Let A,B,C,D be Boolean random variables. Consider the following Bayesian network:

How can the following probabilities be calculated from the quantities in the above net?

- A. Prob(C=T)
- B. Prob(A=T | C=T)
- C. Prob(D=T | A=T, C=T)

** Answer:**

A.
P(C=T) =

P(C=T | A=T, B=T) P(A=T) P(B=T) +
P(C=T | A=T, B=F) P(A=T) P(B=F) +

P(C=T | A=F, B=T) P(A=F) P(B=T) +
P(C=T | A=F, B=F) P(A=F) P(B=F)

B. P(A=T | C=T) = (by Bayes' Law)

P(C=T | A=T) P(A=T) / P(C=T).

The denominator P(C=T) is given above in (A).

Now, P(C=T | A=T) = P(C=T,B=T | A=T) + P(C=T, B=F | A=T) =

P(C=T | B=T, A=T) P(B=T | A=T) +
P(C=T | B=F, A=T) P(B=F | A=T) =

(since B is independent of A)

P(C=T | A=T, B=T) P(B=T) + P(C=T | A=T,B=F) P(B=F).

Putting
this all together we get

P(A=T | C=T) =

[P(C=T | A=T, B=T) P(A=T) P(B=T) +
P(C=T | A=T, B=F) P(A=T) P(B=F)] /

[P(C=T | A=T, B=T) P(A=T) P(B=T) +
P(C=T | A=T, B=F) P(A=T) P(B=F) +

P(C=T | A=F, B=T) P(A=F) P(B=T) +
P(C=T | A=F, B=F) P(A=F) P(B=F) ]

C.

P(D=T | A=T, C=T) =

P(D=T,B=T | A=T,C=T) + P(D=T,B=F | A=T,C=T) =

P(D=T | B=T,A=T,C=T) P(B=T | A=T,C=T) +
P(D=T | B=F,A=T,C=T) P(B=F | A=T,C=T).

Now P(D=T | B=T, A=T, C=T) = P(D=T | B=T, C=T) since D is independent of A given B and C.

P(B=T | A=T, C=T) = (by Bayes' law)

P(C=T | B=T, A=T) P(B=T | A=T) / P(C=T | A=T) =

P(C=T | B=T, A=T) P(B=T) /

[P(C=T | B=T, A=T) P(B=T) +
P(C=T | B=F, A=T) P(B=F)]

Putting all this together gives

P(D=T | A=T, C=T) =

[P(D=T | B=T, C=T) P(C=T | B=T, A=T) P(B=T) +

P(D=T | B=F, C=T) P(C=T | B=F, A=T) P(B=F)] /

[P(C=T | B=T, A=T) P(B=T) +
P(C=T | B=F, A=T) P(B=F)]

Prob(B = T | A = T) = 3 Prob(B = F | A = T) = 5 Prob(B = T | A = F) = 1 Prob(B = F | A = F) = 4What are the actual probabilities?

** Answer: **

Prob(B = T | A = T) = 3/8 Prob(B = F | A = T) = 5/8 Prob(B = T | A = F) = 1/5 Prob(B = F | A = F) = 4/5

A. The attachment of "and some recent copies" is syntactically ambiguous. What strategy can be used to determine whether it attaches to "sculptures" or to "museum"?

** Answer:**
World knowledge:
A sculpture is much more likely to be in a museum than
in a copy of a sculpture. (This assumes that the ambiguity in (B) is resolved
correctly.)

B. There is also a somewhat subtle anaphoric ambiguity in the sentence. What is it, and how can it be resolved?

** Answer:**
The ambiguity is whether "copies" means "copies of sculptures" or "copies
of a museum." The resolution again involves world knowledge: A sculpture
is much more likely to be copied than a museum.

C. The semantics of a superlative like ``oldest'' are rather complex. What is the difference in semantic structure between "The oldest sculptures in the museum" and "The marble sculptures in the museum"?

** Answer:** The word "marble" simply states a property of the sculptures:
these sculptures happen to be marble. But it is nonsense to say that these
sculptures happen to be oldest. Rather, the meaning of "oldest" is
dependent on the qualification "in the museum"; these are sculptures such
that no other sculptures * in this museum * are older.

A. What is the k-gram model, in natural language interpretation? Describe one way it is applied (2 or 3 sentences).

** Answer:** The k-gram model is a probabilistic model in which it is
assumed that the probability of a "word" (loosely interpreted) occuring
in a sentence depends only on the previous k-1 words, and, given those
k-1 words, is independent of all previous words. "Word" here can be (a)
an actual word; (b) a part of speech; (c) a phoneme; (d) a particular
meaning of a particular word; among others.

B. How can the k-gram model be viewed as a Markov model?

** Answer:**
The states of the Markov model are tuples of k-1 words. If word w_{k}
follows tuple w_{1} ... w_{k-1} with probability p in the
k-gram model, this will be reflected in the Markov model by an arc from
the state w_{1} ... w_{k-1} to state
w_{2} ... w_{k} with probability p.

** Answer:**
The main points I wanted you to touch on were:
As an advantage of the knowledge-based approach, the fact that, ultimately,
a complete natural language system will have to deal with all the issues
that arise in this approach. Also that research in this approach can
draw on the work in linguistics.

As a disadvantage of the statistical approach, the "sparse data" and the "long tail" problems; the facts that no NL text of plausible size will contain all the low-level combinations that are possible.

As disadvantages to the knowledge-based approach: the fact that many of the strategies proposed are extremely ill-defined, and, in our current state of knowledge, unimplementable except in toy illustrations. The interaction between different kinds of information and the use of plausible inference is particularly poorly understood. The fact that this is a highly labor-intensive approach.

As an advantage of the statistical approach: the fact that working systems can often be made to work quite effectively by compiling statistics over existing online texts with a small fraction of the work that would be needed in the knowledge-based approach. The fact that this system in some respects allows many different types of information --- syntactic, semantic, pragmatic --- to be combined effortlessly.

A. What is segmentation?

** Answer:** The division of an image into regions corresponding to
different surfaces in the actual scene.

B. What is the general viewpoint assumption?

** Answer:** The assumption that significant qualitative features of
the image are stable under sufficiently small changes of viewpoint.

C. What is the difference between an affine and a perspective transformation? (You need not give the formula for the coordinate transformation.) Under what circumstances are the two classes of transformations applicable?

** Answer:** In an affine transformation, parallel lines remain parallel.
In a perspective transformation, a set of parallel lines may turn into
a set of lines that meet at a single point. Affine transformations are
applicable in cases where the objects in the scene are far from the eye
as compared to their size, but their orientation relative to the line of
sight varies. Perspective transformations apply when the distance from
the eye to an object is small compared to the size of the object, and so
some parts of the object are much closer to the eye than others.
than

Consider the following microworld, a trivial form of Chinese checkers:

You have an arrangement of locations and a smaller number of pieces. Each piece sits at one location. Each location holds at most one piece.

If piece P1 is at location L1, L1 is next to L2, and L2 is empty, then you can move P1 to L2. If piece P1 is at L1; P2 is at L2; L3 is empty; and L1, L2, and L3 are in sequence on a line; then you can jump P1 over P2 and put it in L3.

This microworld can be described in the situation calculus using the following language L:

Sorts: Pieces and locations. Atemporal: next(L1,L2) : Predicate. line(L1,L2,L3) : Predicate. Boolean fluents: at(P,L) : Function. empty(L) : Function Actions: move(P,L1,L2) : Function jump(P1,P2,L1,L2,L3) : Function Temporal: holds(S,F). result(S,A). poss(S,A).

A. State in L the domain constraint that a location L1 is empty if and only if there is no piece in L1.

** Answer:**
holds(S,empty(L1)) iff not exists(P) holds(S,at(P,L1))

B. State the causal axiom for move.

** Answer:**
poss(S,move(P,L1,L2)) implies holds(result(S,move(P,L1,L2)),at(P,L2)).

C. State the feasibility axiom for jump.

** Answer:**
poss(S,jump(P1,P2,L1,L2,L3)) iff

holds(S,at(P1,L1)) and holds(S,at(P2,L2)) and holds(S,empty(L3)) and
line(L1,L2,L3).