Final Exam: Artificial Intelligence

May 3, 2006

Note: For several of these problems, the answer below is one of several correct possibilities.

Problem 1 (15 points)

Let G be an undirected graph. A set of vertices S is called a VERTEX COVER if every edge in G contains at least one vertex in S. For instance, in the graph shown below, the set {B,C,F} is a vertex cover of size 3.

Consider the following problem: Given a graph G and a number K, find a vertex cover for G of size at most K.

A. Describe a systematic (tree-like) search space for solving the vertex cover problem

Answer: A state is a set of vertices of size at most K.
An operator on state S is to add to S a vertex in V that is alphabetically later than any of the vertices in S, and that covers at least one edge not covered by S.
The start state is the empty set of vertices.
The goal state is a set of vertices that covers all the edges in G.

B. What is the branching factor in your state space? What is the depth of the space? Give your answer in terms of G and K in general, not for the particular example in (B)

Answer: The branching factor is |V|, the number of vertices in G. The depth is K.

C. Give an example where a goal state is much shallower than the maximum depth of the space, or give an argument that no such example exists.

Answer: A "star"-graph where the edge all connnect a center vertex to the other vertices in t graph.

Problem 2 (10 points)

The vertex cover problem discussed in problem 1 can be compiled into a propositional satisfiability problem as follows:

There are two types of propositional atoms.

The following categories constraints are then required.

A. Give one example of each of these categories that would be generated in compiling the example from problem 1.

Answer: W. A_in_S => A_1 V A_2 V A_3. X. B_2 => A_1. Y. ~(A_1 ^ B_1). Z. A_in_S V B_in_S.

B. Argue that the category (Y) is necessary by describing a variable assignment that satisfies all the constraints in categories (W), (X), and (Z) but is not a solution to the problem.

Answer The solution where all the atoms except A_2 and A_3 are assigned the value TRUE.

Problem 3 (10 points)

If alpha-beta pruning is applied to the tree below, what branches are pruned? What is the best move for MAX, and what is the value of the root node?

Problem 4 (15 points)

Consider the following collection of sentences in the propositional logic.
1. P => (Q <=> R).
2. Q <=> ~(R^W)
3. Q => (P^W).

A. Convert this set to CNF. (You need not show the intermediate steps.)

Answer: 1A. ~P V ~Q V R.
1B. ~P V Q V ~R.
2A. ~Q V ~R V ~W.
2B. Q V R
2C. Q V W.
3A. ~Q V P
3B. ~Q V W.

B. Show how the Davis-Putnam assignment finds a satisfying assumption. (Assume that, when a branch point is reached, the algorithm chooses the first atom alphabetically and tries TRUE before FALSE.)

Answer:

Let S0 be the above set of clauses.
Try P=TRUE.
Delete ~P from 1A and 1B.  Delete 3A.


S1:
1A. ~Q V R. 
1B. Q V ~R.
2A. ~Q V ~R V ~W.
2B. Q V R 2C. Q V W. 3B. ~Q V W. Try Q=TRUE. Delete ~Q from 1A, 2A, 3B. Delete 1B, 2B, 2C. 1A. R.
2A. ~R V ~W.
3B. W. 1A is a singleton clause. Set R=TRUE. Delete ~R from 2A. Delete 1A. 2A. ~W.
3B. W. 2A is a singleton clause. Set W=FALSE. Delete W from 3B. Delete 2A. 3B is now the null clause. Fail. Return to S1, Try Q=FALSE. Delete Q from 1B,2B,2C. Delete 1A, 2A, 3B. 1B. ~R.
2B. R 2C. W. Set R=FALSE. Delete R from 2B. Delete 1B. 2B is now the null clause. Fail. Return to S0. Set P=FALSE. Delete P from 3A, delete 1A, 1B. 2A. ~Q V ~R V ~W.
2B. Q V R 2C. Q V W. 3A. ~Q. 3B. ~Q V W. 3A is a singleton clause. Set Q=FALSE. Delete Q from 2B, 2C, delete 2A, 3C, 3B. 2B. R 2C. W 2B and 2C are singleton clauses. Set R=TRUE, W=TRUE. Succeed. Answer: P=FALSE, Q=FALSE, R=TRUE, W=TRUE.

Problem 5 (20 points)

Let L be the following first-order language over a space of people:
s(X) --- X is satisfied with life.
d(X) --- X is a doctor.
h(X) --- X is a heart surgeon.
c(X,Y) --- X is a child of Y.
f --- Fred.

A. Express the following statements in L.

W. A person is satisfied with life if all his/her children are doctors.
forall(A) [forall(B) c(B,A) => d(B)] => s(A).

X. All of Fred's children are heart surgeons.
forall(X) c(X,f) => h(X).

Y. Heart surgeons are doctors.
forall(X) h(X) => d(X).

Z. Fred is satisfied with life.
s(f).

B. Give a resolution proof of Z from W,X,Y. You need not show the intermediate stages of conversion to CNF. Your proof must be a resolution proof; no credit will be given for any other kind of proof. Your proof must show all the resolutions involved in the proof.

Answer: Converting W, X, Y, and the negation of Z to CNF gives five clauses:
W1. c(sk0(A),A) V s(A).
W2. ~d(sk0(A)) V s(A).
X. ~c(X,f) V h(X).
Y. ~h(X) V d(X).
Z. ~s(f). Resolving Z with W1 gives A. c(sk0(f),f).
Resolving A with X gives B. h(sk0(f)). Resolving B with Y gives C. d(sk0(f)). Resolving Y with W2 gives D. s(f). Resolving D with Z gives the empty clause.

C. Is this a Horn theory? Explain your answer.
Answer: No. W1 has two positive literals and is thus not a Horn clause.

Problem 6 (5 points)

For each of the following statements, state whether it is true of backward chaining, true of forward chaining, true of both backward and forward chaining, or true of neither backward nor forward chaining.

Problem 7 (15 points)

Let X,Y,Z be Boolean random variables. Suppose that Compute the following:
A. Prob(Y=T).
Answer: P(Y=T) = P(Y=T^X=T) + P(Y=T^X=F) = P(Y=T|X=T) P(X=T) + P(Y=T|X=F)P(X=F) = (3/4)(1/3) + (1/4)(2/3) = 5/12
B. Prob(Y=F|X=T).
P(Y=F|X=T) = 1-P(Y=T|X=T) = 1/4.
C. Prob(X=T|Y=F).
P(X=T|Y=F) = P(Y=F|X=T) P(X=T)/P(Y=F) = (1/4) (1/3) / (7/12) = 1/7
D. Prob(X=F,Z=F).
P(X=F,Z=F) = P(X=F) P(Z=F) = (2/3) (4/5) = 8/15.
E. Prob(X=T,Y=F)
. P(X=T,Y=T) = P(Y=F|X=T) P(X=T) = (1/4)(1/3) = 1/12.

Problem 8 (10 points)

One of the disturbing aspects of machine learning is that the predictions of ML systems are highly dependent on the representation; and that different, logically equivalent representations can give rise to very different predictions.

This takes many different forms. In this problem, we will consider how different characterizations of the classification attribute affect the prediction. Suppose, for instance, that you are trying to guess in what month and year a given event occured, based on other information. One way is to view this as two classification attributes, "Month" and "Year" and apply classification techniques to these separately. Another way is to view it as a single attribute "Month-Year" and apply a classification technique to that single attribute.

Suppose that the predictive attributes are A and B, and that the ML technique being used is 1R. Construct a data set in which applying the 1R technique to "Month" and "Year" separately gives rules that predicts "Month=January" and "Year=2000" for the instance A=T,B=T, but applying the 1R technique to "Month-Year" gives the prediction "Month-Year=February-2002" for the same instance.

A B Month Year
T T January 1997
T T January 1998
T T January 1999
T T March 2000
T T April 2000
T T May 2000
T T February 2002
T T February 2002