1. P => (Q <=> R).A. Convert this set to CNF. (You need not show the intermediate steps.)
2. Q <=> ~(R^W)
3. Q => (P^W).
1A. ~P V ~Q V R. 1B. ~P V Q V ~R. 2A. ~Q V ~R V ~W 2B. Q V R. 2C. Q V W. 3A. ~Q V P 3B. ~Q V W.
B. Show how the Davis-Putnam assignment finds a satisfying assumption. (Assume that, when a branch point is reached, the algorithm chooses the first atom alphabetically and tries TRUE before FALSE.)
S0: 1A. ~P V ~Q V R. 1B. ~P V Q V ~R. 2A. ~Q V ~R V ~W 2B. Q V R. 2C. Q V W. 3A. ~Q V P 3B. ~Q V W. Try P=T. Delete ~P from 1A, 1B, 3A, S1: 1A. ~Q V R. 1B. Q V ~R 2A. ~Q V ~R V ~W 2B. Q V R. 2C. Q V W. 3B. ~Q V W. Try Q=T. Delete ~Q from 1A, 2A, 3A, 3B; delete 1B, 2B, 2C 1A. R. 2A. ~R V ~W 3B. W. R and W are singleton clauses. Set R=T, W=T. Delete 1A, 3B. Delete ~R and ~W from 2A, leaving the null clause. Backtrack to S1. Try Q=F. Delete 1A, 2A, 3B. Delete Q from 1B, 2B, 2C. 1B: ~R 2B: R 2C: W. R and W are singleton clauses. Set R=T, W=T. Delete 1B, 2C. Delete ~R from 1B. 1B is now the null clause fail. Backtrack to S0, try P=F. Delete 1A, 1B, delete P from 3A. 2A. ~Q V ~R V ~W 2B. Q V R. 2C. Q V W. 3A. ~Q 3B. ~Q V W. 3A is a singleton clause. Set Q=F. Delete 2A, 3A, 3B, delete Q from 2B, 2C. 2B. R 2C. W 2B, 2C are singleton clauses. Set R=T, W=T. Delete 2B, 2C. No clauses left, succeed. Return P=F, Q=F, R=T, W=T.
s(X,L) --- Person X speaks language L. c(X,Y) --- Persons X and Y can communicate. i(W,X,Y) --- Person W can serve as an interpreter between persons X and Y. j,p,e,f --- Constants: Joe, Pierre, English, and French respectively.A. Express the following statements in L:
Answer: The Skolemized forms of (i)-(v) are
(sk1(L,M) is a Skolem function mapping L and M to a person who speaks both L and M.)
The Skolemized form of the negation of (vi) is
The proof is then as follows:
Prob(A=T) = 0.8 Prob(B=T | A=T) = 0.5. Prob(B=T | A=F) = 1. Prob(C=T | B=T) = 0.1 Prob(C=T | B=F) = 0.5 A and C are conditionally independent given B.Evaluate the following terms. (If you wish, you can give your answer as an arithmetic expression such as "0.8*0.5 / (0.8*1 + 0.5*0.1)")
X Y Z C number of instances ----------------------------------- T T T T 7 T T F F 1 T F T T 2 F T T F 1 F F T F 8 F F F T 1A. What classifier does the 1R algorithm output?
The best rule based on X would be ``if (X==T) then predict C=T else predict C=F" with accuracy 18/20.
The best rule based on Y would be ``if (Y==T) then predict C=T else predict C=F" with accuracy 15/20.
No rule based on Z gets more than 1/2 right for either value of Z.
Hence, return the rule based on X: ``if (X==T) then predict C=T else predict C=F" with accuracy 18/20.
B. What classification does Naive Bayes predict for a test instance with X=F, Y=T, Z=F, and what probability does it give to that prediction?
Let A=P(X=F,Y=T,Z=F). Then P(C=T|X=F,Y=T,Z=F) = (1/A) * Freq(C=T) * Freq(X=F|C=T) * Freq(Y=T|C=T) * Freq(Z=F|C=T) = A(1/2) * (1/10) * (7/10) * (1/10) = 0.0035
P(C=F|X=F,Y=T,Z=F) = (1/A) * Freq(C=F) * Freq(X=F|C=F) * Freq(Y=T|C=F) * Freq(Z=T|C=F) = A * (1/2) * (9/10) * (2/10) * (1/10) = 0.009
A= 0.009 + 0.0035 =0.00935.
Predict C=F with probability 0.009/0.00935 = 0.96.
C. If the ID3 algorithm is run on this, the top-level node of the decision tree will be a test of X. Show the entire decision tree. (The remaining entropy comparisons are trivial.)
(X) T -----------|---------- F (Z) (Z) T---|---F T---|---F | | | | C=T C=F C=F C=TD. Give an argument that no linear discriminator can perfectly classify this data set. (Hint: Compare the effect of attribute Z on C in going from the first line to the second line against its effect in going from the 5th line to the 6th line.)
Answer: The only difference between line 1 and line 2 in the predictive attributes is that Z changes from T to F, causing a change of C from T to F. Therefore in a perfect linear discriminator, Z must have a positive weight. The only difference between line 5 and line 6 in the predictive attributes is that Z changes from T to F, causing a change of C from F to T. Therefore in a perfect linear discriminator, Z must have a negative weight. They can't both be true.
Answer: A training set is used by the base level learning algorithm to output a classifier. A validation set is used multiple times by a higher level process to compare different classifiers and to choose the best learning algorithm for the task or the best tuning for parameters in the learning algorithm. A test set is (ideally) used once to evaluate the quality of a classifier.