## Sample problems from second half of course

Let me emphasize that this is just a collection of sample problems, not a sample final exam.

## Multiple choice problems

### Problem 1

Bayes' Law states that
• A. Prob(P|Q) = Prob(P) / Prob(Q).
• B. Prob(P|Q) = Prob(Q|P)
• C. Prob(P|Q) = Prob(Q|P) / Prob(Q)
• D. Prob(P|Q) = Prob(P) * Prob(Q|P) / Prob(Q)
• E. Prob(P|Q) = Prob(Q) * Prob(Q|P) / Prob(P)

### Problem 2

In Naive Bayes learning, we make the assumption that
• A. The classification attribute is independent of the predictive attributes.
• B. The classification attribute depends only one predictive attribute.
• C. The predictive attributes are absolutely independent
• D. The predictive attributes are conditionally independent given the classification attribute.

### Problem 3

A support vector machine finds a linear separator that maximizes the "margin", which is:
• A. The number of misclassified data points.
• B. The sum over all misclassified points of the distance from the point to the separator.
• C. The sum over all misclassified points of the distance from the point to the separator squared.
• D. The minimum distance from any point to the separator.

### Problem 4

In the problem of tag elements E1 ... EN with tags T1 ... TN, the K-gram assumption is the assumption that
• A. EI is independent of EI-K
• B. TI is independent of TI-K
• C. EI is conditionally independent of E1 ... EI-K given EI+1-K ... EI-1
• D. TI is conditionally independent of T1 ... TI-K given TI+1-K ... TI-1
```

```

### Problem 5

Learning takes place in a back-propagation network by
• A. Propagating activation levels from the input layer to the output layer.
• B. Propagating activation levels from the output layer to the input layer.
• C. Propagating modification to weights on the arcs from the input layer to the output layer.
• D. Propagating modification to weights on the arcs from the output layer to the input layer.
• E. Adding nodes and links in the hidden layers.
• F. Both adding and deleting nodes and links in the hidden layers.

## Long Answer Problems

### Problem 6

A. What conditional probabilities are recorded in the above Bayesian network?

B. For each of the following statements, say whether it is true or false in the above network:

• B and C are independent absolutely.
• B and C are independent given A.
• B and C are independent given D.
• A and D are independent absolutely.
• A and D are independent given B.
• A and D are independent given B and C.

C. Assuming that all the random variables are Boolean, show how Prob(B=T) can be calculated in terms of the probabilities recorded in the above network.

### Problem 7

Datasets often contain instances with null values in some of the attributes. Some classification learning algorithms are able to use such instances in the training set; other algorithms must discard them.
• A. Can Naive Bayes make use of instances with null values in the training set? Explain your answer.
• B. Can K-Nearest neighhbors make use of instances with null values in the training set? Explain your answer.
```

```

### Problem 8

The version of the ID3 algorithm in the class handout includes a test "If AVG_ENTROPY(AS,C,T) is not substantially smaller than ENTROPY(C,T)'' then the algorithm constructs a leaf corresponding to the current state of T and does not recur. "Substantially smaller" here, of course, is rather vague. Is overfitting more likely to occur if this condition is changed to require that "AVG_ENTROPY(AS,C,T) is much smaller than ENTROPY(C,T)" or if the condition is changed to "AVG_ENTROPY(AS,C,T) is at all smaller than ENTROPY(C,T)"? Explain your answer. the disadvantage of eliminating the test?

### Problem 9

• A. What is the sparse data problem in using Naive Bayes for classifying text? How is it solved?
• B. What is the sparse data problem in using the k-gram model for tagging text? How is it solved?

### Problem 10

The most common measure of the quality of a classifier is in terms of the accuracy of its predictions. Explain why this is not always the best measure and describe an alternative measure.