## Problem Set 5

Assigned: April 1
Due: April 15.

### Problem 1

Machine learning algorithms can be very sensitive to how values of attributes are grouped together. Naive Bayes is one of the most sensitive.

Amy and Barbara are separately researching the question of what very rich people like to drink. They are both working from the same data set. An instance in this data set is a person. There are three attributes:

• 1. The make of the most expensive car the person owns.
• 2. Whether or not the person owns a yacht.
• 3. The person's favorite beverage.
Amy divides the beverages into two categories: alcoholic and non-alcoholic. She then applies the Naive Bayes algorithm to predict the category of drink for an individual with the predicitive values "Owns a BMW" and "Owns a yacht." The result of her calculation is that, given that a person owns a BMW and a yacht, the probability is more than 90 per cent that his/her favorite drink is non-alcoholic.

Barbara divides the beverages into two categories: champagne and everything else. She applied Naive Bayes in the same way. The results of her calculation is that, given that a person owns a BMW and a yacht, the probability is more than 90 per cent that his/her favorite drink is champagne.

Describe a data set that supports both of these conclusions. (The kind of "description" you're looking for would be something like, "There are 100 instances of people who own a BMW, own a yacht and drink champagne; 3 instances of people who don't own a BMW, own a yacht, and drink beer;" etc.)

### Problem 2

A. Let D be a data set with three predictive attributes: P, Q, and R and one classification attributes C. Attributes P, Q, and C are Boolean. Attribute R has three values: 1, 2, and 3. The data is as follows

P Q R C Number of
instances.
Y Y 1 N 20
Y Y 2 Y 1
Y Y 3 Y 2
Y N 1 Y 8
Y N 2 Y 2
Y N 3 Y 0
N Y 1 N 12
N Y 2 Y 2
N Y 3 Y 1
N N 1 Y 25
N N 2 Y 1
N N 3 Y 4

Trace the execution of the ID3 algorithm, and show the decision tree that it outputs. At each stage, you should compute the average entropy AVG\_ENTROPY(A,C,T) for each attribute A. (The book calls this "Remainder(A)" (p. 660).)