Due: Nov. 3.

** Answer: ** According to the Naive Bayes formula,

Prob(Play=Yes | Sunny, Cool, High, Weak) is proportional to

Prob(Play=Yes) Prob(Sunny|Play=Yes) Prob(Cool|Play=Yes) Prob(High|Play=Yes)
Prob(Weak|Play=Yes) =

(9/14) * (2/9) * (3/9) * (3/9) * (6/9) = 0.0106

Prob(Play=No | Sunny, Cool, High, Weak) is proportional to

Prob(Play=No) Prob(Sunny|Play=No) Prob(Cool|Play=No) Prob(High|Play=No)
Prob(Weak|Play=No) =

(5/14) * (3/5) * (1/5) * (4/5) * (2/5) = 0.0137

Thus Play=No is more likely than Play=Yes. To find the constant of proportionality (normalizing factor) and recover the true probabilities, add these two constants together, to give 0.0243, and divide each of the above products by this sum, giving

Prob(Play=Yes | Sunny, Cool, High, Weak) = 0.4356

Prob(Play=No | Sunny, Cool, High, Weak) = 0.5644

Suppose that your original data set has two attributes. The predictive attribute is "Outlook" which has values "Sunny", "Overcast", and "Rainy"; the classification attribute is "Temperature" which has integer values in degrees Farenheit. Thus, a typical instance (row) in the data set might be "Sunny; 72".

You are considering two different discretization schemes:

- Scheme S1 divides the temperature into "Cold" (below 40); "Cool" (40-59); "Temperate" (60-75); and "Hot" (over 75).
- Scheme S2 divides the temperature into "Freezing" (below 32); "Chilly" (32-55); "Mild" (56-70), "Warm" (71-85) and "Broiling" (above 85).

Construct a data set of outlooks and integer temperatures with the following property: If the temperatures are discretized using S1, then Naive Bayes, given "Sunny" will predict "Hot", but if they are discretized using S2, then Naive Bayes given "Sunny" will predict "Freezing".

** Answer: ** Suppose that the data set is

Temperature | Outlook |

100 | Sunny |

90 | Sunny |

80 | Sunny |

77 | Sunny |

30 | Sunny |

20 | Sunny |

10 | Sunny |

If we discretize using S1, then we have

Prob(Hot | Sunny) = Prob(Sunny | Hot) Prob(Hot) / Prob(Sunny) = 1 * (4/7) / 1
= 4/7.

Prob(Cold| Sunny) = Prob(Sunny | Cold) Prob(Cold) / Prob(Sunny) = 1 * (3/7) / 1
= 3/7.

and the probabilities of the other temperature ranges are all 0. So the
prediction is Hot.

If we discretize using S2, then we have

Prob(Broiling| Sunny) = Prob(Sunny | Broiling) Prob(Broiling) / Prob(Sunny) =
1 * (2/7) / 1 = 2/7.

Prob(Warm| Sunny) = Prob(Sunny | Warm) Prob(Warm) / Prob(Sunny) =
1 * (2/7) / 1 = 2/7.

Prob(Freezing| Sunny) = Prob(Sunny | Freezing) Prob(Freezing) / Prob(Sunny)
= 1 * (3/7) / 1 = 3/7.

and the probabilities of the other temperature ranges are all 0. So the
prediction is Freezing.

** Extra credit: ** Suppose that we add a second discrete predictive
attributes -- say "Day of the Week" with values "Sunday", "Monday" etc.
Describe a data set over "Outlook", "Day" and "Temperature" with the
following property: Given "Sunny" and "Monday", if the temperature is
discretized using S1, then Naive Bayes will predict that it is 99%
sure that the temperature is Hot, but if the temperature is discretized
using S2, then Naive Bayes will predict that it is 99% sure that the
temperature is Freezing.

This is trickier. One data set is as follows:

Temperature | Day of Week | Outlook | Number of instances |

30 | Monday | Sunny | 100 |

35 | Tuesday | Overcast | 1,000,000 |

90 | Monday | Sunny | 100 |

90 | Tuesday | Overcast | 9900 |

Prob(Hot | Monday, Sunny) is proportional to

Prob(Hot) Prob(Monday | Hot) Prob(Sunny | Hot) = (10,000/1,010,100) * (100 / 10,000) * (100 / 10,000) = $10^{-6}$.

Prob(Cold| Monday, Sunny) is proportional to

Prob(Cold) Prob(Monday | Cold) Prob(Sunny | Cold) = (1,000,100/1,010,100) *
(100 / 1,000,100) * (100 / 1,000,100) = $10^{-8}$.

The probability of all other temperature ranges is 0. To normalize,
we divide by $10^{-6}+10^{-8}$ giving

Prob(Hot | Monday, Sunny) = 0.99

Prob(Cold| Monday, Sunny) = 0.01

On the other hand, if we discretize using S2, we get

Prob(Broiling| Monday, Sunny) is proportional to

Prob(Broiling) Prob(Monday | Broiling) Prob(Sunny | Broiling) =
(10,000/1,010,100) *
(100 / 10,000) * (100 / 10,000) = $10^{-6}$.

Prob(Freezing| Monday, Sunny) is proportional to

Prob(Freezing) Prob(Monday | Freezing) Prob(Sunny | Freezing) =
(100/1,010,100) * 1 * 1 = $10^{-4}$.

The probability of all other temperature ranges is 0. To normalize,
we divide by $10^{-4}+10^{-6}$ giving

Prob(Freezing| Monday, Sunny) = 0.99

Prob(Broiling| Monday, Sunny) = 0.01