Lab 5 notes

Slides from Lab 5 can be found here: lab5_slides.pdf

Code demonstration

Demonstration of learning trees and displaying them as pdfs with scikit-learn:
(also has an example of imputing missing values with the mean of the population).

Outputs are written as .dot files which can be written to pdf with the following command:
dot -Tpdf > output.pdf (requires graphviz)

example trees learned on the Pima indian dataset
Data is found in the UCI repository. (For interest- see the wikipedia page about the Pima people)

Controlling the max depth:

Controlling the min samples per leaf: