Introduction To Machine Learning
Fall 2012
Overview Machine learning is an exciting and fastmoving field of
Computer Science with many recent consumer applications
(e.g., Kinect, Google Translate, Siri, digital camera face
detection, Netflix recommendations) and applications
within the sciences and medicine (e.g., predicting
proteinprotein interactions, species modeling, detecting
tumors, personalized medicine). In this
undergraduatelevel class, students will learn about the
theoretical foundations of machine learning and how to
apply machine learning to solve new problems. 

General information Lectures: Tuesday and Thursday,
11am12:15pm
Office hours: Tuesday
56pm and by appointment. Location: 715 Broadway, 12th floor, Room
1204 Grading: problem sets (50%) + midterm
exam (25%) + project (20%) + participation (5%). Problem Set policy Books: No textbook is required for this class, but students may find it helpful to purchase one of the following books. Bishop's book is much easier to read, whereas Murphy's book has substantially more depth and coverage (and is up to date).
Mailing list: To subscribe to the class list,
follow instructions here. 
Schedule
Note: the Bishop and
Murphy readings suggested below are optional.
Lecture  Date  Topic  Required reading  Assignments 
1  Sept 4 (Tues) 
Overview [Slides]

Chapter
1 of Murphy's book Bishop, Chapter 1 (optional) 

2  Sept 6 (Th) 
Introduction to learning [Slides] Loss functions, Perceptron algorithm 
Review notes on probability,
linear algebra, and calculus 

3 
Sept 11 (Tues) 
Linear classifiers [Slides] Proof of perceptron mistake bound, introduction to Support vector machines 
Notes
on perceptron mistake bound (just section 1) Notes on support vector machines (sections 14) Bishop, Section 4.1.1 (pg. 181182) and Chapter 7 (pg. 325328) Murphy, Section 14.5.2 (pg. 498501) 
ps1 (data) due Sept 25 at
11am [Solutions] 
4 
Sept 13 (Th) 
Support vector machines [Slides] 
See above. Also: Bishop, Sections 7.1.1 and 7.1.3 

5 
Sept 18 (Tues) 
Support vector machines (continued) [Slides] Derivation of SVM dual, introduction to kernels 
Optional: Hastie (book online for free
 see below), Section 4.5 (pg. 129) 

6 
Sept 20 (Th) 
Kernel methods [Slides]  Notes
on kernel methods (sec. 38) Bishop, Section 6.2, Section 7.1 (except for 7.1.4), and Appendix E Murphy, Chapter 14 (except 14.4 and 14.7) Optional: For more on SVMs, see Hastie, Sections 12.112.3 (pg. 435). For more on crossvalidation see Hastie, Section 7.10 (pg. 250). Optional: For more advanced kernel methods, see chapter 3 of this book (free online from MIT libraries) 

7 
Sept 25 (Tues) 
Kernel methods & optimization Mercer's theorem, convexity 
Lecture
notes 

8 
Sept 27 (Th) 
Learning theory [Slides] Generalization of finite hypothesis spaces 
These have only highlevel overviews:  Murphy, Section 6.5.4 (pg. 209)  Bishop, Section 7.1.5 (pg. 344) 

9 
Oct 2 (Tues) 
Learning theory (continued) [Slides] VCdimension 
Notes
on learning theory 
ps2, due Oct 8 by 4pm [Solutions] 
10 
Oct 4 (Th) 
Nearest neighbor methods [Slides] Also marginbased generalization 
Notes
on gaptolerant classifiers (section 7.1, pg. 2931) Hastie et al., Sections 13.313.5 (on nearest neighbor methods) 

11 
Oct 9 (Tues) 
Decision trees [Slides] 
Bishop, Section 14.4 (pg. 663) Murphy, Section 16.2 

12 
Oct 11 (Th) No class on Oct 16 (Fall recess) 
Ensemble methods, Boosting [Slides] 
Hastie et al., Section 8.7 (bagging)
and Sections 10.110.6 (boosting) Boosting overview (except sections 6 and 8) Bishop, Sections 14.2 & 14.3 Murphy, Section 16.4 Optional: Hastie et al. Chapter 15 (on random forests) 

13 
Oct 18 (Th) 
Boosting (continued) [Slides] 
A
Few Useful Things to Know About Machine Learning 
ps3, due Oct 25 at 11am 
Oct 23 (Tues) 
Midterm exam


14 
Oct 25 (Th) 
Clustering [Slides] Kmeans, Agglomerative clustering 
Hastie et al., Sections 14.3.6,
14.3.8, 14.3.9, 14.3.12 Murphy, Sections 25.1, 25.525.5.3 Bishop, Section 9.1 (pg. 424) 
ps4 (2 page project proposal), due Nov. 6 at
5pm by email 
Oct 30 (Tues) 
Class cancelled due to Hurricane Sandy 

Nov 1 (Th) 
Class cancelled due to Hurricane Sandy 

15 
Nov 6 (Tues) 
Clustering (continued)
[Slides] Spectral clustering 
Hastie et al., Section 14.5.3 Optional: Tutorial on spectral clustering Murphy, Section 25.4 

16 
Nov 8 (Th) 
Introduction to Bayesian methods
[Slides] Probability, decision theory 
Murphy, Sections 33.3 Bishop, Sections 22.3.4 

17 
Nov 13 (Tues) 
Naive Bayes [Slides] 
Murphy, Sections 3.4, 3.5 (naive Bayes), 5.7
(decision theory) Bishop, Section 1.5 (decision theory) 

18 
Nov 15 (Th) 
Logistic regression [Slides] 
Notes
on naive Bayes and logistic regression Murphy, 88.3 (logistic reg.), 8.6 (generative vs. discriminative) Bishop, 4.24.3.4 (logistic reg.) 
ps5, due Nov 27 at 11am [Solutions] 
19 
Nov 20 (Tues) No class on Nov 22 (Thanksgiving) 
Logistic Regression (continued)
[Slides] 

20 
Nov 27 (Tues) 
Linear Regression [Slides] 
Hastie et al., Chapter 3 (pages
4347) Notes on linear regression (see Part 1) Murphy, Section 77.5 Bishop, Sections 33.2 

21 
Nov 29 (Th) 
Mixture models, EM algorithm
[Slides] 
Notes
on mixture models Notes on Expectation Maximization Murphy, 1111.4.2.5, Section 11.4.7 Bishop, Sections 9.2, 9.3, 9.4 

22 
Dec 4 (Tues) 
EM algorithm (continued) [Slides] 
ps6, due Dec 11 at 11am  
23 
Dec 6 (Th) 
Hidden Markov models [Slides] 
Notes
on HMMs Tutorial on HMMs Introduction to Bayesian networks Murphy, Chapter 17 Bishop, Sections 8.4.1, 13.12 

24 
Dec 11 (Tues) 
Factor analysis & dimensionality reduction (PCA) [Slides]  Notes
on PCA More notes on PCA Bishop, Sections 12.1 (PCA), 12.4.1 (ICA) Optional: Barber, Chapter 15 

25 
Dec 13 (Th) 
Dimensionality reduction (continued)
[Slides] Latent Dirichlet allocation 
Review
article on topic modeling 

Dec 18 (Tues) 
Project
presentations 10  11:50am, WWH 317 
Final
projects due Dec 16 at 5pm (electronically) 
Reference materials

PrerequisitesThis is an undergraduatelevel course. Students should be very comfortable with basic mathematical skills in addition to good programming skills. Some knowledge of probability theory and statistics, linear algebra, and multivariable calculus will be helpful. Basic Algorithms (CSCIUA.0310) is a prerequisite, although well qualified students taking it during Fall 2012 will be accepted with permission of the instructor. Please contact the professor with any additional questions. 
Problem
Set policy I expect you to try solving each problem set on your own. However, when being stuck on a problem, I encourage you to collaborate with other students in the class, subject to the following rules:
