CBLL HOME
VLG Group
News/Events
Seminars

People
Research
Publications
Talks
Demos
Datasets
Software
Courses
Links

Group Meetings

Join CBLL

Y. LeCun's website
CS at Courant
Courant Institute
NYU

Machine Learning and Pattern Recognition: Schedule

[ Course Homepage | Schedule and Course Material | Mailing List ]

This page contains the schedule, slide from the lectures, lecture notes, reading lists, assigments, and web links.

I urge you to download the DjVu viewer and view the DjVu version of the documents below. They display faster, are higher quality, and have generally smaller file sizes than the PS and PDF.

Full-text search is provided for the entire collection of slides and papers. Click here to search

01/21: Introduction and basic concepts

Subjects treated: Intro, types of learning, nearest neighbor, how biology does it, linear classifier, perceptron learning procedure, linear regression, training/test, capacity, overfitting, regularization, Occam's Razor, MDL.

Slides: [DjVu | PDF | PS]

Required Reading:

Hastie/Tibshirani/Friedman: Chapter 2

Optional Reading:

Refresher on random variables and probabilites by Andrew Moore: (slides 1-27) [DjVu | PDF]
Refresher on joint probabilities, Bayes theorem by Chris Willams: [DjVu | PDF]
Refresher on statistics and probabilities by Sam Roweis: [DjVu | PS]
If you are interested in the early history of self-organizing systems and cybernetics, have a look at this book available from the Internet Archive's Million Book Project: Self-Organizing Systems, proceedings of a 1959 conference edited by Yovits and Cameron (DjVu viewer required for full text).

01/28: Probability Theory, Bayes Inversion, Bayes Decision Rule

Subjects treated: Refreshers on probability theory. Bayes decision rule, naive Bayes classifier, logistic regression. Refresher on multivariate calculus and optimization.

Slides

Refresher on probability and statistics: [DjVu | PDF | PS]
Optimization (pages 1-5) DjVu

Required Reading:

Hastie/Tibshirani/Friedman: Sections 3.1, 3.2, 3.4.1 to 3.4.3, 4.1, 4.2, 4.4, 4.5

Optional Reading:

Duda/Hart/Stork: Sections 5.1 to 5.8
Bishop: Chapter 3
Hastie/Tibshirani/Friedman: Sections 3.3, 4.3
Paper on logistic regression by Michael Jordan: [DjVu | PS]

02/04: MLE, MAP, Energy Functions

Subjects treated: Bayesian Estimation, Maximum Likelihood Estimation, MAP Estimation, Loss Functions and Energy-Based models. Probability, Entropy, Energy, and Free Energy. Introduction to Lush.

Slides: [DjVu | PDF | PS]

Required Reading:

Intro to Bayesian Learning by Max Welling: [DjVu | PDF | PS]

Optional Reading:

Tutorial on Lush: here

Homework Assignements: (see next lecture)

Lush is available on the CIMS Sun workstations available for student use.

02/11: Gradient-Based Learning I: Beyond Linear Classifiers

Subjects treated: Intro to Gradient-Based Learning. Limitations of linear classifiers. Basis function expansion, polynomial classifiers, kernel expansion, RBF Networks, Simple multi-layer neural nets. Optimization and the convergence of gradient-based learning.

Slides:

Intro to gradient-based learning: [DjVu | PDF | PS]
Optimization (pages 5-end) DjVu

Required Reading:

Gradient-based Learning Applied to Document Recognition by LeCun, Bottou, Bengio, and Haffner, pages 1-5 (Introduction): [ DjVu | .ps.gz ]
Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 1-5: [ DjVu | .ps.gz ]

Homework Assignements: implementing the Perceptron Algorithm, MSE Classifier (linear regression), Logistic Regression. Details and datasets below:

Download this tar.gz archive. It contains the datasets and the homework description.
Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or with Winzip in Windows.
The file homework01.txt contains the questions and instructions.
Most the of the necessary Lush code is provided.
Due Date is Wednesday March 3, before the lecture.

02/18: Gradient-Based Learning II: Multilayer Networks and Back-Propagation

Subjects treated: Multi-Module learning machines. Vector modules and switches. Multilayer neural nets. Backpropagation Learning.

Slides:

Multilayer nets and backpropagation: [DjVu | PDF | PS]

02/25: Gradient-Based Learning III: Special Architectures

Subjects treated: Special architectures: RBF nets, mixtures of experts, parameter-space transforms. Implementation and practical issues with multi-module/multi-layer learning machines. Intro to convolutional nets.

Slides:

Special Architectures: [DjVu | PDF | PS]
Optimization, Practical Issues with gradient-based training: [DjVu | PDF | PS]

Required Reading:

Convolutional nets: "Gradient-based Learning Applied to Document Recognition" by LeCun, Bottou, Bengio, and Haffner, pages 5-18 (up to and including section IV-B ): [ DjVu | .ps.gz ]
On the Lagrangian formulation of gradient-based learning: "A theoretical framework for back-propagation": [ DjVu | .ps.gz ]
Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 6-end: [ DjVu | .ps.gz ]

Optional Reading:

Multimodule Approach and Lagrangian formulation: "a framework for the cooperation of learning algorithms" by Bottou and Gallinari: DjVu.

03/03: Convolutional Nets. Cross-Validation, Model Selection, Learning Theory

Subjects treated: Invariant Recognition, Feature Learning, Convolutional Networks and Time-Delay Neural Nets. Model Selection, Cross-Validation, VC-dimension, Structural Risk Minimization, Bagging.

Slides:

Convolutional Nets: no slides, see the required reading for last week.

Model selection, cross-validation, learning theory, bagging: [DjVu | PDF | PS]

Homework Assignements: implementing Gradient-Based Learning and back-propagation. You must implement gradient-based learning using the object-oriented, module-based approach as described in class. Various architectures, including a multilayer neural net, must be implemented and tested on two datasets.

Download this tar.gz archive. It contains the datasets and the homework description.
Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or with Winzip in Windows.
The file homework-02.txt contains the questions and instructions.
Most of the necessary Lush code is provided.
Due Date is Friday April 2 (NEW NEW DATE!).

03/10: Unsupervised Learning

Subjects treated: Unsupervised Learning: Principal Component Analysis. Density Estimation: Parzen Windows, Mixtures of Gaussians, Auto-Encoders. Latent variables and the Estimation-Maximization algorithm.

Slides:

Unsupervised learning: [DjVu | PDF | PS]

03/17

Spring break: NO CLASS.

03/24: Guest Lecture by Prof. Lawrence Saul: Dimensionality Reduction

Subjects treated: Non-Linear Dimensionality Reduction and Embedding: Guest Lecture by Prof. Lawrence Saul from University of Pennsylvania:

Slides:

L. Saul's Lecture Slides on non-linear dimensionality reduction (caution: the PS and the PDF are over 25MB, the DjVu is 2MB): [DjVu | PDF | PS]

Required Reading: (please read this before the class)

L. K. Saul and S. T. Roweis (2003). Think globally, fit locally: unsupervised learning of low dimensional manifolds. Journal of Machine Learning Research 4:119-155. [PDF].

Optional Reading:

Prof. Saul's web page.

03/31: Efficient Optimization, Latent Variables, Graph Transformer Networks

Subjects treated: Efficient learning: conjugate gradient, Levenberg-Marquardt. Lagrange Multipliers and Constrained Optimization.