CBLL HOME
VLG Group
News/Events
Seminars

People
Research
Publications
Talks
Demos
Datasets
Software
Courses
Links

Group Meetings

Join CBLL

Y. LeCun's website
CS at Courant
Courant Institute
NYU

G22-3033-014, Spring 2004:
Machine Learning and Pattern Recognition

[ Course Homepage | Schedule and Course Material | Mailing List ]

Graduate Course on machine learning, pattern recognition, neural nets, statistical modeling.

Instructor: Yann LeCun, 715 Broadway, Room 706, x83283, yann [ a t ] cs.nyu.edu

Teaching Assistant: Fu Jie Huang, 715 Broadway, Room 705, jhuangfu [ a t ] cs.nyu.edu

Classes: Wednesdays 5:00-6:50PM, Room 101, Warren Weaver Hall.

Office Hours for Prof. LeCun: Thursdays 2:00-4:00 PM

Office Hours for TA Fu-Jie Huang: Tuesdays 2:00-4:00 PM

Course Description

This course will cover a wide variety of topics in machine learning, pattern recognition, statistical modeling, and neural computation. The course will cover the mathematical methods and theoretical aspects, but will primarily focus on algorithmic and practical issues.

Machine Learning and Pattern Recognition methods are at the core of many recent advances in "intelligent computing". Current applications include machine perception (vision, audition), control (process control, robotics), data mining, time-series prediction (e.g. in finance), natural language processing, text mining and text classification, bio-informatics and computational models of biological processes, and many other areas.

The topics studied in the course include:

the basics of inductive inference, learning, and generalization.

linear classifiers: perceptron, LMS, logistic regression.

non-linear classifiers with linear parameterizations: basis-function methods, boosting, support vector machines.

multilayer neural networks, backpropagation

heterogeneous learning systems

graph-based models for sequences: hidden Markov models, finite-state transducers, Hidden Markov Models, recurrent networks.

unsupervised learning: density estimation, clustering, and dimensionality reduction methods.

energy-based models and probabilistic models.

introduction to graphical models.

approximate inference, sampling.

optimization methods in learning: gradient-based methods, second-order methods, Expectation-Maximization.

objective functions: maximum likelihood, MAP, discriminative criteria.

the bias-variance dilemma, regularization, model selection.

applications in vision, speech, language, forecasting, and biological modeling.

Prerequisites

Linear algebra, vector calculus, elementary statistics and probability theory. Good programming ability is a must: many assignements will involve implementing algorithms studied in class.

The course will include a short tutorial on the Lush language, a simple interpreted language for numerical applications.

Programming projects may be implemented in any language, (C, C++, Java, Matlab, Lisp, Python,...) but the use of a high-level interpreted language with good numerical support and and good support for vector/matrix algebra is highly recommended (Lush, Matlab, Octave...).

Mailing List

Text Books

T. Hastie, R. Tibshirani, and J. Friedman: "Elements of Statistical Learning", Springer-Verlag, 2001.

I will not follow this book very closely. In particular, much of the material covered in the second half of the course cannot be found in the above book. I will refer to research papers and lectures notes for those topics.

Either one of the following books is also recommended, but not absolutely required (you can get a copy from the library):

C. Bishop: "Neural Networks for Pattern Recognition", Oxford University Press, 1996. [quite good, if a little dated].
Richard O. Duda, Peter E. Hart, David G. Stork: "Pattern Classification" Wiley-Interscience; 2nd edition, October 2000. [good reference because it covers a lot of topics].

Other Books of Interest

S. Haykin: "Neural Networks, a comprehensive foundation", Prentice Hall, 1999 (second edition).
Tom Mitchell: "Machine Learning", McGraw Hill, 1997.

Links

Code

Lush: A simple language for quick implementation of, and experimentation with, numerical algorithms (for Linux and Windows/Cygwin). Many algorithms described in this course are implemented in the Lush library.
Torch: A C++ library for machine learning.

Lush is installed on the department's PCs. It will soon be available on the Sun network as well.

Papers

Some of those papers are available in the DjVu format. The viewer/plugins for Windows, Linux, Mac, and various Unix flavors are available here.

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998. [PS.GZ] [DjVu]
Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient BackProp," in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998. [PS.GZ] [DjVu]
P. Simard, Y. LeCun, J. Denker, and B. Victorri, "Transformation Invariance in Pattern Recognition, Tangent Distance and Tangent Propagation," in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998. [PS.GZ] [DjVu]

Publications, Journals

Web site of the textbook The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
Nips Online: Volume 0-13 of the Neural Information Processing Systems Conference.
Journal of Machine Learning Research.
Neural Computation.

Conference Sites

Neural Information Processing Systems Conference.

Datasets

Datasets from the textbook "Elements of Statistical Learning".
The MNIST Data: a database of hand-written digits: 60,000 training samples and 10,000 test samples. This data is in a Lush-readable format.

Demos and Pretty Pictures

Demo of Convolutional Nets for handwriting recognition.

More demos are available here.

G22-3033-014, Spring 2004: Machine Learning and Pattern Recognition