CBLL HOME
VLG Group
News/Events
Seminars

People
Research
Publications
Talks
Demos
Datasets
Software
Courses
Links

Group Meetings

Join CBLL

Y. LeCun's website
CS at Courant
Courant Institute
NYU

G22-2565-001, Fall 2010:
Machine Learning and Pattern Recognition

[ Course Homepage | Schedule and Course Material | Mailing List ]

Graduate Course on machine learning, pattern recognition, neural nets, statistical modeling.

Instructor: Yann LeCun, 715 Broadway, Room 1220, 212-998-3283, yann [ a t ] cs.nyu.edu

Teaching Assistant: Daniel Galron, 715 Broadway, Room 704, 212-998-3496, galron [ a t ] cs.nyu.edu

Classes: Thursdays 5:00-6:50 PM, Room 109, Warren Weaver Hall.

Office Hours for Prof. LeCun: Wednesdays 4:00-6:00 PM. Please send an email to Prof. LeCun prior to an office hour visit.

office hours for the TA: Mondays 4:00-6:00 PM , or by appointment.

Click here for schedule and course material >>>

Click here for the list of assignments and final projects >>>

Course Description

The course covers a wide variety of topics in machine learning, pattern recognition, statistical modeling, and neural computation. It covers the mathematical methods and theoretical aspects, but will primarily focus on algorithmic and practical issues.
Machine Learning and Pattern Recognition methods are at the core of many recent advances in "intelligent computing". Current applications include machine perception (vision, audition, speech recognition), control (process control, robotics), data mining, time-series prediction (e.g. in finance), natural language processing, text mining and text classification, bio-informatics, neural modeling, computational models of biological processes, and many other areas.

Energy-Based Models

Who Can Take This Course?

This course can be useful to all students who would want to use or develop statistical modeling methods. This includes students in CS (AI, Vision, Graphics), Math (System Modeling), Neuroscience (Computational Neuroscience, Brain Imaging), Finance (Financial modeling and prediction), Psychology (Vision), Linguistics, Biology (Computational Biology, Genomics, Bio-informatics), and Medicine (Bio-Statistics, Epidemiology).
The only formal pre-requisites are familiarity with computer programming and linear algebra, but the course relies heavily on such mathematical tools as probability and statistics, multi-variate calculus, and function optimization. The basic mathematical concepts will be introduced when needed, but students will be expected to assimilate a non-trivial amount of mathematical concepts in a fairly short time.
Although this is a graduate-level course, highly motivated undergraduates at the senior level with a good math background can take this class. A few juniors have even taken this class successfully in the past.

Topics Treated

The topics studied in the course include:

the basics of inductive inference, learning, and generalization.
linear classifiers: perceptron, LMS, logistic regression.
non-linear classifiers with linear parameterizations: basis-function methods, boosting, support vector machines.
multilayer neural networks, backpropagation
heterogeneous learning systems
graph-based models for sequences: hidden Markov models, finite-state transducers, recurrent networks.
unsupervised learning: density estimation, clustering, and dimensionality reduction methods.
introduction to graphical models and factor graphs
approximate inference, sampling.
optimization methods in learning: gradient-based methods, second-order methods, Expectation-Maximization.
objective functions: maximum likelihood, maximum a-posteriori, discriminative criteria, maximum margin.
the bias-variance dilemma, regularization, model selection.
applications in vision, speech, language, forecasting, and biological modeling.
By the end of the course, students will be able to not only understand and use the major machine learning methods, but also implement, apply and analyze them.

The LAGR project

Evaluations

The best way (some would say the only way) to understand an algorithm is to implement it and apply it. Building working systems is also a lot more fun, more creative, and more relevant than taking formal exams.

Therefore students will be evaluated primarily on programming projects given on a 3 week cycle, a final project, and an exam.

Automatic Face Detection

Prerequisites

Linear algebra, vector calculus, elementary statistics and probability theory. Good programming ability is a must: most assignements will consist in implementing algorithms studied in class.

Software for the homeworks

Skeleton code for the homework will be provided in a simple-to-learn numerical programming language called Lush.
Lush is available on the CIMS servers and workstations that are available for student use (remotely through ssh or in the computer labs).
Lush only runs on Linux, so you must have a version of Linux installed on your machine. If you don't already have Linux, you should install Ubuntu Linux on your machine (instructions below).

Installing Ubuntu Linux

There are several options for installing Ubuntu Linux on a Windows machine:

Install Ubuntu on your Windows machine without repartitioning your harddrive using Wubi. Wubi is a Windows app that installs Ubuntu in a file on your Windows partition (the file is a virtual hardrive from Linux's point of view). Wubi is not an emulator, so you have to boot your machine in either Linux or Windows. You can't run them at the same time.
download and install Ubuntu. The Ubuntu installer offers you the option of shrinking your Windows partition non-destructively so as to make space for Linux. The installation is very simple and takes about 20 minutes. It can be done from a CD/DVD or from a USB flash drive.
Install Ubuntu through VMWare so you can run Window and Ubuntu at the same time.

You can also dispense with installing Ubuntu Linux altogether and simply use the Linux machines at the Courant Institute computer labs.

Installing Lush

After you have installed Ubuntu Linux, you must install a number of packages using the shell commands below (type your password when prompted):
sudo apt-get install gcc g++ libx11-dev binutils-dev indent libreadline5 libreadline5-dev sudo apt-get install emacs cvs libgsl0-dev imagemagick

then, at the shell, do:
cd cvs -d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush login [type enter when prompted for a password] cvs -z3 -d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush co lush cd lush ./configure make
This will download and compile Lush in your home directory. The Lush executable is in lush/bin/i686-pc-linux-gnu/lush. For convenience, you can link this executable to some place in your shell search path, for example with:
cd mkdir bin cd bin ln -s ../lush/bin/i686-pc-linux-gnu/lush .
assuming that the bin directory in your homedir is in your search path. The alternative is to do
cd cd lush sudo make install
to perform a system-wide installation.
The course will include a short tutorial on the Lush language, a simple interpreted language for numerical applications.

Alternatives to Lush

If you really want, you can implement the homeworks in another language, (C, C++, Java, Matlab, Lisp, Python,...) but the use of a high-level interpreted language with good numerical support and good support for vector/matrix algebra is highly recommended (Lush, Matlab, Octave...). Some assignments require the use of an object-oriented language, which rules out
If you decide to use another language than Lush, you will be doing a lot more work than necessary (particularly for homework 2).

Invariant Object Recognition

Mailing List

Text Books

Pattern Recognition and Machine Learning, by Christopher Bishop. Springer, August 2006.
NOTES: 13) The publisher just released the first edition of this book on August 28, 2006, so you might have to wait a few days before getting it. (2) this is a different book from past years; (3) This is not Bishop's previous book called Neural Networks and Pattern Recognition;
The textbook can be used for reference, but I will not follow it very closely.
The following books can also be used for complementary material (you can get copies from the library):

Richard O. Duda, Peter E. Hart, David G. Stork: "Pattern Classification" Wiley-Interscience; 2nd edition, October 2000.
T. Hastie, R. Tibshirani, and J. Friedman: "Elements of Statistical Learning", Springer-Verlag, 2001.
Ethem Alpaydin: Introduction to Machine Learning, MIT Press, October 2004.

Other Books of Interest

C. Bishop: "Neural Networks for Pattern Recognition", Oxford University Press, 1996. Pretty much superseded by Bishop's lates book.
S. Haykin: "Neural Networks, a comprehensive foundation", Prentice Hall, 1999 (second edition).
Tom Mitchell: "Machine Learning", McGraw Hill, 1997.

automatic obstacle avoidance

Machine Learning Research at NYU

Please have a look at the research project page of the Computational and Biological Learning Lab for a few example of machine learning research at NYU.

There are numerous opportunities for independent studies and even undergraduate research projects. Contact Prof. LeCun for details.

Links

Code

Lush: A simple language for quick implementation of, and experimentation with, numerical algorithms (for Linux, Mac, and Windows/Cygwin). Many algorithms described in this course are implemented in the Lush library. Lush is available on the department's Sun machines that are freely accessible to NYU graduate students.
EBLearn: A C++ library for machine learning whose classes are similar to Lush's, developed at NYU.
Torch: A C++ library for machine learning developed at NEC Labs in Princeton that uses the Lua interpreter as a frontend.
Theano: a machine learning engine developed at the University of Montreal.
NumPy: A Python library for numerical and scientific computing.

Papers

Some of those papers are available in the DjVu format. The viewer/plugins for Windows, Linux, Mac, and various Unix flavors are available here.

Y. LeCun, S. Chopra, R. Hadsell, F.-J. Huang, M.-A. Ranzato, "A Tutorial on Energy-Based Learning", in Predicting Structured Outputs, MIT Press 2006. [PDF] [DjVu]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-Based Learning Applied to Document Recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998. [PS.GZ] [DjVu]

Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient BackProp," in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998. [PS.GZ] [DjVu]

P. Simard, Y. LeCun, J. Denker, and B. Victorri, "Transformation Invariance in Pattern Recognition, Tangent Distance and Tangent Propagation," in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998. [PS.GZ] [DjVu]

Publications, Journals

Web site of the textbook The Elements of Statistical Learning: Data Mining, Inference, and Prediction, by Trevor Hastie, Robert Tibshirani, and Jerome Friedman.
Nips Online: Volume 0-13 of the Neural Information Processing Systems Conference.
Journal of Machine Learning Research.
Neural Computation.

Conference Sites

Neural Information Processing Systems Conference.

Datasets

Datasets from the textbook "Elements of Statistical Learning".
The MNIST Data: a database of hand-written digits: 60,000 training samples and 10,000 test samples. This data is in a Lush-readable format.

G22-2565-001, Fall 2010: Machine Learning and Pattern Recognition

Installing Ubuntu Linux

Installing Lush

Alternatives to Lush