

Graduate Course on machine learning, pattern recognition, neural nets,
statistical modeling.
Instructor: Yann LeCun, 715 Broadway, Room 1220, 2129983283, yann [ a t ] cs.nyu.edu
Teaching Assistant: Daniel Galron, 715 Broadway, Room 704, 2129983496, galron [ a t ] cs.nyu.edu
Classes: Thursdays 5:006:50 PM, Room 109, Warren Weaver Hall.
Office Hours for Prof. LeCun: Wednesdays 4:006:00 PM.
Please send an email to Prof. LeCun prior to an office hour visit.
office hours for the TA: Mondays 4:006:00 PM , or by appointment.
Click here for schedule and course material >>>
Click here for the list of assignments and final projects >>>

The course covers a wide variety of topics in machine learning,
pattern recognition, statistical modeling, and neural computation. It
covers the mathematical methods and theoretical aspects, but will
primarily focus on algorithmic and practical issues.
Machine Learning and Pattern Recognition methods are at the core of
many recent advances in "intelligent computing". Current applications
include machine perception (vision, audition, speech recognition),
control (process control, robotics), data mining, timeseries
prediction (e.g. in finance), natural language processing, text mining
and text classification, bioinformatics, neural modeling,
computational models of biological processes, and many other areas.

EnergyBased Models

Who Can Take This Course? 
This course can be useful to all students who would want to use or
develop statistical modeling methods. This includes students in CS
(AI, Vision, Graphics), Math (System Modeling), Neuroscience (Computational
Neuroscience, Brain Imaging), Finance (Financial modeling and
prediction), Psychology (Vision), Linguistics, Biology (Computational
Biology, Genomics, Bioinformatics), and Medicine (BioStatistics,
Epidemiology).
The only formal prerequisites are familiarity with computer
programming and linear algebra, but the course relies heavily on such
mathematical tools as probability and statistics, multivariate
calculus, and function optimization. The basic mathematical
concepts will be introduced when needed, but students will be
expected to assimilate a nontrivial amount of mathematical
concepts in a fairly short time.
Although this is a graduatelevel course, highly motivated
undergraduates at the senior level with a good math background can
take this class. A few juniors have even taken this class
successfully in the past.


The topics studied in the course include:
 the basics of inductive inference, learning, and generalization.
 linear classifiers: perceptron, LMS, logistic regression.
 nonlinear classifiers with linear parameterizations:
basisfunction methods, boosting, support vector machines.
 multilayer neural networks, backpropagation
 heterogeneous learning systems
 graphbased models for sequences: hidden Markov models,
finitestate transducers, recurrent networks.
 unsupervised learning: density estimation,
clustering, and dimensionality reduction methods.
 introduction to graphical models and factor graphs
 approximate inference, sampling.
 optimization methods in learning: gradientbased methods,
secondorder methods, ExpectationMaximization.
 objective functions: maximum likelihood, maximum aposteriori,
discriminative criteria, maximum margin.
 the biasvariance dilemma, regularization, model selection.
 applications in vision, speech, language, forecasting,
and biological modeling.
By the end of the course, students will be able to not only understand
and use the major machine learning methods, but also implement, apply
and analyze them.


The best way (some would say the only way) to understand an algorithm
is to implement it and apply it. Building working systems is also a
lot more fun, more creative, and more relevant than taking formal exams.
Therefore students will be evaluated primarily on programming projects
given on a 3 week cycle, a final project, and an exam.
Automatic
Face Detection
Linear algebra, vector calculus, elementary statistics and probability
theory. Good programming ability is a must: most assignements will
consist in implementing algorithms studied in class.
Software for the homeworks 
Skeleton code for the homework will be provided in a
simpletolearn numerical programming language called
Lush.
Lush is available on the CIMS servers
and
workstations that are available for student use (remotely through
ssh or in the computer labs).
Lush only runs on Linux, so you must have a version of Linux installed
on your machine. If you don't already have Linux, you should install
Ubuntu Linux on your machine (instructions below).
Installing Ubuntu Linux
There are several options for installing Ubuntu Linux on a Windows machine:
 Install Ubuntu on your Windows machine without repartitioning your harddrive
using Wubi. Wubi is a Windows app
that installs Ubuntu in a file on your Windows partition (the file is a virtual
hardrive from Linux's point of view). Wubi is not an emulator,
so you have to boot your machine in either Linux or Windows. You can't run
them at the same time.
 download and install Ubuntu.
The Ubuntu installer offers you the option of shrinking your Windows partition
nondestructively so as to make space for Linux. The installation is
very simple and takes about 20 minutes. It can be done from a CD/DVD or
from a USB flash drive.
 Install Ubuntu through VMWare so you can run Window and Ubuntu
at the same time.
You can also dispense with installing Ubuntu Linux altogether and
simply use the Linux machines at the Courant Institute computer labs.
Installing Lush
After you have installed Ubuntu Linux, you must install a number of
packages using the shell commands below (type your password when prompted):
sudo aptget install gcc g++ libx11dev binutilsdev indent libreadline5 libreadline5dev
sudo aptget install emacs cvs libgsl0dev imagemagick
then, at the shell, do:
cd
cvs d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush login
[type enter when prompted for a password]
cvs z3 d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush co lush
cd lush
./configure
make
This will download and compile Lush in your home directory.
The Lush executable is in lush/bin/i686pclinuxgnu/lush.
For convenience, you can link this executable to some place in your
shell search path, for example with:
cd
mkdir bin
cd bin
ln s ../lush/bin/i686pclinuxgnu/lush .
assuming that the bin directory in your homedir is in your search path.
The alternative is to do
cd
cd lush
sudo make install
to perform a systemwide installation.
The course will include a short tutorial on the
Lush language, a simple
interpreted language for numerical applications.
Alternatives to Lush
If you really want, you can implement the homeworks in another
language, (C, C++, Java, Matlab, Lisp, Python,...) but the use of a
highlevel interpreted language with good numerical support and good
support for vector/matrix algebra is highly recommended (Lush, Matlab,
Octave...). Some assignments require the use of an objectoriented
language, which rules out
If you decide to use another language than Lush, you will be doing
a lot more work than necessary (particularly for homework 2).

Invariant Object Recognition

Register to the course's mailing list.
Pattern Recognition and Machine Learning, by Christopher
Bishop. Springer, August 2006.
NOTES: 13) The publisher just released the first edition of this book on
August 28, 2006, so you might have to wait a few days before getting
it. (2) this is a different book from past years; (3) This is not
Bishop's previous book called Neural Networks and Pattern Recognition;
The textbook can be used for reference, but I will not follow it very
closely.
The following books can also be used for complementary material
(you can get copies from the library):
Other Books of Interest
 C. Bishop: "Neural Networks for Pattern Recognition", Oxford
University Press, 1996. Pretty much superseded by Bishop's lates book.
 S. Haykin: "Neural Networks, a comprehensive foundation",
Prentice Hall, 1999 (second edition).
 Tom Mitchell: "Machine Learning", McGraw Hill, 1997.

automatic obstacle avoidance 
Machine Learning Research at NYU 
Please have a look at the research project page of the
Computational and Biological Learning Lab for a few example
of machine learning research at NYU.
There are numerous opportunities for independent studies and even
undergraduate research projects. Contact Prof. LeCun for details.
 Lush: A simple language for quick
implementation of, and experimentation with, numerical algorithms
(for Linux, Mac, and Windows/Cygwin). Many algorithms described in this
course are implemented in the Lush library. Lush is available on the
department's Sun machines that are freely accessible to NYU graduate
students.
 EBLearn: A C++ library for machine learning
whose classes are similar to Lush's, developed at NYU.
 Torch: A C++ library for machine learning
developed at NEC Labs in Princeton that uses the Lua interpreter as a frontend.
 Theano: a machine learning
engine developed at the University of Montreal.
 NumPy: A Python library for numerical
and scientific computing.
Papers
Some of those papers are available in the DjVu format.
The viewer/plugins for Windows, Linux, Mac, and various Unix flavors are
available here.
Y. LeCun, S. Chopra, R. Hadsell, F.J. Huang, M.A. Ranzato,
"A Tutorial on EnergyBased Learning",
in Predicting Structured Outputs, MIT Press 2006.
[PDF]
[DjVu]
 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner,
"GradientBased Learning Applied to Document Recognition,"
Proceedings of the IEEE, vol. 86, no. 11, pp. 22782324, Nov. 1998.
[PS.GZ]
[DjVu]
 Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient BackProp,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
 P. Simard, Y. LeCun, J. Denker, and B. Victorri,
"Transformation Invariance in Pattern Recognition, Tangent Distance and Tangent Propagation,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
Publications, Journals
Conference Sites
Datasets

