

Graduate Course on machine learning, pattern recognition, neural nets,
statistical modeling.
Instructor: Yann LeCun, 715 Broadway, Room 1220, x83283, yann [ a t ] cs.nyu.edu
Teaching Assistant: Raia Hadsell,
715 Broadway, Room 1015, x83277, raia [ a t ] cs.nyu.edu
Classes: Tuesdays 5:006:50PM, Room 1221, 719 Broadway.
Office Hours for Prof. LeCun: Wednesdays 3:005:00 PM
office hours for Raia Hadsell: Mondays 24pm, or most days by appointment
Click here for schedule and course material >>>

This course will be an updated version of G22.3033.002
taught in the Fall of 2004.
This course will cover a wide variety of topics in machine learning,
pattern recognition, statistical modeling, and neural computation.
The course will cover the mathematical methods and theoretical
aspects, but will primarily focus on algorithmic and practical issues.
Machine Learning and Pattern Recognition methods are at the core of
many recent advances in "intelligent computing". Current applications
include machine perception (vision, audition, speech recognition),
control (process control, robotics), data mining, timeseries
prediction (e.g. in finance), natural language processing, text mining
and text classification, bioinformatics, neural modeling,
computational models of biological processes, and many other areas.

EnergyBased Models

Who Can Take This Course? 
This course can be useful to all students who would want to use or
develop statistical modeling methods. This includes students in CS
(AI, Vision, Graphics), Math (System Modeling), Neuroscience (Computational
Neuroscience, Brain Imaging), Finance (Financial modeling and
prediction), Psychology (Vision), Linguistics, Biology (Computational
Biology, Genomics, Bioinformatics), and Medicine (BioStatistics,
Epidemiology).
The only formal prerequisites are familiarity with computer
programming and linear algebra, but the course relies heavily on such
mathematical tools as probability and statistics, multivariate
calculus, and function optimization. The basic mathematical
concepts will be introduced when needed, but students will be
expected to assimilate a nontrivial amount of mathematical
concepts in a fairly short time.
Although this is a graduatelevel course, highly motivated
undergraduates at the senior level with a good math background can
take this class. A few juniors have even taken this class
successfully in the past.


The topics studied in the course include:
 the basics of inductive inference, learning, and generalization.
 linear classifiers: perceptron, LMS, logistic regression.
 nonlinear classifiers with linear parameterizations:
basisfunction methods, boosting, support vector machines.
 multilayer neural networks, backpropagation
 heterogeneous learning systems
 graphbased models for sequences: hidden Markov models,
finitestate transducers, recurrent networks.
 unsupervised learning: density estimation,
clustering, and dimensionality reduction methods.
 introduction to graphical models and factor graphs
 approximate inference, sampling.
 optimization methods in learning: gradientbased methods,
secondorder methods, ExpectationMaximization.
 objective functions: maximum likelihood, maximum aposteriori,
discriminative criteria, maximum margin.
 the biasvariance dilemma, regularization, model selection.
 applications in vision, speech, language, forecasting,
and biological modeling.
By the end of the course, students will be able to not only understand
and use the major machine learning methods, but also implement, apply
and analyze them.
This course will be a (much updated) rerun of G22.3033.002 taught in
Fall 2004. Please visit the site
of the Spring 2004 edition of this course to have a look at the
schedule and source material.


The best way (some would say the only way) to understand an algorithm
is to implement it and apply it. Building working systems is also a
lot more fun, more creative, and more relevant than taking formal exams.
Therefore students will be evaluated primarily (almost exclusively) on
programming projects given on a 2 week cycle, and on a final project.
Automatic
Face Detection
Linear algebra, vector calculus, elementary statistics and probability
theory. Good programming ability is a must: most assignements will
consist in implementing algorithms studied in class.
The course will include a short tutorial on the
Lush language, a simple
interpreted language for numerical applications.
Lush can be downloaded and installed on Linux,
Mac OSX, and Windows (under Cygwin).
See Chris Poultney's
notes on installing Lush under Cygwin.
Lush is available on the CIMS Sun
workstations available for student use.
Programming projects may be implemented in any language, (C, C++,
Java, Matlab, Lisp, Python,...) but the use of a highlevel
interpreted language with good numerical support and and good support
for vector/matrix algebra is highly recommended (Lush, Matlab,
Octave...). Some assignments require the use of an objectoriented
language.
Also, for most assignments, a code squeleton in Lush will be
provided.

Invariant Object Recognition

Register to the course's mailing list.
Richard O. Duda, Peter E. Hart, David G. Stork: "Pattern Classification"
WileyInterscience; 2nd edition, October 2000.
I will not follow this book very closely. In particular, much of the
material covered in the second half of the course cannot be found in
the above book. I will refer to research papers and lectures notes
for those topics.
Either one of the following books is also recommended, but not
absolutely required (you can get a copy from the library):
 T. Hastie, R. Tibshirani, and J. Friedman:
"Elements of Statistical Learning",
SpringerVerlag, 2001.
 C. Bishop: "Neural Networks for Pattern Recognition", Oxford
University Press, 1996. [quite good, if a little dated].
Other Books of Interest
 S. Haykin: "Neural Networks, a comprehensive foundation",
Prentice Hall, 1999 (second edition).
 Tom Mitchell: "Machine Learning", McGraw Hill, 1997.

automatic obstacle avoidance 
Machine Learning Research at NYU 
Please have a look at the research project page of the
Computational and Biological Learning Lab for a few example
of machine learning research at NYU.
There are numerous opportunities for independent studies and even
undergraduate research projects. Contact Prof. LeCun for details.
 Lush: A simple language for quick
implementation of, and experimentation with, numerical algorithms
(for Linux, Mac, and Windows/Cygwin). Many algorithms described in this
course are implemented in the Lush library. Lush is available on the
department's Sun machines that are freely accessible to NYU graduate
students. See Chris Poultney's
notes on installing Lush under Cygwin.
 Torch: A C++ library for machine learning.
Lush is installed on the department's PCs. It will soon be available
on the Sun network as well.
Papers
Some of those papers are available in the DjVu format.
The viewer/plugins for Windows, Linux, Mac, and various Unix flavors are
available here.
 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner,
"GradientBased Learning Applied to Document Recognition,"
Proceedings of the IEEE, vol. 86, no. 11, pp. 22782324, Nov. 1998.
[PS.GZ]
[DjVu]
 Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient BackProp,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
 P. Simard, Y. LeCun, J. Denker, and B. Victorri,
"Transformation Invariance in Pattern Recognition, Tangent Distance and Tangent Propagation,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
Publications, Journals
Conference Sites
Datasets
Demos and Pretty Pictures 
More demos are available here.
Object Recognition with Convolutional Neural Nets

