
Graduate Course on machine learning, pattern recognition, neural nets,
statistical modeling.
Instructor: Yann LeCun, 715 Broadway, Room 706, x83283, yann [ a t ] cs.nyu.edu
Teaching Assistant: Fu Jie Huang, 715 Broadway, Room 705, jhuangfu [ a t ] cs.nyu.edu
Classes: Wednesdays 5:006:50PM, Room 101, Warren Weaver Hall.
Office Hours for Prof. LeCun: Thursdays 2:004:00 PM
Office Hours for TA FuJie Huang: Tuesdays 2:004:00 PM
This course will cover a wide variety of topics in machine learning,
pattern recognition, statistical modeling, and neural computation.
The course will cover the mathematical methods and theoretical
aspects, but will primarily focus on algorithmic and practical issues.
Machine Learning and Pattern Recognition methods are at the core of
many recent advances in "intelligent computing". Current applications
include machine perception (vision, audition), control (process
control, robotics), data mining, timeseries prediction (e.g. in
finance), natural language processing, text mining and text
classification, bioinformatics and computational models of biological
processes, and many other areas.
The topics studied in the course include:
 the basics of inductive inference, learning, and generalization.
 linear classifiers: perceptron, LMS, logistic regression.
 nonlinear classifiers with linear parameterizations:
basisfunction methods, boosting, support vector machines.
 multilayer neural networks, backpropagation
 heterogeneous learning systems
 graphbased models for sequences: hidden Markov models,
finitestate transducers, Hidden Markov Models,
recurrent networks.
 unsupervised learning: density estimation,
clustering, and dimensionality reduction methods.
 energybased models and probabilistic models.
 introduction to graphical models.
 approximate inference, sampling.
 optimization methods in learning: gradientbased methods,
secondorder methods, ExpectationMaximization.
 objective functions: maximum likelihood, MAP,
discriminative criteria.
 the biasvariance dilemma, regularization, model selection.
 applications in vision, speech, language, forecasting,
and biological modeling.
Linear algebra, vector calculus, elementary statistics and probability
theory. Good programming ability is a must: many assignements will
involve implementing algorithms studied in class.
The course will include a short tutorial on the
Lush language, a simple
interpreted language for numerical applications.
Programming projects may be implemented in any language, (C, C++,
Java, Matlab, Lisp, Python,...) but the use of a highlevel
interpreted language with good numerical support and and good support
for vector/matrix algebra is highly recommended (Lush, Matlab, Octave...).
Register to the course's mailing list.
T. Hastie, R. Tibshirani, and J. Friedman:
"Elements of Statistical Learning",
SpringerVerlag, 2001.
I will not follow this book very closely. In particular, much of the
material covered in the second half of the course cannot be found in
the above book. I will refer to research papers and lectures notes
for those topics.
Either one of the following books is also recommended, but not
absolutely required (you can get a copy from the library):
 C. Bishop: "Neural Networks for Pattern Recognition", Oxford
University Press, 1996. [quite good, if a little dated].
 Richard O. Duda, Peter E. Hart, David G. Stork: "Pattern Classification"
WileyInterscience; 2nd edition, October 2000. [good reference
because it covers a lot of topics].
Other Books of Interest
 S. Haykin: "Neural Networks, a comprehensive foundation",
Prentice Hall, 1999 (second edition).
 Tom Mitchell: "Machine Learning", McGraw Hill, 1997.
 Lush: A simple language for quick
implementation of, and experimentation with, numerical algorithms
(for Linux and Windows/Cygwin). Many algorithms described in this
course are implemented in the Lush library.
 Torch: A C++ library for machine learning.
Lush is installed on the department's PCs. It will soon be available
on the Sun network as well.
Papers
Some of those papers are available in the DjVu format.
The viewer/plugins for Windows, Linux, Mac, and various Unix flavors are
available here.
 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner,
"GradientBased Learning Applied to Document Recognition,"
Proceedings of the IEEE, vol. 86, no. 11, pp. 22782324, Nov. 1998.
[PS.GZ]
[DjVu]
 Y. LeCun, L. Bottou, G. Orr, and K. Muller, "Efficient BackProp,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
 P. Simard, Y. LeCun, J. Denker, and B. Victorri,
"Transformation Invariance in Pattern Recognition, Tangent Distance and Tangent Propagation,"
in Neural Networks: Tricks of the trade, (G. Orr and Muller K., eds.), 1998.
[PS.GZ]
[DjVu]
Publications, Journals
Conference Sites
Datasets
Demos and Pretty Pictures 
More demos are available here.
Object Recognition with Convolutional Neural Nets

