|
This page contains the schedule, slide from the lectures, lecture notes, reading lists,
assigments, and web links.
I urge you to download the DjVu viewer
and view the DjVu version of the documents below. They display faster,
are higher quality, and have generally smaller file sizes than the PS and PDF.
Full-text search is provided for the entire
collection of slides and papers. Click here to search
01/21: Introduction and basic concepts |
Subjects treated: Intro, types of learning, nearest neighbor, how biology does it,
linear classifier, perceptron learning procedure, linear regression,
training/test, capacity, overfitting, regularization, Occam's Razor, MDL.
Slides: [DjVu | PDF | PS]
Required Reading:
- Hastie/Tibshirani/Friedman: Chapter 2
Optional Reading:
- Refresher on random variables and probabilites by
Andrew Moore: (slides 1-27) [DjVu | PDF]
- Refresher on joint probabilities, Bayes theorem by
Chris Willams: [DjVu | PDF]
- Refresher on statistics and probabilities by
Sam Roweis: [DjVu | PS]
- If you are interested in the early history of self-organizing
systems and cybernetics, have a look at this book available from the
Internet Archive's Million Book Project: Self-Organizing
Systems, proceedings of a 1959 conference edited by Yovits and
Cameron (DjVu viewer required for full text).
01/28: Probability Theory, Bayes Inversion, Bayes Decision Rule |
Subjects treated: Refreshers on probability theory.
Bayes decision rule, naive Bayes classifier, logistic regression.
Refresher on multivariate calculus and optimization.
Slides
- Refresher on probability and statistics: [DjVu | PDF | PS]
- Optimization (pages 1-5) DjVu
Required Reading:
- Hastie/Tibshirani/Friedman: Sections 3.1, 3.2, 3.4.1 to 3.4.3, 4.1, 4.2, 4.4, 4.5
Optional Reading:
- Duda/Hart/Stork: Sections 5.1 to 5.8
- Bishop: Chapter 3
- Hastie/Tibshirani/Friedman: Sections 3.3, 4.3
- Paper on logistic regression by
Michael Jordan: [DjVu | PS]
02/04: MLE, MAP, Energy Functions |
Subjects treated: Bayesian Estimation, Maximum Likelihood
Estimation, MAP Estimation, Loss Functions and Energy-Based models.
Probability, Entropy, Energy, and Free Energy. Introduction to Lush.
Slides: [DjVu | PDF | PS]
Required Reading:
Optional Reading:
Homework Assignements: (see next lecture)
02/11: Gradient-Based Learning I: Beyond Linear Classifiers |
Subjects treated: Intro to Gradient-Based Learning.
Limitations of linear classifiers. Basis function expansion, polynomial classifiers,
kernel expansion, RBF Networks, Simple multi-layer neural nets.
Optimization and the convergence of gradient-based learning.
Slides:
- Intro to gradient-based learning: [DjVu | PDF | PS]
- Optimization (pages 5-end) DjVu
Required Reading:
- Gradient-based Learning Applied to Document Recognition by LeCun,
Bottou, Bengio, and Haffner, pages 1-5 (Introduction):
[ DjVu | .ps.gz ]
- Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 1-5:
[ DjVu | .ps.gz ]
Homework Assignements: implementing the Perceptron
Algorithm, MSE Classifier (linear regression), Logistic Regression.
Details and datasets below:
- Download this tar.gz archive. It
contains the datasets and the homework description.
- Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or
with Winzip in Windows.
- The file homework01.txt contains the questions and instructions.
- Most the of the necessary Lush code is provided.
- Due Date is Wednesday March 3, before the lecture.
02/18: Gradient-Based Learning II: Multilayer Networks and Back-Propagation |
Subjects treated: Multi-Module learning machines. Vector
modules and switches. Multilayer neural nets. Backpropagation
Learning.
Slides:
- Multilayer nets and backpropagation: [DjVu | PDF | PS]
02/25: Gradient-Based Learning III: Special Architectures |
Subjects treated: Special architectures: RBF nets, mixtures
of experts, parameter-space transforms. Implementation and practical
issues with multi-module/multi-layer learning machines. Intro to
convolutional nets.
Slides:
- Special Architectures: [DjVu | PDF | PS]
- Optimization, Practical Issues with gradient-based training: [DjVu | PDF | PS]
Required Reading:
- Convolutional nets: "Gradient-based Learning Applied to Document Recognition" by LeCun,
Bottou, Bengio, and Haffner, pages 5-18 (up to and including section IV-B ):
[ DjVu | .ps.gz ]
- On the Lagrangian formulation of gradient-based learning:
"A theoretical framework for back-propagation":
[ DjVu | .ps.gz ]
- Efficient Backprop, by LeCun, Bottou, Orr, and Muller, Sections 6-end:
[ DjVu | .ps.gz ]
Optional Reading:
- Multimodule Approach and Lagrangian formulation: "a framework for
the cooperation of learning algorithms" by Bottou and Gallinari:
DjVu.
03/03: Convolutional Nets. Cross-Validation, Model Selection, Learning Theory |
Subjects treated: Invariant Recognition, Feature Learning,
Convolutional Networks and Time-Delay Neural Nets.
Model Selection, Cross-Validation, VC-dimension,
Structural Risk Minimization, Bagging.
Slides:
- Convolutional Nets: no slides, see the required reading for last week.
- Model selection, cross-validation, learning theory, bagging: [DjVu | PDF | PS]
Homework Assignements: implementing Gradient-Based Learning
and back-propagation. You must implement gradient-based learning using
the object-oriented, module-based approach as described in class.
Various architectures, including a multilayer neural net, must be
implemented and tested on two datasets.
- Download this tar.gz archive. It
contains the datasets and the homework description.
- Decompress it with "tar xvfz homework-01.tgz" on Unix/Linux or
with Winzip in Windows.
- The file homework-02.txt contains
the questions and instructions.
- Most of the necessary Lush code is provided.
- Due Date is Friday April 2 (NEW NEW DATE!).
03/10: Unsupervised Learning |
Subjects treated: Unsupervised Learning: Principal Component
Analysis. Density Estimation: Parzen Windows, Mixtures of Gaussians,
Auto-Encoders. Latent variables and the Estimation-Maximization algorithm.
Slides:
Spring break: NO CLASS.
03/24: Guest Lecture by Prof. Lawrence Saul: Dimensionality Reduction |
Subjects treated: Non-Linear Dimensionality Reduction and
Embedding: Guest Lecture by Prof. Lawrence Saul
from University of Pennsylvania:
Slides:
- L. Saul's Lecture Slides on non-linear dimensionality reduction
(caution: the PS and the PDF are over 25MB, the DjVu is 2MB): [DjVu | PDF | PS]
Required Reading: (please read this before the class)
- L. K. Saul and S. T. Roweis (2003). Think globally, fit locally:
unsupervised learning of low dimensional manifolds.
Journal of Machine Learning Research 4:119-155.
[PDF].
Optional Reading:
03/31: Efficient Optimization, Latent Variables, Graph Transformer Networks |
Subjects treated:
Efficient learning: conjugate gradient, Levenberg-Marquardt.
Lagrange Multipliers and Constrained Optimization.
More on latent variables and EM.
Modeling distributions over sequences. Learning machines that
manipulate graphs. Finite-state transducers. Graph Transformer
Networks.
Required Reading:
- Graph Transformer Networks: "Gradient-based Learning Applied to Document Recognition" by LeCun,
Bottou, Bengio, and Haffner, pages 18 (Section IV-C) to the end:
[ DjVu | .ps.gz ]
Homework Assignements: Homework 03: K-Means and Mixture of Gaussians estimation with EM.
- The subject of this homework is to implement the K-means algorithm
and the Expectation-Maximization algorithm for a Mixture of Gaussians model.
The algorithms must be tested on image data for simulated image
compression taks.
- Download this tar.gz archive. It
contains the datasets and the homework description.
- Decompress it with "tar xvfz homework-03.tgz" on Unix/Linux or
with Winzip in Windows.
- The file homework-03.txt contains
the questions and instructions.
- DUE DATE: Friday April 16
04/07: Boosting and Support Vector Machines |
This lecture will be given by Prof. Dan Melamed.
Subjects treated: Boosting, and
Ensemble Methods. Maximum Margin Classifiers.
Support Vector Machines, Kernel Machines.
Homework Assignements: Final Project
- A list of possible project topics is
available here.
Make a proposal (send an email message to me and
to the TA).
- This project will count for a lot in the final grade.
- Collaboration: you can do your final project in groups of two students.
- Due Date: Friday, May 14. Extensions may be granted for
ambitious projects by students who are not graduating this year.
if you intend to graduate this year, you must return your
project by the due date.
04/14: Hidden Markov Models |
Subjects treated: Probabilistic Automata, Distribution
over Sequences, Hidden Markov Models, Inference: Forward-Backward
Algorithm, Learning: Expectation-Maximization algorithm.
Reading:
- Slides of a tutorial on HMM by
Sam Roweis:
[PDF]
- Tutorial article on HMM by Larry Rabiner: [ DjVu | PDF ]
- Tutorial by Mehryar Mohri on Weighted finite-state transducers for
speech recognition: Part I [ PS],
Part II [ PS]
04/21: Graphical Models, Belief Propagation |
Subjects treated: Intro to graphical models,
Inference, Belief Propagation, Boltzmann Machines,
Required Reading:
- Book by David
MacKay (free download): Information Theory, Inference, and Learning Algorithms
[DjVu],
[PDF].
See chapters 24, 25, 26, and 43.
04/28: Learning, Sampling, and Energy-Based Models |
Subjects treated: Learning in Graphical Models;
Approximate Inference and Sampling, Markov-Chain Monte-Carlo,
Hybrid Monte-Carlo; Energy-Based Models, Contrastive Divergence.
Required Reading:
- Book by David
MacKay (free download): Information Theory, Inference, and Learning Algorithms
[DjVu],
[PDF].
See chapters 29-33.
|
|