VLG Group
Group Meetings
Y. LeCun's website
CS at Courant
Courant Institute

Machine Learning and Pattern Recognition: Schedule

[ Course Homepage | Schedule and Course Material | Mailing List ]

This is the list of final project topics.

Projects can be done individually, in groups of 2, or (with permission) in groups of 3. Some projects require you to download the MNIST dataset.

Getting the MNIST dataset

The MNIST dataset is composed of 60,000 training samples and 10,000 test samples of handwritten digits. Details about MNIST are available here.

An additional set of digits produced by randomly distorting the original MNIST images is available (see below).

To obtain the data:

  • Go to http://www.cs.nyu.edu/~yann/datasets/mnist-distort/.
  • Download and gunzip the training images: train-images-idx3-ubyte.gz, and the labels for those: train-labels-idx1-ubyte.gz.
  • Download and gunzip the test images: t10k-images-idx3-ubyte-0.gz, and the labels for them: t10k-labels-idx1-ubyte-0.gz
  • If you have an account on dept.cs.nyu.edu, you can do:
     rsync -av dept.cs.nyu.edu:/home/yann/www.cs.nyu.edu/html/datasets/mnist-distort.
    and type your password when prompted. This will download the entire set (152MB, distorted digits included).
BE CAREFUL: your browser may gunzip the files without telling you.

Each of those files is a Lush matrix that can be loaded into lush with (load-matrix "thefile"), or more efficiently, it can be memory-mapped in RAM with (mmap-matrix "thefile").

A Lush script is available to train a convolutional network on MNIST. The script can be obtained here. The convolutional net has 20 feature maps at the first and second layers, and 50 at the third and fourth. the script should give you 0.7% to 0.8% error on the test set after training on the original training set (no distorted samples).

the net is quite a bit bigger than the one described in the paper "Gradient-Based Learning Applied to Document Recognition".

If you need help with this project and can't get a hold of me you can contact the TA Piotr Mirowski, or one of my PhD students who has worked with MNIST and Convolutional nets:

  • Fu Jie Huang (Jie): jhuangfu [ at ] cs
  • Sumit Chopra: sumit [ at ] cs

MNIST: Energy-Based Losses

Implement two or three of the energy-based losses discussed in class, and train the convolutional net on MNISt with that. Compare results with Mean-Squared Error (energy loss on energy quadratic in Y).

The loss function that are suggested are: negative log-likelihood (with variable beta), margin loss, square-square loss.

Try to train on the full 660,000 distorted training samples, and see if you can beat the world record (0.39%).

MNIST: Adaboost

Implement Adaboost, and use it to boost a multinomial logistic regression, and to boost the convolutional net.

MNIST: Jittered Queries

Train the convolutional net with versions of the samples that are randomly shifted by 1 pixel to the left, right, top, and bottom.

To test the network, run it 9 times for each test samples: shift the test samples to all 9 positions (+- 1 pixel in each direction) and combine the outputs of the 9 nets to get the most reliable answer.

Compare this with standard training.

MNIST: Overlapping subsampling layers

Analyze the effect of translated digits on the performance of convolutional nets and SVM.

MNIST: thresholded backprop

Train the convolutional net, and Modify the backprop code so that samples whose error is below a threshold get skipped during training.

See if the speed of convergence or the accuracy improve.

MNIST: fastest convolutional layer implementation

You need access to a multicore or multi-cpu Intel or AMD system for this one.

The core of Lush's convolutional network implementation is the c-layer class defined in lush/packages/gblearn2/gb-modules-nn.lsh

The current implementation of the fprop and bprop methods are generic and slow.

Since most of the time is spent performing convolutions, see if you can hack a faster version using all the tricks in the book: Intel's IPP library (which uses the SSE3 instruction set to vectorize convolutions), the Atlas library, multi-threaded implementations (using pthreads).....

You should be able to get a speed up of 5 to 10 over the current implementation on a single CPU. Can you get a speed-up of 50 on an 8 core machine?

MNIST: Fourier features

Compute the 2D Fourier power spectrum of the MNIST digits (with the zero-frequency components in the center of the image).

Train the convolutional net on the Fourier transforms.

Diabetic Retinopathy

This project is proposed by Prof. Lakshmi Subramanian in the Networks and Distributed Systems group at NYU.

The idea is to train a convolutional network (using the implementation in the lush/packages/gblearn2 library) to classify image patches extracted from images of the human retina. The categories include normal retinas and various stages of diabetic retinopathy. Diabetic retinopath is the leading cause of blindness in the world (particularly in India and China), even though it is generally preventable if diagnosed early.

More details are available here

Images are available here.

You can get a good start on this project within the scope of the class assignment, but if you are looking for an independent study project next semester, or for a Masters thesis, this could be a good choice.

Learning with Conjugate Gradient

Implement and test Polak-Ribiere and Fletcher-Reeves Conjugate Gradient optimizations in the homework 3 (backprop) code.

Compare it with stochastic gradient on regression problems.

This comes down to implementing a new subclass of the "parameter" class in which the "update" method will be redefined, and a subclass of the "simple-trainer" classes in which the methods "learn-sample" and "learn" will be redefined.

Option 1: port your implementation to Lush's gblearn2 library.

Option 2: implement quasi-Newton methods such as limited-memory BFGS.