This is the list of final project topics.
Projects can be done individually, in groups of 2, or (with
permission) in groups of 3.
Some projects require you to download the MNIST dataset.
Getting the MNIST dataset 
The MNIST dataset is composed of 60,000 training samples and 10,000
test samples of handwritten digits. Details about MNIST are available
here.
An additional set of digits produced by randomly distorting the
original MNIST images is available (see below).
To obtain the data:
BE CAREFUL: your browser may gunzip the files without telling you.
Each of those files is a Lush matrix that can be loaded into lush
with (loadmatrix "thefile"), or more efficiently, it can be
memorymapped in RAM with (mmapmatrix "thefile").
A Lush script is available to train a convolutional network
on MNIST. The script can be obtained
here. The convolutional net has 20 feature maps at the first
and second layers, and 50 at the third and fourth.
the script should give you 0.7% to 0.8% error on the test set after
training on the original training set (no distorted samples).
the net is quite a bit bigger than the one described in the
paper "GradientBased Learning Applied to Document Recognition".
If you need help with this project and can't get a hold of me you
can contact the TA Piotr Mirowski, or one of my PhD students who has
worked with MNIST and Convolutional nets:
 Fu Jie Huang (Jie): jhuangfu [ at ] cs
 Sumit Chopra: sumit [ at ] cs
MNIST: EnergyBased Losses 
Implement two or three of the energybased losses discussed in class,
and train the convolutional net on MNISt with that.
Compare results with MeanSquared Error (energy loss
on energy quadratic in Y).
The loss function that are suggested are: negative loglikelihood
(with variable beta), margin loss, squaresquare loss.
Try to train on the full 660,000 distorted training samples,
and see if you can beat the world record (0.39%).
Implement Adaboost, and use it to boost a multinomial logistic
regression, and to boost the convolutional net.
Train the convolutional net with versions of the samples that are
randomly shifted by 1 pixel to the left, right, top, and bottom.
To test the network, run it 9 times for each test samples: shift
the test samples to all 9 positions (+ 1 pixel in each direction)
and combine the outputs of the 9 nets to get the most reliable answer.
Compare this with standard training.
MNIST: Overlapping subsampling layers 
Analyze the effect of translated digits on the performance of
convolutional nets and SVM.
MNIST: thresholded backprop 
Train the convolutional net, and Modify the backprop code
so that samples whose error is below a threshold get skipped during
training.
See if the speed of convergence or the accuracy improve.
MNIST: fastest convolutional layer implementation 
You need access to a multicore or multicpu Intel or AMD system for this one.
The core of Lush's convolutional network implementation is the clayer
class defined in lush/packages/gblearn2/gbmodulesnn.lsh
The current implementation of the fprop and bprop methods are generic and slow.
Since most of the time is spent performing convolutions, see if you
can hack a faster version using all the tricks in the book:
Intel's IPP library (which uses the SSE3 instruction set to vectorize
convolutions), the Atlas library, multithreaded implementations
(using pthreads).....
You should be able to get a speed up of 5 to 10 over the
current implementation on a single CPU. Can you get a speedup
of 50 on an 8 core machine?
Compute the 2D Fourier power spectrum of the MNIST digits (with the
zerofrequency components in the center of the image).
Train the convolutional net on the Fourier transforms.
This project is proposed by
Prof. Lakshmi Subramanian in
the Networks and Distributed Systems group at NYU.
The idea is to train a convolutional network (using the implementation
in the lush/packages/gblearn2 library) to classify image patches
extracted from images of the human retina. The categories include
normal retinas and various stages of diabetic retinopathy. Diabetic
retinopath is the leading cause of blindness in the world
(particularly in India and China), even though it is generally
preventable if diagnosed early.
More details are available here
Images are available
here.
You can get a good start on this project within the scope of the
class assignment, but if you are looking for an independent study
project next semester, or for a Masters thesis, this could be a good
choice.
Learning with Conjugate Gradient 
Implement and test PolakRibiere and FletcherReeves Conjugate
Gradient optimizations in the homework 3 (backprop) code.
Compare it with stochastic gradient on regression problems.
This comes down to implementing a new subclass of the "parameter"
class in which the "update" method will be redefined, and a subclass
of the "simpletrainer" classes in which the methods "learnsample"
and "learn" will be redefined.
Option 1: port your implementation to Lush's gblearn2 library.
Option 2: implement quasiNewton methods such as limitedmemory BFGS.
References:
 Y. LeCun, L. Bottou, G. Orr, and K. Muller,
"Efficient BackProp,"
in Neural Networks: Tricks of the trade, 1994.
 Martin F. Møller,
A Scaled
Conjugate Gradient Algorithm for Fast Supervised Learning, 1993.
 J. Nocedal, D. C. Liu,
"On the Limited Memory Method for Large Scale Optimization ,"
Mathematical Programming B, 45, 3, pp. 503528, 1989.
