This is the list of final project topics.
Projects can be done individually, in groups of 2, or (with
permission) in groups of 3.
Some projects require you to download the MNIST dataset.
Getting the MNIST dataset |
The MNIST dataset is composed of 60,000 training samples and 10,000
test samples of handwritten digits. Details about MNIST are available
here.
An additional set of digits produced by randomly distorting the
original MNIST images is available (see below).
To obtain the data:
BE CAREFUL: your browser may gunzip the files without telling you.
Each of those files is a Lush matrix that can be loaded into lush
with (load-matrix "thefile"), or more efficiently, it can be
memory-mapped in RAM with (mmap-matrix "thefile").
A Lush script is available to train a convolutional network
on MNIST. The script can be obtained
here. The convolutional net has 20 feature maps at the first
and second layers, and 50 at the third and fourth.
the script should give you 0.7% to 0.8% error on the test set after
training on the original training set (no distorted samples).
the net is quite a bit bigger than the one described in the
paper "Gradient-Based Learning Applied to Document Recognition".
If you need help with this project and can't get a hold of me you
can contact the TA Piotr Mirowski, or one of my PhD students who has
worked with MNIST and Convolutional nets:
- Fu Jie Huang (Jie): jhuangfu [ at ] cs
- Sumit Chopra: sumit [ at ] cs
MNIST: Energy-Based Losses |
Implement two or three of the energy-based losses discussed in class,
and train the convolutional net on MNISt with that.
Compare results with Mean-Squared Error (energy loss
on energy quadratic in Y).
The loss function that are suggested are: negative log-likelihood
(with variable beta), margin loss, square-square loss.
Try to train on the full 660,000 distorted training samples,
and see if you can beat the world record (0.39%).
Implement Adaboost, and use it to boost a multinomial logistic
regression, and to boost the convolutional net.
Train the convolutional net with versions of the samples that are
randomly shifted by 1 pixel to the left, right, top, and bottom.
To test the network, run it 9 times for each test samples: shift
the test samples to all 9 positions (+- 1 pixel in each direction)
and combine the outputs of the 9 nets to get the most reliable answer.
Compare this with standard training.
MNIST: Overlapping subsampling layers |
Analyze the effect of translated digits on the performance of
convolutional nets and SVM.
- train an SVM on MNIST (use the code in lush/packages/svm or ask
Jie about other implementations.
- train the convolutional net.
- translate all the digits randomly by +- 1 pixel in each
direction (training and test samples).
- retrain and test the SVM and the convolutional net with the
translated data.
- compare the increase in error rate between the SVM and the
convolutional net as the amount of shift increases.
MNIST: thresholded backprop |
Train the convolutional net, and Modify the backprop code
so that samples whose error is below a threshold get skipped during
training.
See if the speed of convergence or the accuracy improve.
MNIST: fastest convolutional layer implementation |
You need access to a multicore or multi-cpu Intel or AMD system for this one.
The core of Lush's convolutional network implementation is the c-layer
class defined in lush/packages/gblearn2/gb-modules-nn.lsh
The current implementation of the fprop and bprop methods are generic and slow.
Since most of the time is spent performing convolutions, see if you
can hack a faster version using all the tricks in the book:
Intel's IPP library (which uses the SSE3 instruction set to vectorize
convolutions), the Atlas library, multi-threaded implementations
(using pthreads).....
You should be able to get a speed up of 5 to 10 over the
current implementation on a single CPU. Can you get a speed-up
of 50 on an 8 core machine?
Compute the 2D Fourier power spectrum of the MNIST digits (with the
zero-frequency components in the center of the image).
Train the convolutional net on the Fourier transforms.
This project is proposed by
Prof. Lakshmi Subramanian in
the Networks and Distributed Systems group at NYU.
The idea is to train a convolutional network (using the implementation
in the lush/packages/gblearn2 library) to classify image patches
extracted from images of the human retina. The categories include
normal retinas and various stages of diabetic retinopathy. Diabetic
retinopath is the leading cause of blindness in the world
(particularly in India and China), even though it is generally
preventable if diagnosed early.
More details are available here
Images are available
here.
You can get a good start on this project within the scope of the
class assignment, but if you are looking for an independent study
project next semester, or for a Masters thesis, this could be a good
choice.
Learning with Conjugate Gradient |
Implement and test Polak-Ribiere and Fletcher-Reeves Conjugate
Gradient optimizations in the homework 3 (backprop) code.
Compare it with stochastic gradient on regression problems.
This comes down to implementing a new subclass of the "parameter"
class in which the "update" method will be redefined, and a subclass
of the "simple-trainer" classes in which the methods "learn-sample"
and "learn" will be redefined.
Option 1: port your implementation to Lush's gblearn2 library.
Option 2: implement quasi-Newton methods such as limited-memory BFGS.
References:
- Y. LeCun, L. Bottou, G. Orr, and K. Muller,
"Efficient BackProp,"
in Neural Networks: Tricks of the trade, 1994.
- Martin F. Møller,
A Scaled
Conjugate Gradient Algorithm for Fast Supervised Learning, 1993.
- J. Nocedal, D. C. Liu,
"On the Limited Memory Method for Large Scale Optimization ,"
Mathematical Programming B, 45, 3, pp. 503-528, 1989.
|