This is the list of final project topics.
Projects can be done individually, in groups of 2, or (with
permission) in groups of 3.
You can propose your own project topic, subject to approval by the instructor.
You must send a .tar or .tgz file to the TA with your code and a PDF file
describing your project and the results you obtained.
Implement Standard Learning Algorithms with Eblearn |
Eblearn is a C++ library that implements
classes and functionalities similar to Lush's gblearn2 package for gradient-based
and energy-based learning.
This series of projects consists in implementing a number of standard algorithms
and applications using eblearn. Suggested algorithms include Adaboost, PCA,
K-Means, Mixtures of Gaussians.
Students should contact Pierre Sermanet (pierre DOT sermanet AT gmail
DOT com), the principal maintainer of EBLearn, for information about what
module and functionaltiy should be implemented.
Predicting Financial Data |
We have a dataset consisting of description vectors of various
companies, together with a variable that indicated whether the company
defaulted on their loans.
The project consists in predicting whether the company will default
using various methods, including neural nets, logistic regression,
SVM, and perhaps other methods (each project team should pick a good
subset).
The complication resides in the fact that some variables are missing,
hence a latent variable inference model should be used.
Contact the instructor to obtain the dataset.
Implementing Learning Modules on NVidia GPUs with CUDA |
CUDA is a C-like language for writing program for NVidia GPU cards.
Lush has a framework for writing, compiling, and calling CUDA functions,
A small number of modules (multiple convolutions, etc) have been
implemented with this framework, but more are needed. For example,
fully-connected neural nets, and sparse coding algorithms.
Students interested in this should contact Benoit Corda
(cordaben AT gmail DOT com) for the CUDA/Lush interface code
and information about which functions to implement.
You need to have access to an NVidia graphic card with a GPU
that supports CUDA (GeForce 9xxx, GTX-2xx, GTX-3xx, GTX-4xx, etc).
A couple of publicly accessible CIMS servers have CUDA GPU cards
in them. A number of CBLL servers also have CUDA cards but
they are heavily used.
Learning Sparse and Overcomplete Representations for Music and Speech |
Learning algorithms to produce sparse and overcomplete representations have been
popular for image recognition, image denoising, and other visual tasks.
The project consists in applying these methods to musical signal analysis
and speech.
The method can be used as a first step in a "blind source separation" system
that could separate instrument tracks from a monophonic recording.
This would consist in adapting existing code (in Lush and/or C++)
and run it on audio datasets.
Some information on this class of algorithms is available here.
Belief Propagation for Music Composition |
The purpose of the project is to build a factor graph model
that implements the rules of simple counterpoint.
Given a few notes, the model can be used to "fill in the blanks"
by finding melodies that satisfy all the rules.
Each rule can be seen as a factor in a factor graph.
The best note sequence will be obtained with Belief Propagation
or some other efficient inference procedure.
Lp pooling for Convolutional Nets |
Convolutional networks have been trained with a number of different
pooling functions, such as average, average followed by tanh, and max.
This project concerns the implementation and test of other pooling
functions, such as (SUM_x X_i^p)^(1/p), also known as "Lp norm",
as well as 1/b log[ SUM_i exp(b*X_i) ], also known as log-sum pooling.
This will have to be implemented within Lush and/or with EBlearn,
and tested on the MNIST or CIfAR 10 dataset.
Students should contact Koray Kavukcuoglu for the code and datasets
(koray AT cs DOT nyu DOT edu).
Trainable Contrast Normalization |
Convolutional nets and multiplayer neural nets perform well
when the state of each layer has zero mean, and when the
state variables all have similar standard deviations.
The goal of this project is to implement a trainable subtractive
and divisive contrast normalization layer for convolutional networks.
(current such modules are set up by hand and not trained).
The idea is to make a trainable convolutional module so that every
variable in a 3D set of feature maps is replaced by itself minus a
(trainable) linear combination of its neighbors, in such a way that
the output is sparse (e.g. so that it minimizes the L1 norm of
the state).
Students should contact Koray Kavukcuoglu for the code and datasets
(koray AT cs DOT nyu DOT edu).
Recurrent ConvNet for Image segmentation |
The idea is to train a convolutional network to perform image segmentation
(for example, to label the edges in an image). This will be tested on
the Berkeley image segmentation dataset.
Background information on using ConvNets for segmentation can be found
here.
The convnet architecture would be recurrent, in that multiple
successive layers would share the same convolution kernels.
Parallelizing Stochastic Gradient |
Stochastic gradient is difficult to parallelize because each sample is
run sequentially (unlike batch training where the samples can be run
in parallel). The present project idea is to train multiple convnets
in parallel with stochastic gradient, and to synchronize their weight
parameters periodically, rarely enough so that the communication cost
is minimal.
Lush provides an object serialization infrastructure that makes it
easy to run multiple Lush processes that talk to each other though
sockets.
Multi-step Denoising Autoencoder |
(details to follow).
Implement the so-called "Field of Experts model" in Lush (see
reference below).
Reference: Stefan Roth and Michael J. Black: Fields of Experts: A
Framework for Learning Image Priors. In Proc. of the IEEE Conference
on Computer Vision and Pattern Recognition (CVPR), vol. 2,
pp. 860-867, June
2005. Link
to paper.
This model can be used for image restoration, denoising etc.
Text/Image Segmenter for DjVuLibre |
Implement a foreground/background segmenter for DjVuLibre.
DjVuLibre is the open source version of the DjVu system.
It currently lacks a good segmenter that can separate
the text and drawings from the backgrounds and continous-tone
images in a scanned document.
The project will consist in building a foreground/background segmenter
by using various clustering methods.
Train the convolutional net implementated in Lush on snippet of images
extracted from scanned pages. The snippets should include neighboring
characters, so that the network output will be robust to "distracting"
characters on each side of the character to be recognized.
The purpose is to demonstrate a segmentation-free "brute force" OCR by
sweeping the convolutional network over an entire page of text.
Students can use the convolutional network implementations from the gblearn2
library in Lush (in lush/packages/gblearn2) or from the eblearn C++ library
Eblearn.
Learning with Conjugate Gradient in Eblearn |
Implement and test Polak-Ribiere and Fletcher-Reeves Conjugate
Gradient optimizations within
the Eblearn C++ library.
Compare it with stochastic gradient on regression problems.
This comes down to implementing a new subclass of the "parameter"
class in which the "update" method will be redefined.
Optional: implement quasi-Newton methods such as limited-memory BFGS.
References:
- Y. LeCun, L. Bottou, G. Orr, and K. Muller,
"Efficient BackProp,"
in Neural Networks: Tricks of the trade, 1994.
- Martin F. Møller,
A Scaled
Conjugate Gradient Algorithm for Fast Supervised Learning, 1993.
- J. Nocedal, D. C. Liu,
"On the Limited Memory Method for Large Scale Optimization ,"
Mathematical Programming B, 45, 3, pp. 503-528, 1989.
|