CBLL HOME
VLG Group
News/Events
Seminars
People
Research
Publications
Talks
Demos
Datasets
Software
Courses
Links
Group Meetings
Join CBLL
Y. LeCun's website
CS at Courant
Courant Institute
NYU
Lush
Lush

Machine Learning and Pattern Recognition: Assignments


[ Course Homepage | Schedule and Course Material | Mailing List ]
Skeleton code for the assignments are provided in the Lush language.

You may use other languages to implement your assignments. but it is likely that the time you will spend re-implementing infrastructure code will be larger than the time to get familiar with Lush (except perhaps for the first assignment).

For the adventurous among us, there is an open source C++ library called eblearn that implements all the functionalities of the Lush numerical library and machine learning library. It's still a bit rough, but usable.

Lush is available for Linux: http://lush.sf.net.

If you only have a Windows machine, you have several solutions:

  • Use one of the CIMS server machines, Lush is already installed on them. You can access them remotely using ssh (e.g. "ssh -X access.cims.nyu.edu"), or you can go to one of the public computer labs at Warren Weaver Hall.
  • Install Ubuntu on your Windows machine without repartitioning your harddrive using Wubi. Wubi is a Windows app that installs Ubuntu in a file on your Windows partition (the file is a virtual hardrive from Linux's point of view). Wubi is not an emulator, so you have to boot your machine in either Linux or Windows. You can't run them at the same time.
  • download and install Ubuntu. The Ubuntu installer offers you the option of shrinking your Windows partition non-destructively so as to make space for Linux. The installation is very simple and takes about 20 minutes.
  • Install Ubuntu through VMWare so you can run Window and Ubuntu at the same time.
In any case, as a graduate student in Computer Science, you have to be exposed to Linux/Unix sooner or later.

Installing Lush

Do not install the Lush package that comes with Ubuntu. It is badly out of date, rather, install Lush from the CVS repository on SourceForge as explained below.

After you have installed Ubuntu Linux, you must install a number of packages using the shell commands below (type your password when prompted):

sudo apt-get install gcc g++ libx11-dev binutils-dev indent libreadline5 libreadline5-dev 
sudo apt-get install emacs cvs libgsl0-dev imagemagick

then, at the shell, do:

cd
cvs -d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush login 
[type enter when prompted for a password]
cvs -z3 -d:pserver:anonymous@lush.cvs.sourceforge.net:/cvsroot/lush co lush
cd lush
./configure
make
This will download and compile Lush in your home directory. The Lush executable is in lush/bin/i686-pc-linux-gnu/lush. For convenience, you can link this executable to some place in your shell search path, for example with:
cd
mkdir bin 
cd bin
ln -s ../lush/bin/i686-pc-linux-gnu/lush .
assuming that the bin directory in your homedir is in your search path. The alternative is to do
cd
cd lush
sudo make install
to perform a system-wide installation.

The course will include a short tutorial on the Lush language, a simple interpreted language for numerical applications.

Linear Machines and Regularization

Implementing the Perceptron Algorithm, MSE Classifier (linear regression), Logistic Regression. Details and datasets below:

  • Download this .tgz archive. It contains the datasets for all the homeworks.
  • Download this .tgz archive. It contains the homework description.
  • "cd" to a directory and decompress the two files in this directory using "tar xvfz thefile.tgz" on Unix/Linux or Mac.
  • This will create two directories: datasets and hw-linear.
  • The file hw-linear/README.txt contains the questions and instructions.
  • Most the of the necessary Lush code is provided.
  • Due Date is Monday October 17th,.

Neural Networks and Backprop

:

Click on this links to get the homework: hw-backprop-v20.tgz.

Due Date is Monday November 7 before class.

K-Means and Mixtures of Gaussians

Click on this links to get the homework: hw-unsup-v20.tgz.

Due Date is Monday November 21.

Final Projects

click here for a list of final projects.

Projects can be done individually, in groups of 2, or (with permission) in groups of 3.

If you have an interest in a particular topic, you can propose your own project topic, subject to approval by the instructor. Send a description of your project proposal to the instructor and the TA before November 15th.

Otherwise, a list of possible projects will be proposed during the course of the semester (early November).

All projects are due December 23th.

There will be a project showcase and demo show on Tuesday December 14th on the 13th floor of Warren Weaver Hall from 5:00 to 9:00 PM. Extra points will be given to those making a presentation (poster and/or demo) at the project showcase.

You must send a .tar or .tgz file to the TA with your code and a PDF file describing your project and the results you obtained.

It is expected that you will implement a learning algorithm yourself, or use an existing one in a new and interesting way.

In other words, merely downloading an SVM package off the web and running it on a standard dataset won't get you a good grade.

On a few occasions, some class projects have been known to turn into conference publications....

List of Proposed Projects

A more extensive list is coming soon.

Among other topics, we are looking for volunteers to implement various standard learning algorithms and demos for the C++ library eblearn.

.