Visual Object Recognition and Retrieval

Tutorial at ICML 2008, Helsinki, Finland

Rob Fergus, Dept. of Computer Science, Courant Institute of Mathematical Sciences, New York University.

Abstract

The tutorial will address the problem of recognizing visual object classes in images, currently the focus of much interest in Computer Vision. As recent innovations in the area draw heavily on machine learning concepts, the tutorial will attempt to highlight the growing intersection between the two areas. The material will be divided five sections, covering (i) bag of words models; (ii) parts and structure models; (iii) discriminative methods; (iv) objects and scenes (v) retrieval schemes for large datasets. The emphasis will be on the important general concepts rather than in depth coverage of contemporary papers. The tutorial is a revised version of the prize-winning short course given at ICCV 2005 and CVPR 2007 in conjunction with Fei-Fei Li (Princeton) and Antonio Torralba (MIT) (Link).

Outline of content

Introduction:

define the problem of object categorization (OC)
brief history
invariance issues in OC
representation
learning
recognition

Bag of words models:

model representation
learning
recognition
demo
all related works

Part-based models:

model representation
learning
recognition
demo
all related works

Discriminative models:

model representation
learning
recognition
demo
all related works

Objects and its contexts:

segmentation based recognition
context facilitated recognition
recognition and geometry

Retrieval methods:

Large datasets
Internet imagery

Suitable audience

The tutorial is suitable for anyone interested in Object Recognition as a problem in of itself, or as a target application for machine learning tools. The material is suitable for 1st or 2nd year graduate students and beyond.

Course Materials

Final version of course slides:
1. Introduction
2. Bag of Words models
3. Bag of Words models with spatial information
4. Parts and Structure
5. Classifier-based methods
6. Combined segmentation and recognition
7. Recognition for retrieval
8. Datasets and Conclusions

Link to previous incarnation of tutorial, given at CVPR 2007 and ICCV 2005.

Presenter's Background

Rob Fergus is currently an assistant professor of computer science at the Courant Institute at New York University, USA. He received an MSc from Caltech (with Prof. Pietro Perona), a PhD from Oxford (with Prof. Andrew Zisserman). Before coming to NYU, he spent two years as a post-doc in the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. William Freeman. He won the prize for the best paper at CVPR 2003 and the award for best computer science thesis in the UK in 2005.