Visual Object Recognition and Retrieval

Visual Object Recognition and Retrieval

Tutorial at ICML 2008, Helsinki, Finland

Rob Fergus, Dept. of Computer Science, Courant Institute of Mathematical Sciences, New York University.
Abstract

The tutorial will address the problem of recognizing visual object classes in images, currently the focus of much interest in Computer Vision. As recent innovations in the area draw heavily on machine learning concepts, the tutorial will attempt to highlight the growing intersection between the two areas. The material will be divided five sections, covering (i) bag of words models; (ii) parts and structure models; (iii) discriminative methods; (iv) objects and scenes (v) retrieval schemes for large datasets. The emphasis will be on the important general concepts rather than in depth coverage of contemporary papers. The tutorial is a revised version of the prize-winning short course given at ICCV 2005 and CVPR 2007 in conjunction with Fei-Fei Li (Princeton) and Antonio Torralba (MIT) (Link).

Outline of content
  1. Introduction:
    • define the problem of object categorization (OC)
    • brief history
    • invariance issues in OC
    • representation
    • learning
    • recognition
  2. Bag of words models:
    • model representation
    • learning
    • recognition
    • demo
    • all related works
  3. Part-based models:
    • model representation
    • learning
    • recognition
    • demo
    • all related works
  4. Discriminative models:
    • model representation
    • learning
    • recognition
    • demo
    • all related works
  5. Objects and its contexts:
    • segmentation based recognition
    • context facilitated recognition
    • recognition and geometry
  6. Retrieval methods:
    • Large datasets
    • Internet imagery
Suitable audience
The tutorial is suitable for anyone interested in Object Recognition as a problem in of itself, or as a target application for machine learning tools. The material is suitable for 1st or 2nd year graduate students and beyond.
Course Materials
Final version of course slides:
1. Introduction
2. Bag of Words models
3. Bag of Words models with spatial information
4. Parts and Structure
5. Classifier-based methods
6. Combined segmentation and recognition
7. Recognition for retrieval
8. Datasets and Conclusions


Link to previous incarnation of tutorial, given at CVPR 2007 and ICCV 2005.
Presenter's Background
Rob Fergus is currently an assistant professor of computer science at the Courant Institute at New York University, USA. He received an MSc from Caltech (with Prof. Pietro Perona), a PhD from Oxford (with Prof. Andrew Zisserman). Before coming to NYU, he spent two years as a post-doc in the Computer Science and Artificial Intelligence Lab (CSAIL) at MIT, working with Prof. William Freeman. He won the prize for the best paper at CVPR 2003 and the award for best computer science thesis in the UK in 2005.