Computer Vision - CSCI-GA.2271-001

Semester: Fall 2012.

Time and Location: Tuesday 5:00-6:50pm, Room 1221, 715 Broadway.

Instructor: Rob Fergus     

Office hours: Tuesday 6:50-8:50pm, Room 1226, 12th floor, 715 Broadway.

Overview

Computer Vision aims to extract descriptions of the world from pictures or video. In recent years, much progress has been made on this challenging problem. The course will start by looking the established area of a geometric vision. It will then move onto mid-level problems such as tracking and segmentation. The final part of the course will focus on recognition, particularly on the problem of detecting object classes (e.g. bottles, shoes, cars) in images, currently a topic much reserach interest.

Prerequisites

The course will be suitable for masters and PhD students. A reasonable knowledge of linear algebra will be required, along with some basic concepts in machine learning. The homeworks will require Matlab, so familiarity with it is desirable, although not essential.

Assessment

Assessment will be through four graded homework assignments.

Schedule

Date Time Topics Relevant Book Chapters
WEEK 1
Tues 09/04/2012
17:00-18:50 1. Introduction, Image Formation (Slides - PPT) (Slides - PDF) Szeliski, Ch. 1 and 2; F & P, Ch. 1
WEEK 2
Tue 09/11/2012
17:00-18.50 2. Color; Filters, Edge detection (Slides - PPT) (Slides - PDF) Szeliski, Ch. 3 and 4; F & P, Ch. 6, 7 and 8
WEEK 3
Tues 09/18/2012
17:00-18:50 3. Corner & Region detection and representation. (Slides - PPT) (Slides - PDF) Szeliski, Ch. 3 and 4; F and P, ch. 3 and 16; Lowe 2004

Assignment 1 Out (PDF) (einstein.jpg)
WEEK 4
Tue 09/25/2012
17:00-18.50 4. Fitting, RANSAC, Hough transform and Image Alignment (Slides - PPT) (Slides - PDF) Szeliski, Ch. 6; F & P sec. 3.1, ch. 15; Winder and Brown 2007
WEEK 5
Tues 10/02/2012
17:00-18:50 5. Epipolar geometry, Stereo reconstruction (Slides - PPT) (Slides - PDF) Szeliski, Ch. 7; H and Z, ch. 9-12; F and P, ch. 10 and 11
WEEK 6
Tues 10/09/2012
17:00-18:50 6. Multiview Stereo, Structure from Motion (Slides - PPT) (Slides - PDF) Szeliski, Ch. 7; F and P, ch. 12, 13; H and Z, ch. 18.
Tues 10/09/2012
17:00 Assignment 1 Due

Assignment 2 Out (PDF) (world) (image.txt) (sfm_point.mat)
WEEK 7
Fall Break (No class)
WEEK 8
Tues 10/23/2012
17:00-18:50 7. Introduction to Recognition. (Slides - PPT) (Slides - PDF)
WEEK 9
Tues 10/30/2012
17:00-18:50 8. Specific Object Recognition & Faces (Slides - PPT) (Slides - PDF) Szeliski, Ch. 14.
WEEK 10
Tues 11/06/2012
17:00-18:50 9. Recognition - Bag of words models (Slides - PPT) (Slides - PDF) Szeliski, Ch. 14.
Tues 11/06/2012
17:00 Assignment 2 Due

Assignment 3 Out (PDF) (faces.zip)
WEEK 11
Tues 11/13/2012
17:00-18:50 10. Parts-based models (Slides - PPT) (Slides - PDF) )
WEEK 12
Tues 11/20/2012
17:00-18:50 11. Segmentation (Slides - PPT) (Slides - PDF)
WEEK 13
Tues 11/27/2012
17:00-18:50 12. Neural-network models Szeliski, Ch. 5
WEEK 14
Tues 12/04/2012
17:00-18:50 13. Optical flow (Slides - PPT) (Slides - PDF) Szeliski, Ch. 8
Tues 12/04/2012
17:00 Assignment 3 Due

Assignment 4 Out (PDF) (tiger.jpg) (baseball.jpg) (Shi and Malik, PAMI 2000)
WEEK 15
Tues 12/11/2012
17:00-18:50 14. Internet Vision (Slides - PPT) (Slides - PDF)
EXAM WEEK
Tues 12/18/2012
17:00 Assignment 4 Due (Note that this is a strict deadline)

Acknowledgments

The instructor would like to thank Andrew Zisserman and Svetlana Lazebnik for making their slides available. Thanks also go to Fei-Fei Li and Antonio Torralba for creating the ICCV'05/CVPR'07 object recognition tutorial slides used in classes 11,12,13.

Textbook

The main text book that we will use is:

Szeliski, Richard, Computer Vision: Algorithms and Applications Springer, 2011. This book is available in electronic form at: Link

There are also a couple of other text books relevant to the course, although we won't be directly using them:

Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003.

Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.

Both these are available from the CIMS library.

For the object recognition part of the course, please see the Object Reconition Short Course. Link

Additional Material

Matlab guides

Matlab tutorial by Hany Farid and Eero Simoncelli Link

A more comprehensive Matlab tutorial by David Griffiths Link

Further documentation on Matlab can be found here Link

Books

Palmer, Stephen E. Vision Science: Photos to Phenomenology, MIT Press, 1999.

Strang, Gilbert. Linear Algebra and Its Applications 2/e, Academic Press, 1980.

Wandell, Brian A. Foundations of Vision, Sinauer, 1995.