Semester: Fall 2017. Time and Location: Thursday 7:109:00pm, Warren Weaver Hall, Room 1302. Instructor: Rob Fergus Office hours: Thursday 6:007:00pm, Room 514, 60 5th Ave. Tutor: Rohit Muthyala (rrm404@nyu.edu).


Computer Vision aims to extract descriptions of the world from pictures or video. In recent years, much progress has been made on this challenging problem. The course will start by looking the established area of a geometric vision. It will then introduce the convolutional network model and describe the profound impact that it has had on problems in recognition, segmentation and many other vision problems.
The course will be suitable for master's students and advanced undergraduates. A reasonable knowledge of linear algebra will be required, along with some basic concepts in machine learning. The homeworks will require Torch, so familiarity with it is desirable, although not essential.
Assessment will be through three graded homework assignments and a final course project. The weighting across these will be: 15%, 10% 25%, 50%.
The policy regarding late homework is as follows: (a) assignments that are late by less than 24hrs will suffer a 10% reduction; (b) those between 24 and 72 hrs late will suffer a 25% reduction and (c) those more 72hrs late will suffer a 50% reduction. You are strongly encouraged to start the assignment early and don't be afraid to ask for help.
Date  Time  Topics  Relevant Book Chapters 






Szeliski, Ch. 1 and 2; F & P, Ch. 1 













Szeliski, Ch. 3 and 4; F & P, Ch. 6, 7 and 8 













Szeliski, Ch. 6; F and P, ch. 3.1 and 15; Lowe 2004 



Winder and Brown 2007 









Szeliski, Ch. 7; F & P ch. 10 & 11; 









Szeliski, Ch. 14 




























































































































The main text book that we will use is:
Szeliski, Richard, Computer Vision: Algorithms and Applications Springer, 2011. This book is available in electronic form at: Link
There are also a couple of other text books relevant to the course, although we won't be directly using them:
Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003.
Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.
Both these are available from the CIMS library.
For the object recognition part of the course, please see the Object Reconition Short Course. Link
Matlab tutorial by Hany Farid and Eero Simoncelli Link
A more comprehensive Matlab tutorial by David Griffiths Link
Further documentation on Matlab can be found here Link
Palmer, Stephen E. Vision Science: Photos to Phenomenology, MIT Press, 1999.
Strang, Gilbert. Linear Algebra and Its Applications 2/e, Academic Press, 1980.
Wandell, Brian A. Foundations of Vision, Sinauer, 1995.