# Computer Vision - CSCI-GA.2271-001

Semester: Fall 2017.

Time and Location: Thursday 7:10-9:00pm, Warren Weaver Hall, Room 1302.

Instructor: Rob Fergus

Office hours: Thursday 6:00-7:00pm, Room 514, 60 5th Ave.

Tutor: Rohit Muthyala (rrm404@nyu.edu).

Grader: Utku Evci (ue225@nyu.edu).

## Overview

Computer Vision aims to extract descriptions of the world from pictures or video. In recent years, much progress has been made on this challenging problem. The course will start by looking the established area of a geometric vision. It will then introduce the convolutional network model and describe the profound impact that it has had on problems in recognition, segmentation and many other vision problems.

## Prerequisites

The course will be suitable for master's students and advanced undergraduates. A reasonable knowledge of linear algebra will be required, along with some basic concepts in machine learning. The homeworks will require Torch, so familiarity with it is desirable, although not essential.

## Assessment

Assessment will be through three graded homework assignments and a final course project. The weighting across these will be: 15%, 10% 25%, 50%.

## Late Policy

The policy regarding late homework is as follows: (a) assignments that are late by less than 24hrs will suffer a 10% reduction; (b) those between 24 and 72 hrs late will suffer a 25% reduction and (c) those more 72hrs late will suffer a 50% reduction. You are strongly encouraged to start the assignment early and don't be afraid to ask for help.

## Schedule

Date Time Topics Relevant Book Chapters
WEEK 1
Thur 09/07/2017
19:10-21:00 1. Introduction, Image Formation Pt. 1 Szeliski, Ch. 1 and 2; F & P, Ch. 1
Thur 09/07/2017
19:10-21:00 2. Image Formation Pt. 2 (Slides - PPT) (Slides - PDF)

Assignment 0 Out (PDF) (assign0_data.py)
WEEK 2
Thur 09/14/2017
19:10-21:00 3. Color (Slides - PPT) (Slides - PDF) Szeliski, Ch. 3 and 4; F & P, Ch. 6, 7 and 8
Thur 09/14/2017
19:10-21:00 4. Filtering & Edges (Slides - PPT) (Slides - PDF)
WEEK 3
Tues 09/18/2017
23:59 Assignment 0 Due
Thur 09/21/2017
19:10-21:00 5. Fitting, RANSAC, Region representation Szeliski, Ch. 6; F and P, ch. 3.1 and 15; Lowe 2004
Thur 09/21/2017
19:10-21:00 6. Image alignment, optical flow. (Slides - PPT) (Slides - PDF) Winder and Brown 2007

Assignment 1 Out (PDF) (assignment1.zip)
WEEK 4
Thur 09/28/2017
19:10-21:00 7. Epipolar geometry Szeliski, Ch. 7; F & P ch. 10 & 11;
Thur 09/28/2017
19:10-21:00 8. Stereo reconsruction (Slides - PPT) (Slides - PDF)
WEEK 5
Thur 10/05/2017
19:10-21:00 9. Structure from Motion. Szeliski, Ch. 14
Thur 10/05/2017
19:10-21:00 10. Large scale structure from motion. (Slides - PPT) (Slides - PDF)
WEEK 6
Thur 10/12/2017
19:10-21:00 11. Neural nets 1

Assignment 2 Out (iPython Notebook)
Thur 10/12/2017
19:10-21:00 12. Neural nets 2 (Slides - PPT) (Slides - PDF)
Thur 10/12/2017
23:59 Assignment 1 Due
WEEK 7
Thur 10/19/2017
19:10-21:00 13. Convolutional Networks 1 (Slides - PPT) (Slides - PDF)
Thur 10/19/2017
19:10-21:00 14. Convolutional Networks 2
WEEK 8
Thur 10/26/2017
19:10-21:00 15. Object Classification 1 (Slides - PPT) (Slides - PDF)
Thur 10/26/2017
19:10-21:00 16. Object Classification 2
WEEK 9
Thur 11/02/2017
19:10-21:00 17. Object Detection
Thur 11/02/2017
19:00 Assignment 2 Due

Assignment 3 Out (PDF)
Thur 11/02/2017
19:10-21:00 18. Object Detection Pt.2 (PPT) (PDF)
WEEK 10
Thur 11/09/2017
19:10-21:00 19. Face Recognition (PDF)
Thur 11/09/2017
19:10-21:00 20. Efficient Implementation Methods (PDF)
WEEK 11
Thur 11/16/2017
19:10-21:00 21. Semantic segmentation, depth prediction
Thur 11/16/2017
19:10-21:00 22. Images and Text 1 (PDF)
WEEK 12
Thanksgiving Break (No class)
WEEK 13
Thur 11/30/2017
19:10-21:00 23. Text and Images 2
Thur 11/30/2017
19:10-21:00 24. Attention and Memory in Deep Nets (PDF)
WEEK 13
Thur 12/07/2017
19:10-21:00 25. GANs 1 (Slides - PDF)
Thur 12/07/2017
19:10-21:00 26. GANs 2
Thur 12/07/2017
19:00 Assignment 3 Due
WEEK 14
Thur 12/14/2017
19:10-21:00 27. Unsupervised Learning 1 (PDF)
Thur 12/14/2017
19:10-21:00 28. Unsupervised Learning 2
EXAM WEEK
Fri 12/22/2017
23:59 Project Due (Note that this is a strict deadline)

## Acknowledgments

The instructor would like to thank Andrew Zisserman and Svetlana Lazebnik for making their slides available. Thanks also go to Fei-Fei Li and Antonio Torralba for creating the ICCV'05/CVPR'07 object recognition tutorial slides used in classes 11,12,13.

## Textbook

The main text book that we will use is:

Szeliski, Richard, Computer Vision: Algorithms and Applications Springer, 2011. This book is available in electronic form at: Link

There are also a couple of other text books relevant to the course, although we won't be directly using them:

Forsyth, David A., and Ponce, J. Computer Vision: A Modern Approach, Prentice Hall, 2003.

Hartley, R. and Zisserman, A. Multiple View Geometry in Computer Vision, Academic Press, 2002.

Both these are available from the CIMS library.

For the object recognition part of the course, please see the Object Reconition Short Course. Link

### Matlab guides

Matlab tutorial by Hany Farid and Eero Simoncelli Link

A more comprehensive Matlab tutorial by David Griffiths Link

Further documentation on Matlab can be found here Link

### Books

Palmer, Stephen E. Vision Science: Photos to Phenomenology, MIT Press, 1999.

Strang, Gilbert. Linear Algebra and Its Applications 2/e, Academic Press, 1980.

Wandell, Brian A. Foundations of Vision, Sinauer, 1995.