Web Search Engines (CSCI-GA 2580)

Fall 2014, Department of Computer Science, NYU

Course Schedule   Course Projects

Current News:

2014/09/28: Sign up for project demo slots (by 5pm November 10, 2014) here.
2014/09/15: Sign up for project group here.
2014/04/27: Course website is up.

Brief Description:

Search engines have become a core part of our daily lives. In this course, we will study the foundations of information retrieval and the technical aspects of modern Web search engines. We will also explore advanced/special topics that have emerged to become highly influential in relation to Web search.

You are expected to study the course material (textbook and research papers), participate in class discussion, and work on a class project that involves system design and implementation.

Instructors and Logistics:

Dr. Fernando Diaz (Microsoft Research), first_initial_and_last_name [AT] cs dot nyu dot edu
Dr. Cong Yu (Google Research), full_name [AT] cs dot nyu dot edu

Teaching Assistants (Questions regarding homeworks should be sent to TAs first):
  Si Liu, sl4072 [at] nyu [dot] edu
  Weicheng Ma, hikaritgpass [at] nyu [dot] edu

Prerequisite: It is expected that you have a good knowledge of algorithms and at least one of the major programming languages.
Although not a strict prereq, having taken UA.0310 is a good proxy.

Time and Location: Mondays 5:10p - 7:00p, CIWW 1302.
Office Hours: Mondays 4:00p - 5:00p, WWH328 (starting 9/15)
Mailing List: csci_ga_2580_001_fa14 [AT] cs dot nyu dot edu

Search Engines - Information Retrieval in Practice, by W. Bruce Croft, Donald Metzler, Trevor Strohman. Addison Wesley. 2009.

Participation 10%;
Exams 40%: Midterm 15%, Final 25%;
Project 50%: 3 Homeworks 30% (10% each), Project Report 10%, Project Demo 10%.

Course Schedule (tentative)

Notations: FD = Fernando Diaz; CY = Cong Yu.
Reading materials will be provided on the web site approximately one week before the lecture date.

Date Topic (Instructor) Reading Material Deadlines
Lec 00 (a, b) (09/08) Introduction (FD) Chapter 1–2 HW0 out.
Lec 01 (09/15) Evaluation (FD) Chapter 8  
Lec 02 (09/22) Ranking (FD) Chapter 7 HW0 due; HW1 out.
Lec 03 (09/29) Indexing (CY) Chapter 5  
Lec 04 (10/06) Document Processing (CY) Chapter 4 HW1 due; HW2 out.
10/13 Fall Recess no class  
Lec 05 (10/20) Web Crawl (CY) Chapter 3 HW2 due.
Lec 06 (10/27) Query Mining (FD) Chapter 6 Midterm out. [1]
Lec 07 (11/03) Personalization (FD)    
Lec 08 (11/10) Big Data (CY)   Midterm due;
HW3 out.

Lec 09 (11/17) Internet Advertising
(by Nitish Korula)
Lec 10 (11/24) Realtime (FD)   HW3 due.
Lec 11 (12/01) Knowledge (CY)    
Lec 12 (12/08) Rigors and Ethics (FD)    
12/10 Final Exam (CY+FD)    
12/15-17 Project Demo Days
WWH 805, 5pm to 7pm
  (via NYU Classes)
Project Report due at 12/15 9am.
Project Code due at 12/18 9am.
[1] Each group is encouraged to send us a short project proposal via email for a quick check on whether the work is on the right track.

Course Projects

A big component of the course is a group project. Each group will design and implement a mini search engine in the first part of the project through a series of homeworks, and an advanced component on top in the second part of the project.

Group ID Group Members Group ID Group Members
G01 wl1002, fg742, jz1371 G02 yy1112, zx339, ks3226
G03 xh499, sw2507, cz764 G04 ak4533, sm5119, aut204
G05 sl3760, dz720, kc2180 G06 yl1949, ws951, sy1288
G07 ww738, hk1642, hy821 G08 arpit.jain, ss7359, maw627
G09 kbp247, rap450, aa3793 G10 rsj259, sh3309, ans486
G11 bmd296, chc490, yg706 G12 cp1425, tsp261, sx238
G13 kh1715, sz1288, ycy247 G14 tt1161, aly233, ff648, adi225
G15 jd3011, jd3007, jl6589 G16 ytl264, saa567, hx364

Project Demo Slot Assignments:
Time (pm ET) December 15 December 16 December 17
5:00 G08 G12 G03
5:15 G09   G05
5:30     G06
5:45   G11 G14
6:00 G04 G16 last resort slot
6:15     last resort slot
6:30     last resort slot
6:45     last resort slot