Web Search Engines

G22.2580
Monday 5:00-7:00
Room 101, Warren Weaver Hall
Professor Ernest Davis

Reaching Me

Prerequisites: None.

Textbook: Mining the Web: Discovering Knowledge from Hypertext Data by Soumen Chakrabarti

Other Useful Books These will be on reserve at the CIMS Library.
Pierre Baldi et al., Modelling the Internet and the Web
Amy Langville and Carl Meyer, Google's PageRank and Beyond
George Chang, Mining the World Wide Web
Bing Liu, Web Data Mining

Useful web site: Search Engine Watch

Course topics:

We will discuss the design of a Web search engine and the extraction of information off the Web. Topics include

Lecture notes

Lecture 1: Architecture; Indexing Sept. 10
Lecture 2: Measuring Relevance Sept. 17
Lecture 3: PageRank and link-based measures of importance Sept. 24
Lecture 4: Similarity Searches; Evaluation Oct. 1
Lecture 5: Clustering Oct. 15
Lecture 6: Collaborative filtering Oct. 22
Lecture 7: Classification Oct. 29
Lecture 8: Invisible Web, Specialized Search Engines, and Metasearch Nov. 5
Lecture 9: Web Structure and Evolution Nov. 12
Lecture 10: Archiving / Usage Mining Nov. 19
Lecture 11: Images Nov. 21
Lecture 12: "Unsupervised" Content Mining Nov. 26
Lecture 13: The Multi-Lingual Web Dec. 3
Lecture 14: And Now For Something Completely Different Dec. 10

Requirements

A course project (60%)
Final exam (40%).

Class email list

Link to the class email web page and follow the instructions there for subscribing.

Teaching Assistant

The teaching assistant will be Koray Kavukcuoglu, x8-3489, 1215 719 Bway, email: firstName at cs.nyu.edu.

Final Exam

Format of the Final Exam

Previous Web Page

The field has changed substantially in the last three years, so this fall's course will be quite different from the course in 2004, but this will give you an idea of the kind of material we will cover.