Project 3: Experiment

Assigned: Oct. 17
Due: Nov. 7

In this assignment, you will carry out and report on an experiment (or set of experiments). You have a great deal of freedom in choosing the experiment and designing it. The only constraints are as follows:

Otherwise, you can do whatever you want:

Use of other persons

You should not systematically use other people in your experiment. The reason is that this falls under the category of "Use of Human Beings as Experimental Subjects", and your plan would have to be approved by the corresponding University committee. And though there is no question about getting approval (unless you are planning to test how judgments of relevance are affected by recreational drugs or torture), all the same this is a process that takes more time that is probably available.

If you very much want to do an experiment involving human subjects, see me.

Deliverables

By Oct. 31, you should email to me (davise@cs.nyu.edu) a write-up of your proposed experiment, with a detailed statement of the hypothesis being tested or quantity being measured, the evaluation measure to be used, and the experimental technique. If your experiment is a repetition of some previous experiment, you should send the citation.

By Nov. 7, you should email to me (NOT the TA) a package including

Sample Outlines for Experiments

Note: These are just samples. You are not required to do one of these; in fact, it's better if you invent your own. And they are just outlines. Filling in the details is your task, so don't ask me for details.

1. Compare the quality of two search engines over a sheaf of queries.

2. Estimate the recall of several search engines over a simple query.

3. Study the effect on quality of broadening a query (a) by stemming; (b) by including synonyms or other related words.

4. Estimate the fraction of dead links on random Web pages. (Of course, there's no way to get a genuinely random sample of Web pages, but you can approximate it by doing a query with an immense results page, and sampling a page pretty far down in the list.)

5. Compare the ranking of answers to a query generated by a search engine to (a) the ranking given by the vector sum; (b) the ranking given purely by in-link count; (c) the ranking given by the optimal linear sum of (a) and (b). (In computing the vector sum, you can ignore the IDF of the words, unless you can find some resource that gives plausible values for that information.)

6. For some search engine that carries out search for images by keyword, try to figure out how the search engine is associating words with images.

7. Issuing a set of queries, and then repeating the same set a week later, determine how much the results change, and characterize the changes. (If you want to do this over a period greater than a week, I will give an extension on the due date.)

8. Apply the PageRank algorithm to some large set of tightly interconnected pages. To what extent does the computed PageRank compare to an intuitive judgment of the significance of the page?

9. Compare the quality of results given by a specialized search engine for queries in its domain to results given by a general search engine. Try to identify the reasons for the difference.

10. Evaluate the success of the CiteSeer system at identifying citations with papers and with other citations, and with matching internal references to bibliographic entries. Characterize the errors.