Project 3: Experiment
Assigned: Oct. 17
Due: Nov. 7
In this assignment, you will carry out and report on an experiment (or
set of experiments).
You have a great deal of freedom in choosing the experiment and
designing it. The only constraints are as follows:
Otherwise, you can do whatever you want:
- The experiment must attempt either to test a specified hypothesis or
to measure a specified quantity. The result must not be known in
- The structure of the experiment, the statement of the hypothesis
to be tested, and the evaluation metric(s) to be used, must be written up
before the experiment is carried out.
The experiment carried out should follow exactly the stated plan.
- The experiment should be as far as possible repeatable. Your
write-up should include enough detail that a second person could do
the same experiment on the same data. If you are using subjective
measures such as "relevance", you should describe as precisely
as possible what you mean by it, and give some examples. (Examples of
borderline cases would also be helpful.)
- The experiment must depend on the contingent structure of the Web;
it must be such that the results would be different if, for example, the
Web consisted of 50 documents in Gaelic about the topography of County
Clare. An example of an experiment that would NOT be acceptable on this
basis would be to measure the running time of the PageRank algorithm as
tested over random graphs.
- It must be a substantive piece of work. Conception, design, execution,
and write-up should take at least 10 hours or so. If you use one of my
suggestions for conception or use a design that you find in the literature,
since you will have saved time on conception and design, I will expect more
time to be spent on execution and write-up.
- The experiment can be automated or not to whatever extent you want.
You can use any existing electronic resources (software, data, etc.)
you can find.
- You may repeat an experiment that you find described in the literature.
- You can do quantitative or qualitative analysis.
- If you know how to do it, you should certainly do a statistical
analysis of the data.
Use of other persons
You should not systematically use other people in your experiment.
The reason is that this falls under the category of "Use of Human
Beings as Experimental Subjects", and your plan would have to be approved
by the corresponding University committee. And though there is no
question about getting approval (unless you are planning to test how
judgments of relevance are affected by recreational drugs or torture),
all the same this is a process that takes more time that is probably
If you very much want to do an experiment involving human subjects, see me.
By Oct. 31, you should email to me (firstname.lastname@example.org)
a write-up of your proposed experiment,
with a detailed statement of the hypothesis being tested or quantity
being measured, the evaluation measure to be used, and the experimental
technique. If your experiment is a repetition of some previous
experiment, you should send the citation.
By Nov. 7, you should email to me (NOT the TA) a package including
- A detailed write-up of the experiment.
- Links for any datasets or preexisting software that you used.
- Source code for any software that you wrote.
- Data generated by the experiment.
- Bibliography (if any)
Sample Outlines for Experiments
Note: These are just samples. You are not required to do one of these;
in fact, it's better if you invent your own. And they are just outlines.
Filling in the details is your task, so don't ask me for details.
1. Compare the quality of two search engines over a sheaf of queries.
2. Estimate the recall of several search engines over a simple query.
3. Study the effect on quality of broadening a query (a) by stemming;
(b) by including synonyms or other related words.
4. Estimate the fraction of dead links on random Web pages. (Of course,
there's no way to get a genuinely random sample of Web pages, but you can
approximate it by doing a query with an immense results page, and sampling
a page pretty far down in the list.)
5. Compare the ranking of answers to a query generated by a search engine
to (a) the ranking given by the vector sum; (b) the ranking given purely
by in-link count; (c) the ranking given by the optimal linear sum of (a)
and (b). (In computing the vector sum, you can ignore the IDF of
the words, unless you can find some resource that gives plausible values
for that information.)
6. For some search engine that carries out search for images by keyword,
try to figure out how the search engine is associating words with images.
7. Issuing a set of queries, and then repeating the same set a week
later, determine how much the results
change, and characterize the changes.
(If you want to do this over a period greater than a week, I will
give an extension on the due date.)
8. Apply the PageRank algorithm to some large set of tightly interconnected
pages. To what extent does the computed PageRank compare to an intuitive
judgment of the significance of the page?
9. Compare the quality of results given by a specialized search engine for
queries in its domain to results given by a general search engine. Try to
identify the reasons for the difference.
10. Evaluate the success of the CiteSeer system at identifying citations with
papers and with other citations, and with matching internal references to
bibliographic entries. Characterize the errors.