## Problem Set 7

Assigned: November 7
Due: November 14.

### Problem 1

The famous first two sentences of Pride and Prejudice are
It is a truth universally acknowledged that a single man in possession of a good fortune must be in want of a wife.

However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered as the rightful property of some one or other of their daughters.

Describe two examples of lexical ambiguity, two examples of syntactic ambiguity, and one example of semantic ambiguity in the first sentence. (As regards the semantic ambiguity: Of course the whole sentence is ironic in tone, but that's not the ambiguity I'm looking for.) Describe an example of anaphoric ambiguity in the second sentence.

### Problem 2

The second sentence above contains the following lexical ambiguities, among others: "Views" means opinions but could mean "directions for looking at" (e.g. "Our hotel room had a great view of Niagra Falls.") "Fixed" means "believed" but could mean "repaired".
• A. Explain how the wrong interpretation of "fixed" can be excluded using selectional restrictions.
• B. Argue that the wrong interpretation of "views" cannot be excluded using selectional restrictions. Sketch how it could be excluded using world knowledge.
• C (extra credit): "Known" is semantically ambiguous --- known to whom? Discuss briefly the issues involved in disambiguating this.
• D (extra credit): "Surrounding" here means "living nearby". Discuss briefly the issues involved in finding the correct interpretation.

### Problem 3

Suppose that we are doing text tagging using the trigram model. To make life simple, we'll consider the case where there are only two tags: N (noun) and O (other). We have a corpus of 1,000 sentences of the following forms:
500 instances of N O N.
300 instances of N O O N.
200 instances of O N O N.

Moreover, the corpus contains:
the word "fish" 100 times: 70 as N and 30 as O.
the word "can" 200 times: 20 as N and 180 as O.
the word "swim" 50 times: 5 as N and 45 as O.

Assume, finally, that we are using the following smoothing function for transitions:
Prob(TI | TI-2, TI-1) = 0.6 FreqC(TI | TI-2, TI-1) + 0.3 FreqC(TI | TI-1) + 0.1 FreqC(TI).
Compare the probabilities of taggings "N O O" and "O O O" for the sentence "Fish can swim".