## Solution Set 7

### Problem 1

The famous first two sentences of Pride and Prejudice are
It is a truth universally acknowledged that a single man in possession of a good fortune must be in want of a wife.

However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered as the rightful property of some one or other of their daughters.

Describe two examples of lexical ambiguity, two examples of syntactic ambiguity, and one example of semantic ambiguity in the first sentence. (As regards the semantic ambiguity: Of course the whole sentence is ironic in tone, but that's not the ambiguity I'm looking for.) Describe an example of anaphoric ambiguity in the second sentence.

Lexical:
"single" = unmarried / one.
"good" = virtuous / large
"fortune" = luck / wealth.

Syntactic:
"on his first entering a neighborhood" can attach to "be" (indicating the time) or to "feelings or views" (as in "views on the election").
"of good fortune" could attach to "man" or "possession".
"of a wife" could attach to "want" or "be".
(Actually, the last two are easily resolved syntactically, because PP's with "of" almost always attach to the preceding word, except in series of conjoined phrases with "of" such as " 'The time has come,' the Walrus said, 'to talk of many things, of shoes and ships and sealing wax, of cabbages and kings.' ")

Semantic:
"a single man": Some specific single man, or all single men?
"a wife" generally means "a woman who is currently married", but here "in want of a wife" means "wishing (or needing) to make some woman his wife."

Anaphoric:
"their" in "their daughters" could refer to "families", "minds", "views", or "feelings".
"this truth" can refer either to the fact that "A single man of good fortune must be in want of a wife," or to the fact that "It is universally acknowledged that a.s.m.o.g.f.m.b.i.w.o.a.w."

### Problem 2

The second sentence above contains the following lexical ambiguities, among others: "Views" means opinions but could mean "directions for looking at" (e.g. "Our hotel room had a great view of Niagra Falls.") "Fixed" means "believed" but could mean "repaired".
• A. Explain how the wrong interpretation of "fixed" can be excluded using selectional restrictions.
Answer: "Repaired" can only be applied to concrete artifacts, not to abstractions like "truth".
• B. Argue that the wrong interpretation of "views" cannot be excluded using selectional restrictions. Sketch how it could be excluded using world knowledge.

Answer: The wrong meaning of "view" is compatible with "of a man" (compare "I got a good view of Senator Kerry at the rally,") and it is compatible with "known" (compare "The view of the Manhattan skyline from Queens is well known.") To find the correct disambiguation strategy, note that, even if we disregard everything else in the sentence, just the sentence "The views of this man are well known," requires the interpretation "opinion". The point is that "view" as in "view of Niagra falls" means something like "appearance as viewed from a particular standpoint", and though this is possible for a moving object like a person at a single time, it is pretty much impossible for such an appearance to be known in a general way. The point is quite subtle, though, and I apologize for giving such a tough problem.

### Problem 3

Suppose that we are doing text tagging using the trigram model. To make life simple, we'll consider the case where there are only two tags: N (noun) and O (other). We have a corpus of 1,000 sentences of the following forms:
500 instances of N O N.
300 instances of N O O N.
200 instances of O N O N.

Moreover, the corpus contains:
the word "fish" 100 times: 70 as N and 30 as O.
the word "can" 200 times: 20 as N and 180 as O.
the word "swim" 50 times: 5 as N and 45 as O.

Assume, finally, that we are using the following smoothing function for transitions:
Prob(TI | TI-2, TI-1) = 0.6 FreqC(TI | TI-2, TI-1) + 0.3 FreqC(TI | TI-1) + 0.1 FreqC(TI).
Compare the probabilities of taggings "N O O" and "O O O" for the sentence "Fish can swim".