G22.2590 - Natural Language Processing -- Spring 2003 -- Prof. Grishman

Assignment #7

Assigned March 6, 2003  -- Corrected March 24, 2003

1.  With the grammar

s  =  np vp
vp =  v np pp
vp =  v np
np =  n
np =  n pp
pp =  p np
the sentence “Students eat pizza with relish.” gets two parses.  Suppose we were given a training corpus of 3 sentences, with their parses:
(s (np (n Men) (pp (p of) (np (n distinction)))) (vp (v like) (np (n broccoli))))
(s (np (n Men)) (vp (v like) (np (n ham) (pp (p with) (np (n eggs))))))
(s (np (n Men)) (vp (v eat) (np (n eggs)) (pp (p with) (np (n gusto)))))
Suppose we used these three parses to train a probabilistic CFG.  What probability would be assigned to each production?  What probability would be assigned to the two parses for “Students eat pizza with relish.”?  In your calculation, consider only the probabilities of the productions;  we are not concerned with the probabilities of generating specific lexical items.

2.  Try your ChunkPatterns from Assignment #6 on the Washington Square News article (see the instructions on the Jet resources page).  Report your score with and without the pattern enhancements required for assignment #6.

We will give a small extra credit for additional, linguistically motivated enhancements which further improve the noun group score.

Due March 27th.