G22.2590 - Natural Language Processing -- Spring 2003 -- Prof. Grishman
February 7, 2005
You may want to use the Jet parser for the first three exercises,
particularly for #2. Keep in mind in doing these exercises that many
words have several parts of speech.
- Using the tiny grammar below, draw the
two parse trees for the sentence “The fair features live music.”
(If you do this with Jet, note that this grammar is slightly different
from that provided as grammar1.)
Suggest a constraint which would
resolve this ambiguity.
sentence := np v | np v np;
np := n | art n | adj n | art adj
- Using the same tiny grammar,
the efficiency of the top-down backtracking parser, a bottom-up
constituent) parser, and a top-down chart parser on the sentence “The
answers questions.” For the two grammar symbols, sentence and np
Modify this grammar to capture subject-verb number
agreement. Does it now produce one parse?
J&M exercise 8.1. (practice tagging some text)
- compare the backtracking and chart parsers
with respect to the number of times the symbol is expanded (indicated
by the message "Seeking ..." on both parsers)
- compare all three parsers with respect
to the number of times a complete constituent
using that symbol is
(4 points: 1 point each)
- the names of some of the parts of speech in the Penn Treebank
were later modified to avoid conflicts with phrase categories.
For example, personal pronoun was changed from PP to PRP to avoid
conflict with the symbol for "Prepositional Phrase".
Unfortunately, Table 8.6 in the book uses the old set, while the
exercise uses the new set. You can find the new set in the updated
version of chapter 8 or in
the Jet documentation.
- in the Penn set, the word "to" is always tagged "TO", whether
it is an infinitival marker or a preposition
Due February 28th.