G22.2590 - Natural Language Processing -- Spring 2008 -- Prof. Grishman

Assignment #5

February 21st, 2008

[4 points]  Consider the J&M grammar shown in Figure 9.3 on page 330, and reproduced here in a Jet-like form (but with alternatives written as separate productions):

sentence := np vp;
np       := pronoun
np       := propernoun;
np       := det nominal;
nominal  := noun nominal;
nominal  := noun;
vp       := verb;
vp       := verb np;
vp       := verb np pp;
vp       := verb pp;
pp       := preposition np;

Using feature constraints as presented in section 11.3, extend this grammar to handle
As was the case for assignment #2, you do not need to take special account of the verb "be" ... it is enough to assume that a verb has two present tense forms, singular and plural.  Provide a sample dictionary containing at least the pronouns "he", "him", and "they", a singular and plural noun, a singular and plural verb, a singular and plural determiner, and a preposition.

The grammar should consist of lines with productions, each followed by zero or more lines with feature constraints, such as

sentence := np vp;
  <np agreement> = <vp agreement>

np := det noun;
  <np agreement> = <noun agreement>
  <det agreement> = <noun agreement>

vp := verb;
  <vp agreement> = <verb agreement>

and a dictionary which specifies the features of individual lexical items, such as:

this,,   cat=det,  agreement = [number = sg];
these,,  cat=det,  agreement = [number = pl];
cat,,    cat=noun, agreement = [number = sg];
cats,,   cat=noun, agreement = [number = pl];
sleeps,, cat=verb, agreement = [number = sg];
sleep,,  cat=verb, agreement = [number = pl];

The basic Jet distribution does not support such feature grammars.  However, Guy Lichtman implemented such grammars as a Jet extension as his course project in 2005, and prepared a report describing this extension.  We are making this extension available and encourage its use, but will accept untested grammars.  If you use this implementation you should write grammar alternatives (a := x | y;) as separate productions (a := x; a := y;), as we did above, so that you can associate separate constraints with each production.

To use the featureGrammar extension, download a zip file containing a jar file, jet-feature.jar, which should go in the lib directory of jet, and a batch file, features.bat, which should go in the win directory (to run under Linux use this file instead and place it in the unix directory).  Then put in the props directory a Jet properties file, featureParse.jet, of the form

# JET properties file
Jet.dataPath         = data
Grammar.fileName     = fsgrammar.txt
EnglishLex.fileName1 = features.dict
processSentence      = lexLookup, featureParse

An ideal assignment solution would contain your grammar and dictionary and several tests, with each test checking one constraint (number agreement for nouns, number agreement for pronouns, ...).  Each test should include one sentence which satisfies the constraint and gets a parse, and a minimally different sentence which violates the constraint and does not get a parse.  Do not include parser traces in your submission;  it's too much paper.

Due February 28th.