G22.2590 - Natural Language Processing - Spring 2003 Prof. Grishman

Lecture 5 Outline

February 20, 2003

Discussion of Assignment #2

Question 1:  syntactic structure Question 2:  rules for pronouns

POS Tagging with Hidden Markov Models:  conclusion

Powerpoint slides on Viterbi Algorithm

The JET tagger:  possible improvements and extensions

Feature Grammars

Problems of context-free grammars

Capturing constraints:
Number agreement
Case constraints (only on pronouns in English)
Count noun constraint
Subcategorization
Selection
… these can be captured by context-free grammar, but not efficiently

Regularization - accounting for ‘displaced constituents’ / movement

Feature structures (J&M 11.1)

Instead of having atomic symbols ('noun', 'NP'), the nodes of the parse tree will have feature structures:  sets of feature-value pairs (or attribute-value pairs). We will represent these in the form [attribute1 = value1, attribute2 = value2, ...].  For example, third-person-singular could be represented as [number = singular, person = 3].

We can include the category as a feature cat in the feature structure:  [cat = NP, number = singular, person = 3].  We can also nest the feature structures, with the value of a feature being another feature structure:  [cat = NP, agreement = [number = singular, person = 3]].

If X is a feature structure, then we will write the value of feature f of X as X.f.  If the feature structure does not specify a value for f, we say X.f = null;  furthermore, null.f = null.

A feature path is a sequence of one or more feature names which is used to select a value from a feature structure:  <f1 f2> applied to X gets (X.f1).f2.  For example, <agreement number> applied to [cat = NP, agreement = [number = singular, person = 3]] yields 'singular'.

[ discuss feature graphs ... reentrant structures ]

Unification

Unification is a binary operation on feature structures.  It can fail (structures are not unifiable -- no result is returned), or it can succeed and return a new feature structure.

The definition of unify(X,Y) is recursive:
    if X = null, return Y
    if Y = null, return X
    if X and Y are the same atomic value, return X
    if X and Y are both feature structures, then
        create a new feature structure Z
        for each feature f in either X or Y
            add f with the value unify(X.f, Y.f) to Z
                (note that if X.f and Y.f cannot be unified, the entire process fails)
        return Z
    else fail

(note that this version does not account for reentrant structures;
J&M describe a more efficient version which avoids copying and accounts for re-entrancy, p. 423)

Expressing Constraints in the Grammar

To a rule
    S := A B C
we can add a constraint of the form
    <node.path> = atomic-value
or
    <node.path> = <node'.path'>
where node and node' are one of S, A, B, or C.  Here the operation '=' is to be interpreted as unification.

To capture number agreement, we would write
    S := NP VP
        <NP number> = <VP number>
If we wanted to check agreement in both number and person, we could write
    S := NP VP
        <NP number> = <VP number>
        <NP person> = <VP person>
or we could group both number and person under an 'agreement' feature, as shown above, and write
    S := NP VP
        <NP agreement> = <VP agreement>

Similarly, we could check determiner - head agreement by
    NP := DET N
        <DET agreement> = <N agreement>

Lexical features

Dictionary definitions will now not be simple word categories, but rather feature structures:

    "mouse"    cat = n, number = singular;
    "mice"       cat = n, number = plural;

(A feature can be omitted to indicate that it can take on either value:    "fish"    cat = n; )

Feature propagation and head features

We require rules of the form
    NP := DET N
        <NP agreement> = <N agreement>
    VP := V NP
        <VP agreement> = <V agreement>
to propagate features up the tree.  In most cases, we will find that a group of features are systematically passed from one child node to the parent.  These features are called head features, and the child is called the head of the phrase.  Thus the N is the head of the NP, and the V the head of the VP.

Parsing and feature constraints

To modify a parser to incorporate feature tests
    extend parser to keep a feature structure with each node of the parse tree
    apply the feature tests when completing a node;  reject the node if the tests fail

In searching the chart for a prior node, we must now check that not just the grammar symbol(s) and start and end nodes are the same, but also that the feature structures are the same (J&M point out that a subsumption test for feature structures is sufficient -- p. 432).