G22.2590 - Natural Language Processing - Spring 2006 Prof. Grishman

Lecture 2 Outline

January 24, 2006

Role of Syntax Analysis

Basic Syntactic Structures of English (J&M 9.1 - 9.8)

Comparison with other Languages

Phrase-structure languages (J&M 9.2)

  • productions; rewrite operation; derivation
  • Chomsky hierarchy (regular grammars, context-free grammars, context-sensitive grammars)
  • [J&M 9.11, 13.1]

    A small context-free English grammar

    sentence := np vp;
    np := n | art n | art adj n;
    vp := v | v np;
    Including auxiliaries vp := v | v np | v vp; Including PPs sentence := np vp;
    np := ngroup | ngroup pp;
    ngroup := n | art n | art adj n;
    vp := v | v np | v vp | v np pp;
    pp := p np;

    Parsers

    Top-down recognizer / parser (J&M 10.2 -- algorithm p. 366)

    Bottom-up (immediate-constituent) parser (Grishman 2.4.2)

    Uses tree nodes with components root (a non-terminal grammar symbol),
    start and end (token numbers), and
    constituents (a vector of parse tree nodes)
    For i = 1 , … number of words in sentence Create a node with root = part of speech of word i, start = i, end = i+1 (if the word has several parts of speech, create one node for each P.O.S.) Put this node on list todo
    While todo is not empty, Remove node n from todo
    If there exists a production A à a1 a2 … aj such that root(n) = aj
    and there exist nodes n1 … nj-1 such that root(nk)=ak and end(nk)=start(nk+1) (k=1,…,j-1),
    then create a new node with root = A, start = start(n1), end = end(n) and add it to todo.