### Lecture 12 Outline

April 23, 2013

Term projects: presentations.

## Statistical Parsers, cont'd

### Evaluating Parsers

Constituent Parsers: the accuracy of constituent parsers is stated in terms of labeled constituent recall/precision/F-measure when compared to a standard parse. Comparison against a standard parse is feasible because UPenn parses have become such a widely used standard. (Text, sectionn 14.7). (Sofware at http://nlp.cs.nyu.edu/evalb/.)

Dependency Parsers: the accuracy of dependency parsers is generally stated in terms of the fraction of tokens for which the proper head and dependency label is assigned; unlabeled dependency may also be reported (see, for example, Nivre and Scholz).

Typical scores for constituent and dependency parsers are now in the low 90's.

### Fast Dependency Parsers

Dependency parses can be generated easily from constituent parses, so we can generate a constituent parse using a CKY parser in time n3 and then convert it to a dependency parse. In the past few years, there has been considerable interest in producing dependency parses directly and quickly.

One benefit of dependency trees is that they can handle discontinuous constituents, as in "A hearing is scheduled on the issue today" [Nivre and McDonald 2008]. These are not common in English, but are much more common in languages with freer word order. Trees with such crossing edges are termed non-projective dependency parses.

We will discuss three general approaches: graph-based, transition-based, and easy-first.

#### Graph-based parsers

Graph-based parsers make an exhaustive search of possible dependency structures, seeking the highest-scoring tree. The score of a tree is the product (or sum) of the scores of the individual arcs; the score of an arc may represent its probability (as for a probabilistic constituent grammar) or some other linear combination of features. There are implementations which are similar to CKY parsers and take time O(n3) (J. Eisner, COLING 1996).

#### Transition-based parsers

Transition-based parsers (also called shift-reduce parsers) are deterministic left-to-right parses. They are similar to the parsers used for programming languages. Given an input sequence and a stack, at each step the parser can push the next word onto the stack or link the top item on the stack with the next word in the input. The decision is made by a classifier which takes as inputs the current state of the stack and a limited number of words ahead in the input sequence. The classifier is trained by converting each dependency tree to a transition sequence which generates that tree. This is a linear-time (O(n)) algorithm.

Making deterministic decisions with limited look-ahead limits the accuracy of the parser. The effect can be reduced by maintaining multiple states. Beam search keeps the top beam-width states. Equivalent states can be merged (Huang and Sagae 2010).

One widely-used parser of this type is the MALT parser.