G22.2590 - Natural Language Processing - Spring 2003 Prof. Grishman

Lecture 11 Outline

April 10, 2003

[Using NE for text analysis:  the Perseus Project;  more on such 'low-level' applications next week in a guest lecture by Prof. Dan Melamed.]

Resolving Ambiguity

Types of ambiguity

Selectional constraints

Constraints on the words (word senses) allowed as arguments and modifiers of other words (word senses)
(cf. syntactic constraints on lexical items).  (J&M p. 614-619).

Recording every acceptable (sense) word combination is impractical, so ...
We gather word senses into word classes, in a hierarchical structure (tree or directed acyclic graph)
And then record selectional constraints in terms of the class(es) of words acceptable as an argument / modifier of another word (or word class)
These word classes usually correspond to ‘conceptual’ (semantic) classes

Word senses may be organized into a taxonomy, connected by hyponomy ('isa') relations (J&M p. 600).
WordNet is the most widely used taxonomy of English (J&M sec. 16.2);  similar taxonomies have been produced for many other languages (see the Global WordNet Association).

Associate selectional constraints with semantic roles (or treat different syntactic position --> semantic role assignments as different senses) to resolve semantic role ambiguities.

Applying selectional constraints

Direct approach:  enumerate all semantic interpretations (logical forms), and see which ones satisfy all constraints
        Problem:  there may be very many interpretations
Factoring ambiguities (esp. word sense ambiguities)
        More efficient for highly ambiguous sentences
        Use iterative constraint satisfaction algorithm which eliminates senses
During or after the parse?
        Can apply constraints during parse, blocking the creation of partial parses if they do not satisfy selectional constraints
        Can significantly reduce number of partial parses, but …makes each parsing step slower.

Semantic grammar

Instead of having a syntactic grammar and separate selectional constraints, we can write a grammar in terms of semantic categories which then directly incorporates the selectional constraints (J&M p. 575-577)

Syntactic vs. semantic grammars:

Semantic grammars provide a simple approach to limited sublanguages (capture both syntactic and semantic constraints in a single component)
- convenient for constructs which fall outside general language syntax
But they lose power of syntactic generalization … each semantic pattern must appear in each of its syntactic forms (active, passive, question, …), and so are cumbersome for broad-coverage systems.
Introducing semantic classes into Jet:  the concept hierarchy.

Acquiring selectional constraints

For restricted sublanguages (esp. technical domains), selectional restrictions may be quite ‘sharp’ and can be captured by manual text analysis
For broader coverage, capturing selectional constraints very difficult
Can be acquired from tree banks (hand-parsed corpora), or
Learned from unambiguous examples in machine-parsed corpora
Corpus-trained approaches allow one to gather statistics on selection, making them selectional preferences rather than strict constraints
    - compute probability of each head-relation-argument triple
    - can generalize using a semantic hierarchy (e.g., WordNet, thesaurus)
Corpus-trained methods can also be used to acquire statistics for word-sense disambiguation (J&M sec. 17.1, 17.2)
    - largely dependent on corpora hand-tagged with word senses