G22.2591 - Advanced Natural Language Processing - Spring 2009

Lecture 5

Name Recognition ... final words

Discuss combining multi-view approaches and active learning.

Hyponymy and Lexico-Syntactic Patterns

Name tagging provides useful but limited information because it is generally based on a limited number of broad categories (person, organization, location, ...).  If we want to know "Who shot JR?" and the system knows that "who" is answered by a person name, then we can look for an answer of the form "<ENAMEX TYPE=PERSON>...</ENAMEX> shot JR".   On the other hand, if the question was posed as "Which rancher shot JR?" we would have to know that a rancher was a kind of person, or have a list of ranchers, in order to answer the question.

The relation between a general class X and a more specific class Y is the hypernym/hyponym relation:  X is a hypernym of Y, Y is a hyponym of X.  Thus rancher is a hyponym of person. 

Hyponym relations are valuable for many NLP tasks and applications.  They are useful for answering questions, information extraction, and generally for applying selectional (semantic) constraints.  So they have been a long-term object of study in NLP.

One early approach to acquiring hyponomy was through the use of machine-readable dictionaries.  Dictionary definitions are written in a standard style ("Y (n) 1. an X which ...") that allows the extraction of hyponym relations without full parsing.  Several dictionaries (Longman's, OALD [Oxford Advanced Learner's Dictionary], Merriam-Webster Pocket) were intensively analyzed in the 1970's and 80's for this purpose.  One problem which made the construction of a hyponym graph from the individual relations is the presence of words with multiple senses. Hyponymy is really a relation between word senses, not words;  lumping the senses together produces a lot of false connections.

This problem was addressed by George Miller and his colleagues in the creation of WordNet starting in the early 1990's.  The basic nodes in the WordNet graph are synsets [synonym sets], which are sets of synonymous word senses.  However, WordNet shares the problems of most large-scale hand-built resources;  it is somewhat inconsistent and incomplete.  It is intended to cover general word usage and so is not adequate for more specialized texts.

To complement (or, for specialized domains, replace) WordNet, interest grew in the 1990's in learning hypoym relations from corpora.  Marti Hearst pointed out that many such relations could be acquired from a few lexico-syntactic patterns (patterns combining specific lexical items and syntactic structures):

Marti Hearst. Automatic acquisition of hyponyms from large text corpora. COLING 1992.

Marti Hearst's system was based on manually constructed patterns;  Rion Snow showed how to train a hyponym classifier which learned such patterns from WordNet and a large text corpus.

Rion Snow, Daniel Jurafsky, and Andrew Ng. Learning syntactic patterns for automatic hypernym discovery. NIPS 2004.

presentation by Ang Sun


Looking ahead:
The latest word from Google on using lexico-syntactic patterns ...

Marius Pasca and Benjamin Van Durme
Weakly-Supervised Acquisition of Open-Domain Classes and Class Attributes from Web Documents and Query Logs
ACL-2008.

And KnowItAll, which also starts with lexico-syntactic patterns:

Oren Etzioni et al.
Methods for domain-independent information extraction from the Web: an experimental comparison
AAAI 2004.