G22.2591 - Advanced Natural Language Processing - Spring 2009

Lecture 9

Further discussion of Assignment #2 results.

Discussion of Assignment #3

Discussion of Text Analysis Conference (TAC) 2009 Knowledge-Base Population (KBP) task as a possible source of term projects.

Relation Extraction (cont'd)

As we saw in looking at examples of ACE relations, identifying relations accurately may depend heavily on first identifying and classifying their arguments accurately. The relation is quite different for French auto worker and Ford auto worker. In particular, many relation taggers operate by considering every pair of entity mentions (basically, noun phrases representing people, places, or organizations) and applying a classifier to the pair, returning either no relation or the type of some relation. Poor performance of the entity mention tagger severely compromises the relation tagger.

To make matters worse, what the relation tagger finds is a relation between noun phrases. (In ACE parlance, they are relation mentions which represent relations between entity mentions.) If the noun phrase is an anaphoric reference to a named entity, ["Bernard Madoff was jailed yesterday. ... His wife Mabel remained in seclusion. or ... The swindler's wife, Mabel, remained in seclusion.] we really want to recover the name of the entity. This requires establishing coreference between entity mentions.

Semi- and un-supervised methods

All these methods involve bootstrapping which alternates between finding pairs of arguments and finding the contexts ('patterns') of these arguments.  This is analogous to the co-training (between spelling features and context features) used in semi-supervised NE tagging.

All of these methods are based on named arguments ... anaphoric arguments are not considered.

Sergei Brin. Extracting Patterns and Relations from the World Wide Web. (Also available in PDF)  In Proc. World Wide Web and Databases International Workshop, pages 172-183. Number 1590 in LNCS, Springer, March 1998.
Eugene Agichtein and Luis Gravano,   Snowball: Extracting Relations from Large Plain-Text Collections, [slides ]   In Proc. 5th ACM International Conference on Digital Libraries (ACM DL), 2000
Deepak Ravichandran and Eduard Hovy, Learning Surface Text Patterns for a Question-Answering System, ACL 2002.
Takaaki Hasegawa, Satoshi Sekine, Ralph Grishman Discovering Relations among Named Entities from Large Corpora. ACL 2004.

Looking ahead: event/scenario extraction

Riloff, E. (1996) "Automatically Generating Extraction Patterns from Untagged Text" (postscript, pdf)   Proc. Thirteenth National Conference on Artificial Intelligence (AAAI-96) , 1996, pp. 1044-1049.

Roman Yangarber; Ralph Grishman; Pasi Tapanainen; Silja Huttunen.  Automatic Acquisition of Domain Knowledge for Information Extraction.  Proc. COLING 2000.