G22.2591 - Advanced Natural Language Processing - Spring 2004

Lecture 10

(Discuss results of sense disambiguation)

(Discuss term project topics)

Coreference:  corpus-trained methods -- paper presentations

Pronoun Coreference
Niyu Ge; John Hale; Eugene Charniak.  A Statistical Approach to Anaphora ResolutionWVLC 1998.

Hale's WVLC slides.

Remark:  when gains are small, assessing significance is important.  Partitioning data allows you to evaluate significance of gain.
General Coreference
Wee Meng Soon; Daniel Chung Yong Lim; Hwee Tou Ng.  A Machine Learning Approach to Coreference Resolution of Noun Phrases.  Computational Linguistics 27 #4, 521-545 (2001).

Presentation by Iman Sen.

Remarks: 
(1) decision trees fragment the data space ... they make a separate decision about each conjunction of features.  In contrast, Naive Bayes and MaxEnt combine effects of different features multiplicatively ... each feature has a separate effect on end result, independent of other features.
(2) fragmentation can be an issue because there isn't that much data ... for example, MUC-6 a little over 1,000 coreference links (positive training examples).

Vincent Ng; Claire Cardie.  Improving Machine Learning Approaches to Coreference Resolution.  ACL 2002.

Slides.

Sanda M. Harabagiu; Razvan C. Bunescu; Steven J. Maiorano.  Text and Knowledge Mining for Coreference Resolution.  NAACL 2001.

Presentation by Yusuke Shinyama.