G22.2591 - Advanced Natural Language Processing - Spring 2004
(Discuss results of sense disambiguation)
(Discuss term project topics)
Coreference: corpus-trained methods -- paper presentations
Niyu Ge; John Hale; Eugene Charniak. A Statistical
Approach to Anaphora Resolution. WVLC 1998.
Remark: when gains are
small, assessing significance is important. Partitioning data
allows you to evaluate significance of gain.
Wee Meng Soon; Daniel Chung Yong Lim; Hwee Tou Ng. A Machine
Learning Approach to Coreference Resolution of Noun Phrases.
Computational Linguistics 27 #4, 521-545 (2001).
(1) decision trees fragment the data
space ... they make a separate decision about each conjunction of
features. In contrast, Naive Bayes and MaxEnt combine effects of
different features multiplicatively ... each feature has a separate
effect on end result, independent of other features.
(2) fragmentation can be an issue because there isn't that much data
... for example, MUC-6 a little over 1,000 coreference links (positive
Vincent Ng; Claire Cardie. Improving Machine
Learning Approaches to Coreference Resolution. ACL 2002.
Sanda M. Harabagiu; Razvan C. Bunescu; Steven J. Maiorano. Text and
Knowledge Mining for Coreference Resolution. NAACL 2001.
Presentation by Yusuke Shinyama.