G22.2591 - Advanced Natural Language Processing - Spring 2011

Lecture 13

Coreference, cont'd

The past decade has seen a great deal of work on building supervised coreference components.  These vary in
We will consider today three variants on this research:

Learning accurate constraints from corpora

Aria Haghighi and Dan Klein, Simple coreference resolution with rich syntactic and semantic features.  EMNLP 2009. 
Is it possible to do coreference without a statistical coref model?  Haghighi and Klein learn rich lexical, syntactic, and semantic constraints and combine them to create a high performance coref system without the need to train a coref model.  (Further improved on by Haghighi and Klein, Coreference resolution in a modular, entity-centered model, NAACL 2010.)

Unsupervised models

Colin Cherry and Shane Bergsma.  An expectation maximization approach to pronoun resolution.  CoNLL 2005.

One of the earliest successful efforts at unsupervised coreference.

First limits the search space for antecedents of pronouns
Then defines a generative probabilistic model for a document with pronouns

P(resolved document) = product(pronoun positions) sum(c) P(p, k | c) P(c)

P(p, k | c) = P(p | c) P(k | c) = P(p | l) P(k | l) P(l) P(j)

where 'l' is the lexical content of the antecedent c and 'j' (the jump value) is its position in the candidate list

We can then use EM to learn the separate probabilities in order to maximize the probability of the resolved document. This involves alternating E steps and M steps.

In the E-step, we compute (fractional) counts of antecedents for each pronoun. 

P(c | p, k) = P(p | l) P(k | l) P(l) P(j) / sum(c') P(p | l') P(k | l') P(l') P(j')

Here a particular c defines values for l and j.  Given these P(c | p, k), and hence the counts of <p, l>, <k, l>, ... we can compute maximum likelihood probabilities such as

P(p | l) = #<p, l> / #l

in the M step.

They got a pronoun resolution accuracy of 66%, compared to 71% for a supervisied SVM resolver trained on 1400 examples.

More recent work on unsupervised coreference ...

Vincent Ng, Unsupervised models for coreference resolution.  EMNLP 2008.

Hoifung Poon and Pedro Domingos, Joint unsupervised coreference resolution with Markov logic, EMNLP 2008.

Discourse constraints

David Bean and Ellen Riloff. Unsupervised learning of contextual role knowledge for coreference resolution. Proc. HLT/NAACL 2004.

Can we make use of discourse information -- event sequences -- to resolve coref?  Bean and Riloff learn pairs of predicates which are likely to govern references to the same entity, starting with reliable coreference pairs. Applied in two narrow domains, terrorism and disasters.  (Generalized by Liao and Grishman, Large corpus-based semantic feature extraction for pronoun coreference, Proceedings of the Second International Workshop on NLP Challenges in the Information Explosion Era (at COLING 2010)).