G22.2591 - Advanced Natural Language Processing - Spring 2009
Discuss Assignment #2 results.
What are relations?
Relations are facts involving a pair of
ex: Moby Dick was written by Hermann
ex: Columbia University is located in New York City.
ex: Ralph Grishman works for New York University.
They generally represent states rather than events (but the distinction
between relations and events is not clear-cut).
Many can be described as attributes of one of the entities.
Relations were introduced as an NLP task for MUC-7
(1997) and extended for ACE
Why extract relations?
- Relations capture much of the connection between entities, and
can be used to build various entity networks (e.g., social networks).
- Many user queries ask about relations; if these can be
identified, the search engine can return the answer or at least the
sentence which probably has the answer. See for example the analysis
of Encarta questions:
of Factoid Questions for Effective Relation Extraction (poster),
Eugene Agichtein, Silviu Cucerzan, and Eric Brill,
ACM SIGIR International Conference on Research and Development in Information Retrieval (SIGIR), 2005
- In bioinformatics and genomics, much of the information is in relations
such as protein/gene interactions which can be extracted from articles
(see for example Claudio Giuliano, Alberto Lavelli, Lorenza Romano.
Exploiting Shallow Linguistic Information for Relation Extraction from Biomedical Literature, EACL 2006
What is the challenge?
The challenge for relation extraction
is the usual challenge for NLP: a coverage or paraphrase problem
... figuring out all the ways in which a relation may be expressed, or
all the more specific predicates which may imply a given relation (is a
professor at ==> works for; is taking a tour of ==> is
ACE has defined a set of 18 relations (2005 guidelines).
These include Employment, Part-Whole:Geographical, Located, Near, and
Note that relations may occur between any types of mentions -- names,
nominals, or pronouns.
Thus the phrase "his uncle" gives rise to a Family relation between
"he" and "uncle".
What features should we use to detect relations?
Let's look at some examples
Augmented parse tree method
S. Miller, M. Crystal, H. Fox, L. Ramshaw, R. Schwartz, R. Stone, R.
Weischedel, and the Annotation Group (BBN Technologies). BBN:
Description of the SIFT System as Used for MUC-7. MUC-7 Proceedings.
For MUC-7, BBN introduced a statistical
model for recognizing binary relations between entities -- the
'template relation' task introduced in that evaluation. (This task
involved a small number of relations, such as person -- organization,
and organization -- location.) They used a generative model based
on a parse tree augmented with semantic labels. The augmentation
is somewhat complicated (see Figure 3 of the paper). In
simplified terms, if a relation connects nodes A and B in the parse
tree, and the lowest node dominating both A and B is C, then they add a
semantic label to A, B, and C, and to all nodes on the paths from C to
A and B. In addition, in some cases a node is added to the tree
to indicate the type of relation and the argument.
A large training corpus of this form is generated in a semi-automatic
fashion. The relations are first annotated by hand. The
sentences are then parsed using a TreeBank-based parser, and the
resulting (syntactic) tree is augmented with information about the
relations. In this way a training corpus of about 1/2 million
words was produced. From this training corpus they then produce a
lexicalized probabilistic context-free grammar.
This grammar is then used to parse new (test) text; and the
relations present are gleaned from the semantic labels (if any) on the
Highlights of SVMs [Support Vector Machines]
For brief presentations of SVMs see
the chapter from
Introduction to Information Retrieval or the
from Andrew Moore.
- linear classifier
- select separating plane to maximize margin
- support vectors = data points closest to separating plane
- dealing with noisy (non-separable) data: soft margin
- kernel methods: replace dot product with kernel function to capture similarity of data points without reducing each data point to a vector of features
Dmitry Zelenko, Chinatsu Aone, and Anthony Richardella.
Methods for Relation Extraction. J. Machine Learning Research
3 (2003) 1083-1106.
SRA addressed the same
relation-extraction problem differently. They used a partial
parser (roughly, a chunker) and they used a discriminative method
(SVMs) instead of a generative one. The parse tree nodes contain a type
and a head or text field (Figure 1). To represent a relation, the
nodes get a 'role' field; for example, to capture a
person-affiliation relation, one node (the person) gets role=member and
one node (the organization) gets role=affiliation.
One advantage of SVMs is that we do not
have to explicitly enumerate the features which are used to classify
examples; it is sufficient to provide a kernel function which,
roughly speaking, computes a similarity between examples. As
their kernel, they used a measure of similarity between two
trees. Basically, two trees are considered similar if their roots
have the same type and role, and each has a subsequence of children
(not necessarily consecutive) with the same types and roles. The
value of the similarity depends on how many such subsequences exist,
and how spread out they are. All the training examples are
converted into such shallow parse trees with role labels, and used to
train the system; the SVM can then classify new examples of
They obtain an F measure of 0.87 for person-affiliation and 0.83 for
organization-location, although this is with hand-checked parses.
Shubin Zhao and Ralph Grishman.
Extracting Relations withh Integrated Information Using Kernel Methods.
The work at NYU used different types
of evidence to identify ACE relations: words, bigrams, the syntactic
path between the two arguments, and the local syntactic context of
each individual argument. A separate kernel function was written for
each, and then a composite kernel combining them all.
Looking ahead: semi- and un-supervised methods
Sergei Brin. Extracting
Patterns and Relations from the World Wide Web. (Also available in PDF)
In Proc. World Wide Web and
Databases International Workshop, pages 172-183. Number 1590 in
LNCS, Springer, March 1998.
Eugene Agichtein and Luis Gravano, Snowball:
Extracting Relations from Large Plain-Text Collections, [slides
] In Proc. 5th ACM
International Conference on Digital Libraries (ACM DL), 2000
Takaaki Hasegawa, Satoshi Sekine, Ralph Grishman
Discovering Relations among Named Entities from Large Corpora.