[Next] [Previous] [Top] [Back to MUC-6 main page]

Coreference Task Definition

3 WHAT THINGS TO ANNOTATE

3.1 - Markables
3.2 - Names and Other Named Entities
3.3 - Gerunds
3.4 - Pronouns
3.5 - Bare Nouns
3.6 - Implicit Pronouns
3.7 - Conjoined Noun Phrases

3.1 Markables

The coreference relation will be marked between elements of the following categories: NOUNS, NOUN PHRASES, and PRONOUNS. Elements of these categories are MARKABLES. PRONOUNS include both personal and demonstrative pronouns, and with respect to personal pronouns, all cases, including the possessive. Dates ("January 23"), currency expressions ("$1.2 billion"), and percentages ("17%") are considered noun phrases.

The relation is marked only between pairs of elements both of which are markables. This means that some markables that look anaphoric will not be coded, including pronouns, demonstratives, and definite NPs whose antecedent is a clause rather than a markable. For example, in

Program trading is "a racket," complains Edward Egnuss, a White Plains, N.Y., investor and electronics sales executive, "and *t's not to the benefit of the small investor*, *that*'s for sure."

Though "that" is related to "it's not to the benefit of the small investor", the latter is not markable, so no antecedent is annotated for "that".

3.2 Names and Other Named Entities

Names and other Named Entities (as defined in the MUC-6 document titled "Named Entity Task Definition" -- dates, times, currency amounts, and percentages) are all markables. A substring of a Named Entity, however, is not a markable. Thus in

*London* ... *London*-based ...

the two instances of London are to be marked coreferential; in

*Reuters Holding PLC* ... *Reuters* announced that

"Reuters Holding PLC" and "Reuters" are to be marked coreferential. But in

Equitable of Iowa Cos. ... located in Iowa.

the two instances of "Iowa" are NOT to be marked as coreferential since the first is not a markable: it is a substring of a Named Entity. Date expressions recognized by the Named Entity task are also treated as atomic; components of a date are not separate markables. Thus, in

In a report issued January 5, 1995, the program manager said that there would be no new funds this year.

no relation is to be marked between "1995" and "this year".

3.3 Gerunds

Gerunds (verbal forms using a present participle) are not markable. In

*Slowing the economy* is supported by some Fed officials; *it* is repudiated by others.

one should not mark the relation between "slowing the economy" and "it". A phrase headed by a present participle is taken to be verbal if it can take an object (as in the above example) or can be modified by an adverb.

Present participles which are modified by other nouns or adjectives ("program trading", "excessive spending"), are preceded by "the" or are followed by an "of" phrase ("the slowing of the economy") are to be considered noun-like and ARE markable.

3.4 Pronouns

The possessive forms of pronouns used as determiners are markable. Thus in

its chairperson

there are are two potential markables for relations: "its" and the entire NP, "its chairperson". Similarly, in "the man's arm", there are two markables.

First, second, and third-person pronouns are all markable, so in

"There is no business reason for *my* departure", *he* added.

"my" and "he" should be marked as coreferential. Reflexive pronouns are markable, so in

*He* shot *himself* with *his* revolver.

"He", "himself", and "his" should all be marked coreferential.

3.5 Bare Nouns

Prenominal occurrences of nouns, e.g., in compound nouns, are markable. Thus in

The price of *aluminum* siding has steadily increased, as the market for *aluminum* reacts to the strike in Chile.

the relation between the two occurrences of "aluminum" should be marked. Note this presupposes that the two occurrences co-refer; they do, they both refer to the type of material.

While nouns in prenominal positions are markable, the noun which appears at the head of a noun phrase is not separately markable -- it is markable only as part of the entire noun phrase. Thus in the passage

Linguists are a strange bunch. Some linguists even like spinach.

it would not be correct to link the two instances of "linguists".

3.6 Implicit Pronouns

Assume that English has no zero pronouns; in other words, the empty string is not markable. In

Bill called John and spoke with him for an hour.

there is no relation between the implicit subject of "spoke" and "Bill".

Do not code relations between a relative pronoun and the head it attaches to or the gap that it fills.

3.7 Conjoined Noun Phrases

Noun phrases which contain two or more heads (as defined in section 4.1) are NOT markable. This restriction is imposed so that each markable can be identified by a unique contiguous head substring. Thus no coreference is to be marked for

The boys and girls enjoy their breakfast.

The individual conjuncts are markable if they are separately coreferential with other phrases:

<COREF ID="1">Edna Fribble</COREF> and <COREF ID="2">Sam Morton</COREF> addressed the meeting yesterday. <COREF ID="3" REF="1" TYPE="IDENT" MIN="Fribble">Ms. Fribble</COREF> discussed coreference, and <COREF ID="4" REF="2" TYPE="IDENT" MIN="Morton">Mr. Morton</COREF> discussed unnamed entities. If the conjuncts share modifiers, the coreference is optional: <COREF ID="1" MIN="Fribble">Ms. Fribble</COREF> was <COREF ID="2" REF="1" TYPE="IDENT" STAT="OPT">president</COREF> and <COREF ID="3" REF="1" TYPE="IDENT" STAT="OPT" MIN="CEO"> CEO of Amalgamated Text Processing Inc.</COREF>


Coreference Task Definition - 31 MAY 95
[Next] [Previous] [Top] [Back to MUC-6 main page]

Generated with CERN WebMaker