Coreference Task Definition
In preparing the key, the text element to be enclosed in SGML tags is the maximal noun phrase; the head will be designated by the MIN attribute.
[We expect that in the future it may be possible, when separate noun phrase bracketings are available, to automatically generate the maximal NP markup from a markup using only heads.]
<COREF MIN="task" ...>the coreference task</COREF>
<COREF MIN="contract" ...>the last contract</COREF> you will ever get
<COREF MIN="quantity" ...>a large quantity of sugar</COREF>
<COREF MIN="tons" ...>about 200,000 tons of sugar</COREF>
If the head is a name, the entire name is marked. This includes suffixes such as "Sr.", "III", etc. on personal names and "Corp." on organization names; it does not include personal titles or any modifiers. We follow in this regard the rules for marking personal and organization names for the Named Entity task.
<COREF MIN="Frederick F. Fernwhistle Jr." ...>the Honorable Frederick F. Fernwhistle Jr.</COREF>
<COREF MIN="Ford Motor Co." ...>Ford Motor Co. of Dearborn, Michigan<COREF>
<COREF MIN="Georg Rath" ...>Herr Dr. Georg Rath</COREF>
In the case of location designators consisting of multiple names, each name is considered a separate unit (as in the Named Entity task) and the head is generally the first of these names, with the others treated as modifiers of the first name:
<COREF MIN="Newark" ...>Newark, New Jersey</COREF>
Dates, currency amounts, and percentages are also treated as atomic units, as in the Named Entity task:
<COREF MIN="December 7, 1941" ...> December 7, 1941, a day which will live in infamy,</COREF>
<COREF MIN="$1.2 million" ...>$1.2 million in crisp bills</COREF>
<COREF MIN="20%">20% of the shares</COREF>
In the case of "headless" constructions, the "head" -- for coreference purposes -- shall be the last token of the noun phrase preceding any prepositional phrases, relative clauses, and other "right modifiers":
<COREF MIN="seven" ...>seven of the best</COREF>
<COREF MIN="five" ...>the five who were left standing</COREF>
<COREF MIN="youngest" ...>the six youngest</COREF>
If the maximal noun phrase is the same as the head, the MIN need not be marked.
*Mr. Holland*
*the senior of the executives who will assume Holland's duties*
*the rumor that the war had ended*
*Fred Frosty, the ice cream king of Tyson's Corner,*
*the Penn Central Co., which used to run a railroad,*
XYZ Inc. formed *a joint venture with Sony*
Note that in the fourth and fifth cases the final comma may be viewed as part of the NP, and so is included in the maximal NP; in the last case, "with Sony" could equally well be taken to modify "venture" or "formed", and so is included as part of the maximal NP around "venture". Note also that in the "Fred Frosty" example, there is a coreference between the entire noun phrase and the appositional phrase, "the ice cream king of Tyson's Corner"; see section 5.3 for a discussion of this construct.
In the case of a pair of conjoined noun phrases with shared complements or modifiers, the maximal noun phrases will NOT include the conjunct. The maximal NP for the first conjunct will include all of the NP up to the conjunction; the maximal NP for the second conjunct will include all of the NP following the conjunction:
<COREF ID="1" MIN="Fribble">Ms. Fribble</COREF> was <COREF ID="2" REF="1" TYPE="IDENT" STAT="OPT">president</COREF> and <COREF ID="3" REF="1" TYPE="IDENT" STAT="OPT" MIN="CEO"> CEO of Amalgamated Text Processing Inc.</COREF>
Generated with CERN WebMaker