[Next] [Previous] [Up] [Top] [Back to MUC-6 main page]

A.1 Guidelines That Pertain to All Three TYPEs

A.1.1 - Entity names used as modifiers in complex NPs that are not proper names are to be tagged when it is clear to the annotator from context or the annotator's knowledge of the world that the name is that of an organization or person.
A.1.2 - In addition, entity names modifying person identifiers (where person identifier= title/role and/or name) are to be tagged.
A.1.3 - In some cases, multiword strings that are proper names will contain entity name substrings; such strings are not decomposable; therefore, the substrings are not to be tagged. (See A.1.2 re special cases involving prenominal modifiers of person identifiers.)
A.1.4 - Uncapitalized, common-noun designators such as "division" in the phrase "Chrysler division" are *not* considered part of an entity name.
A.1.5 - In a possessive construction, the possessor and possessed ENAMEX substrings should be tagged separately.
A.1.6 - Aliases for entities are to be tagged. (For further information on what constitutes an alias, see the "Information Extraction Task Definition.")
A.1.7 - Miscellaneous types of proper names that are *not* to be tagged as ENAMEX include artifacts, other products, and plural names that do not identify a single, unique entity. (For information on the treatment of facilities, see section A.2, below.)

A.1.1 Entity names used as modifiers in complex NPs that are not proper names are to be tagged when it is clear to the annotator from context or the annotator's knowledge of the world that the name is that of an organization or person.

"Bridgestone profits"

<ENAMEX TYPE="ORGANIZATION">Bridgestone</ENAMEX> profits

"the Clinton government"

the <ENAMEX TYPE="PERSON">Clinton</ENAMEX> government

"Treasury bonds and securities"

<ENAMEX TYPE="ORGANIZATION">Treasury</ENAMEX> bonds and securities

"U.S. exporters"

<ENAMEX TYPE="LOCATION">U.S.</ENAMEX> exporters

A.1.2 In addition, entity names modifying person identifiers (where person identifier= title/role and/or name) are to be tagged.

"Mips Vice President John Hime" [Mips is the name of a computer company]

<ENAMEX TYPE="ORGANIZATION">Mips</ENAMEX> Vice President <ENAMEX TYPE="PERSON">John Hime</ENAMEX>

"Treasury Secretary"

<ENAMEX TYPE="ORGANIZATION">Treasury</ENAMEX> Secretary

"the U.S. Vice President"

the <ENAMEX TYPE="LOCATION">U.S.</ENAMEX> Vice President

A.1.3 In some cases, multiword strings that are proper names will contain entity name substrings; such strings are not decomposable; therefore, the substrings are not to be tagged. (See A.1.2 re special cases involving prenominal modifiers of person identifiers.)

"Arthur Anderson Consulting"

<ENAMEX TYPE="ORGANIZATION">Arthur Anderson Consulting</ENAMEX>

[no markup for "Arthur Anderson" alone]

"Boston Chicken Corp."

<ENAMEX TYPE="ORGANIZATION">Boston Chicken Corp.</ENAMEX>

[no markup for "Boston" alone]

"U.S. Fish and Wildlife Service"

<ENAMEX TYPE="ORGANIZATION">U.S. Fish and Wildlife Service</ENAMEX>

[no markup for "U.S." alone]

"Northern California"

<ENAMEX TYPE="LOCATION">Northern California</ENAMEX>

[no markup for "California" alone]

"West Texas"

<ENAMEX TYPE="LOCATION">West Texas</ENAMEX>

[no markup for "Texas" alone]

"Ford Taurus"

Ford Taurus

[no markup, not even for "Ford"]

"Dow Jones Industrial Average"

Dow Jones Industrial Average

[no markup, not even for "Dow Jones"]

"West Texas Intermediate crude"

West Texas Intermediate crude

[no markup, not even for "West Texas" -- this example differs from those in A.1.1, in that "West Texas Intermediate" is a name and can therefore not be decomposed]

A.1.4 Uncapitalized, common-noun designators such as "division" in the phrase "Chrysler division" are *not* considered part of an entity name.

"Chrysler division"

<ENAMEX TYPE="ORGANIZATION">Chrysler</ENAMEX> division

"the Kennedy family"

the <ENAMEX TYPE="PERSON">Kennedy</ENAMEX> family

"the Southwest region"

the <ENAMEX TYPE="LOCATION">Southwest</ENAMEX> region

A.1.5 In a possessive construction, the possessor and possessed ENAMEX substrings should be tagged separately.

"Temple University's Graduate School of Business"

<ENAMEX TYPE="ORGANIZATION">Temple University</ENAMEX>'s <ENAMEX TYPE="ORGANIZATION">Graduate School of Business</ENAMEX>

"Shearson Lehman Hutton's OTC department"

<ENAMEX TYPE="ORGANIZATION">Shearson Lehman Hutton</ENAMEX>'s <ENAMEX TYPE="ORGANIZATION">OTC</ENAMEX> department

"California's Silicon Valley"

<ENAMEX TYPE="LOCATION">California</ENAMEX>'s <ENAMEX TYPE="LOCATION">Silicon Valley</ENAMEX>

"Canada's Parliament"

<ENAMEX TYPE="LOCATION">Canada</ENAMEX>'s <ENAMEX TYPE="ORGANIZATION">Parliament</ENAMEX>

A.1.6 Aliases for entities are to be tagged. (For further information on what constitutes an alias, see the "Information Extraction Task Definition.")

"IBM" [alias for International Business Machines Corp.]

<ENAMEX TYPE="ORGANIZATION">IBM</ORGANIZATION>

"Big Blue" [alias for International Business Machines Corp.]

<ENAMEX TYPE="ORGANIZATION">Big Blue</ORGANIZATION>

"Big Board" [alias for New York Stock Exchange]

<ENAMEX TYPE="ORGANIZATION">Big Board</ORGANIZATION>

"Mr. Fix-It" [nickname for candidate for head of the CIA]

Mr. <ENAMEX TYPE="PERSON">Fix-It</ENAMEX>

"the Big Apple" [nickname for New York City]

<ENAMEX TYPE="LOCATION">the Big Apple</ENAMEX>

A.1.7 Miscellaneous types of proper names that are *not* to be tagged as ENAMEX include artifacts, other products, and plural names that do not identify a single, unique entity. (For information on the treatment of facilities, see section A.2, below.)

"Macintosh computers"

Macintosh computers

[no markup]

"Wall Street Journal"

Wall Street Journal

[no markup]

"the Campbell Soups of the world"

the Campbell Soups of the world

[no markup]


Named Entity Task Definition - 02 JUN 95
[Next] [Previous] [Up] [Top] [Back to MUC-6 main page]

Generated with CERN WebMaker