[Next] [Previous] [Top] [Back to MUC-6 main page]

Named Entity Task Definition

2 TASK OVERVIEW

2.1 - Markup Description
2.2 - Named Entities (ENAMEX tag element)
2.3 - Temporal Expressions (TIMEX tag element)
2.4 - Number Expressions (NUMEX tag element)

2.1 Markup Description

The output of the systems to be evaluated will be in the form of SGML text markup. The only insertions allowed during tagging are tags enclosed in angled brackets. No extra whitespace or carriage returns are to be inserted; otherwise, the offset count would change, which would adversely affect scoring.

The markup will have the following form:

<ELEMENT-NAME ATTR-NAME="ATTR-VALUE" ...>text-string</ELEMENT-NAME>

Example:

<ENAMEX TYPE="ORGANIZATION">Taga Co.</ENAMEX>

The markup is defined in SGML Document Type Descriptions (DTDs), written for MUC-6 use by personnel at MITRE and maintained by personnel at NRaD. The DTDs enable annotators and system developers to use SGML validation tools to check the correctness of the SGML-tagged texts produced by the annotator or the system. The validation tools are available to MUC-6 participants in the file called muc6-sgml-tools. Annotators are using a software tool provided for MUC-6 by SRA Corporation to assist in generating the answer keys to be used for system training and testing.

2.2 Named Entities (ENAMEX tag element)

This subtask is limited to proper names, acronyms, and perhaps miscellaneous other unique identifiers, which are categorized via the TYPE attribute as follows:

ORGANIZATION: named corporate, governmental, or other organizational entity

PERSON: named person or family

LOCATION: name of politically or geographically defined location (cities, provinces, countries, international regions, bodies of water, mountains, etc.)

2.3 Temporal Expressions (TIMEX tag element)

This subtask is for "absolute" temporal expressions only; explanation is provided in appendix B. The tagged tokens are categorized via the TYPE attribute as follows:

DATE: complete or partial date expression

TIME: complete or partial expression of time of day

2.4 Number Expressions (NUMEX tag element)

This subtask is for two useful types of numeric expressions, monetary expressions and percentages. The numbers may be expressed in either numeric or alphabetic form.

The task covers the complete expression, which is categorized via the TYPE attribute as follows:

MONEY: monetary expression

PERCENT: percentage


Named Entity Task Definition - 02 JUN 95
[Next] [Previous] [Top] [Back to MUC-6 main page]

Generated with CERN WebMaker