Prior MUCs had focused on a single task of "information extraction": analyzing free text, identifying events of a specified type, and filling a data base template with information about each such event. Over the course of the five MUCs, the tasks and templates had become increasingly complicated. A meeting in December 1993, following MUC-5, and chaired by Ralph Grishman, defined a broader set of objectives for the forthcoming MUCs: to push information extraction systems towards greater portability to new domains, and to encourage more basic work on natural language analysis by providing evaluations of some basic language analysis technologies.
NYU and NRaD worked together to develop specifications for a set of four evaluation tasks:
The formal MUC-6 evaluation was held in September 1995, and the MUC-6 Conference was held in Columbia, Maryland in November 1995. A proceedings of this conference, including descriptions of the systems from all the participants, is being assembled and will be distributed by Morgan Kaufmann.
Named Entity Task Definition, version 2.0 (31 May 95)
Tokenization Rules, version 1.2 (10 Feb 95)
Coreference Task Definition, version 2.1 (21 Mar 95)
Information Extraction Task Definition, version 1.5 (18 Apr 95)
A Revised Template Description for Time (v3)
Supplement to Time Treatment Used for MUC-5
The task specifications are available as compressed tar files for downloading by anonymous ftp from cs.nyu.edu, directory pub/nlp/muc6/, in both postscript form (fileps.tar.Z
)
and text form (file text.tar.Z
).
Evaluation for the information extraction task will be done
with respect to a particular "scenario" (type of event). In
order to reduce the time which participants spend becoming expert
on a particular domain, and to encourage the development of tools
to port systems to new domains, this scenario will be released only
one month before the evaluation.
Two example scenarios are available, one involving orders for aircraft, the other involving labor negotiations:
Sample Scenario on Aircraft Orders, version 1.1 (22 Feb 95)
Example Templates for Aircraft Orders, version 1.2 (24 Mar 95)
Sample Scenario on Labor Negotiations, version 1.4 (20 Apr 95)
The labor negotiation scenario was used for the dry run in April 1995. The articles used for the (Aircraft Order) example templates are taken from the Wall Street Journal and are available in machine-readable form on the ACL/DCI disk, which is distributed by the Linguistic Data Consortium . In addition, systems can be evaluated on their ability to fill the template elements for people and organizations, independent of a particular scenario.(For information about other natural language processing research at the NYU Proteus Project, click here).
(Last updated April 25, 1996.)