MUC-6

Over the last eight years, a series of five MUCs (Message Understanding Conferences) have been organized by Beth Sundheim of the Naval Research and Development group (NRaD) of NCCOSC. (previously NOSC). These conferences, which have involved the evaluation of information extraction systems applied to a common task, have been funded by ARPA to measure and foster progress in information extraction.

A meeting in December 1993, following the last MUC, and chaired by Ralph Grishman, defined a broader set of objectives for the forthcoming MUCs: to push information extraction systems towards greater portability to new domains, and to encourage more basic work on natural language analysis by providing evaluations of some basic language analysis technologies.

NYU is cooperating with NRaD to organize the next MUC, in September 1995, which will incorporate some of these new evaluation tasks. Three task specifications are currently under development:

named entity recognition
coreference
information extraction "mini-MUC" (template filling)

These tasks are currently being refined through a process of corpus annotation and a "dry run" evaluation, to be held in April 1995. The dry run is limited to a group of researchers who have been involved in prior MUCs and the corpus annotation effort.

A call for participation in MUC-6 will be made in May 1995; the official evaluations will be conducted in September 1995. These evaluations will include the three tasks listed above and may also include a syntactic bracketing task ("Parseval"). Participants may enter one or more of the evaluations.

Named Entity Recognition

The Named Entity task for MUC-6 involves the recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions. This task is intended to be of direct practical value (in annotating text so that it can be searched for names, places, dates, etc.) and an essential component of many language processing tasks, such as information extraction.

Coreference

The Coreference task for MUC-6 involves the identification of coreference relations among noun phrases.

Coreference Task Definition, version 2.1 (21 Mar 95)

Information Extraction ("mini-MUC")

The template-filling task for MUC-6 involves the extraction of information about a specified class of events and the filling of a template for each instance of such an event. In contrast to MUC-5, the effort has been to design relatively simple templates and to predefine the low-level templates (for people, organizations, and artifacts) which would apply to a wide variety of different event types.

Information Extraction Task Definition, version 1.1 (22 Feb 95)

Evaluation for the information extraction task will be done with respect to a particular "scenario" (type of event). In order to reduce the time which participants spend becoming expert on a particular domain, and to encourage the development of tools to port systems to new domains, this scenario will be released only one month before the evaluation. An example scenario is available involving orders for aircraft:

The articles used for the example templates are taken from the Wall Street Journal and are available in machine-readable form on the ACL/DCI disk, which is distributed by the Linguistic Data Consortium .

(For information about other natural language processing research at the NYU Proteus Project, click here).