MUC-6
Over the last eight years, a series of five MUCs (Message
Understanding Conferences) have been organized by Beth Sundheim of the
Naval Research and Development group (NRaD) of NCCOSC. (previously
NOSC). These conferences, which have involved the evaluation of
information extraction systems applied to a common task, have been
funded by ARPA to measure and foster progress in information
extraction.
A meeting in December 1993, following the last MUC, and chaired by
Ralph Grishman, defined a broader set of objectives for the
forthcoming MUCs: to push information extraction systems towards
greater portability to new domains, and to encourage more basic work
on natural language analysis by providing evaluations of some basic
language analysis technologies.
NYU is cooperating with NRaD to organize the next MUC, in September
1995, which will incorporate some of these new evaluation tasks.
Three task specifications are currently under development:
- named entity recognition
- coreference
- information extraction "mini-MUC" (template filling)
These tasks are currently being refined through a process of corpus
annotation and a "dry run" evaluation, to be held in April
1995. The dry run is limited to a group of researchers who
have been involved in prior MUCs and the corpus annotation effort.
A call for participation in MUC-6 will be made in May 1995; the
official evaluations will be conducted in September 1995.
These evaluations will include the three tasks listed above and may
also include a syntactic bracketing task ("Parseval").
Participants may enter one or more of the evaluations.
Named Entity Recognition
The Named Entity task for MUC-6 involves the
recognition of entity names (for people and organizations), place
names, temporal expressions, and certain types of numerical
expressions. This task is intended to be of direct practical value
(in annotating text so that it can be searched for names, places,
dates, etc.) and an essential component of many language processing
tasks, such as information extraction.
Coreference
The Coreference task for MUC-6 involves the identification of
coreference relations among noun phrases.
Information Extraction ("mini-MUC")
The template-filling task for MUC-6 involves the extraction of
information about a specified class of events and the filling
of a template for each instance of such an event. In contrast to
MUC-5, the effort has been to design relatively simple templates
and to predefine the low-level templates (for people, organizations,
and artifacts) which would apply to a wide variety of different
event types.
Evaluation for the information extraction task will be done
with respect to a particular "scenario" (type of event). In
order to reduce the time which participants spend becoming expert
on a particular domain, and to encourage the development of tools
to port systems to new domains, this scenario will be released only
one month before the evaluation. An example scenario is available
involving orders for aircraft:
The articles used for the example templates are taken from the
Wall Street Journal and are available in machine-readable form
on the ACL/DCI disk, which is distributed by the
Linguistic Data
Consortium .
(For information about other natural language processing research
at the NYU Proteus Project, click here).