Title: Improving Event Extraction: Casting a Wider Net
Candidate: Cao, Kai
Advisor(s): Grishman, Ralph
Information extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents. One facet of information extraction is event extraction (EE): identifying instances of selected types of events appearing in natural language text. For each instance, EE should identify the type of the event, the event trigger (the word or phrase which evokes the event), the participants in the event, and (where possible) the time and place of the event.
One EE task was defined and intensively studied as part of the ACE (Automatic Content Extraction) research program. The 2005 ACE EE task involved 8 types and 33 subtypes of events. For instance, given the sentence "She was killed by an automobile yesterday.", an EE system should be able to recognize the word "killed" as a trigger for an event of subtype DIE, and discover "an automobile" and "yesterday" as the Agent and Time arguments. This task is quite challenging, as the same event might appear in the form of various trigger expressions and an expression might represent different types of events in different contexts.
To support the development and evaluation of ACE EE systems, the Linguistic Data Consortium annotated a text corpus (consisting primarily of news articles) with information on the events mentioned. This corpus was widely used to train ACE EE systems. However, the event instances in the ACE corpus are not evenly distributed, and so some frequent expressions involving ACE events do not appear in the training data, adversely affecting performance.
The thesis presents several strategies for improving the performance of EE. We first demonstrate the effectiveness of two types of linguistic analysis -- dependency regularization and Abstract Meaning Representation -- in boosting EE performance. Next we show the benefit of an active learning strategy in which a person is asked to judge a limited number of phrases which may be event triggers. Finally we report the impact of combining our baseline system with event patterns from a system developed for a different EE task (the TABARI program). This step contains expert-level patterns generated by other research groups. Because the information received is complicated and quite different from the original corpus (ACE), the integration of this information requires more complex processing.
Title: On the Design of Small Coarse Spaces for Domain Decomposition Algorithms
Author(s): Dohrmann, Clark; Widlund, Olof
Methods are presented for automatically constructing %low-dimensional coarse spaces of low dimension for domain decomposition algorithms. These constructions use equivalence classes of nodes on the interface between the subdomains into which the domain of a given elliptic problem has been subdivided, e.g., by a mesh partitioner such as METIS; these equivalence classes already play a central role in the design, analysis, and programming of many domain decomposition algorithms. The coarse space elements are well defined even for irregular subdomains, are continuous, and well suited for use in two-level or multi-level preconditioners such as overlapping Schwarz algorithms. An analysis for scalar elliptic and linear elasticity problems ms reveals that significant reductions in the coarse space dimension can be achieved while not sacrificing the favorable condition number estimates for larger coarse spaces previously developed. These estimates depend primarily on the Lipschitz parameters of the subdomains. Numerical examples for problems in three dimensions are presented to illustrate the methods and to confirm the analysis. In some of the experiments, the coefficients have large discontinuities across the interface between the subdomains, and in some, the subdomains are generated by mesh partitioners.