Natural Language and Speech Processing


Ralph Grishman   Satoshi Sekine   Adam Meyers   Mehryar Mohri   Naomi Sager  

The amount of text which is available in electronic form is growing at an explosive rate. In addition to the web, large quantities of text are being collected for medical, legal, commercial, and scientific applications. But the tools for getting the information we need out of this text are still quite primitive. Our research groups in natural language processing are building systems to to extract specific information from large text collections, and to present it in the user's preferred language. A closely related area, speech processing, deals with coding, synthesis and extraction of information from speech signals.

Natural language processing has a long history at NYU. The Linguistic String Project was one of the pioneers in natural language processing research in the United States.

The Proteus Project focuses on automatically learning the linguistic knowledge needed for information extraction and machine translation. It has developed extraction systems in English and Japanese, and a series of language-independent translation models. It also conducts a wide range of basic research, and develops large-scale dictionaries and other resources for natural language processing.

Ralph Grishman's area of research is natural language processing, -- automatically 'understanding' natural language. He works on systems for information extraction, which can pull out, from text, information about specific relationships or types of events. For example, one of the systems developed by the group can process the newspaper text and construct a table of new executive hires and fires, or corporate acquisitions. A current focus of Ralph Grishman's group is learning information from large text collections. The group is a part of a major DARPA-funded effort to develop an integrated speech-machine translation-information extraction software lead by SRI international.

Satoshi Sekine is working on a variety of topics in natural language processing: on-demand information extraction, knowledge discovery from text corpora, to discover patterns, paraphrases, relations and semantic knowledge question answering and summarization, software for Japanese and English text analysis, Japanese and English named entity tagging.

Adam Meyers research interests are in natural language processing and computational linguistics. His main direction of work include computational lexicography, predicate argument structure, sentence alignment, coreference and corpus annotation.

Mehryar Mohri's primary research areas are machine learning, theory and algorithms, text and speech processing, and computational biology. This includes in particular the study of the theoretical aspects of machine learning, the design of general and accurate learning algorithms, and their applications to large-scale learning problems such as those found in bioinformatics and language processing.


Related Web Pages

Proteus Project  

top | contact webmaster