G22.2590 - Natural Language Processing - Spring 2003 Prof. Grishman

Term Project

March 13, 2003

You must submit a “term project” on material connected to the course;  this is worth 30% of your grade.  You have wide latitude in what do for the project.  It may be a project based on Jet;  a separate programming project, or a research paper.  A Jet or programming project must be accompanied by a separate, well-written description of the project;  an analysis of the data and your system’s performance will be an important part of the grade.

Joint projects are permitted.

The general idea is to do something interesting which will require you to confront some of the 'real problems' of doing NLP.

Possible projects

  1. We will discuss the use of Jet for a specific extraction task -- 'executive succession' -- using the pattern matching tools in Jet.  Adaptation of Jet to perform extraction on a different topic (modeled on the executive succession patterns, but more extensive).  Should include some analysis of the performance of the extraction patterns.  You may either try to pull one specific topic from general news, or do a richer analysis of some narrow sublanguage, such as weather forecasts, apartment ads, death notices, cooking recipes, sports results, etc.
  2. Doing similar extraction with your own program (e.g., with Perl).
  3. Preparing a context-free semantic grammar for a sublanguage.
  4. Extension of Jet syntactic patterns to more constructs (e.g., a rich variety of modifiers for noun and verb groups, or additional structures such as clause structures).  This should include some performance analysis.
  5. Experimentation with HMM to improve performance for POS [e.g., using morphological clues] or name tagging, or to apply to a different task (e.g., chunking) (using Jet tools or your own HMM implementation).
  6. Foreign language analysis:  building a POS tagger or even a chunker for another language.
  7. Implementation of one of the parsing algorithms for feature grammars or grammars with semantic features, either as an extension to the Jet parser or a separate program.
  8. Research report on some topic not covered in the course (e.g., morphological analysis of morphologically rich languages;  question answering, summarization, machine translation methods).The paper should show some understanding of what problems have and have not been addressed by current technology.
What were the best projects last time?
question-answering (NL data base interface) system for stock quotes (from scratch)
extraction systems for {criminal verdicts, lay-offs, weather reports, apartment ads}, including evaluation (using Jet)
noun group patterns, with evaluation on larger corpus (using Jet)
feature parsers (both on top of Jet and stand-alone)

Due dates

Brief project description – April 1st  -  1-2 paragraphs, email to grishman@cs.nyu.edu

Project – May 1st (last class)

   2% penalty for each weekday late

Final exam – May 8th  (open book)