G22.2590 - Natural Language Processing  -- Spring 2010 -- Prof. Grishman

Assignment #2

January 28, 2010

Dollar Expressions

You are to write a regular expression which recognizes expressions appearing in such written texts as newpapers and blogs which denote a specific quantity of (US) dollars.
To test your regular expression, prepare a very small corpus in which every dollar expression in enclosed in brackets.

[Two million dollars] is a common salary on Wall Street.
Maharashtra to pay Enron a sum of [30 billion dollars].

Put this corpus in file "key".  Then put your regular expression in our regular expression test program and compile and run that program.  The program should report precision and recall.

Submit the program and your test file.  Send them as separate attachments (in a single email) to grishman@cs.nyu.edu and asun@cs.nyu.edu.  We will run your program both on your data and on our data (culled from recent news stories), so you will be graded in part on whether you handled the common expressions.

You must prepare the regular expression yourself, but you may exchange test corpora with fellow students.   If you do, mention that in your submission email.

Due February 4.