G22.2590 - Natural Language Processing -- Spring 2006 -- Prof. Grishman
1. (2.5 points) Try your chunkPatterns from Assignment #6 on an article
from the Washington
Square News. Save this article in the data directory as "article.txt". Use
the following properties file
JET properties file
# apply chunkPatterns
= tokenize, tagJet, pat(chunks)
Then run your pattern set on the article, following the instructions
for Processing Documents in Jet;
the properties file will write out the document (as file "response-article.txt") with the
ngroup annotations marked in an XML notation. Score the
against the key file following the
for Using the Jet SGML Scorer.
Compare the scores with and without the change to the pattern set
which you made for Assignment #6.
If your change was not very successful in improving the score, try
We will give a small extra credit for additional,
motivated enhancements which further improve the noun group
score. Such improvements should have some generality ... not
coded just to handle a specific example in the WSN article.
2. (1.5 points) Rerun your experiment using the MUC named
entity tagger provided with Jet (command "tagNames"). This will
require small changes to the properties file and chunk patterns.
Your proper-name pattern should now look for ENAMEX annotations produced by
the NE tagger. Report your scores with the (otherwise) original
chunkPatterns and the chunkPatterns with your other modifications.
Due March 7th.