G22.2590 - Natural Language Processing -- Spring 2003 -- Prof. Grishman
Assignment #4
February 6, 2003
1. HMM: Consider a HMM with two states, Cow and Duck,
and a start and end state. Emission probabilities:
- In state Cow, the HMM can emit 'moo' (with 0.9 probability) or 'hello'
(0.1 probability).
- In state Duck, the HMM can emit 'quack' (0.6 probability) or ‘hello'
(0.4 probability). The Duck has been studying English longer.
(Nothing is emitted in the start or end state.) Transition probabilities:
- From the start state, the HMM goes to state Cow with 1.0 probability
(i.e., always).
- From state Cow, the HMM can remain in state Cow (0.5 probability),
go to state Duck (0.3 probability), or go to state end (0.2 probability).
- From state Duck, the HMM can remain in state Duck (0.5 probability),
go to state Cow (0.3 probability), or go to state end (0.2 probability).
Using the Viterbi algorithm, decode (find the most likely state sequence
for) 'moo hello quack'. What is the probability of emitting this sentence
from this state sequence? Show your work, so that you can get partial
credit even if you make an error.
2. JET HMM Tagger. Try the Jet HMM tagger. Submit
one output with an error in a tag. Write out another example
involving the erroneous word and its two neighbors for which the sequence
of 3 tags assigned by the tagger would be correct.
Due February 27th.
Running the tagger:
Get version 1.11 of Jet (or later), and file pos_hmm.txt
Use the properties file tagPOS.properties:
# JET properties file for POS tagging
Jet.dataPath = C:/My Documents/nlp
course
Tags.fileName = pos_hmm.txt
#
processSentence = lexLookup, tagPOS
On the "tagger" menu, turn on the "POS tagger trace".