G22.2590 - Natural Language Processing -- Spring 2003 -- Prof. Grishman
February 6, 2003
1. HMM: Consider a HMM with two states, Cow and Duck,
and a start and end state. Emission probabilities:
(Nothing is emitted in the start or end state.) Transition probabilities:
- In state Cow, the HMM can emit 'moo' (with 0.9 probability) or 'hello'
- In state Duck, the HMM can emit 'quack' (0.6 probability) or ‘hello'
(0.4 probability). The Duck has been studying English longer.
Using the Viterbi algorithm, decode (find the most likely state sequence
for) 'moo hello quack'. What is the probability of emitting this sentence
from this state sequence? Show your work, so that you can get partial
credit even if you make an error.
- From the start state, the HMM goes to state Cow with 1.0 probability
- From state Cow, the HMM can remain in state Cow (0.5 probability),
go to state Duck (0.3 probability), or go to state end (0.2 probability).
- From state Duck, the HMM can remain in state Duck (0.5 probability),
go to state Cow (0.3 probability), or go to state end (0.2 probability).
2. JET HMM Tagger. Try the Jet HMM tagger. Submit
one output with an error in a tag. Write out another example
involving the erroneous word and its two neighbors for which the sequence
of 3 tags assigned by the tagger would be correct.
Due February 27th.
Running the tagger:
Get version 1.11 of Jet (or later), and file pos_hmm.txt
Use the properties file tagPOS.properties:
# JET properties file for POS tagging
On the "tagger" menu, turn on the "POS tagger trace".
Jet.dataPath = C:/My Documents/nlp
Tags.fileName = pos_hmm.txt
processSentence = lexLookup, tagPOS