Name
Tagger scoring
This utility can be used either to score a set of
files which have already been tagged by a name tagger, or to both tag a
set of files and then immediately score them. The key file should
have in-line XML name tags. The utility is invoked by
xjet
Jet.Scorer.NEScorer test-directory
test-file-list key-directory key-file-list [order tagger-model] tag1
tag2 ...
with arguments as follows:
- test-directory
- directory containing test files
- test-file-list
- a file containing a list of test files, one per line
- key-directory
- directory containing key files
- key-file-list
- a file containing a list of key files, one per line
(test-file-list and key-file-list should be the same length
The following two arguments should be included in order to both
tag the test documents and then score them.
- order
- either "unigram" or "bigram", indicating what emission
probabilities should be used
- tagger-model
- file containing HMM model for name tagging
- tagi
- a tag to be scored (e.g., ENAMEX)
The scorer reports scores for each document and for the entire
collection.
It reports both 'type' precision and recall, and 'attribute' precision
and recall.
The former reports on matches at the XML tag type (e.g., ENAMEX); the
latter
on matches at the feature level (e.g., PERSON vs. ORGANIZATION).