Jet Tools:  Annotators and Resources

The analysis of a document in Jet is performed by a set of tools. Most of these tools are annotators:  each annotator adds a set of annotations to the document.  Many of these annotators use linguistic resources to perform their task;  for example, the parser uses a grammar;  the pattern matcher uses a pattern set.  These resources are described along with the annotators which use them.  The annotators currently implemented in Jet are
linguistic resource
Tokenizer divides a text into tokens  
Sentence Splitter divides a text into sentences  
Lexicon Lookup looks up definitions of words in a dictionary lexicon
Part-of-speech Tagger assigns parts of speech to words in context HMM of part-of-speech sequences
Name Tagger
tags names, dates, times, ...
HMM of names
Noun group Chunker
tags noun groups
Maxent model of noun groups
Parser or Statistical Parser
determines syntactic structure grammar
Pattern Matcher identifies structure through regular expression pattern matching pattern set and concept hierarchy
Reference Resolver resolves anaphoric references  
Scorer scores performance against standard