constit
a syntactic constituent
HMMTagger; Retagger; Lexicon;
AddSyntacticRelations [for tree restructuring];
ParseTreeNode.makeParseAnnotations;
StatParser; ExternalDocument [when reading POS-format files]
cat
String
the category of the constituent, such as "np" or "s"
[required]
children
constit Annotation[]
the immediate constituents (in a parse tree)
apposite
constit Annotation
for appositive constructions (NP, NP, such as "Fred, the baker"),
a feature of the first NP pointing to the second;
assigned by AddSyntacticRelations
conj
constit Annotation
for constructs of the form NP1 conj NP2, a feature of NP1
pointing to NP2; assigned by AddSyntacticRelations
det
String
for NPs having a determiner, a feature on the NP with the determiner
String (e.g., "the") or "poss" if a possessive, or "q" if the first
word of the NP is a quantifier or quantifier phrase;
assigned by AddSyntacticRelations.addDetRelation
headC
constit Annotation
the head constituent (one of the immediate constituents)
[assigned by ParseTreeNode.makeParseAnnotations and by
AddSyntacticRelations when restructuring the parse tree]
host
constit Annotation
a link from a WHNP (relative clause) node to the NP node which
dominates it; assigned by AddSyntacticRelations
mainV
constit Annotation
a feature on S nodes pointing to the main verb of the clause;
assigned by AddSyntacticRelations
mention
"true" or null
if true, indicated that the constituent is treated as a mention
(phrase to be resolved) by reference resolution;
assigned by Resolve.markMentions
nameMod**
constit Annotation
if an NP includes a NAME left modifier, a feature on the NP
pointing to the NAME; assigned by AddSyntacticRelations
object*
constit Annotation
a link from an S node, or the vp node of a reduced relative,
to its logical object; assigned by AddSyntacticRelations
pa
a FeatureSet with features "head", "tense" [for verbs
and clauses], "number" [for nouns], and "voice" and
"part" [for clauses]
provides the root form [head], tense and number information;
assigned by EnglishLex to open-class words (nouns, verbs,
adjectives, and adverbs); assigned to S nodes, giving the
head of the main verb, the tense (if tense), the voice ("passive"
for passive clauses), and the verbal particle ("part" feature),
by AddSyntacticRelations
p-obj**
constit Annotation
a link from a PP node to the object of the preposition;
assigned by AddSyntacticRelations.addPObjRelation
pp
constit Annotation
a link from an S node to a PP node under the VP node;
assigned by AddSyntacticRelations
prep
constit Annotation
a link from an S node to the object of a prepositional
modifier under the VP node, where prep =
{"of", "on", "in", "to", "by", "at", "through", "for", "with"};
assigned by AddSyntacticRelations
poss
constit Annotation
for NPs with a possessive modifier, a feature on the NP pointing
to the possessive; assigned by AddSyntacticRelations.addDetRelation
predComp
constit Annotation
for constructs of the form NP be/become X, where X
is an NP, ADJ, VEN, or PP, a feature on the subject NP
pointing to X; assigned by AddSyntacticRelations
preName**
constit Annotation
for NPs with a NAME head immediately preceded by an N or TITLE
("President Bush"), a feature on the NP pointing to the
N or TITLE; assigned by AddSyntacticRelations
subject*
constit Annotation
a link from an S node, or the vp node of a reduced relative,
to its logical subject; assigned by AddSyntacticRelations
* The reciprocal relations (pointing in the reverse direction)
subject-1 and object-1 are added by Ace.tagReciprocalRelations.
** The reciprocal relations nameMod-1, p-obj-1, and preName-1
are also added by AddSyntacticRelations.
dateline
text at beginning of news article saying where (and sometimes when)
the news story was filed
SpecialZoner
ENAMEX
names, as defined by the MUC or ACE standard. Note: this type
(and the other MUC types, NUMEX and TIMEX) are specified by
data files read when the HMM is created
HMMNameTagger
entity
a set of co-referring phrases
Resolve (and MaxEntResolve)
ACEtype
"person", "organization", "location", ...
if the ACE entity type dictionary has been loaded, the
type according to that dictionary
gender
"male" or "female" or null
the gender of the entity, if it can be determined,
else null
human
"t" or null
"t" if entity is a person or set of people
lastMention
constit Annotation
the most recent (last) mention of the entity
mentions
Vector of annotations
the set of mentions of this entity in this document
name
String[]
the name of the entity, if it has one
nameType
String
for entities with names, the type of the name, as assigned
by the name tagger
nameWithMods
String[]
for entities which include a mention with a name head,
the toknes of the complete mention, including modifiers
number
"singular" or "plural"
position
Integer
the position (character offset in the document) of the
most recent (last) mention of the entity
properAdjective
"true" or null
"true" if the only mention(s) of this entity are proper adjectives
(such as "French"); such entities cannot be referred to by
pronouns
NE_INTERNAL
used internally by the Extended Named Entity tagger, erased on exit
from this annotator
NamedEntityUtil; DictionaryTagger; TransformRules
number
a numeric expression in the text
NumberAnnotator
value
Integer
the value of the expression
onoma
an item from a name dictionary (an "onomasticon")
Lexicon (using entries added to the dictionary by Gazetteer)
type
String
the type of name ("country", "usstate", "region", ...)
as specified by the gazetteer
sentence
a sentence
SentenceSplitter;
SpeechSplitter
parse
constit Annotation
the root of the parse tree for this sentence
statParser
tagger
HMMTagger
when part-of-speech tagging is done using the Jet POS categories,
the POS tagger first assigns Penn POS categories using the "tagger"
annotations
cat
String
the (Penn) part of speech assigned by the tagger,
such as "NN" or "VBZ"
textbreak
a portion of a document indicating a sentence break,
such as a blank line or horizonal rule
SpecialZoner
TIMEX2
a time expression
rules called by TimeAnnotator
VAL
String
the normalized value of the time expression
token
a token (word or punctuation) of the text, roughly corresponding to
Penn Tree Bank, but decomposing hyphenated items
Tokenizer;
ExternalDocument.posRead (for POS-format documents)
case
{forcedCap, cap, null}
tokens beginning with an upper-case letter are marked "cap"
except if at the beginning of a sentence or preceded by a
quote or underscore they are marked "forceCap"
intvalue
{Integer, null}
the value of an integer token