Jet.Refres
Class Resolve

java.lang.Object
  extended byJet.Refres.Resolve

public class Resolve
extends java.lang.Object

contains static procedures for reference resolution within a document.


Field Summary
static boolean trace
           
 
Constructor Summary
Resolve()
           
 
Method Summary
static java.lang.String concat(java.lang.String[] s)
          append strings in 's', separated by blanks
static boolean equalArray(java.lang.Object[] first, java.lang.Object[] second)
           
static java.util.Vector gatherMentions(Document doc, Span span)
          collects and returns the set of all mentions -- constituents which are subject to reference resolution.
static java.util.HashMap gatherSyntacticCoref(Document doc, java.util.Vector mentions)
          gatherSyntacticCoref looks for particular syntactic patterns in the text which indicate coreference, and returns a Map with one entry for each such syntactic coreference, linking the anaphor to the antecedent.
static Annotation getHeadC(Annotation ann)
          returns the head constituent associated with constituent 'ann'.
static java.lang.String[] getHeadTokens(Document doc, Annotation constit)
           
static java.lang.String[] getNameTokens(Document doc, Annotation constit)
          returns the name associated with a noun phrase, as an array of token strings, or null if the np does not have a name.
static Annotation getNgHead(Annotation ng)
           
static boolean in(java.lang.Object o, java.lang.Object[] array)
           
static boolean intersect(java.lang.Object[] setA, java.lang.Object[] setB)
           
static int isAbbreviation(java.lang.String[] name, java.lang.String abbrev)
          returns true if 'abbrev' is an acronym-style abbreviation for 'name' -- i.e., an acronym with periods, such as U.S.A.
static int isAcronym(java.lang.String[] name, java.lang.String acronym)
          returns true if 'acronym' is a possible acronym for 'name', such as 'USA' for 'United States of America'.
static boolean isName(Annotation constit)
          returns true if 'consit' is a name.
static int matchFullName(java.lang.String[] mentionName, java.lang.String mentionHead, java.lang.String[] entityName, java.lang.String entityHead)
          returns true if 'mentionName' is a possible reference to 'entityName'.
static boolean matchPronoun(Document doc, Annotation anaphor, java.lang.String mentionHead, Annotation ent)
          return true if pronoun 'mentionHead' is a possible anaphor for entity 'ent' (this also includes possessive pronouns of category 'det', and headless noun phrases of category 'np').
static boolean nameNomCoref(Document doc, java.lang.String det, java.lang.String mentionHead, Annotation mention, Annotation entity)
          return true if a common noun phrase headed by 'mentionHead' is a possible anaphoric reference to the (named) entity 'entity'.
static boolean nomInName(Document doc, Annotation mention, Annotation entity)
           
static java.lang.String[] normalizeGazName(java.lang.String[] name, boolean notNP, boolean trace)
          returns a standardized country name, using the gazetteer.
static java.lang.String normalizeName(java.lang.String name)
          replaces whitespace between tokens with a single blank.
static void references(Document doc, Span span)
          Resolve.references resolves the mentions (noun groups) in span of Document doc.
static void references(Document doc, Span span, java.util.Vector mentions)
           
static int sentenceNumber(int posn)
          returns the number of the sentence containing character 'posn'
static void updateEvents(Document doc, Span span)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

trace

public static boolean trace
Constructor Detail

Resolve

public Resolve()
Method Detail

references

public static void references(Document doc,
                              Span span)
Resolve.references resolves the mentions (noun groups) in span of Document doc. It generates entity annotations, corresponding to one or more mentions in the document which are coreferential. In addition, for every event annotation in span, it generates an r-event annotation in which each feature pointing to a mention is replaced by the entity to which that mention has been resolved.


references

public static void references(Document doc,
                              Span span,
                              java.util.Vector mentions)

gatherMentions

public static java.util.Vector gatherMentions(Document doc,
                                              Span span)
collects and returns the set of all mentions -- constituents which are subject to reference resolution. This includes all noun phrases, possessive pronouns, and (for ACE) names. To avoid duplication, we exclude NPs which are the head constituent of other NPs (so that, for example, if we have "the man in the chair" as a mention, we exclude "the man" as a separate mention.


gatherSyntacticCoref

public static java.util.HashMap gatherSyntacticCoref(Document doc,
                                                     java.util.Vector mentions)
gatherSyntacticCoref looks for particular syntactic patterns in the text which indicate coreference, and returns a Map with one entry for each such syntactic coreference, linking the anaphor to the antecedent.


updateEvents

public static void updateEvents(Document doc,
                                Span span)

normalizeGazName

public static java.lang.String[] normalizeGazName(java.lang.String[] name,
                                                  boolean notNP,
                                                  boolean trace)
returns a standardized country name, using the gazetteer. If 'name' is a variant country name or a country adjective ('French'), returns the standard country name.


getNgHead

public static Annotation getNgHead(Annotation ng)

nameNomCoref

public static boolean nameNomCoref(Document doc,
                                   java.lang.String det,
                                   java.lang.String mentionHead,
                                   Annotation mention,
                                   Annotation entity)
return true if a common noun phrase headed by 'mentionHead' is a possible anaphoric reference to the (named) entity 'entity'. Three cases are allowed:
1. if the head is one of the words in the name of the entity ("the Security Council" ... "the council")
2. if the head is one of the words in the first mention of the entity ("President Abe Lincoln" ... "the president")
3. if the head is the name of a country or nationality, and the head is "country", "nation", or "government"


nomInName

public static boolean nomInName(Document doc,
                                Annotation mention,
                                Annotation entity)

matchFullName

public static int matchFullName(java.lang.String[] mentionName,
                                java.lang.String mentionHead,
                                java.lang.String[] entityName,
                                java.lang.String entityHead)
returns true if 'mentionName' is a possible reference to 'entityName'. The test succeeds if the tokens in 'mentionName' are a subset of those in 'entityName', occurring in the same order as they do in 'entityName'. Comparisons are done ignoring case because a name may appear in all caps in the dateline (BAGHDAD vs. Baghdad) and may be capitalized differently at the beginning of a sentence (Sergio de Mello vs. De Mello).


isAcronym

public static int isAcronym(java.lang.String[] name,
                            java.lang.String acronym)
returns true if 'acronym' is a possible acronym for 'name', such as 'USA' for 'United States of America'. The test succeeds if the letters of 'acronym' are a subset of the initial letters of the tokens of 'name', appearing in the same order as in 'name'. The acronym must be at least 2 letters long.


isAbbreviation

public static int isAbbreviation(java.lang.String[] name,
                                 java.lang.String abbrev)
returns true if 'abbrev' is an acronym-style abbreviation for 'name' -- i.e., an acronym with periods, such as U.S.A. for 'United States of America'.


matchPronoun

public static boolean matchPronoun(Document doc,
                                   Annotation anaphor,
                                   java.lang.String mentionHead,
                                   Annotation ent)
return true if pronoun 'mentionHead' is a possible anaphor for entity 'ent' (this also includes possessive pronouns of category 'det', and headless noun phrases of category 'np').


normalizeName

public static java.lang.String normalizeName(java.lang.String name)
replaces whitespace between tokens with a single blank.


concat

public static java.lang.String concat(java.lang.String[] s)
append strings in 's', separated by blanks


isName

public static boolean isName(Annotation constit)
returns true if 'consit' is a name.


getHeadC

public static Annotation getHeadC(Annotation ann)
returns the head constituent associated with constituent 'ann'.


getNameTokens

public static java.lang.String[] getNameTokens(Document doc,
                                               Annotation constit)
returns the name associated with a noun phrase, as an array of token strings, or null if the np does not have a name.


getHeadTokens

public static java.lang.String[] getHeadTokens(Document doc,
                                               Annotation constit)

in

public static boolean in(java.lang.Object o,
                         java.lang.Object[] array)

intersect

public static boolean intersect(java.lang.Object[] setA,
                                java.lang.Object[] setB)

equalArray

public static boolean equalArray(java.lang.Object[] first,
                                 java.lang.Object[] second)

sentenceNumber

public static int sentenceNumber(int posn)
returns the number of the sentence containing character 'posn'