CSCI-GA.2590 - Natural Language Processing - Spring 2013 Prof. Grishman
Lecture 10 Outline
April 9, 2013
Term projects: importance of
evaluation. Separating development and test data.
Collecting data for interactive tasks ("Wizard of Oz" methods).
For sentiment: alternative indicators. Fine grain
Opinion mining and sentiment analysis Bo Pang and Lillian Lee (2008)
Our focus until now has been on extracting objective information -- "the facts".
But there is a strong interest in extracting subjective information
as well -- people's opinions about things and about other people.
Product may be a binary classification, a document ranking,
or a classification with respect to particular features / issues.
- review of products and services -- useful both for customers
and as a source of business intelligence [a finer grain
is typically needed for business intelligence]
- political opinions about issues and people
Particularly for product and service reviews, there is now a vast
amount of coarse-grained labeled data (typical 1 to 5-star reviews).
This permits a simple classifier using individaul words as features
and a Naive Bayes classifier:
s' = argmaxs P(s | w1 ... wn) = argmaxs P(w1 ... wn | s) P(s)
= argmaxs Πi P(wi | s) P(s)
(other classifiers such as Support Vector Machines can also be used).
Training such a classifier in effect identifies particular words as
strong indicators of positive or negative opinion.
Such corpus-trained methods often work better than trying to select
indicator words by hand. Good indicators are often far from obvious.
Furthermore, the significance and polarity of many words will be
domain and sometimes product specific.
Part-of-speech information may help -- distinguishing nouns and adjectives
with the same spelling serves as a simple form of word sense disambiguation.
More refined indicators (bigrams, dependency bigrams, ...) have a mixed
In some cases training data is not available -- for low-resource
languages, for new domains, for specific aspects of products.
Semi-supervised methods can then be used to build lists of indicators.
These methods start with a seed set. However, just expanding the
seeds based on distributional similarity may lead to errors (why?),
so other evidence of similar indicators is required.
Hatzivassiloglou and McKeown used conjunction patterns ...
"X and Y" (for adjectives X and Y) suggests that X and Y have the same
polarity, while "X but Y" suggests they have opposite polarity.
WordNet and machine-readable dictionaries have also been used to
find related indicators.
Discourse. Until now we considered the structure and
meaning of sentences in isolation. We now turn to issues
primarily connected with multi-sentence text -- discourse.
Reference Resolution (J&M 21.3-8)
- referent: real-world object being referred to
- referring expression: a portion of text referring
to that object (we shall also refer to these as mentions of the
- discourse entities: the set of objects referred
to by a text
- coreference: two expressions referring to the same
- the first expression -- the antecedent
-- licenses the use of the second -- the anaphor
- the first mention of an object in a discourse evokes
Types of referring expressions
- definite pronouns (he, she, it, ...): generally anaphoric
- but 'it' has non-referring usages: "It is raining."
"It is unlikely that he will come."
- and can represent a bound variable in quantified contexts
- indefinite pronouns (one):
- refers to another entity with the same properties as the
- can be modified ('the green one')
- definite NPs (the car):
- reference to uniquely identifiable entity
- generally anaphoric
- but unique common nouns ('the sun'), functional NPs ('the sum
- indefinite NPs (a car): generally evoke a new discourse
- may also be generic: "Giraffes are beautiful creatures."
- names: named entities can be later referred to by portions
- inferrables: sometimes the relation between anaphor and
antecedent is not one of identity ...
"I entered the room and looked at the ceiling."
- zero anaphora: sometimes the anaphor is implicit
- many languages allow subject omission, and some allow omission of
arguments (e.g., Japanese)
- some cases of inferrable anaphora can be described in terms of PPs
"IBM announced the appointment of Fred as president [of
- expressions can also refer to events, propositions, ...
- "Fred claimed that no one programs in Lisp. That
Resolving pronoun reference
One of the first procedures for resolving pronouns statistically
(Ge, Hale, and Charniak WVLC 1998) maximized the product of four factors
- constraints: number and gender agreement
- need to identify names by gender (using first name)
- some common nouns also gender-specific
- some syntactically singular nouns can be referred to by plural
- recency -- entity must be salient (recently mentioned,
- grammatical role (reference to
- implementation: associate score with preferences;
select antecedent of highest score satisfying constraints
(can incorporate preferences into search order -- Hobbs' search order)
- learn weights of different factors using training corpus
- selectional preferences
- is antecedent likely to occur in context of pronoun
- can be learned from a large corpus
- accuracy fairly good -- accuracy ranging from mid-70s% to 80's%,
depending on text
These factors together gave about 83% accurate resolution.
- probability of correct antecedent at [Hobbs] distance i
- probability of pronoun given antecedent word
- probability of antecedent given its mention count
- probability of antecedent word given pronoun context
(head constituent above pronoun and type of head constituent
Resolving other referring expressions
- names: generally quite straightforward -- look for prior
of which this is a substring
- common noun phrases: generally quite hard
- deciding if an NP is anaphoric
- deciding if an NP description is consistent with an antecedent
(for example, we may use different nouns to describe the
entity -- "the soldier", "the Marine", etc.)
- can refer to a named entity by a noun which is part of the
- other name--noun pairs can be collected from corpora (e.g.,
Anaphora resolution in Jet
- assumes the only referring expressions are noun groups
- generates entity annotations corresponding to discourse
- anaphora resolution consists of linking each noun group to a new
or existing entity
- simple rule for common noun phrases -- only checks for matching
- performed by resolve operation (typically done after
all pattern matching)
- results are displayed in a separate entity window (which
appears, along with the regular document viewer, if entities have been
- both hard-coded and corpus-trained (maxent) pronoun resolvers
Using anaphora resolution for extraction: an example
In many cases, we want to be able to retrieve an argument from context
it is not part of the immediate syntactic structure. A simple way
doing this is to generate a zero anaphor (an ngroup constituent not
any text) and then let reference resolution map it to an entity.
have created a version of the
AppointPatterns which uses this method to collect
organization names and, in some cases, people names.