March 27, 2003

Semantic Analysis

Mapping Syntax to Semantics  (J&M Chapter 15))

We want to compute the semantic representation of a sentence from the parse tree.  We could embed this translation in a procedure but, as in the case of parsing, it will be easier to develop and maintain this translation if it is rule driven.  Furthermore, because the parse tree provides a structural framework, we will use a compositional, syntax-driven translation process.  This means that we will associate a (partial) semantic interpretation with each node of the parse tree, to be computed (using a rule) from the interpretations of its children.

The grammar will be extended to add a SEM feature, representing the semantic interpretation of a node.  Each production will then incorporate the rule for computing its SEM value, and the SEM of the root will be the interpretation of the sentence. (J&M p. 549).

The semantics of a verb phrase is essentially the semantics of a clause, with one argument (the subject) missing … a predicate with one unbound argument.  We can represent this by a lambda expression (p. 551).  Lambda expressions are commonly used to capture the rules for composing the semantics.

Restricted Quantifiers

For the process of translating syntactic to semantic forms, it is convenient to introduce restricted quantifiers, of the form
(forall x: C(x))
(exists x: C(x))
These do not add any power to predicate calculus;  they can be rewritten
(forall x: C(x)) P(x)  = (forall x) (C(x) => P(x))
(exists x: C(x)) P(x) = (exists x) (C(x) & P(x))
Roughly speaking, a noun phrase can be translated to a constant or a restricted quantifier.

Complex Terms

One source of ambiguity is quantifier scope:
A woman gives birth in the United States every five minutes.
We can represent the two readings in conventional predicate calculus using different quantifier scopes.  If we explicitly represent all the semantic ambiguities in a sentence in this way, we may have very many readings.  It is therefore practical to initially produce (from the parse) a representation which captures multiple readings … which encodes (some of) the ambiguity. (And hope that this ambiguity can be resolved at a later stage of semantic analysis.)

In particular, we can use complex terms (J&M p. 555)

<Quantifier x: C(x)>
with the understanding that
P(<Quantifier x: C(x)>) = (Quantifier x: C(x)) P(x)
If an expression contains several complex terms, the scope of the quantifiers is indeterminate.  Semantic analysis will generate such quasi-logical forms, with a separate step then determining the quantifier scope and generating a predicate calculus expression.

Event Extraction

Like the syntax-directed semantic analysis just discussed, the analysis performed by Jet for information extraction is based on compositional semantics.  The task for information extraction is somewhat simpler, because we are not building logical forms with quantifiers (we are only interested in specific events).  Each pattern set builds a set of annotations, which carry the semantics as one of the features (the pa feature in Jet), using the semantics the annotations from the previous pattern sets.

We present as a simple example the extraction of appointment events (hiring and firing of executives).