Sentence Splitter
action
name
|
sentenceSplit
|
resources
required
|
none
|
properties
|
none
|
annotations
required
|
none
|
annotations
added
|
sentence
|
The sentence splitter uses a detailed set of hand-coded rules (adapted
from Sekine's OAK system) to divide a span of text into
sentences. It adds sentence annotations
spanning each sentence.
A question mark or exclamation mark always ends a sentence. A
period followed by an upper-case letter generally ends a sentence, but
there are a number of exceptions. For example, if the period is
part of an abbreviated title ("Mr.", "Gen.", ...), it does not end a
sentence. A period following a single capitalized letter is
assumed to be a person's initial, and is not considered the end of a
sentence.