Sentence Splitter


action name
sentenceSplit
resources required
none
properties
none
annotations required
none
annotations added
sentence

The sentence splitter uses a detailed set of hand-coded rules (adapted from Sekine's OAK system) to divide a span of text into sentences.  It adds sentence annotations spanning each sentence.

A question mark or exclamation mark always ends a sentence.  A period followed by an upper-case letter generally ends a sentence, but there are a number of exceptions.  For example, if the period is part of an abbreviated title ("Mr.", "Gen.", ...), it does not end a sentence.  A period following a single capitalized letter is assumed to be a person's initial, and is not considered the end of a sentence.