when pattern, action, action, ...The basic top-level pattern-matching operation in JET is the application of a PatternSet to a text segment, by the method PatternSet.apply().
There are two different representations used internally for patterns and rules. The first representation (the pattern and rule representation) corresponds closely to the structure of a pattern file: each pattern is represented separately as a sequence of pattern elements, and a pattern set is represented as a sequence of rules, where each rule consists of a pattern (name) and actions. This representation is used when the pattern file is being read in. It would also be suitable in the future if we provide some facility for editing patterns interactively.
Once the patterns have all be read in (or, in the future, after the patterns are modified), each pattern set is converted to a pattern graph. The graph is a representation of all the patterns in a PatternSet as a single directed graph. Optional elements, repeated elements, and references within one pattern to other patterns are "expanded" in the graph; this simplifies the process of pattern matching. More importantly, however, the graph is created with a view to graph optimizations which may be performed in the future. Identical arcs leading from a node can be merged into a single arc (this corresponds to identifying common pattern prefixes). In addition, if a large number of arcs leading from a single node match different strings, these can be reduced to a hash table to avoid sequential matching of the current token against each string..
p1 p2 | p3*is represented as a PatternAlternation with two alternatives; the first alternative ("p1 p2") is a PatternSequence with two elements, p1 and p2; while the second alternative ("p3*") is a PatternRepetition.
Every class of PatternElement has a toGraph method for converting the element into a pattern graph.
Associated with PatternNodes is an eval method which matches the graph rooted at that node against the text. The eval method on a node invokes the eval method on each PatternArc leaving the node. The eval method on the arc invokes in turn the eval method on the AtomicPatternElement associated with the arc; the latter eval methods actually test the document (for the presence of a particular token, for example).