Title: Practical Packrat Parsing


(NYU-CS-TR854)

Author: Robert Grimm



Abstract:

A considerable number of research projects are exploring how
to extend object-oriented programming languages such as Java with, for
example, support for generics, multiple dispatch, or pattern matching.
To keep up with these changes, language implementors need appropriate
tools.  In this context, easily extensible parser generators are
especially important because parsing program sources is a necessary
first step for any language processor, be it a compiler,
syntax-highlighting editor, or API documentation generator.
Unfortunately, context-free grammars and the corresponding LR or LL
parsers, while well understood and widely used, are also unnecessarily
hard to extend.  To address this lack of appropriate tools, we
introduce Rats!, a parser generator for Java that supports easily
modifiable grammars and avoids the complexities associated with
altering LR or LL grammars.  Our work builds on recent research on
packrat parsers, which are recursive descent parsers that perform
backtracking but also memoize all intermediate results (hence their
name), thus ensuring linear-time performance.  Our work makes this
parsing technique, which has been developed in the context of
functional programming languages, practical for object-oriented
languages.  Furthermore, our parser generator supports simpler grammar
specifications and more convenient error reporting, while also
producing better performing parsers through aggressive optimizations.
In this paper, we motivate the need for more easily extensible
parsers, describe our parser generator and its optimizations in
detail, and present the results of our experimental evaluation.