This release significantly improves Blink. The Jeannie expression evaluator has been greatly simplified to reduce the number of commands sent to the component debuggers. The intermediate agent has been enhanced to remove hard-coded limitations such as the maximum number of threads and synchronized accesses to shared data structures. This version is the one used for all experiment in the SPE ’14 paper “Debugging Mixed-Environment Programs with Blink” by Byeongcheol Lee, Martin Hirzel, Robert Grimm, and Kathryn S. McKinley.
This release also improves SuperC's AST node names, fixes various bugs in AST creation, and adds further regression tests.
Clang's C structure layout appears to be different from that of gcc, leading to a regression test failure on Mac OS 10.9.
This release significantly improves SuperC again. SuperC's performance has been tuned, the regression tests have been expanded, numerous bugs have been fixed, and the scripts for running and evaluating SuperC have been enhanced. All SuperC code is now released under the GPL version 2.0. This version is the one used for all experiments in the PLDI ’12 paper “SuperC: Parsing All of C by Taming the Preprocessor” by Paul Gazzillo and Robert Grimm.
This release significantly improves SuperC. The parsing algorithm, Fork-Merge LR, has been completely reimplemented. It is now based on the novel token follow-set, which captures the actual variability of static conditionals independent of how they are nested within each other and appended to each other. The parser also includes three optimizations, shared reductions, lazy forking, and early reductions, which further decrease the number of forked subparsers.
Both SuperC preprocessor and parser are now language-independent. To support a new language, a user needs to provide an annotated JFlex lexer definition, an annotated Bison grammar, and, optionally, a Java implementation of semantic actions.
Several new scripts help with running SuperC and collecting experimental data. They include a script to distribute the processing of Linux kernel source files across machines and scripts to compute summary statistics from SuperC's raw data output, e.g., a CDF of the number of subparsers.
The new SuperC technical manual can be found
in src/xtc/lang/cpp
. It documents basic SuperC usage,
its scripts, and the format of its statistics output. The manual is
built by invoking make manual
.
This release also fixes a bug in the Jeannie regression test harness, which failed on some Linux distributions. Thanks to Jacob Shufro and Martin Hirzel for their help in identifying and resolving this bug.
This release removes support for type checking the simply typed
lambda calculus from xtc.lang.TypedLambda
, since it
depends on the already discontinued Typical compiler. Thanks to
Thomas Huston for identifying this bug.
This release fixes the C structure layout regression tests to not utilize the Typical-generated type checker anymore.
This release also fixes a C type checker regression test to not use a deprecated preprocessor feature anymore. Thanks to Jacob Shufro for identifying this bug.
This release improves SuperC by adding significantly more regression tests and introducing attendant bug fixes. It also adds more code comments and fixes code formatting.
This release adds support for parsing and pretty
printing Java 7. To support Java 7's
try-with-resources statements, the AST for regular try-catch
statements has been changed, even when using parsers for earlier Java
versions. The Java analyzer, xtc.lang.JavaAnalyzer
, has
been updated accordingly. To support Java 7's underscores in numeric
literals, the definition of Java constants
in Rats!' xtc.lang.JavaConstant
has been
modified to more closely follow the language specification.
Tokens are now extensible; the corresponding
class, xtc.tree.Token
, is not final anymore but has
become abstract. Rats!' support for parse trees has been
updated to utilize the new concrete
subclass xtc.tree.TextToken
. All parsers distributed
with xtc have been updated accordingly.
To support Mac OS X 10.7 (Lion), this release adds
support for the stpncpy_chk()
builtin function.
Finally, this release removes a residual dependency on the
Typical-generated type checker from xtc.lang.C
. Thanks
to Thomas Huston for identifying this bug.
This release introduces a preview of SuperC, a
new tool for parsing C code with arbitrary preprocessor usage. SuperC
first lexes C code, then uses a new configuration-preserving
preprocessor to resolve all directives modulo conditionals, and
finally uses a novel variant of LR parsing to generate a well-formed
AST containing static choice nodes for conditionals. The
corresponding Java package is xtc.lang.cpp
.
This release removes the following unmaintained code: the ANTLR and JavaCC parsers for Java; the C4 compiler; the Overlog compiler; the Typical compiler; and the XForm AST query engine. The last release containing this code is xtc version 1.15.0 (with corresponding testsuite).
This release removes the unnecessary dependency on the "perfctr" library from Jinn. Thanks to Mengtao Sun for pointing out this bug.
src/xtc/lang/blink/agent
and the make target
is agent
. Please direct any feedback
to Byeong
Lee.
This release also removes the unnecessary analyzers
target from the make file in src/xtc/lang/blink
. Thanks
to Tony Sloane for pointing out this bug.
rats.manifest
to the source distribution.
Thanks to Tony Sloane for pointing out this omission.xtc.tree.Visitor
and xtc.lang.C
to avoid raw type warnings.limits.c
to support 64-bit architectures
and eliminate compiler warnings.include/win32
as an include file path for Cygwin,
which is necessary for Sun's JDK 1.6. Thanks to Martin Hirzel for
resolving this issue.xtc.lang.c.ml.Machdep
to reflect changed
constants in xtc.Limits
. Thanks to Mike Chrzanowski for
identifying this bug.Due to many of the above changes, xtc now passes all regression tests on Apple's Mac OS X Snow Leopard (10.6), whose C compiler and Java virtual machine default to 64-bit.
Rats! has been updated as follows:
break
statements for
character switches in optional expressions. Thanks to Chris Capel and
the C# compiler for identifying this bug.The Jeannie grammar has been updated to eliminate a bug that caused null pointer exceptions. Thanks to Matt Renzelmann for identifying this bug.
The Blink debugger has been updated to perform dynamic consistency
checks on the arguments to JNI functions. For example, it detects
when NULL
is passed to NewStringUTF
and
reports this invalid argument.
Rats! has been updated as follows:
explicit
attribute instructs Rats! to generate
error messages relative to the marked production's name (and thus
ignore any already generated parse errors).yyValue
in recursive alternatives are
now rejected, since Rats! cannot deduce their semantic value.
Thanks to Chris Capel for raising this issue.rats.jar
.
Thanks to Adrian Quark for raising this issue.C support has been improved as follows:
parsetree
option, which preserves all
formatting. Thanks to Eric Hielscher for raising this issue.__thread
specifier provided by gcc for the ELF object
format. Thanks to Matt Renzelmann for raising this issue and aiding
in its resolution.printFeatures
option
for xtc.lang.C
prints major GCC extensions used by the
code being processed.make configure
to recreate
the appropriate xtc.Limits
for your hardware, OS, and
compiler. Thanks to BK Lee's tireless help, configuration now also
works with Microsoft's Visual C.
The Blink inter-language debugger has been improved as follows:
Rats! has been updated as follows:
This release introduces Blink, a portable mixed-mode Java/native debugger. It currently supports Sun's Java virtual machine running on the x86 versions of Linux and Cygwin, with support for other JVM, OS and processor configurations under development. Please direct any feedback to BK Lee.
Rats! has been updated as follows:
variant
annotations.
The Typical compiler has been updated to
support fun
expressions, and the translation
of let
expressions has been optimized. Additionally,
bugs in the exhaustiveness checking for match
expressions
and when explicitly matching bottom
have been fixed.
Thanks to Christopher Conway for reporting these bugs.
The Java, Typical, and O'Caml type checkers for C have been updated to:
const char *
instead of char *
,To track size, alignment, and offset values, the C type checkers
now include a re-engineered version of gcc's structure layout
algorithm. The local system's C configuration
in xtc.Limits
has been improved in support.
Run make configure
to recreate the appropriate
version for your hardware and operating system.
The syntax for Jeannie top-level compilation
units has changed. The package and import declarations now come
before the initial `.C {…}
block instead of
after it. That way, top-level C code can use simple instead of fully
qualified names when referring to Java entities.
Internally, the Jeannie grammar and AST for array declarators has been updated to create "variable length" nodes, just like the C grammar and AST in release 1.13.0. Furthermore, the compiler has been updated to address several bugs, mostly thanks to helpful reporting by Matt Renzelmann.
Support for Overlog has been extended with a
translator targeting Java. The corresponding runtime is being
developed by Nalini Belaramani at UT Austin; the necessary JAR file is
available here.
Additionally, the Overlog language has been extended with tuple and
function type declarations, the Overlog grammar has been cleaned up,
and a bug in the inference of function return types has been fixed.
The corresponding Java package has been renamed
to xtc.lang.overlog
(from xtc.lang.p2
).
The Jeannie compiler now supports backticked Java
primitive types, e.g., `boolean
or `int
, as
C type specifiers. This change eliminates the need for using the
equivalent JNI types, e.g., jboolean
or jint
, in C contexts. This release also includes
various bug fixes to the Jeannie compiler and
a user
guide.
The Typical compiler now supports
the guard
construct for protecting
against bottom
values in arbitrary expressions. It also
incorporates various bug fixes, including mapping bottom
to bottom
in optimized pattern matches.
This release includes three type checkers for C.
The first is the previously released version, which is written in Java
and used by the Jeannie compiler. The second is new to this release
and written in Typical. It is invoked through
the -analyze
and -typical
options to the C
driver xtc.lang.C
. Just like the type checker written in
Java, the type checker written in Typical passes all of gcc 4.1.1's
regression tests. Both type checkers also process the entire Linux
2.6 kernel. To this end, the handwritten C type checker now:
__builtin_types_compatible_p()
(which
also required changing the C grammar),src/xtc/lang/c/ml
directory. Like the
other two type checkers, the O'Caml version processes the entire Linux
2.6 kernel; though it does not recognize C99's variable length
arrays.
xtc now includes support for type inference and concurrency
analysis of Overlog programs; the corresponding code
lives in the xtc.lang.p2
package.
Rats! has been updated as follows:
rawTypes
attribute has been fixed; it
does not result in a class cast exception anymore. However, support
for this attribute has been deprecated and will be removed in a future
release.All tools now support a -no-exit
option for not exiting a Java virtual machine. As a result, tools can
now be invoked by other Java code in the same JVM without terminating
the JVM after tool completion.
The licensing of most classes
in xtc.util
has been changed to the LGPL version 2.1. As
before, the complete list of LGPL-ed classes can be found
in overview.html.
This release makes the following changes to Rats!:
withLocation
option now
start counting columns at 1 (instead of 0) for consistency with most
modern development environments. The following code fixes Emacs'
column number mode:
(add-hook 'post-command-hook (lambda () (let ((help-echo "mouse-1: select (drag to resize), mouse-2: delete others, mouse-3: delete this") (col (number-to-string (+ 1 (current-column))))) (setq-default mode-line-position `((-3 ,(propertize "%p" 'help-echo help-echo)) (size-indication-mode (8 ,(propertize " of %I" 'help-echo help-echo))) (line-number-mode ((column-number-mode (10 ,(propertize (concat " (%l," col ")") 'help-echo help-echo)) (6 ,(propertize " L%l" 'help-echo help-echo)))) ((column-number-mode (5 ,(propertize (concat " C" col) 'help-echo help-echo)))))))) (force-mode-line-update)))Thanks to Martin Hirzel for updating Emacs' original hook. The start column is now defined by
xtc.Constants.FIRST_COLUMN
;
the xtc.tree.Printer
utility has been updated to use this
constant.withLocation
option now
correctly annotate nodes resulting from directly left-recursive
generic productions with their source locations (again). Release
1.12.0 introduced a regression, which annotated nodes with a source
location past the position of the recursive nonterminal. This release
restores an optimized version of the correct approach introduced in
release 1.8.0.xtc.parser.ParserBase.setLocation(int,String,int,int)
method. The C, C4, and Jeannie grammars utilize this method to update
the corresponding parsers' source location based on gcc line markers
in the preprocessed input. As a result, all error messages now report
the original file name and line number; though the column number may
be inaccurate due to macro expansion.toText()
helper method that returns a string. For
regular parsers, the method is the identity function for strings. For
parsers generated with the withParseTree
option, the
method takes an annotated token as its only argument and returns the
corresponding string. The C and Java grammars have been rewritten to
utilize this method instead of various kludges for converting
annotated tokens to strings.withParseTree
option now
correctly preserve formatting in list-valued productions.
Furthermore, they now correctly preserve formatting in some generic
productions that are not directly left-recursive and end with a
sequence consisting only of formatting; Rats! also does not
split such productions any more.noinline
attribute for productions prevents
inlining even if the production is marked as or recognized
as transient
. Furthermore, the new memoized
attribute for productions prevents productions from being treated
as transient
.The Typical compiler now supports the
hierarchical syntax tree definitions generated by Rats!,
including polymorphic variants and the 'a var
type.
The type describing the syntax tree's root defaults
to node
but can be overridden through
the -node
command line flag. Additional changes to
Typical include:
reduce
construct now
correctly follows its semantics.parent
and ancestor
built-ins now follows their semantics.The Jeannie compiler has been updated to reflect
the language described in the OOPSLA paper. In particular, it now
supports with
statements for non-primitive arrays,
declarations in with
statement initial clauses, and
compound initializers. Additional changes include:
abort
(or _abort
) has been
renamed to cancel
(or _cancel
). The
new -underscores
command line option overrides this new
default behavior, reverting to the underscored versions.//#line <line> <file>and indents both generated C and Java code identically to the source. The new
-pretty
command line option overrides this new
default behavior, reverting to the Java and C pretty printers.jeannie.sh
shell script
in src/xtc/lang/jeannie
manages the entire build process
from Jeannie source code to Java and C binaries.The C regression tests have been updated to include all relevant tests from GCC version 4.1.1. The C type checker has been updated accordingly. In particular, it now explicitly checks for:
main
not being an int.The limits.c
utility for determining a local
system's C configuration has been improved to more accurately
determine the local pointer difference, size, and wide character
types. The corresponding xtc.Limits
class included in
the source distribution is valid for 32-bit x86-based Mac OS X
systems, but differs in endianness from PowerPC-based Mac OS X systems
and in the definitions for size and wide character types from Linux
and Windows systems. The new configure
target for the
global Makefile rebuilds xtc.Limits
and xtc.type.C
(whose constants depend
on Limits
) for a local system.
Thanks to Thomas Moschny, the implementation of for expressions in the XForm AST query and transformation engine has been fixed to properly iterate over nested sequences. Also thanks to Thomas Moschny, a bug causing a null pointer exception has been fixed.
All tools now support
a -diagnostics
option to print tool internal state.
Given this option, the C driver now prints the local system's
configuration parameters (as determined by limits.c
— see above).
Finally, the Java and C drivers now support
the -locateAST
command line option to print each node's
source location when printing the AST with the -printAST
option.
Starting with this release, xtc
includes Typical, a domain-specific language and
compiler for implementing semantic analysis including type checking.
The Typical language builds on the functional core of ML and extends
it with novel declarative constructs specifically designed for
implementing type checkers. The package description
for xtc.typical
provides an overview and introduction.
Examples included with xtc are a type checker for the simply typed
lambda calculus in src/xtc/lang/TypedLambda.tpcl
and for
the Typical language itself in src/xtc/lang/Typical.tpcl
.
A type checker for C written in Typical is under development. The
main developers for Typical are Laune Harris and Anh Le.
Starting with this release, xtc also includes "a compiler
contributed to xtc" a.k.a. Jeannie, which integrates
Java with C. In Jeannie, Java and C code are nested within each other
at the level of individual statements and expressions and compile down
to JNI, the Java platform's standard foreign function interface. By
combining the two languages' syntax and semantics, Jeannie eliminates
verbose boiler-plate code, enables static error detection across the
language boundary, and simplifies dynamic resource management.
The OOPSLA '07
paper by Martin Hirzel and Robert Grimm describes both language and
compiler in detail; the package description
for xtc.lang.jeannie
provides instructions on how to
compile source code to binaries.
Instead of using strings, Rats! now
relies on xtc.type.Type
and its subclasses to internally
represent the types of semantic values. The first new feature to
leverage this improved internal representation is variant
typing for grammars. When the -ast
command line
option is combined with the new -variant
option, Rats! automatically determines ML-style variant
types representing a grammar's generic AST. To facilitate type
inference, Rats! relies on the new variant
attribute for productions, which indicates that all generic nodes
returned by a production are members of the same variant type, named
after the production. The C, Java, Typical, and simply typed lambda
calculus grammars have been updated accordingly.
The Java grammar and AST for this
expressions have been improved. Instead of accepting any primary and
postfix expression, the grammar now recognizes only a qualified
identifier with a trailing dot before the this
keyword.
For well-formed inputs, this changes replaces zero or more nested
selection expression nodes as a this expression node's first child
with an optional qualified identifier.
The C grammar and AST have also been improved.
The "*
" string denoting variable-length arrays in array
declarator nodes and direct abstract declarator nodes has been
replaced with a dedicated "variable length" node. Next, the
identifier string in structure designators has been replaced by a
primary identifier node. Finally, goto statement nodes now have two
children. A "*
" string as the first child now indicates
a computed goto statement. The second child always is a node, with a
primary identifier providing a regular goto statement's label.
As described below, Rats!' handling of list values in
generic productions has changed. If your grammar contains generic
productions and you do not want to update your AST processing code,
add the flatten
option to your grammar.
xtc now supports parse trees in addition to abstract syntax trees, thus facilitating source code refactorings that preserve formatting and layout. In particular:
withParseTree
attribute, Rats! rewrites generic, list-valued, text-only,
and void productions as well as productions that pass the value
through to generate parsers that preserve all formatting as
annotations. Annotations are instances of the new
class xtc.tree.Formatting
, which replaces the generic
annotations introduced in version 1.9.0.withParseTree
attribute. The
exception are strings, which are represented as instances
of xtc.tree.Token
. Additionally, generic nodes include
additional children (consisting of Formatting
annotating
a null
value) if a voided expression or void nonterminal
appears between two list-valued expressions.Token.test
and Token.cast
methods
can be used to test for and cast to strings, irrespective of whether
the tree is a parse tree or abstract syntax tree.xtc.tree.ParseTreePrinter
prints parse trees
including formatting, and the
new xtc.tree.ParseTreeStripper
strips all formatting and
tokens, extracting the embedded AST (but preserving any other
annotations).-parsetree
option to use parse trees instead of
abstract syntax trees. Furthermore, the -strip
option
removes all formatting and tokens from a parse tree again.The interface to abstract syntax tree nodes has been improved as following:
xtc.tree.Locatable
. The corresponding field
in xtc.tree.Node
has been marked private. Rats!
now uses this interface for parsers with the withLocation
attribute, thus removing the dependency on xtc's node
representation.write(Appendable)
method for
incrementally creating a human-readable
representation. Node.toString()
now utilizes this
method. Similar functionality for classes in xtc.type
has been modified to utilize this generalized version.xtc.util.Pair
has been improved. In particular, the
new Node.getList
method returns a node's child as a list,
and the new Node.isList
and Node.toList
methods test for and cast to lists of nodes, respectively.
Additionally, the
new Visitor.iterate
, Visitor.map
,
and Visitor.mapInPlace
methods apply a visitor to all
nodes on a list.The representation of programming language types
in xtc.type
has been cleaned up and expanded:
AnnotatedT
is still available to annotate a type
without directly modifying it.xtc.type.VariableT
.UnitT
, VariantT
,
and TupleT
classes model the corresponding types in
functional languages such as ML or Haskell. The latter two classes
replace the ListT
, OptionT
,
and ProductT
classes introduced in version 1.10.0.Parameter
and Wildcard
classes representing named parameters and
wildcards, respectively. The new wrapped
types ParameterizedT
and InstantiatedT
capture a type's declared parameters and its instantiation with
concrete types, respectively.Type.Tag
now defines a Java enumeration over all type
classes. Each instance's tag is accessible through the
Type.tag()
and Type.wtag()
methods (with
invocations of the former method being forwarded across wrapped
types). As a result, it is now possible to implement switch
statements for types.Tag
interface for C's enum, struct, and union
types has been renamed to Tagged
in order to avoid
confusion with the new Type.Tag
enumeration.
The Constant
interface for types' constant values has
been replaced with a concrete implementation.xtc.type.C
.xtc.type.AST
contains common constants
and operations for typing abstract syntax trees.xtc.type
, though the conversion is not yet
complete.The Java grammar and AST have been re-engineered to (mostly) eliminate the need for a separate AST simplification phase. Notably, the AST for postfix and primary expressions has been significantly cleaned up. The Java type checker has been updated accordingly.
Additionally, xtc now includes a grammar for
Java 5. The Java 5 grammar is implemented
as a modification of the Java 1.4 grammar, and ASTs for the two
versions are compatible, i.e., every valid Java 1.4 AST also is a
valid Java 5 AST. The Java pretty printer has been updaged to
support both versions. Furthermore, the FactoryFactory
concrete syntax tool has been updated to use the Java 5 grammar.
Since ASTs for the two language versions are compatible, the concrete
syntax tool will create Java 1.44 ASTs as long as the input only
uses Java 1.4 features.
The C type checker now verifies that external declarations without initializers are complete only at the end of a translation unit, thus correctly allowing for the definition of a struct or union type after it has been used in an external declaration. It also adds support for three more GCC extensions:
extern
and inline
functions, which
effectively are macros and may be defined in the same translation unit
before a regular function definition,In addition to supporting the generation of parse trees and using
the new Locatable
interface, Rats! has been improved as
follows:
Pair<T>
, automatically creating a list from
the values of each alternative's component expressions. If the last
component expression has a list value, that value becomes the tail of
the production's list value. If the only component expression has a
list value, that value becomes the production's value. For
example,
Pair<Node> ExpressionList = Expression (void:",":Symbol Expression)* ;creates a list of nodes, automatically consing the first expression's value onto the list of expression nodes. In contrast,
Pair<Node> TwoExpressions = Expression Expression ;also creates a list of nodes, but by consing the two expressions' nodes onto the empty list.
null
literal, which
simply provides a null
value. Previously, the C
and Java grammars used a production
Node Null = ;to generate null values; the null literal provides a more direct and efficient alternative. The old
xtc.util.Null
and xtc.util.NullNode
modules have been removed.flatten
attribute.-ast
command line option,
which instructs Rats! to print a formal definition of a
grammar's abstract syntax tree, has been rewritten (again) to produce
a more accurate definition. It now uses the optional
modifier to indicate that an AST node's child may be null
and the variable
modifier to indicate that a child may
not even be present. This feature remains under active
development.-Olocation
)
causes Rats! to (1) use simpler code for updating a node's
source location where possible and (2) omit updates altogether where
possible. This optimization is enabled by default.ClassCastException
during code
generation.null
, even if the option was
matched in the input; furthermore, the repetition was not matched
completely. Thanks to Eclipse for raising the "unused variable
binding" leading to the bug's discovery.The AttributeList
and MalformedNodeException
classes
in xtc.tree
have been removed. All code using the former
has been changed to use a List<Attribute>
; there
was no code using the latter.
Finally, this release incorporates several fixes to minor bugs identified by Eclipse and by FindBugs.
The licensing of several classes has been
changed. The Node
, GNode
,
and Annotation
classes in xtc.tree
and
the Action
and State
classes
in xtc.util
are now licensed under the LGPL version 2.1
instead of the GPL version 2. Consequently, parsers generated from
grammars with generic or stateful productions are not covered by the
GPL anymore.
This release simplifies the interface between nodes and
visitors. Processing methods cannot be specified as part of
nodes anymore; i.e., visitWith(Visitor)
methods are not
recognized by dispatch()
anymore. Furthermore, if a
visit method has void
as its return
type, dispatch()
now returns null
; i.e., it
does not return the specified node anymore. The first feature has
been removed because it has not been used in over 1 1/2 years; the
second feature has been removed because it is inconsistent with Java
reflection and programmer expectations about void methods (while also
having some runtime overhead).
Other changes to nodes and visitors include:
dispatch()
cannot identify an appropriate visit
method, it now invokes the new unableToVisit(Node)
method. That method's default implementation simply raises a visitor
exception, thus resulting in the already familiar behavior. However,
visitors can override this method and thus implement their own error
handling strategies. Note that dispatch()
caches
resolutions to unableToVisit()
, just like it caches
resolved visit methods.VisitorException
and VisitingException
now inherit from a common
superclass TraversalException
. That class removes stack
trace elements corresponding to dispatch()
and Java
reflection invocations from a strack trace, thus resulting in less
clutter when printing the stack trace. Thanks to Martin Hirzel for
raising this issue.indexOf()
,
lastIndexOf()
, and contains()
operations
consistent with the Java collections framework.Rats! has been improved as follows:
FullParserBase
has been rolled
into ParserBase
, thus eliminating the need to
differentiate a parser's base class according to license.@Name
"; the last node marker in a sequence
specifies the created generic node's name. Node markers are
especially useful for expressing different left-associative operators
that have the same precedence with a single directly left-recursive
production. Where possible, explicit semantic actions in the C
grammar have been replaced with node markers.profile
attribute instructs Rats!
to include code for profiling the usage of the memoization table. For
grammars with this attribute, Rats! includes a counter for
every field (i.e., memoized production) in the memoization table. The
parser then increments the appropriate counter on every table
access. Rats! also includes
a profile(xtc.tree.Printer)
method, which prints
the maximum value for all of a production's fields across all
memoized productions. If that number consistenly is 1 over a sampling
of representative inputs, the corresponding production should probably
not be memoized (i.e., marked as transient). The C and Java drivers
have been updated to support parsers generated with this
attribute.factory
attribute instructs Rats!
to use a class different from xtc.tree.GNode
for creating
generic nodes.verbose
attribute has been
rewritten to produce considerably more informative traces of a
parser's execution. In particular, the parser now traces when it (1)
enters a production, (2) exits a production (with either a match or
parse error), and (3) looks up a previously memoized result.nowarn
attribute instructs Rats! to
suppress warnings for a production or the entire grammar.-ast
command line option,
which instructs Rats! to print a formal definition of a
grammar's abstract syntax tree, has been generalized (and simplified)
to produce a more accurate definition.value(Result)
, format(ParseError)
,
and signal(ParseError)
methods of the parser base
class xtc.parser.ParserBase
. The old error reporting
code has been removed from ParserBase
and
ParseException
; all tools have been updated accordingly.
The easiest way to use a parser with the updated interface is:parser.value(parser.pNonterminal(0))This expression tries to recognize nonterminal Nonterminal, starting at the beginning of the input, and either returns the corresponding semantic value or signals a parse exception.
-valued
command line option instructs
Rats! to reduce a grammar to only those expressions that
directly contribute to the abstract syntax tree and to then print the
reduced grammar. Like the -ast
option, it helps
developers understand a grammar's abstract syntax tree without them
needing to understand the complete grammar.xtc now supports concrete syntax for creating
Java and C abstract syntax trees. The
new xtc.lang.FactoryFactory
tool reads in a factory
declaration, which includes one or more snippets of Java or C code,
and creates the corresponding factory class. That class has one
method per snippet, with each method creating the abstract syntax tree
representing the code snippet. Code snippets may be declarations,
statements, or expressions; they may also contain pattern variables,
which are bound on method invocation.
The Java grammar has been improved by introducing a distinct production for variable declarations and by not recognizing constructor, method, and field declarations inside method bodies anymore. At the same time, the AST fragment for variable declarations has the same structure as that for field declarations; i.e., both nodes have the same name ("FieldDeclaration") and one or more children indicating the modifiers.
Additionally, the pretty printing of Java ASTs has been improved: synchronized statements now include parentheses around their expressions, compilation units and class bodies do not contain unnecessary blank lines any more, and the spacing of class declarations, catch clauses, and new expressions has been improved. Thanks to Martin Hirzel and Laune Harris for identifying several of these issues.
Thanks to Martin Hirzel, xtc now includes a type checker
for Java (version 1.4). Comparable to the C type checker,
the Java type checker is invoked through the -analyze
command line option to xtc.lang.JavaDriver
.
The -printSymbolTable
option instructs the Java driver to
print the symbol table after analysis. Note that the Java type
checker requires a simplified AST, as indicated by
the -simplifyAST
option.
Support for processing C programs has been improved as follows:
CDriver
exposes these features through
the -preserveLines
and -formatGNU
command
line options.xtc.lang.CAnalyzer
) now correctly
type checks variable declarations with compound initializers, even if
the -markAST
command line option is specified. Under
certain conditions, it previously aborted with an exception indicating
that a node already has a type. Thanks to Martin Hirzel for
identifying this issue.Tool support for I/O has been improved. In
particular, xtc.util.Runtime
now manages input/output
directories and can open chracter streams.
Furthermore, xtc.util.Tool
now allows for the
specification of character encodings on the command line. As a
result, Rats! now supports user-specified character
encodings. Thanks to Steven Foster for raising the issue of character
encodings.
This release fixes the following bugs in XForm, the AST query and transformation engine:
Iterator.next()
has been removed
when processing ASTs.for
expressions has
beeen removed.Finally, this release makes the following miscellaneous changes:
xtc.util.Pair
now has improved support for treating
pairs as lists. In particular, the following methods have been
added: hashCode()
to determine a list's
hashcode, equals()
to test for list equality,
toString()
to determine a list's human-readable
representation, get()
and set()
to access a
list's elements, contains()
and consists()
to test for a list's elements, and setLastTail()
and append()
to append two lists (either destructively or
not).Pair
has also been changed to
implement Iterable<T>
, thus enabling the use of
pairs in Java's enhanced for loop. Thanks to Petar Maymounkov for
suggesting this change.printHeader()
method
in xtc.util.Tool
prints a header appropriate for
machine-generated code. Rats! has been updated to use this
method.xtc.type.SourcePrinter
prints types (that is,
instances of xtc.type.Type
) as C source
declarations.xtc.type.TypePrinter
now tracks already printed
compound types and prints just a reference on subsequent encounters
(instead of printing the complete type). This change avoids an
infinite recursion when a complex type references itself, e.g., a C
structure containing a pointer itself.xtc.tree.Printer
now supports close()
.
Furthermore, a NullPointerException
when
invoking reset()
on a Printer
that does not
buffer the output has been fixed. Thanks to Patrick Winters for
identifying the latter issue.NullPointerException
when
invoking dump()
on
a xtc.util.SymbolTable.Scope
containing
a null
value has been fixed. Thanks to Laune Harris for
identifying this issue.fmt()
and msg()
methods of xtc.util.Utilities
have
been removed after refactoring the error reporting code
in xtc.parser.ParserBase
and xtc.util.Runtime
.All code is now compiled with Java 5:
xtc.util
, xtc.tree
,
xtc.parser
, and xtc.type
packages have been
updated to utilize the new language features, notably generics. As
part of the conversion process, many classes have been simplified,
notably by replacing explicit iterations with for-each loops.xtc.lang
and xtc.xform
packages still needs to be updated and thus results in "unchecked
operation" warnings.xtc.lang.antlr
and xtc.lang.javacc
have
been annotated with "@SuppressWarnings("unchecked")
" to
avoid unnecessary warnings; since they depend on external tools, they
will not be updated to Java 5.rawTypes
grammar
attribute will still compile with previous versions of the runtime
classes (after removing one annotation, see below).This release makes the following changes to Rats!:
nt
references a void production, then
"nt*
" is now treated as "void:(nt*)
".Set<Integer>
" or
"Map<String, Integer>
". Note
that Rats! does not recognize wildcards. It does, however,
allow white space (but not comments) between typenames, type argument
brackets, and commas.rawTypes
grammar attribute
instructs Rats! not to use generics and to include a
"@SuppressWarnings("unchecked")
" annotation in the
generated parser. Otherwise, Rats! now leverages xtc's new
support for Java 1.5. Performance measurements of the Java parser
show that (1) there is no difference in throughput or heap
utilization between the version using generic types and the version
using raw types and (2) both versions running on the Java 5 virtual
machine are 4-6% slower than previous versions of xtc running on the
Java 1.4 virtual machine for Mac OS X.set
grammar attribute has been replaced
by the more specific setOfString
grammar attribute.
Other type-specific set attributes will be added as needed.-ast
" command line option
instructs Rats! to print a description of a grammar's
abstract syntax tree as an ML-like type definition; it only considers
generic productions.xtc.tree.Node
even if
the dump
option is specified. Thanks to Sukyoung Ryu for
identifying this bug.) {
"). Thanks to Sukyoung Ryu for identifying this
bug.Object
as the semantic value.void:void:expr
", is
now parsed as a voided expression and not a voided binding to the
(illegal) identifier "void
". The redundant voiding
operator is ignored.yyBase
variable without declaring
it. Thanks to Sukyoung Ryu for identifying this bug.Pair
anymore when
creating generic nodes. These casts became unnecessary with the
improved deduction of semantic values' types in release 1.9.0; this
release (1.10.0) further refines type deduction, notably for the types
of repeated expressions.element
field, which had
type Element
, has been replaced by the more specific
choice
field (of type OrderedChoice
). Next,
an ordered choice's alternatives are now sequences (and not arbitrary
elements anymore). Finally, all properties used by the parser
generator are now collected in the new Properties
class.A bug in the implementation of generic nodes has
been fixed: GNode.ensureVariable()
does not reverse the
children anymore if it is invoked on a generic node with a fixed
number of children.
Support for language tools has been improved by
adding two new methods to xtc.util.Tool
:
The process(String)
method recursively processes the file
with the specified name and the wrapUp()
method is called
after all files have been processed. Thanks to Hunter Freyer for
suggesting these improvements.
The Java grammar has been changed to support an optional comma in array initializers and to allow single-line comments to be terminated by the end-of-file. Thanks to Martin Hirzel for identifying and fixing these issues.
The Java simplifier now correctly processes this()
and super()
call expressions. Thanks to William Moy for
identifying this bug.
Finally, this release changes the C type checker to correctly use composite types for function definitions following one or more declarations.
This release fixes bugs when pretty printing switch, case, and default constructs for Java ASTs. Thanks to William Moy for pointing out this issue.
Thanks to Martin Hirzel, this release also improves the documentation for the Java AST simplifier.
Thanks to Martin Hirzel, this release includes further fixes for simplifying and printing Java abstract syntax trees.
Thanks to Martin Hirzel, this release fixes a bug when processing assignments during simplification of abstract syntax trees for Java.
xtc now requires JDK 1.5 to build and run.
While xtc still is written in version 1.4 of the Java language, it now
uses classes and interfaces from version 1.5 of the platform
libraries. Notably, all uses of StringBuffer
have been
replaced with StringBuilder
.
The interface to abstract syntax tree nodes has
been generalized by moving the methods for generic tree traversal and
for adding/removing children from xtc.tree.GNode
up
to xtc.tree.Node
. As part of that
move, hasChildren()
was renamed to isEmpty()
and children()
to iterator()
to be more
consistent with the Java platform libraries.
To avoid forcing every subclass into implementing these methods,
Node
provides default implementations for all methods,
which effectively signal unsupported operation exceptions. Code using
nodes can determine whether a node actually supports generic tree
traversal through the hasTraversal()
method and
adding/removing children through the hasVariable()
method. To support generic tree traversal, a subclass only needs to
implement the size()
, get(int)
, and
set(int, Object)
methods. To support
adding/removing children, a subclass only needs to implement the
add(Object)
, add(int, Object)
and remove(int)
methods.
Support for AST annotations has been improved,
with xtc.tree.Annotation
now supporting generic
annotations through the before1()
, after1()
,
round1()
, and variable()
factory methods.
Furthermore, the new node type xtc.tree.Token
supports
the representation of source file symbols as nodes.
In the presence of annotations and tokens, instance tests and
casts on objects returned from an AST node may not work as expected.
Code processing trees should use getString()
to access
string children and getGeneric()
to access generic nodes.
Furthermore, it should use Token.test()
and Token.cast()
to test for and cast to strings and
GNode.test()
and GNode.cast()
to test for
and cast to generic nodes.
All code using generic nodes has been updated to reflect the new
interface. Furthermore, xtc.tree.Printer.format()
now
accepts any node and uses generic traversal to print that node.
xtc now includes working support for semantic
analysis of C. xtc.lang.CAnalyzer
provides a
type checker for C99 and commonly used GCC extensions. While it
successfully passes most of GCC's regression tests, its support for
C99's variable length arrays is not yet complete. It also does not
support GCC's extern inline
functions and variables in
specified registers. In support of CAnalyzer
,
the xtc.type
package has been significantly improved,
notably with a class hierarchy of references to model the memory
layout of lvalues. Several bugs have also been fixed. Furthermore,
the creation of fresh symbols in xtc.util.SymbolTable
has
been fixed so that symbols are, in fact, fresh.
The new type checker is invoked through the -analyze
command line option to xtc.lang.CDriver
.
The -strict
option instructs the C driver to disable
GCC's extensions. The -markAST
option instructs the C
driver to annotate AST nodes with their types. Finally,
the -printSymbolTable
instructs the C driver to print the
symbol table after analysis.
The C grammar has been extended with support for
unnamed struct and union fields within structs and unions.
Furthermore, an initialized declarator now starts with an optional
attribute specifier list, shifting all previous component expressions.
Next, the C grammar now recognizes
GCC's __builtin_offsetof()
function
and __complex__
as an alternative to
C99's _Complex
. Finally, the order of identifiers and
constants in PrimaryExpression
has been reversed, so that
wide C character and string constant are now correctly recognized.
This release makes the following changes to Rats!:
yyValue
are now treated just like bindings
to yyValue
: the parser uses the explicitly specified
value instead of creating a new generic node.null
. Module
xtc.util.Null
has been updated accordingly, removing the
explicit semantic action.null
value. In particular, productions representing desugared options now
have the type of the optional expression and not
necessarily Object
anymore.genericAsVoid
attribute,
productions with type Node
are now automatically voided
as well.yyValue
is now declared as
Node
and not as GNode
.xtc.util.Pair
), parsers now add the list's values to the
production's generic node only if the list is not null
.
As a result, parsers for grammars containing such expressions do not
fail with a null pointer exception anymore. Thanks to Uwe Simm for
identifying this issue.Node
(instead
of GNode
) as the type of productions that pass generic
node values through. That way, they can accommodate annotated
nodes.
The XForm AST query and transformation engine
now supports add and remove operations. For example, "add
Child<> to //Parent
" adds a Child
node to
all Parent
nodes in the AST, and "remove
//SomeName
" removes all SomeName
nodes from the
AST. Additionally, an out of range or otherwise malformed integer
predicate no longer causes a runtime exception; rather, an empty
sequence is returned.
This release improves Rats! by
featuring a completely rewritten Transformer
phase. This
phase deduces semantic values, lifts nested choices, repetitions, and
options, and desugars repetitions and options. The rewritten code is
more modular and (hopefully) more easily maintainable. It also is
more accurate in deducing semantic values and more uniform in
processing (deeply) nested choices, repetitions, and options. As a
result, the rewritten code also fixes a regression identified by
Thomas Moschny.
A set of regression tests for Rats! has been added. The
tests are invoked by typing make check-rats
in the
top-level directory of the distribution.
The old version of the transformer phase is still available
through the -oldTransform
command line option
to Rats!. However, it is deprecated and will be removed in
the near future.
Error checking of grammars has been improved. In particular:
inline
and transient
attributes,
since inline
subsumes transient
.The folding of equal sequences has been modified so that it does not result in a trailing choice of empty alternatives anymore.
Code generation has been modified to avoid declaring and
assigning the yyPredIndex
variable if the variable's
value is never used. Thanks to Thomas Moschny (and Eclipse) for
pointing out this issue.
This release improves the Java grammar by adding support for empty declarations (a semicolon by itself), assert statements, and class selection expressions. Thanks to Terence Parr for identifying these issues.
This release also contains a snapshot of the on-going effort
towards supporting semantic analysis. Notably,
the xtc.type
package has been significantly improved and
xtc.lang.CAnalyzer
has been updated accordingly.
However, for now, typing of C programs still is buggy and
incomplete.
Finally, unnecessary import declarations have been removed throughout xtc, including from parsers generated by Rats!.
This release renames xtc.parser.BaseParser
to
ParserBase
and xtc.parser.PackratParser
to FullParserBase
.
Additionally, FullParserBase
now inherits from
ParserBase
to avoid code duplication.
Next, this release makes the following changes to Rats!' code generator:
ParserBase
and FullParserBase
classes.This release also fixes a bug
in xtc.lang.JavaAstSimplifier
and xtc.lang.JavaPrinter
that caused a null pointer
exception when pretty printing simplified method declarations. The
fixed version of JavaAstSimplifier
preserves the number
of children in MethodDeclaration
AST nodes.
This release considerably improves xtc's support for
the semantic analysis of programs. In particular,
the new xtc.util.SymbolTable
class implements a scoped
symbol table that easily integrates with AST traversal through xtc's
visitors. The new xtc.type
package provides
representations for a program's types. It currently covers all of C's
and Java's types (as of JDK 1.4). The
new xtc.lang.CAnalyzer
visitor leverages the new classes
to fill in the symbol table for a program and to check semantic
correctness along the way. However, CAnalyzer
is still
incomplete and buggy.
The new interface xtc.Limits
specifies
the integer range limits for a local system's C
compiler. The version distributed with xtc's release is consistent
with GCC for Mac OS X on the PowerPC and for Mac OS X, Linux, and
Windows on x86 processors. limits.c
in the same package
can be used to generate the correct limits for other operating systems
and architectures.
Next, the C grammar has been changed as following:
xtc.lang.CParserState
, which is used to disambiguate
typedef names from object, function, or enum
constant
names, has been changed to support subclassing and thus to simplify
the implementation of extensions to the C language.Next, the Java grammar has been improved by using more descriptive names for a large number of productions, by optimizing several productions, and by eliminating the creation of unnecessary AST nodes. The Java printer has been updated accordingly.
Both the recognizer-only and the AST-building Java parsers are
now generated from the same grammar through the
new genericAsVoid
grammar attribute (see below). The
top-level module for both versions is xtc.lang.Java
and
the corresponding parsers now are xtc.lang.JavaRecognizer
(no AST) and xtc.lang.JavaParser
(AST).
To better evaluate and compare parser performance, the Java driver can now generate ASTs when using JavaCC- or ANTLR-generated parsers. The AST-building JavaCC grammar has been generated with Java Tree Builder (version 1.2.2) from the original JavaCC grammar (dated 5/5/02). The AST-building ANTLR grammar is distributed by the ANTLR project, with the recognizer-only version being manually derrived from the original. Both versions of the ANTLR grammar have been updated to version 1.21.
The xtc distribution now contains support for SDF and Elkhound generated Java parsers (again to evaluate and compare parser performance):
glr
directory contains Java 1.5 and
1.4 grammars for SDF. The 1.5 version is the grammar from the
java-front
0.8 distribution (with a differently named top-level module) and
the 1.4 version has been derrived from the former by removing support
for generics, the enhanced for loop, typesafe enums, varargs, static
imports, and metadata. The glr/buildsdf.sh
script is
used to generate the corresponding parse tables and
the data/sdf.sh
script is used to perform a performance
evaluation. The buildsdf.sh
script depends on the
pack-sdf
and sdf2table
tools, while the
sdf.sh
script depends on the sglr
and sglri
tools.glr/ella
directory. It includes the corresponding
lexical, syntactic, and AST specifications as well as any supporting
C++ code. Ella depends on
the smbase
, ast
, elkhound
, and
elsa
packages from Elkhound's source distribution. It
can be built by copying the corresponding directories into
the glr
directory and then
executing ./configure
and make
in that
directory. The data/ella.sh
script is used to evaluate
Ella's performance.This release makes the following changes to Rats!:
reserved
attribute has been replaced
with the new set
attribute, which results in the
generation of a static final set with the attribute's value as its
name. It also results in the inclusion of a convenience
method add(Set,Object[])
for filling this set. The
XForm, C, and Java grammars have been modified accordingly.genericAsVoid
attribute can be
used to generate a parser that only recognizes a language but does not
build an AST from the same tree-building grammar. It is now used for
generating the recognizer-only Java parser from
the xtc.lang.Java
module.-Ocost
), choices2
(-Ochoices2
), and prefixes (-Oprefixes
)
optimizations are now enabled by default. The choices2 optimization
now only inlines productions that have been marked with the
new inline
attribute. Otherwise, this attribute is
semantically equivalent to transient
.-Ognodes
) leverages
xtc.tree.GNode
's new factory methods to create leaner
generic nodes. It is enabled by default.-lgpl
option generates parsers that are not
restricted by the GPL. Parsers generated with this option use the
new xtc.parser.BaseParser
base class, which, unlike
xtc.parser.PackratParser
, does not reference any classes
released under the GPL.-Oselect
) has been fixed. Thanks to Laune Harris for
helping to identify and fix this issue.xtc.util.Action
) has been fixed. Actions are used to
construct left-recursive data structures from right-recursive
productions. But their application did not annotate nodes with
source code locations; this has been fixed through the new
PackratParser.apply(Pair, Object, int)
method.
Furthermore, the Action
class has been turned into an
interface.Thanks to Laune Harris, this release makes the following major changes to XForm:
insert before
" and
("insert after
") and set difference
("differ
"). Next, arbitrary expressions including
function calls can now appear in predicates. Finally, function
arguments can now be sequences, strings, or integers instead of just
integers.xform/samples
directory. In addition an example Java
language extension has been added
to xform/samples/javaproperty
.Java's access control is now disabled for xtc's visitor
dispatch. As a result, visitors can now be specified as
anonymous inner classes. For example, xtc.lang.CAnalyzer
uses this feature to analyze declaration specifiers and
declarators.
Generic nodes now need to be created through a
set of factory methods; look for the create()
methods
in xtc.tree.GNode
. Several of these methods directly
accept a generic node's children and return generic nodes that are
specialized for the specified number of children. As a result, such
fixed size nodes do not support
the add()
, addAll()
,
and remove()
methods defined
by xtc.tree.GNode
. They can be distinguished from
variable sized nodes through isVariable()
and converted
to variable sized nodes through ensureVariable(GNode)
.
Rats!' new gnodes optimization (see below) utilizes these
factory methods to reduce the memory and performance overhead of
parsers with generic productions.
This release introduces improved support for building
language tools with xtc. In particular, the
new xtc.util.Runtime
class manages command line options,
errors and warnings, and output to the standard console. The
new xtc.util.Tool
class provides a skeleton tool
implementation, including support for several default command line
options. Rats! and the C, Java, and XForm drivers have been
rewritten to utilize both classes. Note that, as a result of this
rewrite, some command line options for these tools have changed.
This release also introduces our first unit
tests. We rely
on JUnit as our unit
testing framework and JUnit's binary release (junit.jar
)
must be in the classpath. Thanks to Anh Le, this release also
introduces our first regression tests, based on GCC's
regression tests. Just like GCC, we rely
on expect
and DejaGnu to
perform these tests. The
description of our development setup
and the sample shell scripts (setup.bat
and setup.sh
) have been updated accordingly.
xtc now builds with JDK 1.5 by passing
the -source 1.4
flag to the javac
compiler.
All sources remain at Java version 1.4.
xtc's licensing has been changed: Most of the
code is now released under the GNU General Public License (GPL)
version 2. The exceptions
are xtc.parser.BaseParser
, xtc.parser.Column
,
xtc.parser.Result
, xtc.parser.SemanticValue
,
xtc.parser.ParseError
, xtc.tree.Location
,
and xtc.util.Pair
, which are released under the GNU
Lesser General Public License (LGPL) version 2.1. The main licensing
change is that the option of using later versions of the GPL and LGPL
has been removed.
Thanks to Marco Yuen and Marc Fiuczynski, this release incorporates C4, the CrossCutting C Compiler. C4 makes aspect-oriented software development techniques available to C programmers, with the goal of simplifying the development of software variants, notably for the Linux kernel.
This release makes the following changes to Rats!:
reserved
attribute results in
the generation of a static final set of reserved
identifiers RESERVED
and a convenience
method reserve(String[])
for filling this set. This
attribute eliminates the need for explicitly defining this set in a
body action (though the set still has to be filled in an action).flag
attribute results in the
generation of a static final boolean with the attribute's value as its
name. This attribute eliminates the need for explicitly defining such
a flag in a body action.flag
attribute, the
processing of attributes has been updated. As a result, attributes
such as transient
, whose values used to be ignored, now
must not have values. Internally, the
class xtc.tree.AttributeList
has been added and
xtc.tree.Attribute.equals()
has been changed to take an
attribute's value into account.stateful
, reserved
, or flag
attributes, which are preserved. When pretty printing modules with
the -html
command line option, globally effective
attributes are now highlighted (assuming
the grammar.css
stylesheet contained in the source distribution's root directory also
is in the same directory as the HTML files).stateful
, reserved
,
and flag
).stateful
attribute and not just the module itself.Analyzer.remove(Module)
. Finally, ambiguous
nonterminals are now always detected. As a result of these bug fixes,
it is now possible to apply multiple independent modifications to the
same base module. Thanks to Martin Hirzel for identifying the first
bug (whose resolution triggered discovery of the other two).:
linenumber:
column-number
to better integrate with Emacs. Thanks to Martin Hirzel for
suggesting this improvement.The C, Java, and XForm grammars have been modified to utilize the new attributes. Additionally, the C and Java grammars have been further modularized, up to the respective top-level module, which now simply modifies another, parameterized module.
Additionally, this release makes the following changes to xtc's C support:
xtc.lang.CSymbolTable
has been renamed
to CParserState
to emphasize that it does not implement a
full symbol table.xtc.lang.CCounter
can now print its own statistics
through the print(xtc.tree.Printer)
method. It has also
been updated to reflect recent changes in the C grammar. The C driver
has been updated accordingly.xtc.tree.GNode
's interface has been improved. In
particular, numberOfChildren()
has been renamed
to size()
, addAll(List)
has been changed
to addAll(Collection)
, and add(int,Object)
,
addAll(int,Pair)
, and addAll(int,Collection)
have been added.
A bug in XForm, which causes the result of a query to contain internal item objects, has been fixed.
In short, this release adds a module system to Rats!, adds support for building and printing an AST in the Java driver, fixes several bugs in the C parser and printer, and includes a significantly improved XForm, our AST query and transformation engine.
In more detail, this release introduces a simple yet powerful
module system for Rats!. The module system
supports basic modules to factor grammars into re-usable units. It
supports module modifications to concisely specify extensions.
Finally, it supports module parameters to easily compose different
extensions with each other. As a result, the format of grammar
specifications has been changed and grammars not distributed with this
release need to be modified. The module system is described in detail
in the package documentation for xtc.parser
.
To get a peek at modules, execute the following command
in src/xtc/lang
:
java xtc.parser.Rats -in ../.. -instantiated -html C.ratsThen open the resulting
xtc.lang.C.html
file in your web
browser and explore.
This release makes the following, additional changes to Rats!:
-in
options.
If no such options are present, the search path is the current
directory.-loaded
command line option), after instantiating
(-instantiated
), after applying modifications
(-applied
), and after all processing
(-processed
). If the -html
command line
option is present (as illustrated above), the last three printing
options will generate hyperlinked HTML in Rats!' output
directory (which can be set with -out
). The
corresponding stylesheet is grammar.css
.-option
command line options. Most
attributes have also been renamed. Notably, debug
is
now verbose
, constantBinding
is now
constant
, state
is now stateful
,
reset
is now resetting
, ignoreCase
is now ignoringCase
, and location
is now
withLocation
. Furthermore, mainMethod
is
now main
and usePrinter
is
now printer
..
' to
'_
'. Nonterminals may not contain underscores
anymore.NullPointerException
when processing undefined
nonterminals in xtc.parser.TextTester
has been
eliminated.NullPointerException
when processing optional
sequences with no bindable value
in xtc.parser.MetaDataSetter
has been eliminated. Thanks
to Stacey Kuznetsov for identifying this bug.PackratParser.format(ParseError)
and PackratParser.print(ParseError)
simplify the display
of parse errors, while the new
exception xtc.parser.ParseException
simplifies the
propagation of parse errors.The Java driver can now optionally build an abstract syntax tree and also pretty print that tree. Thanks to Stacey Kuznetsov for implementing the necessary changes.
The C grammar and pretty printer have been improved as follows:
XForm, the query and transformation engine, has been improved as follows. Thanks to Joe Pamer for realizing these changes.
or
and and
logical operators.inside_out
operator.xtc.xform.Item
objects allocated while
performing a query has been reduced by a factor of 90.NullPointerException
's
in xtc.lang.CPrinter.visitStructureDeclarationList()
and
in xtc.xform.Item.equals()
.
In detail, this release makes the following performance-related improvements:
-buffer
and -nobuffer
command
line options for Rats! and the C and Java drivers have been
removed. Parse error reporting now uses a new method
(xtc.parser.PackratParser.lineAt()
), as input lines are
not directly available
anymore. xtc.util.Utilities.msg()
, which is used for
error printing, has been changed accordingly.xtc.tree.Printer
's constructors have been
changed accordingly.-Onontransient
(for
"optimize non-transient productions"). This optimization is enabled
by default. Since this optimization creates new transient
productions, the -Oerrors2
optimization is now disabled
by default.Character.isJavaIdentifierStart()
and Character.isJavaIdentifierPart()
instead of
(incorrectly specified) explicit character classes.-Ooptional
. This optimization is enabled by default.
Note, however, that this optimization may result in a loss of accuracy
for deducing the type of a binding. For example,
if foo:Foo?
and bar:Bar?
both appear in the
same production, with Foo
having String
as
its type and Bar
having Pair
, then the
declared type for both foo
and bar
is the
common supertype Object
.-Oleft2
. This optimization is
enabled by default. The previously supported transformation into
right-recursions is still available through the -Oleft1
command line option.-Omatches
. This optimization is enabled by
default.-Oselect
. This optimization is
enabled by default.instanceof
tests and casts. To this
end, Rats! now interprets any import statements in a
grammar's header and tries to analyze the corresponding
classes.dumpTable
attribute results in
the generation of a method, dump(xtc.tree.Printer)
, to
print the memoization table in a human-readable format. The dump can
be used for analyzing allocation patterns. The C and Java drivers
include a corresponding command line option (-memo
),
which casues the memoization table to be printed after a successful
parse. Though, the C and Java grammars do not include the
dumpTable
attribute. The
new xtc.parser.TableAnalyzer
utility collects and prints
(minimal for now) statistics for a previously dumped memoization
table.Thanks to Adam Kravetz for helping to identify several opportunities for optimizations.
This release also cleans up the interface between nodes and
visitors. In particular, dispatch can now only be initiated by
calling Visitor.dispatch(Node)
(instead
of Node.accept(Visitor)
). Furthermore, processing
methods specified as part of nodes are now
named Node.visitWith(Visitor)
(instead
of Node.process(Visitor)
). In contrast
to accept()
, dispatch()
handles null
nodes, doing nothing and
returning null
. Furthermore, if the
selected visit()
or visitWith()
method
has void
as its return type, dispatch()
returns the specified node (instead of null
).
Rats!' internal visitors have been updated to utilize
dispatch()
. Additionally, many visitors have been
refactored to utilize a common
superclass, xtc.parser.GrammarVisitor
, which reduces code
bloat across Rats!' internal visitors. All visitors
in xtc.lang
were already
using dispatch()
.
Furthermore, this release makes the following changes to Rats!:
reset
attribute. Rats!' own grammar and the
C grammar have been modified accordingly.resetTo(int)
and isEOF(int)
methods. Incremental parsing is useful for processing interactive or
very large inputs. It is now used by
the xtc.lang.CDriver
by default (and disabled through
the -noincr
command line option).gstring
") has been removed. They
provide little benefit with considerable complexity (and code
duplication). The two generic text productions in the C grammar have
been rewritten as regular generic productions.ignoreCase
attribute instructs Rats!
to perform comparisons for string matches in a case-insensitive
manner. It applies to either the entire grammar or individual
productions and is useful for languages with case-insensitive
keywords. Note that comparisons for string literals, character
literals, and character classes continue to be case-sensitive, even in
the presence of this attribute. Thanks to Ken Britton for suggesting
this feature and providing me with a prototype
implementation.transient
is
used as a production's type and a warning if any other per-production
attribute is used as a production's type. In either case, the actual
type is probably missing from the production.NullPointerException
anymore.
If the value of the repeated element is the value of a nonterminal,
the corresponding production is not voided anymore, even if the
repetition is automatically recognized as a production's semantic
value.ClassCastException
anymore.Transformer
,
DirectLeftRecurser
, and Generifier
visitors.This release also adds support for local label declarations to the C grammar and pretty printer. Additionally, the C grammar, symbol table, and pretty printer have been modified, so that annotations encapsulating regular AST nodes now represent the compiler directives preceding that node's text in the input (instead of the other way around).
The new xtc.xform
package provides a facility for
querying and transforming abstract syntax trees (ASTs). The query
language is inspired
by XPath
2.0, but has some significant differences, notably to
destructively modify ASTs. Thanks to Joe Pamer for implementing the
query and transformation engine.
All tools now return appropriate exits codes, 0 on successful executions and 1 on error conditions.
This release also improves the C grammar and pretty printer. In particular, it fixes bugs in:
typedef
declarations now only introduce
the identifiers in the declarator list as type names and field names
now properly override type names when preceded by a type
specifier),goto
statements,long
and long long
constant suffixes (such
as LL
),+
, -
, and &
operators,-Wparentheses
command line option.
Additionally, the C grammar and pretty printer now support the following (GCC) extensions:
#ident
directives in preprocessed
code,typeof
(and underscored variations) as
a type specifier,signed
type specifier,case
labels,__alignof__
as an expression,__builtin_va_arg()
function (which
takes a type name as its second argument),const
, volatile
and restrict
type qualifiers,__extension__
specifier.The overall effect is that the C driver
(xtc.lang.CDriver
) now parses and pretty prints the
entire Linux kernel (version 2.6.8). The resulting source code
compiles with GCC under the -Wall
command line option
(and no warnings).
Thanks to Marc Fiuczynski for identifying most of the bugs and missing language constructs and for testing the C driver against the Linux kernel.
This release also changes the format of pretty printed ASTs to be
more compact (and to be consistent with the AST query language
currently being developed). Pairs (xtc.util.Pair
) are
now mutable, but should still be treated as immutable if they are
memoized by a Rats!-generated parser.
It makes the following changes:
visible
attribute supports the generation of
parsers that are package private (instead of public).xtc.util.State
has been changed to reflect that
state modifications are modeled as lightweight transactions.ArrayIndexOutOfBoundsException
has been fixed; thanks
to Robin Lee Powell for identifying this bug.ArrayIndexOutOfBoundsException
when printing a default parse error (returned by a transient
production under the errors2 optimization) has been fixed; thanks to
Robin Lee Powell for identifying this bug.-out
) to select the output
directory for parsers generated by Rats! has been added.
Also, Rats! now prints only errors to the error console.
Both changes improve integration with Ant; thanks to Yonas Jongkind
for suggesting them.Note that this release changes the basic parser interface
and is not backwards-compatible. In particular, parsing
methods now take an explicit index argument
(named yyStart
), and the character()
method
returns an int
instead of a Result
.
Furthermore, parsers perform best if they are created with the
three-argument constructor, which includes the length of the input.
For example, the following code snippet parses a file
named fileName
of size fileSize
with
reader in
and top-level production TopLevel
:
Parser p = new Parser(in, fileName, fileSize); Result r = p.pTopLevel(0);
This release also makes the following changes:
String
value and containing a
parser action were incorrectly treated as text-only productions (bug
fix).-Ochunks
"
command line flag; bug fix).-Ochoices2
. The original (though
slightly improved) optimization for void and text-only productions is
still available under the -Ochoices1
command line option
and enabled by default. Note that the choices2
optimization includes the choices1
optimization.-Oerrors2
.
This optimization is complimentary to the previously available error
object optimization, which is now controlled through
the -Oerrors1
command line option. Both optimizations
are enabled by default.xtc.tree.GNode
now uses less memory for generic nodes
with zero or one children.xtc.lang.CDriver
, for parsing and
printing C has been added. It provides control over whether to print
parsed files and also supports the collection of runtime performance
statistics. Additionally, the invocation syntax for the main class,
xtc.Main
, has been changed, now using
the -util
comand line option to control tool
selection.c.rats
, has been tuned so that
alternatives that are more likely to appear in the input are parsed
first. For example, declarations are now parsed before function
definitions.yyValue
may still not be
referenced.-Oleft
command line
option to control the automatic transformation of direct
left-recursions into right-recursions. It also prints additional
messages under the -verbose
command line option.!
syntactic predicate on a character constant or class,
followed by the any character constant, followed by any element has
been eliminated (bug fix).&
syntactic predicates appearing in
transient void or text-only productions (bug fix).This release focuses on Rats!' automatic generation of abstract syntax trees (through generic nodes). Notable improvements include:
xtc.tree.GNode
.yyValue
(instead of always resulting in a new
generic node).void:
. This new
prefix operator has lower precedence than all other operators,
including regular prefix operators, with the exception of the ordered
choice operator /
.xtc.util.Action
class.gstring
(for "generic string") as its type and a generic
node as its semantic value, whose only child is the text matched in
the input.Additionally, the newly added state
attribute and the
corresponding xtc.util.State
interface help with writing
grammars that are context-sensitive and require global state. The
state
attribute, as well as the debug
,
location
, and constantBinding
attributes can
now also be specified on a per-production basis, simply by including
them before the production's type. Next, sequences can now be named;
the name is specified as the first element in a sequence by including
it between less-than <
and greater-than
>
signs. Furthermore, the readability of printed
grammars and generated code has been improved through new
line-wrapping facilities in xtc.tree.Printer
. Finally,
identifiers may now contain underscores (_
).
Almost all of the newly added features are utilized by the new
grammar for C and the corresponding pretty printer (in the
xtc.lang
package). Parser and pretty printer can be
tested by executing "java xtc.lang.CParser
<file>
".
xtc.tree.GNode
) as
its semantic value has
generic
as its type. The corresponding generic node has
the same name as the production, and the children of the generic node
are the semantic values of all component expressions in the matched
sequence, with the exception of character terminals and nonterminals
referencing void productions.visit()
methods
in visitors are selected based on the type of the node, the
corresponding process()
methods in nodes are selected
based on the type of the visitor. The dynamic dispatch mechanism
first tries to locate a process()
method and, if none can
be found, tries to locate the corresponding visit()
method.
This release also makes the following improvements to Rats!:
^
and contain low-level code
that parses languages not expressible by parsing expression grammars.
This feature has been motivated by Sameer Ajmani and Bryan Ford.mainMethod
grammar attribute,
which causes a main method to be generated that parses files specified
on the command line.rats
, now supports command line flags to control
which optimizations to perform. The Java parser
tool, pjava
, now supports the printing of parser
statistics in white-space delimited format (to easier import data into
spreadsheets).