xtc Release History and Notes

2.3.1 (4/4/12)
Minor bug fix release. This release fixes incorrect license headers in several source files and replaces 2.3.0 as described below.
2.3.0 (3/25/12)
Major feature release.

This release significantly improves SuperC again. SuperC's performance has been tuned, the regression tests have been expanded, numerous bugs have been fixed, and the scripts for running and evaluating SuperC have been enhanced. All SuperC code is now released under the GPL version 2.0. This version is the one used for all experiments in the PLDI ’12 paper “SuperC: Parsing All of C by Taming the Preprocessor” by Paul Gazzillo and Robert Grimm.

2.2.0 (11/19/11)
Major feature release.

This release significantly improves SuperC. The parsing algorithm, Fork-Merge LR, has been completely reimplemented. It is now based on the novel token follow-set, which captures the actual variability of static conditionals independent of how they are nested within each other and appended to each other. The parser also includes three optimizations, shared reductions, lazy forking, and early reductions, which further decrease the number of forked subparsers.

Both SuperC preprocessor and parser are now language-independent. To support a new language, a user needs to provide an annotated JFlex lexer definition, an annotated Bison grammar, and, optionally, a Java implementation of semantic actions.

Several new scripts help with running SuperC and collecting experimental data. They include a script to distribute the processing of Linux kernel source files across machines and scripts to compute summary statistics from SuperC's raw data output, e.g., a CDF of the number of subparsers.

The new SuperC technical manual can be found in src/xtc/lang/cpp. It documents basic SuperC usage, its scripts, and the format of its statistics output. The manual is built by invoking make manual.

This release also fixes a bug in the Jeannie regression test harness, which failed on some Linux distributions. Thanks to Jacob Shufro and Martin Hirzel for their help in identifying and resolving this bug.

2.1.1 (9/9/11)
Minor bug fix release.

This release removes support for type checking the simply typed lambda calculus from xtc.lang.TypedLambda, since it depends on the already discontinued Typical compiler. Thanks to Thomas Huston for identifying this bug.

This release fixes the C structure layout regression tests to not utilize the Typical-generated type checker anymore.

This release also fixes a C type checker regression test to not use a deprecated preprocessor feature anymore. Thanks to Jacob Shufro for identifying this bug.

2.1.0 (9/7/11)
Minor feature and bug fix release.

This release improves SuperC by adding significantly more regression tests and introducing attendant bug fixes. It also adds more code comments and fixes code formatting.

This release adds support for parsing and pretty printing Java 7. To support Java 7's try-with-resources statements, the AST for regular try-catch statements has been changed, even when using parsers for earlier Java versions. The Java analyzer, xtc.lang.JavaAnalyzer, has been updated accordingly. To support Java 7's underscores in numeric literals, the definition of Java constants in Rats!' xtc.lang.JavaConstant has been modified to more closely follow the language specification.

Tokens are now extensible; the corresponding class, xtc.tree.Token, is not final anymore but has become abstract. Rats!' support for parse trees has been updated to utilize the new concrete subclass xtc.tree.TextToken. All parsers distributed with xtc have been updated accordingly.

To support Mac OS X 10.7 (Lion), this release adds support for the stpncpy_chk() builtin function.

Finally, this release removes a residual dependency on the Typical-generated type checker from xtc.lang.C. Thanks to Thomas Huston for identifying this bug.

2.0.0 (7/20/11)
Major feature release.

This release introduces a preview of SuperC, a new tool for parsing C code with arbitrary preprocessor usage. SuperC first lexes C code, then uses a new configuration-preserving preprocessor to resolve all directives modulo conditionals, and finally uses a novel variant of LR parsing to generate a well-formed AST containing static choice nodes for conditionals. The corresponding Java package is xtc.lang.cpp.

This release removes the following unmaintained code: the ANTLR and JavaCC parsers for Java; the C4 compiler; the Overlog compiler; the Typical compiler; and the XForm AST query engine. The last release containing this code is xtc version 1.15.0 (with corresponding testsuite).

This release removes the unnecessary dependency on the "perfctr" library from Jinn. Thanks to Mengtao Sun for pointing out this bug.

1.15.0 (6/14/10)
Major feature release. This release introduces Jinn, a dynamic bug detector for the Java Native Interface (JNI). It currently supports HotSpot and J9 running on the x86 version of Linux. Support for other OS and processor combinations is under development. The source directory is src/xtc/lang/blink/agent and the make target is agent. Please direct any feedback to Byeong Lee.

This release also removes the unnecessary analyzers target from the make file in src/xtc/lang/blink. Thanks to Tony Sloane for pointing out this bug.

1.14.4 (9/29/09)
Minor bug fix release:

Due to many of the above changes, xtc now passes all regression tests on Apple's Mac OS X Snow Leopard (10.6), whose C compiler and Java virtual machine default to 64-bit.

1.14.3 (4/6/09)
Minor feature and bug fix release.

Rats! has been updated as follows:

The Jeannie grammar has been updated to eliminate a bug that caused null pointer exceptions. Thanks to Matt Renzelmann for identifying this bug.

The Blink debugger has been updated to perform dynamic consistency checks on the arguments to JNI functions. For example, it detects when NULL is passed to NewStringUTF and reports this invalid argument.

1.14.2 (10/18/08)
Minor feature and bug fix release.

Rats! has been updated as follows:

C support has been improved as follows:

Please remember to run make configure to recreate the appropriate xtc.Limits for your hardware, OS, and compiler. Thanks to BK Lee's tireless help, configuration now also works with Microsoft's Visual C.

The Blink inter-language debugger has been improved as follows:

1.14.1 (7/31/08)
Bug fix release.

Rats! has been updated as follows:

1.14.0 (7/26/08)
Major feature release.

This release introduces Blink, a portable mixed-mode Java/native debugger. It currently supports Sun's Java virtual machine running on the x86 versions of Linux and Cygwin, with support for other JVM, OS and processor configurations under development. Please direct any feedback to BK Lee.

1.13.3 (5/14/08)
Minor feature and bug fix release.

Rats! has been updated as follows:

Due to improvements in variant typing, Rats! now statically types the Jeannie grammar, requiring no additional variant annotations.

The Typical compiler has been updated to support fun expressions, and the translation of let expressions has been optimized. Additionally, bugs in the exhaustiveness checking for match expressions and when explicitly matching bottom have been fixed. Thanks to Christopher Conway for reporting these bugs.

The Java, Typical, and O'Caml type checkers for C have been updated to:

Thanks to Matt Renzelmann for identifying the last two issues.

To track size, alignment, and offset values, the C type checkers now include a re-engineered version of gcc's structure layout algorithm. The local system's C configuration in xtc.Limits has been improved in support. Run make configure to recreate the appropriate version for your hardware and operating system.

The syntax for Jeannie top-level compilation units has changed. The package and import declarations now come before the initial `.C {…} block instead of after it. That way, top-level C code can use simple instead of fully qualified names when referring to Java entities.

Internally, the Jeannie grammar and AST for array declarators has been updated to create "variable length" nodes, just like the C grammar and AST in release 1.13.0. Furthermore, the compiler has been updated to address several bugs, mostly thanks to helpful reporting by Matt Renzelmann.

Support for Overlog has been extended with a translator targeting Java. The corresponding runtime is being developed by Nalini Belaramani at UT Austin; the necessary JAR file is available here. Additionally, the Overlog language has been extended with tuple and function type declarations, the Overlog grammar has been cleaned up, and a bug in the inference of function return types has been fixed. The corresponding Java package has been renamed to xtc.lang.overlog (from xtc.lang.p2).

1.13.2 (12/1/07)
Minor feature and bug fix release.

The Jeannie compiler now supports backticked Java primitive types, e.g., `boolean or `int, as C type specifiers. This change eliminates the need for using the equivalent JNI types, e.g., jboolean or jint, in C contexts. This release also includes various bug fixes to the Jeannie compiler and a user guide.

The Typical compiler now supports the guard construct for protecting against bottom values in arbitrary expressions. It also incorporates various bug fixes, including mapping bottom to bottom in optimized pattern matches.

This release includes three type checkers for C. The first is the previously released version, which is written in Java and used by the Jeannie compiler. The second is new to this release and written in Typical. It is invoked through the -analyze and -typical options to the C driver xtc.lang.C. Just like the type checker written in Java, the type checker written in Typical passes all of gcc 4.1.1's regression tests. Both type checkers also process the entire Linux 2.6 kernel. To this end, the handwritten C type checker now:

The third type checker for C is new to this release as well and written in O'Caml. It re-uses the parser and AST representation of CIL and is contained in the src/xtc/lang/c/ml directory. Like the other two type checkers, the O'Caml version processes the entire Linux 2.6 kernel; though it does not recognize C99's variable length arrays.

xtc now includes support for type inference and concurrency analysis of Overlog programs; the corresponding code lives in the xtc.lang.p2 package.

Rats! has been updated as follows:

All tools now support a -no-exit option for not exiting a Java virtual machine. As a result, tools can now be invoked by other Java code in the same JVM without terminating the JVM after tool completion.

The licensing of most classes in xtc.util has been changed to the LGPL version 2.1. As before, the complete list of LGPL-ed classes can be found in overview.html.

1.13.1 (10/16/07)
Bug fix and minor feature release.

This release makes the following changes to Rats!:

The Typical compiler now supports the hierarchical syntax tree definitions generated by Rats!, including polymorphic variants and the 'a var type. The type describing the syntax tree's root defaults to node but can be overridden through the -node command line flag. Additional changes to Typical include:

The Jeannie compiler has been updated to reflect the language described in the OOPSLA paper. In particular, it now supports with statements for non-primitive arrays, declarations in with statement initial clauses, and compound initializers. Additional changes include:

The C regression tests have been updated to include all relevant tests from GCC version 4.1.1. The C type checker has been updated accordingly. In particular, it now explicitly checks for:

Additionally, the processing of block-level extern declarations has been much improved.

The limits.c utility for determining a local system's C configuration has been improved to more accurately determine the local pointer difference, size, and wide character types. The corresponding xtc.Limits class included in the source distribution is valid for 32-bit x86-based Mac OS X systems, but differs in endianness from PowerPC-based Mac OS X systems and in the definitions for size and wide character types from Linux and Windows systems. The new configure target for the global Makefile rebuilds xtc.Limits and xtc.type.C (whose constants depend on Limits) for a local system.

Thanks to Thomas Moschny, the implementation of for expressions in the XForm AST query and transformation engine has been fixed to properly iterate over nested sequences. Also thanks to Thomas Moschny, a bug causing a null pointer exception has been fixed.

All tools now support a -diagnostics option to print tool internal state. Given this option, the C driver now prints the local system's configuration parameters (as determined by limits.c — see above).

Finally, the Java and C drivers now support the -locateAST command line option to print each node's source location when printing the AST with the -printAST option.

1.13.0 (8/31/07)
Major feature and bug fix release.

Starting with this release, xtc includes Typical, a domain-specific language and compiler for implementing semantic analysis including type checking. The Typical language builds on the functional core of ML and extends it with novel declarative constructs specifically designed for implementing type checkers. The package description for xtc.typical provides an overview and introduction. Examples included with xtc are a type checker for the simply typed lambda calculus in src/xtc/lang/TypedLambda.tpcl and for the Typical language itself in src/xtc/lang/Typical.tpcl. A type checker for C written in Typical is under development. The main developers for Typical are Laune Harris and Anh Le.

Starting with this release, xtc also includes "a compiler contributed to xtc" a.k.a. Jeannie, which integrates Java with C. In Jeannie, Java and C code are nested within each other at the level of individual statements and expressions and compile down to JNI, the Java platform's standard foreign function interface. By combining the two languages' syntax and semantics, Jeannie eliminates verbose boiler-plate code, enables static error detection across the language boundary, and simplifies dynamic resource management. The OOPSLA '07 paper by Martin Hirzel and Robert Grimm describes both language and compiler in detail; the package description for xtc.lang.jeannie provides instructions on how to compile source code to binaries.

Instead of using strings, Rats! now relies on xtc.type.Type and its subclasses to internally represent the types of semantic values. The first new feature to leverage this improved internal representation is variant typing for grammars. When the -ast command line option is combined with the new -variant option, Rats! automatically determines ML-style variant types representing a grammar's generic AST. To facilitate type inference, Rats! relies on the new variant attribute for productions, which indicates that all generic nodes returned by a production are members of the same variant type, named after the production. The C, Java, Typical, and simply typed lambda calculus grammars have been updated accordingly.

The Java grammar and AST for this expressions have been improved. Instead of accepting any primary and postfix expression, the grammar now recognizes only a qualified identifier with a trailing dot before the this keyword. For well-formed inputs, this changes replaces zero or more nested selection expression nodes as a this expression node's first child with an optional qualified identifier.

The C grammar and AST have also been improved. The "*" string denoting variable-length arrays in array declarator nodes and direct abstract declarator nodes has been replaced with a dedicated "variable length" node. Next, the identifier string in structure designators has been replaced by a primary identifier node. Finally, goto statement nodes now have two children. A "*" string as the first child now indicates a computed goto statement. The second child always is a node, with a primary identifier providing a regular goto statement's label.

1.12.0 (7/18/07)
Major feature and bug fix release.

As described below, Rats!' handling of list values in generic productions has changed. If your grammar contains generic productions and you do not want to update your AST processing code, add the flatten option to your grammar.

xtc now supports parse trees in addition to abstract syntax trees, thus facilitating source code refactorings that preserve formatting and layout. In particular:

The interface to abstract syntax tree nodes has been improved as following:

The representation of programming language types in xtc.type has been cleaned up and expanded:

The Java grammar and AST have been re-engineered to (mostly) eliminate the need for a separate AST simplification phase. Notably, the AST for postfix and primary expressions has been significantly cleaned up. The Java type checker has been updated accordingly.

Additionally, xtc now includes a grammar for Java 5. The Java 5 grammar is implemented as a modification of the Java 1.4 grammar, and ASTs for the two versions are compatible, i.e., every valid Java 1.4 AST also is a valid Java 5 AST. The Java pretty printer has been updaged to support both versions. Furthermore, the FactoryFactory concrete syntax tool has been updated to use the Java 5 grammar. Since ASTs for the two language versions are compatible, the concrete syntax tool will create Java 1.44 ASTs as long as the input only uses Java 1.4 features.

The C type checker now verifies that external declarations without initializers are complete only at the end of a translation unit, thus correctly allowing for the definition of a struct or union type after it has been used in an external declaration. It also adds support for three more GCC extensions:

  1. Global register variables (but without checking the register names),
  2. extern and inline functions, which effectively are macros and may be defined in the same translation unit before a regular function definition,
  3. structures with trailing incomplete arrays as struct member types and array element types.
As a result, the C type checker now passes all GCC regression tests ported to xtc.

In addition to supporting the generation of parse trees and using the new Locatable interface, Rats! has been improved as follows:

The AttributeList and MalformedNodeException classes in xtc.tree have been removed. All code using the former has been changed to use a List<Attribute>; there was no code using the latter.

Finally, this release incorporates several fixes to minor bugs identified by Eclipse and by FindBugs.

1.11.0 (5/14/07)
Major feature and bug fix release.

The licensing of several classes has been changed. The Node, GNode, and Annotation classes in xtc.tree and the Action and State classes in xtc.util are now licensed under the LGPL version 2.1 instead of the GPL version 2. Consequently, parsers generated from grammars with generic or stateful productions are not covered by the GPL anymore.

This release simplifies the interface between nodes and visitors. Processing methods cannot be specified as part of nodes anymore; i.e., visitWith(Visitor) methods are not recognized by dispatch() anymore. Furthermore, if a visit method has void as its return type, dispatch() now returns null; i.e., it does not return the specified node anymore. The first feature has been removed because it has not been used in over 1 1/2 years; the second feature has been removed because it is inconsistent with Java reflection and programmer expectations about void methods (while also having some runtime overhead).

Other changes to nodes and visitors include:

Rats! has been improved as follows:

xtc now supports concrete syntax for creating Java and C abstract syntax trees. The new xtc.lang.FactoryFactory tool reads in a factory declaration, which includes one or more snippets of Java or C code, and creates the corresponding factory class. That class has one method per snippet, with each method creating the abstract syntax tree representing the code snippet. Code snippets may be declarations, statements, or expressions; they may also contain pattern variables, which are bound on method invocation.

The Java grammar has been improved by introducing a distinct production for variable declarations and by not recognizing constructor, method, and field declarations inside method bodies anymore. At the same time, the AST fragment for variable declarations has the same structure as that for field declarations; i.e., both nodes have the same name ("FieldDeclaration") and one or more children indicating the modifiers.

Additionally, the pretty printing of Java ASTs has been improved: synchronized statements now include parentheses around their expressions, compilation units and class bodies do not contain unnecessary blank lines any more, and the spacing of class declarations, catch clauses, and new expressions has been improved. Thanks to Martin Hirzel and Laune Harris for identifying several of these issues.

Thanks to Martin Hirzel, xtc now includes a type checker for Java (version 1.4). Comparable to the C type checker, the Java type checker is invoked through the -analyze command line option to xtc.lang.JavaDriver. The -printSymbolTable option instructs the Java driver to print the symbol table after analysis. Note that the Java type checker requires a simplified AST, as indicated by the -simplifyAST option.

Support for processing C programs has been improved as follows:

Tool support for I/O has been improved. In particular, xtc.util.Runtime now manages input/output directories and can open chracter streams. Furthermore, xtc.util.Tool now allows for the specification of character encodings on the command line. As a result, Rats! now supports user-specified character encodings. Thanks to Steven Foster for raising the issue of character encodings.

This release fixes the following bugs in XForm, the AST query and transformation engine:

Thanks to Karen Osmond for identifying several of these issues, and thanks to Laune Harris for fixing them.

Finally, this release makes the following miscellaneous changes:

1.10.0 (12/24/06)
Major feature and bug fix release.

All code is now compiled with Java 5:

This release makes the following changes to Rats!:

A bug in the implementation of generic nodes has been fixed: GNode.ensureVariable() does not reverse the children anymore if it is invoked on a generic node with a fixed number of children.

Support for language tools has been improved by adding two new methods to xtc.util.Tool: The process(String) method recursively processes the file with the specified name and the wrapUp() method is called after all files have been processed. Thanks to Hunter Freyer for suggesting these improvements.

The Java grammar has been changed to support an optional comma in array initializers and to allow single-line comments to be terminated by the end-of-file. Thanks to Martin Hirzel for identifying and fixing these issues.

The Java simplifier now correctly processes this() and super() call expressions. Thanks to William Moy for identifying this bug.

Finally, this release changes the C type checker to correctly use composite types for function definitions following one or more declarations.

1.9.3 (9/20/06)
Minor bug fix release.

This release fixes bugs when pretty printing switch, case, and default constructs for Java ASTs. Thanks to William Moy for pointing out this issue.

Thanks to Martin Hirzel, this release also improves the documentation for the Java AST simplifier.

1.9.2 (9/12/06)
Minor bug fix release.

Thanks to Martin Hirzel, this release includes further fixes for simplifying and printing Java abstract syntax trees.

1.9.1 (9/7/06)
Minor bug fix release.

Thanks to Martin Hirzel, this release fixes a bug when processing assignments during simplification of abstract syntax trees for Java.

1.9.0 (9/5/06)
Major feature and bug fix release.

xtc now requires JDK 1.5 to build and run. While xtc still is written in version 1.4 of the Java language, it now uses classes and interfaces from version 1.5 of the platform libraries. Notably, all uses of StringBuffer have been replaced with StringBuilder.

The interface to abstract syntax tree nodes has been generalized by moving the methods for generic tree traversal and for adding/removing children from xtc.tree.GNode up to xtc.tree.Node. As part of that move, hasChildren() was renamed to isEmpty() and children() to iterator() to be more consistent with the Java platform libraries.

To avoid forcing every subclass into implementing these methods, Node provides default implementations for all methods, which effectively signal unsupported operation exceptions. Code using nodes can determine whether a node actually supports generic tree traversal through the hasTraversal() method and adding/removing children through the hasVariable() method. To support generic tree traversal, a subclass only needs to implement the size(), get(int), and set(int, Object) methods. To support adding/removing children, a subclass only needs to implement the add(Object), add(int, Object) and remove(int) methods.

Support for AST annotations has been improved, with xtc.tree.Annotation now supporting generic annotations through the before1(), after1(), round1(), and variable() factory methods. Furthermore, the new node type xtc.tree.Token supports the representation of source file symbols as nodes.

In the presence of annotations and tokens, instance tests and casts on objects returned from an AST node may not work as expected. Code processing trees should use getString() to access string children and getGeneric() to access generic nodes. Furthermore, it should use Token.test() and Token.cast() to test for and cast to strings and GNode.test() and GNode.cast() to test for and cast to generic nodes.

All code using generic nodes has been updated to reflect the new interface. Furthermore, xtc.tree.Printer.format() now accepts any node and uses generic traversal to print that node.

xtc now includes working support for semantic analysis of C. xtc.lang.CAnalyzer provides a type checker for C99 and commonly used GCC extensions. While it successfully passes most of GCC's regression tests, its support for C99's variable length arrays is not yet complete. It also does not support GCC's extern inline functions and variables in specified registers. In support of CAnalyzer, the xtc.type package has been significantly improved, notably with a class hierarchy of references to model the memory layout of lvalues. Several bugs have also been fixed. Furthermore, the creation of fresh symbols in xtc.util.SymbolTable has been fixed so that symbols are, in fact, fresh.

The new type checker is invoked through the -analyze command line option to xtc.lang.CDriver. The -strict option instructs the C driver to disable GCC's extensions. The -markAST option instructs the C driver to annotate AST nodes with their types. Finally, the -printSymbolTable instructs the C driver to print the symbol table after analysis.

The C grammar has been extended with support for unnamed struct and union fields within structs and unions. Furthermore, an initialized declarator now starts with an optional attribute specifier list, shifting all previous component expressions. Next, the C grammar now recognizes GCC's __builtin_offsetof() function and __complex__ as an alternative to C99's _Complex. Finally, the order of identifiers and constants in PrimaryExpression has been reversed, so that wide C character and string constant are now correctly recognized.

This release makes the following changes to Rats!:

All grammars have been updated to use Node (instead of GNode) as the type of productions that pass generic node values through. That way, they can accommodate annotated nodes.

The XForm AST query and transformation engine now supports add and remove operations. For example, "add Child<> to //Parent" adds a Child node to all Parent nodes in the AST, and "remove //SomeName" removes all SomeName nodes from the AST. Additionally, an out of range or otherwise malformed integer predicate no longer causes a runtime exception; rather, an empty sequence is returned.

1.8.2 (8/8/06)
Minor feature and bug fix release.

This release improves Rats! by featuring a completely rewritten Transformer phase. This phase deduces semantic values, lifts nested choices, repetitions, and options, and desugars repetitions and options. The rewritten code is more modular and (hopefully) more easily maintainable. It also is more accurate in deducing semantic values and more uniform in processing (deeply) nested choices, repetitions, and options. As a result, the rewritten code also fixes a regression identified by Thomas Moschny.

A set of regression tests for Rats! has been added. The tests are invoked by typing make check-rats in the top-level directory of the distribution.

The old version of the transformer phase is still available through the -oldTransform command line option to Rats!. However, it is deprecated and will be removed in the near future.

Error checking of grammars has been improved. In particular:

The folding of equal sequences has been modified so that it does not result in a trailing choice of empty alternatives anymore.

Code generation has been modified to avoid declaring and assigning the yyPredIndex variable if the variable's value is never used. Thanks to Thomas Moschny (and Eclipse) for pointing out this issue.

This release improves the Java grammar by adding support for empty declarations (a semicolon by itself), assert statements, and class selection expressions. Thanks to Terence Parr for identifying these issues.

This release also contains a snapshot of the on-going effort towards supporting semantic analysis. Notably, the xtc.type package has been significantly improved and xtc.lang.CAnalyzer has been updated accordingly. However, for now, typing of C programs still is buggy and incomplete.

Finally, unnecessary import declarations have been removed throughout xtc, including from parsers generated by Rats!.

1.8.1 (6/10/06)
Minor bug fix release.

This release renames xtc.parser.BaseParser to ParserBase and xtc.parser.PackratParser to FullParserBase. Additionally, FullParserBase now inherits from ParserBase to avoid code duplication.

Next, this release makes the following changes to Rats!' code generator:

This release also fixes a bug in xtc.lang.JavaAstSimplifier and xtc.lang.JavaPrinter that caused a null pointer exception when pretty printing simplified method declarations. The fixed version of JavaAstSimplifier preserves the number of children in MethodDeclaration AST nodes.

1.8.0 (6/6/06)
Major feature and bug fix release.

This release considerably improves xtc's support for the semantic analysis of programs. In particular, the new xtc.util.SymbolTable class implements a scoped symbol table that easily integrates with AST traversal through xtc's visitors. The new xtc.type package provides representations for a program's types. It currently covers all of C's and Java's types (as of JDK 1.4). The new xtc.lang.CAnalyzer visitor leverages the new classes to fill in the symbol table for a program and to check semantic correctness along the way. However, CAnalyzer is still incomplete and buggy.

The new interface xtc.Limits specifies the integer range limits for a local system's C compiler. The version distributed with xtc's release is consistent with GCC for Mac OS X on the PowerPC and for Mac OS X, Linux, and Windows on x86 processors. limits.c in the same package can be used to generate the correct limits for other operating systems and architectures.

Next, the C grammar has been changed as following:

Next, the Java grammar has been improved by using more descriptive names for a large number of productions, by optimizing several productions, and by eliminating the creation of unnecessary AST nodes. The Java printer has been updated accordingly.

Both the recognizer-only and the AST-building Java parsers are now generated from the same grammar through the new genericAsVoid grammar attribute (see below). The top-level module for both versions is xtc.lang.Java and the corresponding parsers now are xtc.lang.JavaRecognizer (no AST) and xtc.lang.JavaParser (AST).

To better evaluate and compare parser performance, the Java driver can now generate ASTs when using JavaCC- or ANTLR-generated parsers. The AST-building JavaCC grammar has been generated with Java Tree Builder (version 1.2.2) from the original JavaCC grammar (dated 5/5/02). The AST-building ANTLR grammar is distributed by the ANTLR project, with the recognizer-only version being manually derrived from the original. Both versions of the ANTLR grammar have been updated to version 1.21.

The xtc distribution now contains support for SDF and Elkhound generated Java parsers (again to evaluate and compare parser performance):

The new top-level glr directory contains Java 1.5 and 1.4 grammars for SDF. The 1.5 version is the grammar from the java-front 0.8 distribution (with a differently named top-level module) and the 1.4 version has been derrived from the former by removing support for generics, the enhanced for loop, typesafe enums, varargs, static imports, and metadata. The glr/buildsdf.sh script is used to generate the corresponding parse tables and the data/sdf.sh script is used to perform a performance evaluation. The buildsdf.sh script depends on the pack-sdf and sdf2table tools, while the sdf.sh script depends on the sglr and sglri tools.
The Elkhound-based Java parser, called Ella, is contained in the glr/ella directory. It includes the corresponding lexical, syntactic, and AST specifications as well as any supporting C++ code. Ella depends on the smbase, ast, elkhound, and elsa packages from Elkhound's source distribution. It can be built by copying the corresponding directories into the glr directory and then executing ./configure and make in that directory. The data/ella.sh script is used to evaluate Ella's performance.

This release makes the following changes to Rats!:

As a result of these changes, the throughput of the AST-building Java parser has improved by 31.5% and the throughput of the C parser has improved by 52%.

Thanks to Laune Harris, this release makes the following major changes to XForm:

Additionally, several minor XForm bugs have been fixed.

Java's access control is now disabled for xtc's visitor dispatch. As a result, visitors can now be specified as anonymous inner classes. For example, xtc.lang.CAnalyzer uses this feature to analyze declaration specifiers and declarators.

Generic nodes now need to be created through a set of factory methods; look for the create() methods in xtc.tree.GNode. Several of these methods directly accept a generic node's children and return generic nodes that are specialized for the specified number of children. As a result, such fixed size nodes do not support the add(), addAll(), and remove() methods defined by xtc.tree.GNode. They can be distinguished from variable sized nodes through isVariable() and converted to variable sized nodes through ensureVariable(GNode). Rats!' new gnodes optimization (see below) utilizes these factory methods to reduce the memory and performance overhead of parsers with generic productions.

This release introduces improved support for building language tools with xtc. In particular, the new xtc.util.Runtime class manages command line options, errors and warnings, and output to the standard console. The new xtc.util.Tool class provides a skeleton tool implementation, including support for several default command line options. Rats! and the C, Java, and XForm drivers have been rewritten to utilize both classes. Note that, as a result of this rewrite, some command line options for these tools have changed.

This release also introduces our first unit tests. We rely on JUnit as our unit testing framework and JUnit's binary release (junit.jar) must be in the classpath. Thanks to Anh Le, this release also introduces our first regression tests, based on GCC's regression tests. Just like GCC, we rely on expect and DejaGnu to perform these tests. The description of our development setup and the sample shell scripts (setup.bat and setup.sh) have been updated accordingly.

xtc now builds with JDK 1.5 by passing the -source 1.4 flag to the javac compiler. All sources remain at Java version 1.4.

xtc's licensing has been changed: Most of the code is now released under the GNU General Public License (GPL) version 2. The exceptions are xtc.parser.BaseParser, xtc.parser.Column, xtc.parser.Result, xtc.parser.SemanticValue, xtc.parser.ParseError, xtc.tree.Location, and xtc.util.Pair, which are released under the GNU Lesser General Public License (LGPL) version 2.1. The main licensing change is that the option of using later versions of the GPL and LGPL has been removed.

Thanks to Marco Yuen and Marc Fiuczynski, this release incorporates C4, the CrossCutting C Compiler. C4 makes aspect-oriented software development techniques available to C programmers, with the goal of simplifying the development of software variants, notably for the Linux kernel.

1.7.1 (8/17/05)
Minor feature and bug fix release.

This release makes the following changes to Rats!:

The C, Java, and XForm grammars have been modified to utilize the new attributes. Additionally, the C and Java grammars have been further modularized, up to the respective top-level module, which now simply modifies another, parameterized module.

Additionally, this release makes the following changes to xtc's C support:

xtc.tree.GNode's interface has been improved. In particular, numberOfChildren() has been renamed to size(), addAll(List) has been changed to addAll(Collection), and add(int,Object), addAll(int,Pair), and addAll(int,Collection) have been added.

A bug in XForm, which causes the result of a query to contain internal item objects, has been fixed.

1.7.0 (8/9/05)
Major feature release.

In short, this release adds a module system to Rats!, adds support for building and printing an AST in the Java driver, fixes several bugs in the C parser and printer, and includes a significantly improved XForm, our AST query and transformation engine.

In more detail, this release introduces a simple yet powerful module system for Rats!. The module system supports basic modules to factor grammars into re-usable units. It supports module modifications to concisely specify extensions. Finally, it supports module parameters to easily compose different extensions with each other. As a result, the format of grammar specifications has been changed and grammars not distributed with this release need to be modified. The module system is described in detail in the package documentation for xtc.parser.

To get a peek at modules, execute the following command in src/xtc/lang:

java xtc.parser.Rats -in ../.. -instantiated -html C.rats
Then open the resulting xtc.lang.C.html file in your web browser and explore.

This release makes the following, additional changes to Rats!:

The Java driver can now optionally build an abstract syntax tree and also pretty print that tree. Thanks to Stacey Kuznetsov for implementing the necessary changes.

The C grammar and pretty printer have been improved as follows:

XForm, the query and transformation engine, has been improved as follows. Thanks to Joe Pamer for realizing these changes.

1.6.1 (6/11/05)
Minor bug fix release. This release eliminates NullPointerException's in xtc.lang.CPrinter.visitStructureDeclarationList() and in xtc.xform.Item.equals().
1.6.0 (6/11/05)
Performance tuning release. This release focuses on improving performance and a corresponding code clean-up; as a result, this release may break existing code. Performance tests on an 2002 iMac (with a 800 MHz PowerPC G4 processor and 1 GB of RAM) show that Java driver throughput has improved by 49%, from 256 KB/s up to 382 KB/s, and heap utilization has improved by 25%, from 58:1 (i.e., 58 bytes of heap per 1 byte in the input) down to 43:1. C driver performance for parsing and pretty printing the entire Linux 2.6.10 kernel (~1,000 files) has improved by 35%, from 211 minutes down to 137 minutes. Improvements are similar for a faster machine: C driver performance for parsing and pretty printing the Linux kernel on a 2004 PowerMac (with two 2.5 GHz PowerPC G5 processors and 1 GB of RAM) has improved by 34%, from 56 minutes down to 37 minutes. All our C driver experiments used a Java heap size of 512 MB (both minimum and maximum size); performance improvements for configurations with smaller heaps are likely to be much more pronounced.

In detail, this release makes the following performance-related improvements:

Thanks to Adam Kravetz for helping to identify several opportunities for optimizations.

This release also cleans up the interface between nodes and visitors. In particular, dispatch can now only be initiated by calling Visitor.dispatch(Node) (instead of Node.accept(Visitor)). Furthermore, processing methods specified as part of nodes are now named Node.visitWith(Visitor) (instead of Node.process(Visitor)). In contrast to accept(), dispatch() handles null nodes, doing nothing and returning null. Furthermore, if the selected visit() or visitWith() method has void as its return type, dispatch() returns the specified node (instead of null).

Rats!' internal visitors have been updated to utilize dispatch(). Additionally, many visitors have been refactored to utilize a common superclass, xtc.parser.GrammarVisitor, which reduces code bloat across Rats!' internal visitors. All visitors in xtc.lang were already using dispatch().

Furthermore, this release makes the following changes to Rats!:

This release also adds support for local label declarations to the C grammar and pretty printer. Additionally, the C grammar, symbol table, and pretty printer have been modified, so that annotations encapsulating regular AST nodes now represent the compiler directives preceding that node's text in the input (instead of the other way around).

The new xtc.xform package provides a facility for querying and transforming abstract syntax trees (ASTs). The query language is inspired by XPath 2.0, but has some significant differences, notably to destructively modify ASTs. Thanks to Joe Pamer for implementing the query and transformation engine.

1.5.2 (3/7/05)
Minor feature and bug fix release. This release changes Rats! so that all repetitions appearing in transient productions are implemented through iterations and are not desugared into the corresponding recursive expressions (which can be used to avoid stack overflow errors for long sequences of expressions). This release also fixes a bug in Rats!, which caused repeated sequences to be lifted too aggressively.

All tools now return appropriate exits codes, 0 on successful executions and 1 on error conditions.

This release also improves the C grammar and pretty printer. In particular, it fixes bugs in:

For some nested expressions (such as arithmetic expressions appearing as operands for the bitwise or operator), the pretty printer now emits parentheses to avoid warnings when compiling the resulting code with GCC under the -Wparentheses command line option.

Additionally, the C grammar and pretty printer now support the following (GCC) extensions:

The C grammar now accepts source files with just white space and comments. Furthermore, Rats!-generated parsers, when created with an explicit file size argument to the constructor, now accept empty files (i.e., of length 0).

The overall effect is that the C driver (xtc.lang.CDriver) now parses and pretty prints the entire Linux kernel (version 2.6.8). The resulting source code compiles with GCC under the -Wall command line option (and no warnings).

Thanks to Marc Fiuczynski for identifying most of the bugs and missing language constructs and for testing the C driver against the Linux kernel.

This release also changes the format of pretty printed ASTs to be more compact (and to be consistent with the AST query language currently being developed). Pairs (xtc.util.Pair) are now mutable, but should still be treated as immutable if they are memoized by a Rats!-generated parser.

1.5.1 (12/16/04)
Bug fix release.

It makes the following changes:

1.5.0 (11/11/04)
Performance tuning and bug fix release. Parsers generated by Rats! now use arrays of read-in characters and memoized results instead of a linked list of parser objects. The current parser position now is an explicit index into these arrays instead of a reference to a parser object. Performance tests with the Java parser show that the parser consumes only half the memory and takes only 80% the time when recognizing Java source files when compared with previous versions.

Note that this release changes the basic parser interface and is not backwards-compatible. In particular, parsing methods now take an explicit index argument (named yyStart), and the character() method returns an int instead of a Result. Furthermore, parsers perform best if they are created with the three-argument constructor, which includes the length of the input. For example, the following code snippet parses a file named fileName of size fileSize with reader in and top-level production TopLevel:

  Parser p = new Parser(in, fileName, fileSize);
  Result r = p.pTopLevel(0);

This release also makes the following changes:

1.4.2 (9/23/04)
Performance tuning and bug fix release:
1.4.1 (9/16/04)
Minor feature and bug fix release:
1.4.0 (9/7/04)
Feature release.

This release focuses on Rats!' automatic generation of abstract syntax trees (through generic nodes). Notable improvements include:

Additionally, the newly added state attribute and the corresponding xtc.util.State interface help with writing grammars that are context-sensitive and require global state. The state attribute, as well as the debug, location, and constantBinding attributes can now also be specified on a per-production basis, simply by including them before the production's type. Next, sequences can now be named; the name is specified as the first element in a sequence by including it between less-than < and greater-than > signs. Furthermore, the readability of printed grammars and generated code has been improved through new line-wrapping facilities in xtc.tree.Printer. Finally, identifiers may now contain underscores (_).

Almost all of the newly added features are utilized by the new grammar for C and the corresponding pretty printer (in the xtc.lang package). Parser and pretty printer can be tested by executing "java xtc.lang.CParser <file>".

1.3.0 (4/21/04)
Feature release. This release adds the ability to automatically generate abstract syntax trees (ASTs) in Rats!. A production that should result in a generic node (xtc.tree.GNode) as its semantic value has generic as its type. The corresponding generic node has the same name as the production, and the children of the generic node are the semantic values of all component expressions in the matched sequence, with the exception of character terminals and nonterminals referencing void productions.
1.2.2 (4/16/04)
Internal release. Fixed a bug in the desugaring of repeated sequences for Rats!; thanks to Robin Lee Powell for identifying this bug.
1.2.1 (4/13/04)
Bug fix release. Fixed a bug in Rats!, under which options were too aggressively simplified; thanks to Robin Lee Powell for pointing out the incorrect behavior resulting from this bug. Also fixed two bugs in the processing of nested choices. Finally, fixed a bug in the handling of bindings to nested choices.
1.2.0 (4/9/04)
Feature release. This release improves the reflection-based dynamic dispatch for nodes and visitors by also allowing functionality to be expressed as part of nodes: While visit() methods in visitors are selected based on the type of the node, the corresponding process() methods in nodes are selected based on the type of the visitor. The dynamic dispatch mechanism first tries to locate a process() method and, if none can be found, tries to locate the corresponding visit() method.

This release also makes the following improvements to Rats!:

This release also includes support for generic abstract syntax tree nodes, though they cannot yet be generated automatically.

1.1.0 (2/3/04)
Minor feature and bug fix release. This release improves Rats! by fixing a bug in the processing of syntactic predicates and by adding a new optimization that avoids stack overflow errors on some Java virtual machines. The Rats! tool, rats, now supports command line flags to control which optimizations to perform. The Java parser tool, pjava, now supports the printing of parser statistics in white-space delimited format (to easier import data into spreadsheets).
1.0.0 (1/21/04)
Initial release.