Summary of Objectives and Approach.
Detailed Summary of Technical Progress.
The preceding work aroused attention at a Dagstuhl workshop last year, and an invited talk was requested for the joint PLILP/ALP Conf. in Sept. 94 in which our program development technology was publically demonstrated and also written up within the proceedings.
As encouraging as this sounds, there are two hurdles that still need to be overcome before this novel program development technology can be practical. First, the technology needs to scale up. The SIGSOFT paper reported experimental development of 9 C programs, the largest of which is around 10 pages long. A much more credible case would be made with programs 50 pages long. A related problem is the slow rate at which C programs are mechanically produced from hand-coded high level specifications. 3.3 lines of specification are turned into 24 lines of C every minute. A plan is outlined in the PLILP/ALP Proceedings for increasing this production rate 6000 times.
Much of the research over the last year was in preparation for scaling up the program development technology. At the heart of our transformational methodology is a type/subtype system that facilitates perspicuous mathematical specification of efficient data structures and algorithms without extraneous implementation detail. More importantly, such specifications lead to a set-theoretic language with computational transparency; i.e., programs can be analyzed accurately for run-time resource utilization, so that programming can be guided by complexity considerations.
Most of our effort last year (as part of the scale-up effort) was in extending the type/subtype system with an abstract datatype capability and a crucial, but technically difficult, ability to handle recursively defined subtypes with disjoint alternation (which considerably broadens the contexts in which a unit-time implementable associative access can be used in our specification language). This theoretical work is now completed, and the more pragmatic research involved in fitting it into a programming language has just begun. Since reading high level external input is the beginning of computation, we decided to begin this work with the development of algorithms to translate high level external input data in string form into the efficient data structures modeled by the new type/subtype system. This work is now done, and the core algorithms are reported in the Proceedings of the IFIPS 94 Conference. The next step is to develop new type and subtype inference algorithms for the specification language, and to modify our specification compilers in order to exploit the new type/subtype system and to generate efficient C code.
Last year we were involved in several algorithm design projects with the aim of implementing these new algorithms as part of experiments with scaled up programs. In a journal article submitted to TCS (and now provisionally accepted) Cai and the investigator extended Paige and Tarjan's simple algorithm in SICOMP 87 to find duplicate strings contained in a multiset of strings into a general purpose `multiset discrimination' tool for finding duplicate data contained in a wide subset of our new type system. Using this tool, we were able to obtain improved solutions to virtually every compilation task from front-end macro processing down to global optimization by strength reduction. This tool was central to the development of the reading algorithms in the IFIPS 94 paper.
With Nils Klarlund at the U. of Aarhus we developed new DFA minimization algorithms for large alphabet sizes and for potentially small number of transitions leading out from any state. These algorithms make use of new improved BDD simplification algorithms that depend on multiset discrimination. Klarlund needs these algorithms to speedup his program verification system.
With Neil Jones at the U. of Copenhagen we developed new unification algorithms that accelerate the classical union/find algorithm of Huet, and Vitter and Simons. Our algorithms are also top-down, and report occurs checks early. We plan to generate C code from high level specifications of these algorithms, and to conduct comparative benchmarks with implementations of other algorithms.
With J.P. Keller we derived a new DFA minimization algorithm with space asymptotically better than Hopcroft's algorithm of 1971. This research also involved integration of program transformations using APTS with set theoretic proof construction and checking using Keller's Aetna system. One conclusion drawn from this work is that the more rapidly that APTS can produce high performance C code, the more critical it needs to be equipped with the safety of machine-checked verification of the transformations involved in code production. Because we observed how difficult and painfully slow it is to construct such formal proofs (a view shared by expert users of other systems), the formidable problem of making proof construction easier is essential to making our technology scale up in a practical way.
Besides the algorithms just mentioned, the new algorithms that we derived in journal articles by the investigator with Chang and with Bloom are candidates for scaled up experiments.
The inference engine, the critical component of the APTS program transformation system that performs program analysis has been submitted for journal. An independent application of APTS as part of a running geometric constraint solving system was reported in a journal article that has been accepted at CAD. A survey of our research together with relevant open problems was published in the Proceedings of the Atlantique Workshop.
Transitions and DOD Interactions.
Software and Hardware Prototypes.
Invited and Contributed Presentations.
Honors, Prizes or Awards Received.
Project Personnel Promotions Obtained.