Towards Increased Productivity of Algorithm Implementation

Table of Contents:

Principal Investigator.
Productivity Measures.
Summary of Objectives and Approach.
Detailed Summary of Technical Progress.
Transitions and DOD Interactions.
Software and Hardware Prototypes.
List of Publications.
Invited and Contributed Presentations.
Honors, Prizes or Awards Received.
Project personnel promotions obtained.
Project Staff.
Misc Hypermedia URL.
Keywords.

Principal Investigator.

PI Name: Robert A. Paige
PI Institution: New York University
PI Phone Number: (212) 998-3080
PI Fax Number: (212) 995-4124
PI E-mail Address: paige@cs.nyu.edu
PI URL Home Page: http://cs.nyu.edu/cs/faculty.html
Grant Title: Towards Increased Productivity of Algorithm Implementation
Grant/Contract Number: N00014-93-I-0924
R&T Number: 4331017---01
Reporting Period: 01 Oct. 93 - 30 Sep. 94

Productivity Measures.

Number of refereed papers submitted not yet published: 3
Number of refereed papers published: 4
Number of unrefereed reports and articles: 4
Number of books or parts thereof submitted but not published: 0
Number of books or parts thereof published: 0
Number of project presentations: 23
Number of patents filed but not yet granted: 0
Number of patents granted and software copyrights: 0
Number of graduate students supported >= 25% of full time: 0
Number of post-docs supported >= 25% of full time: 0
Number of minorities supported: 0

Summary of Objectives and Approach.

The proposed research aims to improve the production rate and the reliability of high performance implementations of nonnumerical algorithms. The approach utilizes program transformations that capture broad algorithm design principles. These transformations are evaluated prior to an implementation by testing whether they can be used effectively both to explain complex algorithms, and also to help design new algorithms. The implementation methodology makes use of conditional rewriting together with logic based program analysis. The program development methodology is evaluated by productivity experiments.

Detailed Summary of Technical Progress.

The productivity experiments described in last year's report were reported in Dec. 1993 at the first ACM SIGSOFT Conference. The results suggest that productivity of algorithm implementation in C (measured in terms of number of source C lines per unit of manual programming time) can be increased by our approach over conventional hand crafted programming by factors ranging from 5.1 to 9.9. Preliminary results also showed that the running time of C code produced by this new approach can be as fast as 1.5 times that of tightly coded high quality Fortran.
The preceding work aroused attention at a Dagstuhl workshop last year, and an invited talk was requested for the joint PLILP/ALP Conf. in Sept. 94 in which our program development technology was publically demonstrated and also written up within the proceedings.
As encouraging as this sounds, there are two hurdles that still need to be overcome before this novel program development technology can be practical. First, the technology needs to scale up. The SIGSOFT paper reported experimental development of 9 C programs, the largest of which is around 10 pages long. A much more credible case would be made with programs 50 pages long. A related problem is the slow rate at which C programs are mechanically produced from hand-coded high level specifications. 3.3 lines of specification are turned into 24 lines of C every minute. A plan is outlined in the PLILP/ALP Proceedings for increasing this production rate 6000 times.
Much of the research over the last year was in preparation for scaling up the program development technology. At the heart of our transformational methodology is a type/subtype system that facilitates perspicuous mathematical specification of efficient data structures and algorithms without extraneous implementation detail. More importantly, such specifications lead to a set-theoretic language with computational transparency; i.e., programs can be analyzed accurately for run-time resource utilization, so that programming can be guided by complexity considerations.
Most of our effort last year (as part of the scale-up effort) was in extending the type/subtype system with an abstract datatype capability and a crucial, but technically difficult, ability to handle recursively defined subtypes with disjoint alternation (which considerably broadens the contexts in which a unit-time implementable associative access can be used in our specification language). This theoretical work is now completed, and the more pragmatic research involved in fitting it into a programming language has just begun. Since reading high level external input is the beginning of computation, we decided to begin this work with the development of algorithms to translate high level external input data in string form into the efficient data structures modeled by the new type/subtype system. This work is now done, and the core algorithms are reported in the Proceedings of the IFIPS 94 Conference. The next step is to develop new type and subtype inference algorithms for the specification language, and to modify our specification compilers in order to exploit the new type/subtype system and to generate efficient C code.
Last year we were involved in several algorithm design projects with the aim of implementing these new algorithms as part of experiments with scaled up programs. In a journal article submitted to TCS (and now provisionally accepted) Cai and the investigator extended Paige and Tarjan's simple algorithm in SICOMP 87 to find duplicate strings contained in a multiset of strings into a general purpose `multiset discrimination' tool for finding duplicate data contained in a wide subset of our new type system. Using this tool, we were able to obtain improved solutions to virtually every compilation task from front-end macro processing down to global optimization by strength reduction. This tool was central to the development of the reading algorithms in the IFIPS 94 paper.
With Nils Klarlund at the U. of Aarhus we developed new DFA minimization algorithms for large alphabet sizes and for potentially small number of transitions leading out from any state. These algorithms make use of new improved BDD simplification algorithms that depend on multiset discrimination. Klarlund needs these algorithms to speedup his program verification system.
With Neil Jones at the U. of Copenhagen we developed new unification algorithms that accelerate the classical union/find algorithm of Huet, and Vitter and Simons. Our algorithms are also top-down, and report occurs checks early. We plan to generate C code from high level specifications of these algorithms, and to conduct comparative benchmarks with implementations of other algorithms.
With J.P. Keller we derived a new DFA minimization algorithm with space asymptotically better than Hopcroft's algorithm of 1971. This research also involved integration of program transformations using APTS with set theoretic proof construction and checking using Keller's Aetna system. One conclusion drawn from this work is that the more rapidly that APTS can produce high performance C code, the more critical it needs to be equipped with the safety of machine-checked verification of the transformations involved in code production. Because we observed how difficult and painfully slow it is to construct such formal proofs (a view shared by expert users of other systems), the formidable problem of making proof construction easier is essential to making our technology scale up in a practical way.
Besides the algorithms just mentioned, the new algorithms that we derived in journal articles by the investigator with Chang and with Bloom are candidates for scaled up experiments.
The inference engine, the critical component of the APTS program transformation system that performs program analysis has been submitted for journal. An independent application of APTS as part of a running geometric constraint solving system was reported in a journal article that has been accepted at CAD. A survey of our research together with relevant open problems was published in the Proceedings of the Atlantique Workshop.

Transitions and DOD Interactions.

Software and Hardware Prototypes.

List of Publications.

Bloom, B. and Paige, R., "Computing Ready Simulations Efficiently," in First North American Process Algebra Workshop, A. Zwarico and S. Purushothaman eds., Workshops in Computer Science Series, Springer-Verlag 1992, pp. 119 - 134; accepted at Science of Computer Programming with the title "Transformational Design and Implementation Of a New Efficient Solution to the Ready Simulation Problem."
Bouma, W., Cai, J., Fudos, I., Hoffmann, C., and Paige, R., "A Geometric Constraint Solver," Purdue U. TR 93-54, accepted CAD, 1993.
J. Cai, X. Han, R. Tarjan, "An O(mlogn)-time Algorithm for the Maximal Planar Subgraph Problem," SICOMP, vol. 22, No. 6, pp. 1142-1162, Dec. 1993.
J. Cai, "Counting Embeddings of Planar Graphs using DFS trees," SIAM J. Discrete Math, Vol. 6, No. 3, Aug. 1993.
Cai, J. and Paige, R., "Using Multiset Discrimination To Solve Language Processing Problems Without Hashing," accepted Theoretical Computer Science; also DIKU-TR Num. 94/16.
Cai, J. and Paige, R., "Towards Increased Productivity of Algorithm Implementation," Proc. ACM SIGSOFT 1993, pp. 71 - 78.
Cai, J., A Language for Semantic Analysis, Technical Report 635, Courant Institute/New York University, 1993, submitted to TOPLAS.
Chang, C. and Paige, R., "From Regular Expressions to DFA's Using Compressed NFA's," submitted to Theoretical Computer Science.
Paige, R., "Efficient Translation of External Input in a Dynamically Typed Language," to appear Proc. IFIP Congress 94 - Vol 1, eds. B. Pehrson and I. Simon, Elsevier, Sept 1994.
Paige, R., "Atlantique Research Overview," Proc. Atlantique Workshop on Semantics Based Program Manipulation," eds. N. Jones and C. Talcott, DIKU Report 94/12, 1994.
Paige, R., "Viewing a program transformation system at work" Joint 6th Intl. Conf. on Programming Language Implementation and Logic Programming (PLILP) and 4th Intl. Conf. on Algebraic and Logic Programming (ALP), Sep. 1994.

Invited and Contributed Presentations.

Seminar, U. of Utrecht, "Towards Increased Productivity of Algorithm Implementation," Oct. 1993.
Seminar, SUNY, Albany, "Towards Increased Productivity of Algorithm Implementation," Dec. 1993.
Seminar, ISI/USC, "High Level Reading and Data Structure Compilation," Dec. 1993.
Seminar, UCLA, "Program Development and Algorithm Design by Transformation," Dec. 1993.
Seminar, Stanford U., "High Level Reading and Data Structure Compilation," Dec. 1993.
Seminar, Aarhus University, "Derivation of Fast Strong Bisimulation and DFA Minimization," Mar. 1994.
Theory Seminar, Aarhus University, "From Regular Expressions to DFA's," Mar. 1994.
Seminar, LITP, "Algorithm Design and Implementation by Program Transformation," Mar. 1994.
Seminar, Ecole Polytechnique, "Algorithm Design and Implementation by Program Transformation," Mar. 1994.
Seminar, U. of Trondheim, "Productivity Improvement of Algorithm Implementation," Aug. 1994.
Seminar, Tech. U. of Berlin, "Algorithm Design and Program Development by Transformation," Aug. 1994.
Seminar, Tech. U. of Berlin, "High Level Reading," Aug. 1994.
Seminar, U. of Saarlandes, "High Level Reading," Aug. 1994.
Presentation at Joint PLILP/ALP Conf., "Viewing a Program Transformation System at Work," Madrid, Spain, Sep. 1994.
Presentation at AFOSR Software and Systems Program Workshop, "Towards Increased Productivity of Algorithm Implementation," Washington D.C., Sep. 1994.
Presentation at IFIPS Congress, "Efficient Translation of External Input in a Dynamically Typed Language," Hamburg, Ger., Aug. 1994.
Presentation at Semantique Workshop on Semantics Based Program Manipulation, "High Level Reading," Aarhus, Denmark, July, 1994.
Presentation at Dagstuhl Seminar on Incremental Computation and Dynamic Algorithms, "Algorithm Derivation and Program Development by Transformation," Schloss Dagstuhl, Ger., May, 1994.
Presentation at Dagstuhl Seminar on Incremental Computation and Dynamic Algorithms, "Towards Increased Productivity of Algorithm Implementation" and an APTS System Demonstration, Schloss Dagstuhl, Ger., May, 1994.
Presentation at WG2.1 Meeting, "Algorithm Invention By Transformation," Rencum, The Netherlands, Jan. 1994.
Presentation at WG2.1 Meeting, "High Level Reading and Data Structure Compilation," Rencum, The Netherlands, Jan. 1994.
Presentation at WG2.1 Meeting, "Algorithm Specification," Rencum, The Netherlands, Jan. 1994.
Presentation at ACM SIGSOFT, "Towards Increased Productivity of Algorithm Implementation," Dec. 1993.

Honors, Prizes or Awards Received.

Invited speaker for Joint 6th Intl. Conf. on Programming Language Implementation and Logic Programming (PLILP) and 4th Intl. Conf. on Algebraic and Logic Programming (ALP), Sept., 1994, Madrid, Spain.

Project Personnel Promotions Obtained.

PI promoted to Full Professor, retroactive to Sept. 1993.

Project Staff.

Principal Investigotor: Robert Paige

Misc Hypermedia.

Keywords.

Transformations
Algorithm Derivation
Program Development
Productivity