next up previous
Next: 6.3 Types Up: 6. Conclusions Previous: 6.1 Other Systems

6.2 Interoperability

By far the most important kind of interoperation between code written in different languages is that which occurs when the the pieces of code are in fact entirely separate programs, linked only through some communication medium. This is especially true when processes are an abundant resource within easy reach, and is such a powerful and general modus operandi that I find it tends to remove most of the need for subroutine-call interoperability between high-level languages like SETL and lower-level languages like C.

This is largely because of the differences in data representation between languages of unequal level. These differences require data conversions, which introduce the potential for error, inefficiency, and clutter. The conversions must occur even if the interface is call-based rather than I/O-based. Conversely, if interfaces are kept narrow where practicable as a fundamental design principle, the high walls between processes are a welcome form of protection: memory corruption by a SETL program running under a correct interpreter is impossible, but no such guarantee can be made if an arbitrary C library is linked in. Furthermore, where a language split is already countenanced for the sake of access to some optimized low-level code, a process split allows the CPU-intensive computation to be moved easily to a different processor, perhaps one that is much faster than the user's workstation.

However, there are occasions when it is genuinely useful to be able to use a predefined library of C, Fortran, or Ada routines directly from SETL, typically for graphics. Usually one does not really wish to write new code in the foreign language in such a case--otherwise one does well to write it in the implementation language of some SETL interpreter, and integrate it there. More likely is that one just wants to have an interface to the library in terms natural to SETL. The question then arises: how much effort is it worth to build a ``thick'', SETL-oriented binding to this library compared to the nearly mechanical generation of a ``thin'' binding which may require considerable accommodation of the library's needs by the SETL programmer? The answer will depend on how transient the need for that particular library is judged to be.

In 1990, Jack Schwartz had an indefinitely transient need for the facilities of the Macintosh Toolbox, which even then contained over 1000 routines. SETL2 was ported to the Mac, but multiple processes in the standard environment for that platform were not to be available for many years to come. Jack therefore enlisted my help in generating a thin SETL2 interface to the Toolbox. The SETL2 callout interface was exceedingly primitive (all calls had to be routed through a single routine), and Kirk Snyder staunchly refused to allow anyone else access to the SETL2 source code in those days. Partly for these reasons, but mainly due to the fact that there were a great many parameter types, all needing conversions of one kind or another, the C part of this interface was quite bulky, running to some 16,000 lines of mechanically generated code. The SETL programs and other scripts I wrote to generate the interface had the helpful redundancy of Pascal ``header'' files (source files containing declarations meant to be incorporated into user programs). The C header files did not by themselves convey enough information to distinguish value parameters from result or value-result parameters when an asterisk (indicating a pointer) appeared, but the appearance of var in the corresponding Pascal declaration supplied the needed discrimination in all but a few special cases. Nowadays, good discipline in C headers calls for the use of const to allow programmers to make this distinction at a glance, but this was not common practice in 1990, at least not among the authors of the Macintosh Toolbox.

This was the first of several SETL2 and later SETL interfaces I generated over the years. The Griffin group once even had me generate a SETL2 interface to the X graphics library. Eventually these generators led to what is now a reasonably civilized procedure for customizing my SETL interpreter with thin interfaces to libraries described by C headers. The only such interfaces I have personally found useful to date have been for graphics libraries, such as GLUT/Mesa, which implements a simple event-based windowing system together with an essentially complete realization of OpenGL.

The customization procedure will never be fully automatic, because not all the information pertaining to a library interface is contained in the C header files. Some decisions about the correspondence between C structures and SETL objects have to be made consciously. The goal of customization is principally to extend the SETL library, but except where a very thin interface is acceptable (which is rare), the goal is also to produce a SETL package containing ``wrapper'' routines and other definitions. The work done by the customizer culminates in the production of a Makefile and associated scripts that build files which fit into the structure of the SETL distribution package in such a way that the inclusion of the customization can be selected at configuration time preparatory to compiling and installing the SETL system.

Recently, in a similar vein, a fairly powerful package called SWIG (Simplified Wrapper and Interface Generator) [24] has been developed by Dave Beazley for the purpose of generating interfaces between a number of languages and C/C++ functions (and variables), again based on information found in the kind of C/C++ declarations to be found in header files. Currently, the scripting languages Tcl/Tk [184], Perl [155], and Python [201] are fully supported by SWIG, and there is partial support for Eiffel, Guile, and Java.

Of course, even the best customization procedure or ad hoc interface generator cannot ultimately be as good as a properly formalized foreign-language interface. The requisite sublanguage must be able to describe external entities precisely, and to specify how to translate between them and SETL entities. Since low-level languages often deal directly in representations of low-level scalar types and memory layouts, the sublanguage must accommodate these, and observe the restrictions that attend them.

Of all the languages currently in use for systems programming, Ada 95 stands out as the only one able to express such specifications in a way that is simultaneously convenient, comprehensive, and precise. Accordingly, I would make the following proposal.

SETL is badly in need of a respectable implementation, yet it is a small and semantically straightforward language. I believe that the ideal way to write a formal specification for it would be to describe each of its syntactic constructs as expansions into Ada code, and to describe each of its run-time objects using Ada specifications. These two bodies of description obviously dovetail, and the ``meta-rules'' which are developed to discipline them will form a good basis for describing SETL extensions, including foreign-language interfaces.

The interfacing sublanguage should nominally fit into the style of SETL, but since Ada is well suited to this kind of descriptive role, the SETL forms ought to translate rather directly into Ada.

In fact, it is reasonable to contemplate an ``in-line Ada'' construct for SETL if Ada is to play such a central definitional role. I have some relevant experience in this regard, as a system I built several years ago called SETL/C++ was a successful though not entirely satisfactory implementation of SETL which used C++ to describe all SETL objects. It had an ``in-line C++'' feature, which worked perfectly well but was somewhat hard on the eyes. More unsatisfactory was the relative weakness of C++ as a specification language, though I was ultimately able to make the templates etc. do my bidding. But in practical terms, the greatest obstacle to having SETL/C++ take over from the well-worn, C-coded, interpreter-based SETL implementation I still use was the unreliability of C++ compilers, a problem that continues to this day. Ada, on the other hand, is probably the most ideal interface specification language in existence, and GNAT [1] is much more robust than any C++ compiler currently available, so perhaps it is time to let Ada repay SETL for Ada/Ed by using Ada/GNAT to specify and implement the world's first truly robust SETL system.


next up previous
Next: 6.3 Types Up: 6. Conclusions Previous: 6.1 Other Systems
David Bacon
1999-12-10