next up previous
Next: 1. Introduction Up: SETL for Internet Data Processing Previous: Acknowledgments

Preface

Colin Broughton, a colleague in Edmonton, Canada, first made me aware of SETL in 1980, when he saw the heavy use I was making of associative tables in SPITBOL for data processing in a protein X-ray crystallography laboratory.

Accordingly, he loaned me B.J. Mailloux's copy of A SETLB Primer by Henry Mullish and Max Goldstein (1973). I must have spoken often of this language, because when another colleague, Mark Israel, visited the University of British Columbia two or three years later, and came across a tutorial entitled The SETL Programming Language by Rober Dewar (1979), he photocopied it in its entirety for me.

The syntactic treatment of maps in SETL places its expressive balance closer to algebraic mathematics than is customary for programming languages, and I immediately started finding SETL helpful as a notation for planning out the more difficult parts of programs destined to be coded in SPITBOL, Algol 68, Fortran, or Assembler/360.

When I took my M. Sc. in 1984-5, my enthusiasm about SETL was such that I made a presentation about it to my Advanced Programming Languages class. Later, when I was invited to stay on for my Ph.D. at the University of Toronto, the reason I declined to do so was specifically because no one there was willing to supervise work on SETL, and I for my part did not want to get caught up in either the logic programming or the (really very good) systems programming tradition prevailing there at that time.

So it was that in 1987, not having access to any good SETL implementation (the CIMS version seemed to crash on the first garbage collection attempt on the mainframe I was using), I decided to dash one off in SPITBOL. It was a good learning exercise. The compiler seemed to work correctly to the extent it was tested, and produced runnable SPITBOL code, but never saw much practical application on the hardware of the day.

In December of 1988, I decided to write a production-grade SETL compiler in C, and thus began an implementation that I use to this day and continue to extend. From the start, I found maps useful in the combinatorial programming I was doing in molecular modeling. The implementation is packaged in such a way that its most common invocation mode (``compile and go'') is via the shell-level command setl, making SETL programs easy to fit into Unix filter pipelines, just like other scripts or pre-compiled programs.

Reasonably convenient though it was to be able to set up arrangements of communicating SETL programs in this way, the languages such as the Bourne and C shell used for interconnecting the various programs were still nothing more than thin descendants of the job control languages of yore. Since SETL was already so competent at handling data, it was only natural to extend it with facilities for process creation and communication. This led to the one-shot filter, unidirectional pipe, and bidirectional pump models for communication described in Chapter 2 [Environmentally Friendly I/O] of this dissertation.

The fact that a writea to a reader's reada can as easily transfer a large map as a small integer, combined with the fact that the Unix command rsh can be used to launch tasks on remote processors and communicate with them, made this a very comfortable programming environment for distributed programming, though not able to express a TCP/IP (TCP or UDP) server or client directly.

In 1994, the World Wide Web ``arrived'', and it became clear that TCP/IP was to become firmly established as a global standard--the MIME-conveying HTTP protocol rides upon TCP streams, and the namespace-structuring Universal Resource Locator (URL) convention uses host names that map to IP addresses through the auspices of the widely used Domain Name Service (DNS). I therefore decided to build support for TCP/IP directly into the SETL I/O system, such that opening and using a bidirectional TCP communications stream in SETL would be as easy and natural as opening and using a file. How this is done is detailed in Chapter 3 [Internet Sockets], which also describes SETL's programmer-friendly support for UDP datagrams.

The ability to code servers in SETL has proven to be even more useful than I predicted. Servers act as the primary objects in server hierarchies. They bear state and control access to that state through message-passing protocols with child processes which in turn deal with clients. A server tends to keep track of such children with a dynamically varying map. Between the server and these trusted, proximal children, communication is safe and quick, minimizing the risk of the server becoming a bottleneck. The WEBeye study of Chapter 4 [WEBeye: A Case Study] illustrates this pattern.

The liberal use of processes turns out to be beneficial time and again. Real work tends to fall to simple modules which communicate in a primitive way through their standard input and output channels, and these modules can easily be written in more efficiency-oriented languages than SETL where necessary. Small components are also easy to isolate for special or unusual testing, or for those rare but inevitable episodes called debugging.

Overall, systems designed as process-intensive server hierarchies tend to acquire a satisfying dataflow feel. Not only is this in the spirit of Unix filters, it also dovetails with SETL's value semantics, which abhor a pointer and cherish a copy, and in so doing avoid the hazard of distributed dangling references.


next up previous
Next: 1. Introduction Up: SETL for Internet Data Processing Previous: Acknowledgments
David Bacon
1999-12-10