next up previous
Next: 1.2 A Brief History of SETL Up: 1. Introduction Previous: 1. Introduction

  
1.1 Why SETL?

First of all, SETL strives to put the needs of the programmer ahead of those of the machine, as is reflected in the automatic memory management, in the fact that flexible structures can be employed as easily as size-constrained ones can, and in the presence of an interface to powerful built-in datatypes through a concise and natural syntax. This high-level nature makes SETL a pleasure to use, and has long been appreciated outside the world of distributed data processing. Flexibility does not itself ensure good discipline, but is highly desirable for rapid prototyping. This fills an important need, because experimentation is a crucial early phase in the evolution of most large software systems, especially those featuring novel designs [135,173,66,70,71].

Second, SETL's strong bias in favor of ``value semantics'' facilitates the distribution of work and responsibility over multiple processes in the client-server setting. The absence of pointers eliminates a major nuisance in distributed systems design, namely the question of how to copy data structures which contain pointers. SETL realizes Hoare's ideal of programming without pointers [114].

Third, the fact that every SETL object, except for atoms and procedure values, can be converted to a string and back (with some slight loss of precision in the case of floating-point values), and indeed will be so converted when a sender's writea call is matched by a receiver's reada, means that SETL programs are little inconvenienced by process boundaries, while they enjoy the mutual protections attending private memories. Maps and tuples can represent all kinds of data structures in an immediate if undisciplined way, and the syntactic extension presented in Section 2.15 [Field Selection Syntax for Maps], which allows record-style field selection on suitably domain-restricted maps to be made with a familiar dot notation, further abets the direct use of maps as objects in programs, complementing the ease with which they can be transmitted between programs. A similar freedom of notation exists in JavaScript, where associative arrays are identified with ``properties'' [152].

Fourth, strings themselves are first-class objects in SETL. They are completely flexible, such that assigning to a substring can change the length of the hosting string, just as a tuple can change length through subtuple assignment. Strings have a rich set of built-in operations for searching and manipulation based on algebraically formulated patterns. Further extensions to allow selections and substitutions to be specified by regular expressions in slicing notations and other forms are described in Section 2.14 [Strings]. Because strings are at the heart of data processing, it is vital to support them well, and SETL does.

Fifth, SETL's skill with general finite maps has welcome consequences for the data processing practitioner. At an abstract level, a data processing system can be viewed as a dynamic graph along whose arcs messages pass. The Khoros [134] system makes this abstract view concrete in its visual programming language, Cantata. The data in messages undergoes transformation and recombination in processing nodes, where maps represent data relationships directly. Processes of extraction tend to gather data around keys (often strings) that identify categories. Processes of association discovery correspond to set intersection, and merging is closely related to set union. The sets themselves are most often domains or ranges of maps, or sometimes projections of sets of more general tuples. Maps are so very much at the heart of SETL style that Dewar, in his 1979 book The SETL Programming Language [53], wrote:

The general rule in SETL is to use maps wherever possible. This may take some practice, especially if you are used to programming in some other language, but remember this simple principle: find the maps, they are always there!
This is a principle that works well in practice. For example, if a process P has to multiplex input streams from several other processes, a map M over the corresponding I/O handles in P will often be used to track P's state of knowledge about those processes, and the domain of M will be in the set of input handles that is passed to the select primitive (see Section 2.5 [Multiplexing with Select]) when P waits nondeterministically for I/O events.

Sixth, the absence of restrictions that are unhelpful to programmers brings with it a substantial measure of orthogonality and robustness. Orthogonality promotes the use of feature combinations that make sense, which is conducive to directness of expression. The absence of size restrictions similarly helps to eliminate clutter. For example, when programmers do not have to write extra code to deal with the fact that messages embedded in a TCP stream can be of arbitrary length, the most general case is handled gracefully and effortlessly. The importance of this for publicly exposed network servers cannot be overemphasized. If the only thing the most assiduous attack can do is bring down a subprocess S due to overall exhaustion of resources that are allocated to S, and S is the hostile client's only interface to the public service, then the damage is easily isolated. It does not even need to be damage per se, but can be naturally handled like an ordinary transaction that aborts when it cannot commit.

The remarkable adaptability of SETL and its gift for concise expression over a wide range of programming problems stem from its close connection to the foundations of mathematics. Set formers, modeled after set comprehensions, are a splendid case in point. They are highly accessible little pictures which encourage the programmer to take a dual view of sets as entities that can be characterized by predicates or constructed from parts. Tuple displays also exemplify SETL's directness of expression. In ``fetch'' contexts, they are enumerative denotations much like the written form of LISP lists, and in ``store'' positions, they show immediately the pattern of a required structure. Finally, SETL has freely borrowed the best ideas of other programming languages, such as the Algol family, APL [120,169], and SNOBOL [99], as well as adding a few of its own.

The main significance of all the foregoing attributes of SETL for data processing over the Internet is that they pave the way for small programs. A small SETL program can do a lot, is not constrained by the usual obstacles to the communication of complex or pointer-bearing objects between processes, and is a well isolated module. Shared variables, the plague of concurrent programs, are simply not in the language, and are rarely missed: sharing resources is a serious matter that usually calls for a management mechanism that is best encapsulated in a module anyway. Namespace is adequately structured by a file system or URL-like convention for most purposes, but access to the objects in that space is best mediated by an appropriately synchronizing small process.

This raises the important issue of the data processing environment--an insular language may be admired, but it cannot scream. I have been fortunate in choosing to adopt the Posix [118,119,117] standards as a design benchmark for the SETL interface to files, commands, users, processes, and network communication. This operating system model has gained wide acceptance by vendors in the 1990s, and is now embodied in the X/Open specification commonly known as Unix 98 [154]. Chapters 2 [Environmentally Friendly I/O] and 3 [Internet Sockets] are largely devoted to a presentation of those features in my current definition of SETL that employ and build upon this model.


next up previous
Next: 1.2 A Brief History of SETL Up: 1. Introduction Previous: 1. Introduction
David Bacon
1999-12-10