next up previous
Next: 5.4 Summary Up: 5. On Data Processing Previous: 5.2 Problems and Solutions

Subsections


5.3 Guidelines

Let us now shift our attention from the responsibilities of the language designer and implementer to those of the programmer.

5.3.1 On Checking

Practically any data processing system will be forced to deal with some environment of ``foreign'' input data. It happens again and again that programmers, armed with tools better suited to systems programming than to high-level applications, will, in the face of deadlines, inexperience, and negligent supervisors, take shortcuts in the coding of input routines, and allocate fixed-size input buffers even in situations where they know they shouldn't. This kind of bug remains dormant until some attacker or innocent button-pusher awakens it with a long input record, and when it finally bites by nibbling at some memory it isn't supposed to, can be very difficult to track down. From this we learn the rule:

$\bullet$ No unchecked restrictions.

Actually, this should be refined a little, because overflows occur in many forms, and some of them are quite innocent. The correct advice is to be aware of when overflows are possible, and to make sure that their effects are understood (and not disastrous).

This rule is particularly relevant for distributed data processing, where programs tend to have greater exposure to malicious or clumsy adversaries than subroutines in relatively protected environments have. Stevens, in his famous introductory text on network programming, writes [194, p. 15]:

It is remarkable how many network break-ins have occurred by a hacker sending data to cause a server's call to sprintf to overflow its buffer. Other functions that we should be careful with are gets, strcat, and strcpy, normally calling fgets, strncat, and strncpy instead.
The distinction he is making here is between ANSI C routines that respectively do not and do provide a way for the programmer to limit the amount of data written into a given memory area.

  
5.3.2 On Limits

Often the best way to guard against the ill effects of a memory or arithmetic overflow is to make sure the overflow can't happen at all. If the restriction isn't really necessary, perhaps it is worth removing, giving us the closely related rule:

$\bullet$ No silly restrictions.
This is entirely germane to the input buffer example, because the scrupulous programmer will either put in appropriate checks, or use some form of dynamic allocation to make sure there is always a big enough buffer if there is a buffer at all.

The truly wise programmer will have this taken care of automatically by using a language like SETL, which actually makes it more convenient not to have an input size restriction than to impose one. For example, if the input is organized into some kind of ``lines'' as defined by local file system and operating system conventions, the statement *

line := getline fd;
assigns a single line (or om, at the end of file) to line no matter whether the line contains 0 characters or a billion. Its only alternative is to raise an exception due to insufficient virtual memory. The silent disasters of overflowing a buffer or yielding only part of the input line simply are not options, by the definition of getline.

Again, this is a rule which makes most sense at the highest semantic levels. At lower levels, closer to the hardware, some restrictions are unavoidable, and need to be properly checked. At the SETL level, it is usually most appropriate simply to pretend that no restriction exists, as in this example. This is really saying that exceeding the restriction is a rare resource exhaustion event that should probably be treated along the lines of running out of hard disk space--crashing the program with a diagnostic message may be a reasonable response, especially if the program is a small and well-isolated module whose main purpose is to deal with external clients, and its parent is a SETL program that sees the crash merely as an end-of-file condition on a file descriptor (see Section 2.13 [Normal and Abnormal Endings]).

Language designers obey this rule when they pay allegiance to the principle of orthogonality. Some of my own extensions to SETL were made in this spirit. For example, any expression may validly initialize a const, general loop headers may appear within set and tuple formers, and for and while clauses can appear within a single loop header. Restrictions on such things as the length of identifiers have no place in a modern language design, of course.

Language implementers also do well to avoid things like fixed-size tables and incommodious integers wherever there is a risk of unnecessarily restricting program size, the number of symbols or procedures, and so on.

  
5.3.3 On the Unexpected

Every programmer makes mistakes, networks and computers crash, file systems overflow, resources of every description reach the point of exhaustion sometimes, and clients present a myriad of surprises, so:

$\bullet$ Expect the unexpected.

Except in safety-critical systems, which require a heavy investment in equipment and an entirely different approach to software design than what is appropriate for data processing (essentially to guarantee that every need is always covered by a working component, and that resource exhaustion absolutely cannot occur), the best way to maximize reliability in a large distributed system is to layer it, with each module doing local checks and fielding the failures of lower-level components.

When a check fails, a module's path of least resistance is usually to throw up its hands and fail completely but recognizably. The module at the next higher level, usually a parent process in a program hierarchy, should always be prepared to deal with such failure, if for no other reason that there are so many ways in which the child process (say) can fail. Design economy is achieved by routing various types of failure through a common handling mechanism. In the ``Box'' pattern embodied in WEBeye, this is done by having parent processes connected to child processes through pipe or pump streams. Failure or natural termination of a child process is made known to the parent through an end-of-file condition. The parent generally knows or doesn't care whether the termination was expected. Serious errors of the kind which ``crash'' the subprocess and evoke a complaint from a language processor's run-time system are also logged.

Note that the parent's responsibility here is usually to check for an end-of-file condition on each input operation, and take appropriate action, which may simply be to fail in turn. Even the end-of-file check can sometimes be elided if the parent is a SETL program that is content to crash immediately upon trying to use the om value ensuing from a failed read, though this is not very good programming style.

It is not that far, however, from the liberal use of the assert statement, which is to be recommended highly even though its only purpose is to abend the program in the ostensibly impossible case that the assertion fails.

Component failures can obviously lead to a cascade of failures, moving up the chain of responsibility. Child processes need to be aware that their parents can fail, and again the path of least resistance is usually for the child to exit when it sees an end-of-file condition on the communication channel with the parent. Sometimes a process will want to do some ``cleanup'' housekeeping before actually exiting. The exit_gracefully routine in the vc-exit.setl file (Section A.15 [vc-exit.setl]) that is textually included by many components of WEBeye is a fairly extreme example of this, as it propagates SIGTERM signals to subprocesses in an effort to give them all a chance to shut down cleanly, but ends up issuing the irresistible SIGKILL to any that do not respond to the SIGTERM. Conversely, processes in WEBeye that have children strive to remain receptive to SIGTERM at all times, as is perhaps best illustrated by the use of a routine called select_or_exit_on_sigterm throughout vc-seq.setl (Section A.39 [vc-seq.setl]).

At some level, in a good design, a cascade of failures will reach a module which attempts some form of recovery. The main advantage of not being overzealous in riddling lower-level modules with recovery code relates to the variety of possible failure causes in real systems: unless the failure has a very specific and immediately recoverable cause, the best chance for a ``clean slate'' upon which to bring the failed part of the system back up will be engendered by clearing out the failed incarnation as completely as possible. This is especially true if the root cause of the failure was resource exhaustion, one of the most unpredictable and problematic failure modes--the very act of removing a large subtree of processes in such circumstances may be the most important part of the recovery itself, as it frees up a large quantity of vital resources.

  
5.3.4 On Clients

The sequence of server examples presented in Sections 3.2 [Concurrent Servers] and 3.3 [Defensive Servers] implicitly suggested the rule:

$\bullet$ Never trust clients.

Often it will be the case that a child process appointed by a server to deal with a client will have numerous responsibilities, and protection against denial-of-service attacks through resource exhaustion can simply be part of the child's natural propensity for crashing on conditions it cannot handle.

But sometimes the child process is interposed purely for protection at the communications level. Let us now examine a couple of general-purpose child processes that can be used by any server for safe line-by-line communication with any client.

The model, as always, is that it doesn't matter if the child process crashes, but it matters very much if the server crashes or is blocked in an I/O operation other than its main select. Hence the server should not even try to read something as seemingly innocuous as a single line directly from a client, because the client could send part of the line and then pause indefinitely. Similarly, the server should not send a line directly to a client, because the client could absorb part of it and then block.

To handle the input side of a connection to a newly accepted client having file descriptor fd, the server can use the following trivial program, which merely copies lines from stdin to stdout, flushing stdout after each one: *

tie (stdinstdout);    -- auto-flush stdout on each read from stdin
while (line := getline stdin/= om loop  -- read line or EOF indication
  putline (stdoutline);                 -- write line (and auto-flush)
end loop;                                  -- loop ends when EOF reached
If this program is named line-pipe.setl, the server can start it as an intermediary between itself and the client as follows: *
fd_in := open (`exec setl line-pipe.setl <&' + str fd, `pipe-in');
Notice that the shell has been used to redirect input from fd into the child process's stdin. Now whenever this child receives a whole line from the client, it writes it out to its own stdout, which is connected by a pipe to the parent's fd_in. Whenever the parent (the server) is ready to receive input from the child, it can read the whole line at high speed through this pipe. Typically, fd_in will be one of many file descriptors in a set passed to select, and the child's attempt to write the line to its parent will cause select to wake up and return fd_in in a set of ``ready for reading'' file descriptors.

On the output side, matters are not so simple. One might reasonably expect that the same little program could be started by the parent using *

fd_out := open (`exec setl line-pipe.setl >&' + str fd, `pipe-out');
since if the parent waits for fd_out to become ``ready for writing'' before sending it a line, then it can do so in the certain knowledge that the child will accept the whole line at high speed, no matter how long it takes for the child to send that line along to a slow (or even indefinitely blocking) client.

But here we run into an annoying fact about Unix pipes (a fact which is usually quite welcome from the performance point of view): that they can be filled ``to capacity'' by a sender in advance of the receiver being ready to receive even one byte. The result is that from the point of view of senders, receivers appear to be ready to receive before they really are. In the present case, this means that line-pipe.setl may appear to be ready to receive a line of data from its parent when in fact it is in the middle of doling a previous line out to an arbitrarily slow client.

A solution to this problem is to use a sending child process which gives its parent an explicit indication when it is truly ready to receive a line: *

fd := open (val command_line(1), `w');     -- get fd from command line
tie (stdinstdout);    -- auto-flush stdout on each read from stdin
while (line := getline stdin/= om loop  -- read line or EOF indication
  putline (fdline);                     -- write line to remote client
  flush (fd);                              -- flush client output buffer
  putline (stdout, `');              -- tell parent we're ready for more
end loop;                                  -- loop ends when EOF reached
If this program is named line-pump.setl, the parent can invoke it thus: *
fd_out := open (`exec setl line-pump.setl -- ' + str fd, `pump');
Notice that fd_out here is actually a bidirectional file descriptor, and that fd is not redirected by the shell but instead identified on the child's invocation command line and inherited. The parent must now wait for fd_out to become ready for reading and clear that indication by reading the empty line from fd_out before sending any line (on fd_out) after the first one. (An obvious slight variation on this parent-child protocol is to have the child send the parent an empty line initially as well, so that every write by the parent, including the first, has to be preceded by absorption of the empty ``clear to send'' line.)

Once the parent has opened both fd_in and fd_out, it is free to close (fd). The actual network connection will not be closed until both child processes have released it, which a cursory inspection of line-pipe.setl and line-pump.setl shows will not happen until they both terminate.

When a process tries to write through a pipe or pump stream to a process that has terminated, a PIPE signal is sent to the would-be writer. This signal can also be generated by attempted output to a TCP connection that has been closed by the peer, though the semantics are a little more complex (the error indication will only be generated after the second low-level write to such a stream). Normally, SIGPIPE causes silent termination of the process, though this behavior can be overridden in the usual way through a call such as *

open (`SIGPIPE', `ignore');
The process which does this should also normally check for errors on all output operations to pipes, pumps, and network connections: *
clear_error;
putline (fd, ...);
flush (fd);
if last_error /= no_error then
  -- output error has occurred
    ...
Clearly, this is a messy business, and best avoided.

Another advantage of using line-pump.setl, in addition to endowing servers with a simple output flow control mechanism, is that it assists servers in just such avoidance--it eliminates their need to consider PIPE signals explicitly. If the child process goes down on a SIGPIPE, the parent merely receives an end-of-file indication from the child. The situation we have here is that the parent (server) always waits for the child to declare its readiness to receive a whole line. The child will not at that point be trying to write to an adversarial client, so it will not itself cause a SIGPIPE to be sent to the parent.

Let us now complete this picture. The parent that is communicating with a client through the two processes just reviewed should check for an end-of-file condition on fd_in whenever it becomes ``ready for reading''. The same is true for fd_out, which is bidirectional despite its name. This can be done in the usual way such as by checking for an om return from getline or by interrogating eof after a geta. An end-of-file from fd_in will usually mean that the client (or its host, or the network) has closed the connection. An end-of-file from fd_out is less likely to be normal behavior, but essentially means that a dropped connection has led to the line-pump.setl child terminating on a SIGPIPE. In both cases, the parent can finish by executing the following code: *

kill (pid (fd_in));    -- send SIGTERM to input subprocess
kill (pid (fd_out));   -- send SIGTERM to output subprocess
close (fd_in);
close (fd_out);
One or both of the kill calls will be redundant but harmless here. The case where a kill is not redundant is where the client has left one side of the connection (input or output) open but blocking. The kill makes sure that the child process is not trying to complete an input or output operation while the parent is left waiting for it to exit--close on a pipe or pump stream involves a low-level wait, which can be indefinite if the child is blocked.

  
5.3.5 On Aliases

Memory management, even when low-level allocation is hidden from view, is always an issue: one of the decisions a programmer repeatedly has to make is whether to copy or merely to reference. Most languages make it easier to reference than to copy. This is only natural considering that languages have traditionally been designed from the machine upwards, because it usually takes less CPU time to copy a pointer than to copy the data it points to. But the effect at the application level is that programmers tend to code for copying only if it seems necessary.

And as a result, they all too often produce ``pointer spaghetti'' which ultimately leads to bugs in which aliases are mistaken for unique pointers, blocks are deallocated prematurely, and pointers fail to get updated when their referents are moved.

SETL, on the other hand, encourages copying with its so-called ``value semantics''. There is no way to create an alias in SETL in the usual sense of more than one variable referring to a single object, nor are there pointers per se in SETL. Assignment, including both directions of parameter passing, is defined as a full copy of the object regardless of whether it is a simple number or a vast and complex map. (Of course, implementations are free to optimize out the actual copying, using, for example, a copy-upon-change regime.) To emphasize this orientation, I usually speak not of SETL objects, but of SETL values, except where it is actually necessary to distinguish between a value and its machine representation. The closest thing to a pointer in SETL is a value that serves as a ``key'' or domain element in one or more maps. Each map plays the role of a memory, and the map's name has to be mentioned on every fetch or store. Because a value of any type can be a key, maps are fully associative memories with unbounded address (key) spaces. Where the key space is naturally a dense set of small positive integers, a tuple can serve as the map/memory.

In short, SETL pushes programmers gently but firmly in the direction of the salutary rule:

$\bullet$ No unnecessary aliases.
And when aliases are necessary, SETL insists that they be referenced to specific maps.

The effect on programs of this bias is far-reaching, and likely to be somewhat discomfiting to people with a LISP background. It is a major paradigm shift. Whereas LISP focuses on the map element (the ordered pair, or ``cons'' cell), SETL treats the whole map, and does so with considerable regard for human syntactic needs. This represents a significant elevation in the semantic level, which is perhaps not surprising--pure LISP is, after all, nothing more than a machine language for a tiny recursive interpreter.

This ``anti-alias'' recommendation, however, though I believe it to be highly appropriate for virtually all data processing programs, is not always good for systems programming. It is hard to imagine a tree manipulation package or operating system kernel written in C without pointers or Ada without access types.

5.3.6 On Accessibility

Avoiding bottlenecks, and providing helpful redundancy in the form of doublechecks and assertions, as well as the very sound rules about modularization, abstraction, and even style that have emerged from the young science of software engineering, are all just as valid for applications programming and systems programming as they are for data processing, but there is another, much humbler rule which is particularly worth following in the specific context of data processing:

$\bullet$ No unprintable data.

In other words, all data that passes between data processing programs should be represented in a form that will be displayed by the most basic tools such as text editors and printers in ``natural'' denotations, unless there is some compelling reason for not doing so. There was, a long time ago, some justification for ``binary'' formats, which can save CPU time, disk space, and communication bandwidth, but as of well before the 1990s, these are trivial, inconsequential benefits at the data processing level when weighed against the inconvenience of data that can only be viewed through special filters. Of course, wherever formats are predefined, this rule cannot necessarily be followed, and insofar as browsers are now basic tools for displaying data, this rule does not necessarily mean that everything should be constrained to the ``printable'' part of ASCII (strictly speaking, the print class of characters in the POSIX locale defined in Unix 98 [154]), though this is probably still desirable for all but image data (and even for images is sometimes the best choice).

SETL actively supports the bias in favor of printable data. It has a good repertoire of facilities for deciphering and formatting arbitrary values as strings (Section 2.14.2 [Formatting and Extracting Values]), it features the pretty operator (Section 2.14.3 [Printable Strings]) which produces printable strings exclusively, and the general output routines render values legibly except that they do not interfere more than necessary with the contents of strings. Routines such as getchar, getfile, putchar, and putfile do not interfere with them at all, ensuring that any bit pattern can be read or written if necessary. In the case where pure SETL programs are exchanging data, writea and reada can be used as perfect reciprocals, making the data stream an ideal place to probe for testing or instrumentation purposes. Similarly, if programs are designed to communicate using a line-by-line protocol, typically incorporating some simple command language, printa can be used for formatting and sending messages to a corresponding geta, which will receive each message as a whole line before parsing and interpreting it. This mode of operation is especially appropriate for communication with servers, which usually need to be able to defend themselves against miscreant clients. A primitive tool such as the general telnet TCP client can be used to perform some basic tests on such a command-oriented server.

  
5.3.7 On Program Size

Because even large, complex data structures are just values in SETL, they can be passed from one SETL program to another with consummate ease by the primitive writea and reada operations just mentioned. This further facilitates the division of labor into many small programs instead of a few large ones, and hints at the rule:

$\bullet$ No monster programs.

The nature of modern data processing, to the extent that it involves piecework done by programmers with a flexible attitude towards languages and configurable off-the-shelf software, pressures programs to be small. Conversely, the affordability of large populations of processes on modern hardware removes the efficiency obstacle to treating programs as a plentiful resource. Furthermore, the relatively high walls of protection provided by modern operating systems around processes suggests that programs themselves may be ideal as modules or even ``objects''. Indeed, what practitioners of object-oriented programming now speak of as a method call was originally defined literally as a message-passing operation [185, p. 438].

There are numerous advantages to the use of programs as the fundamental modules in data processing systems. First, each program can be written in whatever language is most appropriate for it--even ``call-out'' conventions usually constrain the choices severely in the single-program case. Second, independent threads of control help to avoid bottlenecks, and to ward off the syndrome of a single program trying to juggle multiple activities. Third, shared resources tend to be guarded by their own supervisory processes rather than being carelessly managed by global variables (though of course a careful programmer would make such things private to a package and accessible only indirectly through subroutines). With regard to the latter, the motivation to copy rather than reference data is clearly strong in the setting of ``fullweight'' processes--resources will only be shared if they need to be.

5.3.8 On Standards

It goes without saying that adherence to recognized standards such as the Internet protocols and HTTP/MIME is a compatibility prerequisite for practically any new piece of software that hopes to deal with the global public network, and that confining it to use the API, shell, and utilities defined by Unix 98 [154] and Posix [118,119,117] where feasible will lend it a high degree of portability.

But there are also some other specific rules which should always be followed unless there is some compelling reason not to. These are rules that have become established practice because they work well.

5.3.8.1 Port Number Independence

As recommended in Section 3.1.3 [Choosing the Port Number] and illustrated in Section 4.2 [Software Structure], servers should strive to be independent of any specific TCP or UDP port number. To do otherwise is to risk making it impossible for a service to be offered at all, which will happen if the port number is already in use by another program. This condition can persist indefinitely, as is likely if the other program is itself a server. If a server critically depends on obtaining a certain port number, and some fundamental servers do (e.g., Web servers), then the port number should at least be registered with the IANA [116], though even this is no guarantee of its availability. Such a server's chances of getting the port number it wants will be further improved by having it started soon after its host comes up, perhaps as part of the system initialization sequence.

5.3.8.2 Configuration and Installation

A little effort on the part of the software developer to make a package easy to configure, install, and maintain can save every person responsible for installing and administering it a significant amount of trouble and vulnerability to mistakes. This is true for any software package, but especially so for large and complex systems that require configuration decisions to be made by the installer.

In this regard, perhaps the most important rule is to provide a step-by-step installation procedure that offers reasonable defaults and an opportunity to override them, together with clear documentation on the places where the software package impinges on the target platform. An installation script can be quite helpful.

A good principle to follow is to try to minimize, within reason, the number of dependencies on specific files or other resources in target systems. So, for example, although a software package may comprise a large number of programs and configuration files, they should by default all be grouped under a directory whose name serves as a common prefix, if practicable. Then this prefix, together with any particular system files that need to be inserted or modified, will be the entire extent of the package's ``footprint'' (apart from the space and time it ultimately consumes, of course).

One convention deserves special mention where servers are concerned, and that is the matter of how to make them sensitive to configuration changes without requiring them to be stopped and restarted. Fortunately, the Unix tradition has an answer to this question: make the server accept SIGHUP (the ``hang-up'' signal) as a request to re-read the configuration data. For example, inetd (the so-called ``super-server'' that is running on virtually every Internet-aware Unix host) re-reads its configuration file, /etc/services, whenever a HUP signal is sent to it.

For a server structured as an event loop, as in Section 5.3.8.3 [The Event Loop], this behavior can be easily implemented by including a HUP signal stream among the file descriptors passed on the main select invocation.

  
5.3.8.3 The Event Loop

Any SETL program that waits nondeterministically for inputs from more than one source will do so by calling select. For example, even the simple server vc-snap.setl listed in Section A.41 [vc-snap.setl] and discussed in Section 4.2.1 [Video Services] is typical in maintaining a map from pump file descriptors to client records. Each pump stream is connected to a child process that deals with one particular client. The domain of the map, i.e. the set of pump file descriptors, is passed to select along with the the file descriptor of the socket that listens for new client connection requests. Again, this is an entirely typical arrangement, where the server delegates all long-term work to subprocesses and gets back to its main job, sleeping in a select call, as quickly as possible. If the server had other events to be concerned about, such as HUP signals telling it to re-read configuration data, or timers telling it to do some periodic checks, the file descriptors for those signal or timer streams would also be included in the set passed to select.

Personally, I find myself most comfortable with select appearing naked in an overt main event loop, but the sensibilities of those who prefer the ``callback'' style of programming can easily be accommodated too. Suppose fd_map is a global map from pump file descriptors to records, each of which contains a handler field designating a unary event-handling routine, and fd_ready is a global set-valued variable. Then the SETL main program, if the programmer so wishes, can consist of nothing more than some initialization and a final call to a routine such as *

proc process_events;
  var fd;     -- local
  loop     -- cycle until some event-handling routine executes a stop
    [fd_ready] := select (domain fd_map);
    -- The set fd_ready is rechecked on each iteration:
    for fd in fd_ready | fd in fd_ready loop
      call (fd_map(fd).handlerfd);     -- indirect call
    end loop;
  end loop;
end proc;
which could be incorporated verbatim into programs using #include, preceded if desired by #define lines that rename fd_map and/or fd_ready.

``Registering'' an event-handling routine named (say) client_input could be done with *

register (fdroutine client_input);
where the procedure register is defined as follows: *
proc register (fdcallback_routine);
  fd_map(fd) ?:= {};     -- establish new record if necessary
  fd_map(fd).handler := callback_routine;
end proc;
``De-registering'' a callback routine via *
deregister (fd);
might then be done by the following procedure: *
proc deregister (fd);
  fd_map(fd).handler := om;  -- caller may now remove whole record
  fd_ready less:= fd;        -- remove fd from transient ready set
end proc;

The usual cautions about manipulation of global variables apply here: callback routines must be sensitive to what other callbacks might do to those variables. This is why process_events has the odd-looking loop header ``fd in fd_ready | fd in fd_ready'', which inspects the same global fd_ready set as deregister modifies. This loop header makes sure that each fd produced by the first ``fd in fd_ready'', which iterates over a copy of fd_ready, is still to be found in the global variable fd_ready before the corresponding loop iteration occurs. If it is not there at that time, a previous iteration has de-registered the file descriptor and its associated event handler from fd_map, and it would then be inappropriate to try to call that event handler.

It may appear at first glance that this circumstance could be dealt with more gracefully simply by guarding against fd_map(fd)or fd_map(fd).handler being om, but this would offer no protection against the case where a file descriptor, retired by both a callback and by a close executed by that callback, reappeared on a subsequent callback's open--indeed, Unix will always yield the most recently closed file descriptor on a system-level open call. This file descriptor could then be mistaken for its older incarnation, and inappropriate processing performed on it. Since it is really a new file descriptor that has not yet entered the set of candidates supplied to select, the appropriate processing for it is none at all, at this point. Notice that performing ``fd_ready with:= fd'' in register, perhaps in a fatuous appeal to symmetry with deregister, would be exactly equivalent to making this oversimplification.

In WEBeye, where each main server loop typically has a select call over at least the domain of a clients map and a listening server socket, these semantic subtleties are kept at bay by adhering to the principle that in the code following the return from select, operations which may shrink the clients map precede those which may expand it. For example, tests for input from existing clients, which cause shrinkage of the clients map when clients terminate their connections as indicated by an end-of-file condition on the input file descriptor, are placed before the test for newly connecting clients, which can increase the size of the clients map.


next up previous
Next: 5.4 Summary Up: 5. On Data Processing Previous: 5.2 Problems and Solutions
David Bacon
1999-12-10