In contexts as diverse as science, business, and academic research, data processing is as fundamental to what computers really do as computer science is to the study of what they might do.
But what is involved in data processing? All that usually happens to data when it gets ``processed'' is that it gets copied from one place to another. There is almost no computation in data processing, and few algorithms. And yet arranging for the right data to appear in the right place at the right time can be every bit as challenging as the most difficult mathematical analysis. More so, in fact, because the problems tend to be open-ended, and the psychology of users cannot usually be reduced to cogent proofs.
Neither does data processing have clear boundaries that distinguish it from other forms of computing, and we must be content with a rather intensional description of what it embraces. However, this thesis does not seek primarily to define data processing, but to show that SETL, conservatively augmented, is remarkably well suited to roles far from those described in On Programming .
Perhaps the most salient feature of data processing is that it tends to be concerned with data interfaces, spanning the range from low-level input/output formats to high-level user interfaces. A contemporary office information system will typically comprise mostly off-the-shelf software such as spreadsheet packages, database packages, word processors, and operating systems, configured for local use. Additionally, there will be customized elements such as graphical user interfaces and specialized programs relating to the specific activity of the organization. And then of course there is the data itself, which all the other components must directly or indirectly accommodate.
A major task of the data processing programmer is to provide interfaces between these components--decoding, observing data access coordination requirements, and, if not actually transforming data, then at least formatting it for presentation to one or more sinks. With the great rise in the importance of computer networks, the modern data processing programmer also has to be able to deal with the issues surrounding distributed concurrency, including latency and the need for redundancy and security.
We have come a long way from the simple, sheltered world of card readers, 9-track tapes, local disks, and line printers all operated in batch mode. The increase in the complexity of data processing environments is one reason why high-level languages are more important than ever. Fortunately, they are also more affordable.