The Life of Information: From Drums to Wikipedia

Review of The Information by James Gleick

by Ernest Davis, Times Literary Supplement, August 17, 2011.

James Gleick. The Information: A History, a Theory, a Flood. 526 pps. Forth Estate. 978-0-00-722573-6

The view of information as a substance that is stored, transmitted, and transformed, and the analysis of processes of all kinds — cognitive, social, biological, and physical — in terms of their manipulation of information, are today ubiquitous mode of thoughts. This conceptual framework barely existed two centuries ago. James Gleick's ambitious book The Information: a History, a Theory, a Flood traces the emergence of this idea and its interactions with the simultaneous tremendous expansion in information processing technology.

The history that Gleick recounts is very complex. His book includes historical and technical material drawn mostly from engineering, logic, physics, and biology, but also from mathematics, computer science, anthropology, library science, and lexicography. The scientific and historical interrelations between the first four areas are intricate. Gleick's narrative zigzags back and forth in time across the past two centuries. (A timeline would have been helpful.) Consequently, the book lies somewhere between a coherent narrative and a collection of episodes.

Gleick begins his book with an extraordinary set piece: the drum languages of Africa. European travellers discovered these in the 1830's; messages could be transmitted one hundred miles in an hour, faster than any communication system in Europe. The language of the drums was based on imitating the tonal structure of the spoken language; the loss of the information carried in the vowels and consonants of the language was made up by using long fixed phrases. For example, the word "moon" was expressed in the drum languages using the tones corresponding to the phrase "the moon looks down at the earth".

At that date in Europe the fastest communication system was the Chappe telegraph: towers on hilltops, with huge arms that could be placed in semaphore configurations. In good weather the system could transmit signals at 47 miles an hour at the rate of 3 signals per minute. In 1848, the electric telegraph was invented. The effects of instantaneous, long-distance communication were profound and innumerable, ranging from the synchronization of trains to useful weather reports. A side effect of the telegraph was the invention of Morse code, which introduced two critical innovations in the expression of information. First, each letter was expressed as a sequence of dots and dashes. Second, to create a more efficient encoding, common letters were expressed in short sequences and uncommon letters in long sequences.

These three insights of Morse code and of the drum languages --- the expression of information in terms of two symbols, the use of variable length encodings for efficiency, and the use of redundant encodings to correct for errors and ambiguities --- were systematized in the elegant mathematical theory of information, invented in 1948 by Claude Shannon (1916-2001), then a mathematician at Bell Labs. Shannon identified the bit --- a single choice of 0 or 1 --- as the fundamental element of information, and the bit string as the fundamental encoding of information. He formulated the mathematics of using an efficient code to compress information, and of using a redundant code to transmit information reliably despite noise such as static. A distinctive feature of Shannon's information theory was that he explicitly excluded any consideration of meaning; in Shannon's theory, information is stored and transmitted but not interpreted or even used. Shannon is the central figure in "The Information"; the hundred pages that Gleick devotes to his life and works are the most complete biography that has been written of this great scientist.

One of Shannon's major contributions was to define a measure of ignorance in probabilistic situations, called the entropy. Suppose that the phone rings; you judge that with probability 1/2 it is your spouse; with probability 1/4 it is your father; and with probability 1/4 it is your daughter. Information theory measures the entropy of this probability distribution as 1-1/2 bits; this is the amount of information that you will gain, on average, when you pick up the phone, and therefore the amount of information you lack before you pick up the phone. The entropy is also a fundamental measure in the theory of efficient coding. You can develop a method of encoding English in bits where the average number of bits in the encoding of a message is equal to the entropy of English times the number of letters in the message; and that is the best you can do, in terms of efficiency. Remarkably, the formula for computing the entropy in information theory had been discovered in 1871 by Ludwig Boltzmann as expressing the fundamental relation between the thermodynamic state of a gas and the motions of its molecules.

The growth of the informational approach in the biochemistry of genetics revolutionized the field over the course of the twentieth century. The analysis of biochemical processes shifted its focus from explaining the flux of energy and the flow of material to explaining the flow of information, from DNA to RNA to proteins. In theoretical physics, informational analysis takes many different forms. The best known is in Heisenberg's uncertainty principle, that the position and momentum of a particle cannot both be known precisely. The most far-reaching is John Archibald Wheeler's conjecture, summarized in the slogan ``It from Bit'', that all of physics can and should be described in terms of information.

Another thread in Gleick's complex weave is the development of computational devices, computation theory, and and computation theory in the nineteenth and twentieth century. In the mid-nineteenth century Charles Babbage designed a mechanical computer called the Analytical Engine and Byron's daughter Ada Lovelace wrote programs for it; but it was never actually built. (Gleick's chapter on the Analytical Engine very much exaggerates its historical and scientific importance.) During the same era, and in the long run much more fruitfully, George Boole and Augustus De Morgan independently invented methods to describe logical inference in mathematical terms. Mathematical logic advanced rapidly over the next century; in 1910, Principia Mathematica by Whitehead and Russell demonstrated that all rigorous mathematical proofs can be expressed in purely logical terms. In 1936, Alan Turing invented the Turing machine, a model of computation that is beautifully simple, yet can carry out any feasible calculation. The intimate connections between logic and computation were uncovered in the 1920's and 30's by Turing, Emil Post, Kurt Goedel, and Alonzo Church.

However, Gleick's narrative has a strange gap; he loses interest in the electronic computer exactly when it becomes a reality. This is conspicuous because elsewhere Gleick gives elaborately detailed descriptions of technological devices and of hypothetical computers. There is a chapter on Babbage's non-existent Analytical Engine; there are two pages on the workings of the telegraph and four pages on unsuccessful precursors; and there are similarly detailed accounts of the telephone switchboard, Vannevar Bush's differential analyzer, and the purely theoretical Turing machine. But there is not a word describing the workings of any actual computer or program. The exception proves the rule; the single post-1940 device whose workings are described is the quantum computer, which also does not exist.

This may reflect the fact that Shannon's information theory, with the immense exception of the definition of the bit, is not actually very important in most areas of computer science. Most American computer science majors never encounter it. Computer programs very rarely deal with meaningless bit strings; they deal with meaningful data structures. The development of programming languages, undiscussed in Gleick's book, is largely a movement toward meaning; a programming language is ''high-level'' to the extent that its data and operations are meaningful. Gleick's claim that the invention of information theory was more significant than the invention in the same year of the transistor --- the fundamental hardware element of all computers and almost all electronics in the last fifty years --- is patently absurd.

Gleick has a couple of chapters on the history of information before the nineteenth century, but these are much weaker. He often makes historical claims that are far-fetched or wrong. "The written word ... was a prerequisite for conscious thought as we understand it.'' There have been many recent studies of consciousness, but I have not seen a theory that comes close to supporting this. "It did not occur to Sophocles' audiences that it would be sad for his plays to be lost; they enjoyed the show." I don't know particularly about Sophocles or his audience; but classical authors in general were very conscious that they were writing for posterity. They were aware both that texts survived for centuries and that important texts could be and had been lost. "Before print, scripture was not truly fixed." On the contrary, medieval manuscripts of Hebrew scripture differ from the current standard version only in very rare and minor errors, in contrast to the many major differences between the first Folio and first Quarto of Shakespeare.

Gleick devotes a chapter to the development of the English dictionary; this is largely irrelevant to his main theme. The OED is a splendid accomplishment, but it does not mark a major innovation from the standpoint of informatics. Moreover, Gleick's exclusive focus on dictionaries in English is historically misleading. For example, he gives the impression that alphabetical order was invented by Cawdrey (1604); in fact it had been used by Arabic lexicographers since the 13th century.

The final chapters, on "the flood", are readable and informative --- I particularly enjoyed the long account of Wikipedia --- but overwrought. To my mind, global information overload is a minor issue; the more important issues are narrower. The Web is beneficial because it makes useful information more available and transactions easier. It is harmful because of a number of unrelated side-effects; it makes some valued professions, particularly journalism, less profitable and thus harder to sustain; it enables violations of personal privacy; it makes it possible for adolescent folly to become known immediately, everywhere, and forever; and so on. Some of these may have technical or social solutions. None will turn us into supermen or robots.

Other than these exceptions, however, Gleick is always entertaining, informative, and clear. He moves effortlessly between high-level principles, vivid historical anecdotes, and detailed nitty-gritty explanation; and he uses each level to shed light on the others. Overall this is a brilliant work of popular history of science.