To measure an increase or a decrease in information, theoreticians have borrowed a concept from thermodynamics that by now has become an integral part of the lexicon of information theory: the concept of entropy. The term has been bandied about long enough for everyone to have heard of it and, in most cases, to have used it somewhat loosely. We should therefore take a fresh look at it, so as to divest it of all the more or less legitimate echoes it has carried over from thermodynamics. According to the second law of thermodynamics, formulated by Rudolf Clausius, although a certain amount of work can be transformed into heat (as stated by the first law), every time heat is transformed into work certain limitations arise to prevent the process from ever being fully completed.
To obtain an optimum transformation of heat into work, a machine must provoke exchanges of heat between two bodies with different temperatures: a heater and a cooler. The machine draws a certain amount of heat from the heater but, instead of transforming it all into work, passes part of it on to the cooler. The amount of heat, Q, is then partly transformed into work, Q,, and partly funneled into the cooler, Q — Q,.
Thus, the amount of work that is transformed into heat will be greater than the amount of work derived from a subsequent tranformation of heat into work. In the process, there has been a degradation, more commonly known as a consumption, of energy that is absolutely irreversible. This is often the case with natural processes: «Certain processes have only one direction: each of them is like a step forward whose trace can never be erased.»‘ To obtain a general measure of irreversibility, we have to consider the possibility that nature favors certain states over others (the ones at the receiving end of an irreversible process), and we must find a physical measure that could quantify nature’s preference for a certain state and that would increase whenever a process is irreversible. This measure is entropy.
The second law of thermodynamics, concerning the consumption of energy, has therefore become the law of entropy, so much so that the concept of entropy has often been associated with that of consumption, and with the theory stating that the evolution of all natural processes toward an increasing consumption and progressive degradation of energy will eventually result in the «thermic death» of the universe. And here it is important to stress, once and for all, that although in thermodynamics entropy is used to define consumption (thereby acquiring pessimistic connotations—whether or not it is reasonable to react emotionally to a scientific concept), in fact it is merely a statistical measure and, as such, a mathematically neutral instrument.
In other words, entropy is the measure of that state of maximal equiprobability toward which natural processes tend. This is why one can say that nature shows certain preferences: nature prefers greater uniformity to lesser uniformity, and heat moves from a warmer body to a cooler body because a state in which heat is equally distributed is more probable than a state in which heat is unequally distributed. In other words, the reciprocal speed of molecules tends toward a state of uniformity rather than toward a state of differentiation, in which certain molecules move faster than others and the temperature is constantly changing. Ludwig Boltzmann’s research on the kinetic theory of gases demonstrated that nature tends toward an elemental disorder of which entropy is the measure.’
It is. therefore, important to insist on the purely statistical character of entropy—no less purely statistical than the principle of irreversibility, whereby, as proved by Boltzmann. the process of reversion within a closed system is not impossible, only improbable. The collisions of the molecules of a gas are governed by statistical laws which lead to an average equalization of differences in speed. When a fast molecule hits a slow one, it may occasionally happen that the slow molecule loses most of its speed and imparts it to the fast one, which then travels away even faster; but such occurrences are exceptions. In the overwhelming number of collisions, the faster molecule will lose speed and the slower one will gain it, thus bringing about a more uniform state and an increase in elemental disorder.
As Hans Reichenbach has written, «The law of the increase of entropy is guaranteed by the law of large numbers, familiar from statistics of all kinds, but it is not of the type of the strict laws of physics, such as the laws of mechanics, which are regarded as exempt from possible exceptions.»‘
Reichenbach has provided us with the clearest and simplest explanation of how the concept of entropy has passed from the theory of energy consumption to that of information. The increase in entropy that generally occurs during physical processes does not exclude the possibility of other physical processes (such as those we experience every day, since most organic processes seem to belong to this category) that entail an organization of events running counter to all probability—in other words, involving a decrease in entropy.
Starting with the entropy curve of the universe, Reichenbath calls these decreasing phases, characterized by an interaction of events that leads to a new organization of elements, branch systems, to indicate their deviation from the curve.
Consider, for example, the chaotic effect (resulting from a sudden imposition of uniformity) of a strong wind on the innumerable grains of sand that compose a beach: amid this confusion, the action of a human foot on the surface of the beach constitutes a complex interaction of events that leads to the statistically very improbable configuration of a footprint. The organization of events that has produced this configuration, this form, is only temporary: the footprint will soon be swept away by the wind.
In other words, a deviation from the general entropy curve (consisting of a decrease in entropy and the establishment of improbable order) will generally tend to be reabsorbed into the universal curve of increasing entropy. And yet, for a moment, the elemental chaos of this system has made room for the appearance of an order, based on the relationship of cause and effect: the cause being the series of events interacting with the grains of sand (in this case, the human foot), and the effect being the organization resulting from it (in this case, the footprint).
The existence of these relationships of cause and effect in systems organized according to decreasing entropy is at the basis of memory. Physically speaking, memory is a record (an imprint, a print), an «ordered macroarrangement, the order of which is preserved: a frozen order, so to speak.»‘ Memory helps us reestablish causal links, reconstruct facts. «Since the second law of thermodynamics leads to the existence of records of the past, and records store information, it is to be expected that there is a close relationship between entropy and information.»7
We shouldn’t, therefore, be too surprised by the frequent use of the term «entropy» in information theories, since to measure a quantity of information means nothing more than to measure the levels of order and disorder in the organization of a given message.
The Concept of Information in the Work of Norbert Wiener
For Norbert Wiener—who has relied extensively on information theory for his research in cybernetics, that is, in his investigation of the possibilities of control and communication in human beings and machines—the informative content of a message is given by the degree of its organization. Since information is a measure of order, the measure of disorder, that is to say, entropy, must be its opposite. Which means that the information of a message depends on its ability to elude, however temporarily, the equiprobability, the uniformity. the elemental disorder toward which all natural events seem destined, and to organize according to a particular order.
For instance, if I throw in the air a bunch of cubes with different letters printed on their faces, once they hit the ground they will probably spell out something utterly meaningless—say, AAASQMFLENSUF101. This sequence of letters does not tell me anything in particular. In order to tell me something, it would have to be organized according to the orthographic and grammatical laws of a particular language—in other words, it would have to be organized according to a particular linguistic rode.
A language is a human event, a typical branch system in which several factors have intervened to produce a state of order and to establish precise connections. In relation to the entropy curve, language—an organization that has escaped the equiprobability of disorder—is another improbable event, a naturally improbable configuration that can now establish its own chain of probability (the probabilities on which the organization of a language depends) within the system that governs it. This kind of organization is what allows me to predict, with a fair amount of certainty, that in an English word containing three consonants in a row the next letter will be a vowel.
The tonal system, in music, is another language, another code, another branch system. Though extremely improbable when compared to other natural acoustic events, the tonal system also introduces, within its own organization, certain criteria of probability that allow one to predict, with