Claude E. Shannon and Information Theory

By NASRULLAH MAMBROL on July 29, 2018 • ( 0 )

Claude E. Shannon’s publication of A Mathematical Theory of Communication in the Bell System Technical Journal of July and October 1948 marks the beginning of information theory and can be considered “the Magna Carta of the information age” (Verdú 1998: 2057). Shannon’s work brought into being a research field that is both an important sub-discipline of mathematics and an applied science relevant to a multiplicity of fields, including but not restricted to computer science, cryptology, philosophy, psychology, (functional) linguistics, statistics, engineering, physics, biology (especially genetics), and economics.¹But Shannon could not have written his seminal paper without the work done by important precursors: the Bell Lab engineers Harry Nyquist (1924, 1928) and Ralph Hartley (1928); the mathematicians John von Neumann (1932) and Norbert Wiener (1942, 1948);²and the physicists Ludwig Boltzmann (1896–98), J. Willard Gibbs (1876, 1878), and Leó Szilárd (1929). In contemporary information theory, much of this early work still plays an important role. More recent research has either elaborated on Shannon’s original insights (Verdú 1998) or followed the different path of algorithmic information theory outlined by Gregory J. Chaitin, Andrey Nikolaevich Kolmogorov, and Ray Solomonoff in the 1960s (Chaitin 1987).

Claude E. Shannon

For uses of information theory within literary, cultural, and media theory, the case is different. Here, most research builds on Shannon’s work. More precisely, most information-theoretic reflections in the humanities and social sciences rely on The Mathematical Theory of Communication (1963), which reprints Shannon’s original paper along with an expository introduction by Warren Weaver.

As an engineer working for Bell Telephone Laboratories, Shannon was crucially interested in theorizing ways of making the transmission of information more efficient. Drawing on probabilistic theory, statistics, and thermodynamics, Shannon studied the impact of two factors – the bandwidth of a channel and its signal-to-noise ratio – on channel transmission capacity. Thus, he was able to provide crucial assistance to engineers intent on maximizing the capacity of communication channels. Indeed, the primary practical use of Shannon’s theorems is in the design of more efficient telecommunications systems.

For Shannon, there was no doubt as to what constitutes a maximally successful act of communication: the message received must be identical to the message sent. In this model, the final touchstone of communicative success is the replication of the sender’s intention, and noise is defined as all those “things [that] are added to the signal which were not intended by the information source” (Shannon and Weaver 1963: 7). Yet Shannon made an interesting discovery concerning noise that proved to be relevant to many in the literature and science community. While noise is completely unintelligible for the receiver, it is also the part of the signal with the highest information content. This might seem counterintuitive at first, since one would expect that a more ordered, less chaotic signal transmits more information. But Shannon’s observation will become clear once we have had a look at his recourse to thermodynamics.

To his surprise, Shannon found that his definition of information, rendered as a mathematical equation, corresponded to Boltzmann’s definition of entropy, a measure of disorder or the unavailability of energy to do work within a closed system. This makes sense if we follow Shannon in considering messages not in isolation but in the context of the range of possible messages from which the actual message has been selected: “To be sure, this word information in communication theory relates not so much to what you do say, as to what you could say. That is, information is a measure of one’s freedom of choice when one selects a message” (Shannon and Weaver 1963: 8–9). The larger the set of possible messages, the greater the freedom of choice a sender has in choosing a specific message, the greater the uncertainty on the part of the receiver as to what specific message the sender has actually chosen, and the greater the amount of information received. In Shannon’s model, the amount of information received corresponds to the degree of uncertainty removed at the receiver’s end: “Thus greater freedom of choice, greater uncertainty, greater information go hand in hand” (Shannon and Weaver 1963: 19). Hence, a message that is completely predictable is redundant and thus devoid of information. Conversely, a message about whose content the receiver was highly uncertain prior to its arrival conveys much information, and a maximally entropic (or “informative”) message is one that has been chosen out of a maximally large set of messages that are all equally probable:

That information be measured by entropy is, after all, natural when we remember that information, in communication theory, is associated with the amount of freedom of choice we have in constructing messages. Thus for a communication source one can say, just as he would also say it of a thermodynamic ensemble, “This situation is highly organized, it is not characterized by a large degree of randomness or of choice – that is to say, the information (or the entropy) is low.” (Shannon and Weaver 1963: 13)

Now, since the introduction of noise into a channel of communication increases uncertainty and makes messages less predictable, it also increases information. Thus, noise is defined in Shannon’s framework as the signal that exhibits both maximum entropy and the greatest amount of information. As such, it is the opposite of redundancy – a completely predictable signal that conveys no information whatsoever.

From an engineering point of view, though, one needs to distinguish between useful and useless information, and Shannon and Weaver quickly point out that the (large) amount of information contained in noise is useless:

Uncertainty which arises by virtue of freedom of choice on the part of the sender is desirable uncertainty. Uncertainty which arises because of errors or because of the influence of noise is undesirable uncertainty. It is thus clear where the joker is in saying that the received signal has more information. Some of this information is spurious and undesirable and has been introduced via the noise. To get the useful information in the received signal we must subtract out this spurious portion. (Shannon and Weaver 1963: 19)

Noise, it appears, has been successfully exorcized from the mathematical theory of communication. This comes as little surprise, since, as an employee of a telephone company, Shannon was interested in minimizing noise in order to ensure maximally efficient ways of transmitting (useful) information. What has become known as “the fundamental theorem of information theory” also testifies to this: “it is possible to transmit information through a noisy channel at any rate less than channel capacity with an arbitrarily small probability of error” (Ash 1965: 63).

However, toward the end of his expository introduction, Shannon’s coauthor, Weaver, intimates that one might think about noise differently. Throughout his introduction, Weaver stresses that “information must not be confused with meaning” and that “the semantic aspects of communication are irrelevant to the engineering aspects” (Shannon and Weaver 1963: 8). This exclusion of semantic considerations is already visible in the communication model Shannon proposes on the first pages of his article (Figure 13.1).5

(Figure 13.1) Shannon’s Communication Model.

There is no box in this diagram for the interpretive activity of the receiver, and it is clear that the purpose of communication in this model is to transmit messages so that the message received is identical to the message sent. But when Weaver does turn to semantic issues in the final section of his introduction, he proposes a number of changes to Shannon’s model:

One can imagine, as an addition to the diagram, another box labeled “Semantic Receiver” interposed between the engineering receiver (which changes signals to messages) and the destination. This semantic receiver subjects the message to a second decoding, the demand on this one being that it must match the statistical semantic characteristics of the message to the statistical semantic capacities of the totality of receivers, or of that subset of receivers which constitute the audience one wishes to affect. (Shannon and Weaver 1963: 26)

Weaver’s consideration of the receiver’s role indicates a shift away from a communication model that regards the sender’s intention as the sole source of meaning. Moreover, his assertion that the message’s semantic properties must be adjusted, in a “second decoding,” to the receiver’s capacity for processing meaning already qualifies Shannon’s original premise that the goal of communication is the transmission of self-identical messages. Weaver moves even further away from a communication model that is based on intentionality when he considers the possibility of adding an additional box labeled “semantic noise” to the diagram:

Similarly one can imagine another box in the diagram which, inserted between the information source and the transmitter, would be labeled “semantic noise,” the box previously labeled as simply “noise” now being labeled “engineering noise.” From this source is imposed into the signal the perturbations or distortions of meaning which are not intended by the source but which inescapably affect the destination. And the problem of semantic decoding must take this semantic noise into account. It is also possible to think of an adjustment of original message so that the sum of message meaning plus semantic noise is equal to the desired total message meaning at the destination. (Shannon and Weaver 1963: 26) Weaver’s suggestion that distortions of meaning that were not intended by the sender might not impair but contribute to the meaning received at the other end of the communication process represents a break with communication models that are based on the sender’s intention as the sole reference point for communicative success. Weaver’s changes to Shannon’s model also re-inject the noise that had been exorcized through Shannon’s distinction between useful and useless information (Figure 13.2).

Clearly, Weaver’s reflections on noise and meaning propose a model of communication that works in spite of the noise rather than because of it. Still, his suggestion that noise is not only an inevitable constituent of any form of communication but may actually be an essential part of the desired message assigns to noise the status of a potentially beneficial element. Together with Shannon’s assertion that noise is the signal with the highest information content (or highest degree of “informativeness”), it forms the basis for a host of revalorizations of noise in literary, cultural, and media theory.

There are, of course, problems with Shannon and Weaver’s model of communication. First, Shannon and Weaver’s model of communication is a one-way transmission model that can account for communication only between two entities. A broader understanding of communication as it informs, for instance, the notion of discourse or even Stuart Hall’s encoding/decoding model (1980) is well beyond its scope. Second, Weaver’s clear-cut differentiation between “semantic noise” and “engineering noise” cannot be upheld because it presupposes a strict separation of the level of the signifier (affected by the engineering noise) and the signified (affected by the semantic noise).6 The poststructuralist assertion of the primacy of the signifier and the endless deferral of the signified has rendered such a distinction problematic. Finally, because Shannon and Weaver’s concept of noise is based on the assumption that noise corresponds to all those things that have been added to the signal unintentionally (Shannon and Weaver 1963: 7), their model does not allow for noise that has been added on purpose. In the analysis of literature, especially certain types of modernist literature, it is desirable to broaden Shannon and Weaver’s understanding of noise to include textual distortions and fragmentations which we, as readers, tend to see as intended by a writer who uses them consciously and for artistic effect.7

Figure 13.2 Shannon’s Communication Model with Weaver’s Proposed Changes

Despite these limitations, Shannon and Weaver’s mathematical theory of communication has had profound effects on cybernetics and systems theory. However, in their later development, these disciplines abandon the older transmission model of communication for one that describes processes of information exchange or cognitive construction taking place at several hierarchically distinct levels within highly complex systems such as computers, the human body, and society. Apart from these further developments, Shannon and Weaver’s theorems themselves had a strong impact on the humanities and social sciences. In what follows, we will sketch some of their most prominent uses, with a special emphasis on theoretical and literary reflections on noise.

Hayles (1987) compares Shannon and Weaver’s model of communication to Roland Barthes’s as developed in S/Z. Starting from the assumption that science and literature are isomorphic manifestations of a shared culture (Hayles 1987: 119–20), Hayles contrasts the different “economies of explanation” at work in Shannon and Barthes. Both theorists note that noise contains a surplus of information. But while Shannon’s work is embedded in a capitalistscientific economy that demands the reduction of the many to the few and tries to mute noise by designing it as useless,8 Barthes’s work is embedded in a literary economy that demands the expansion of the few to the many, values playfulness over usefulness, and celebrates noise. Thus, “similar concepts emerge with radically different values when they are embedded within different economies” (Hayles 1987: 131). Hayles’s observation applies to many of the uses information theory has been put to in literary, cultural, and media theory. This is especially the case for reflections on the innovative and subversive potential of noise.

Michel Serres

For instance, Michel Serres, in The Parasite, Genesis, and a variety of essays, appropriates le parasite – informatic “noise” in technical French – as a figure for the excluded third, i.e., for all those objects and people that dualist thinking seeks to exclude:

Science is not necessarily a matter of the one or of order, the multiple and noise are not necessarily the province of the irrational. This can be the case, but it is not always so. The whole set of these divisions delineates the space of noise, the clash of these dichotomies overruns it with noise, simple and naïve, repetitive, strategies of the desire for domination. To think in terms of pairs is to make ready some dangerous weapon, arrows, darts, dovetails, whereby to hold space and kill. To think by negation is not to think. Dualism tries to start a ruckus [chercher noise], make noise, it relates to death alone. It puts to death and it maintains death. Death to the parasite, someone says, without seeing that a parasite is put to death only by a stronger parasite. (Serres 1997: 131)

It is in line with these observations that Jacques Attali, in Noise: The Political Economy of Music, champions the improvisational sounding practices of what he calls “composition”: “the conquest of the right to make noise, in other words, to create one’s own code and work, without advertising its goal in advance” (Attali 1985: 132). For Attali, such practices are prophetic; they herald “the emergence of a formidable subversion, one leading to a radically new organization never yet theorized” (Attali 1985: 6).

In The Noise of Culture: Literary Texts in a World of Information, William R. Paulson draws on Serres’s work, information theory, and theoretical biology, as well as Russian and Belgian formalism to reflect on the function of literature in a world that is structured increasingly around the production, circulation, and exchange of machine-readable, clear information. Acknowledging the marginality of literature in the information age, Paulson contends that the social function of literature today may best be described as “the noise of culture”:

Literature is not and will not ever again be at the center of culture, if indeed it ever was. There is no use in either proclaiming or debunking its central position. Literature is the noise of culture, the rich and indeterminate margin into which messages are sent off, never to return the same, in which signals are received not quite like anything emitted. (Paulson 1988: 180)

Friedrich A. Kittler

Other uses of information theory in the humanities and social sciences can be traced in Friedrich A. Kittler’s media archeology, which starts from the assumption that “media determine our situation” (Kittler 1999: xxxix) and finds its most influential expression in two of Kittler’s major books, Discourse Networks 1800/1900 and Gramophone, Film, Typewriter. Kittler’s “informational-theoretical materialism” is shared by a host of other German media theorists – the so-called “Berlin School” of media theory – among them Bernhard Dotzler, Wolfgang Ernst, and Bernhard Siegert, whose Relays: Literature as an Epoch of the Postal System (1999) is one of the most fascinating books to come out of that tradition. Well before Kittler, Max Bense inaugurated another German tradition of technology- centered media theory, the “Stuttgart School.” Bense’s informational aesthetics considers acts of selection as the most fundamental link between art and mathematics and is particularly interested in the interplay of order and complexity in works of art.

So far, we have sketched the basic assumptions of Shannon’s theory of communication and some of its uses in literary, cultural, and media theory. Concerning the intersections of information theory and literature, there is, however, a second avenue to explore, if only very briefly: the impact of information theory on the literary imagination. Many writers have drawn on information theory: Joseph Heller in Something Happened (1974), William Gibson in Neuromancer (1984), Don DeLillo in White Noise (1985), David Foster Wallace in The Broom of the System (1987), Richard Powers in The Gold Bug Variations (1991), Neal Stephenson in Snow Crash (1992) and Cryptonomicon (1999), and Greg Bear in Dead Lines: a novel of life … after death (2004), to name but a few. Yet Thomas Pynchon’s The Crying of Lot 49 (1966) remains the most prominent example of such a text, and the remainder of this chapter discusses that novel as a paradigmatic case.

Like earlier writers such as H.G. Wells in The Time Machine (1895) and Henry Adams in The Education of Henry Adams (1907/1918), Pynchon draws on the thermodynamic notion of entropy to draw a gloomy picture of the Earth as moving toward heat death, i.e., to the gradual but complete dissipation of energy predicted by the nineteenth-century physicist Hermann von Helmholtz, who considered the world a closed thermodynamic system subject to the irreversible increase in entropy postulated by the second law of thermodynamics (Freese 1997: 99–105). But in Pynchon’s fictional world, thermodynamic entropy is counteracted by a second type of entropy: informational entropy. While the thermodynamic world of von Helmholtz and Henry Adams knew entropy only as dissipation of energy, the informational world of Shannon and Pynchon has learned to distinguish between two types of entropy with contrary connotations.

Thomas Pynchon

In Pynchon’s novel, an encounter between informational entropy and thermodynamic entropy is played out in a machine built by John Nefastis. Nefastis claims that his apparatus reverses the process of entropic increase and thus refutes the second law of thermodynamics. Thus, the Nefastis Machine would make James Clerk Maxwell’s thought experiment come true: the idea that a Demon who sorts out the slower- and faster-moving molecules within a closed system could halt entropic degradation and produce a perpetual motion machine. Nefastis’s apparatus requires a psychic who can communicate with Maxwell’s Demon: “

Communication is the key,” cried Nefastis. “The Demon passes his data on to the sensitive, and the sensitive must reply in kind. There are untold billions of molecules in that box. The Demon collects data on each and every one. At some deep psychic level he must get through. The sensitive must receive that staggering set of energies, and feed back something like the same quantity of information. To keep it all cycling.” (Pynchon 1966: 72–73)

In Nefastis’s scheme, an exchange of information between a sensitive and the Demon allows it to wage a battle against the increase in thermodynamic entropy. While Pynchon casts the viability of Nefastis’s apparatus into doubt, the competition staged in it between thermodynamic and informational entropy plays out on a larger scale throughout Pynchon’s novel. The cultural inertia of Southern California depicted at the beginning of the narrative, its “unvarying gray sickness” (Pynchon 1966: 14), corresponds to a state near thermodynamic equilibrium or maximum entropy, at which the system has come to an almost complete standstill. But in the course of the novel the movement toward entropic degradation is countered by repeated injections of informational entropy or noise into the system: the Paranoids’ “shuddering deluge of thick guitar sounds” (25), the cryptic messages relayed by the underground mail delivery system W.A.S.T.E., and the communication networks of the 1960s counterculture more generally. The outcome of the battle between thermodynamic and informational entropy ends indecisively in Pynchon’s novel, but in its staging of that battle, The Crying of Lot 49 stands as a powerful monument to the energy that the fusion of literature and science can release. What enables Pynchon’s novel to do that, though, is not only its negotiation of information and noise at the plot level, but also its recalcitrant literary form – its fragmented plot structure, multiple indeterminacies, complex system of intertextual references, and refusal of narrative closure – which challenges conventionalized language uses and thus injects noise into the system of cultural communication.

Reference: Clarke, Bruce, and Manuela Rossini. The Routledge Companion To Literature And Science. London: Routledge, 2012. Print.

Notes
1 See, however, Hayles’s account of an alternative, British tradition within information theory (Hayles 1999: 18–19, 54–57, 63) initiated by Donald M. MacKay, whose model of structural information – contrary to Shannon’s probabilistic model – takes meaning into account (MacKay 1969). Apart from Shannon, MacKay, and Norbert Wiener, a fourth contender for the title of “father of information theory” would be Dennis Gábor (1946), now best remembered as the inventor of holography.
2 Hence, the line of influence does not run solely from Shannon to Wiener (and thus from information theory to cybernetics) but also in the opposite direction.
4 Hence Luciano Floridi’s suggestion that Shannonian information theory might more accurately be labeled “mathematical theory of data communication” (Floridi 2004: 52).
5 In human oral communication, the information source corresponds to the brain of the speaker; the transmitter to the physical speech apparatus (vocal chords, oral cavity, tongue, etc.), which transforms the message into a coded signal that is sent over the communication channel (the air); the receiver to the ear of the hearer; the destination to the brain of the hearer.
6 Engineering noise, however, corresponds not only to the signifier but also to the writing tool that inscribes it (e.g., a defective keyboard).
7 This does not amount to suggesting that the meaning of a literary text can be equated with the author’s intention. Note also that it is at least debatable to what extent Shannon’s notion of intentionality corresponds to that of literary critics. However far we may have traveled from the intentional fallacies of earlier critics, the question of intentionality haunts any use of information theory within literary studies.
8 See also Heims (1993), who situates the beginnings of information theory and cybernetics in their military-industrial contexts.

Claude E. Shannon

Bibliography
Adams, H. (1907/1918) The Education of Henry Adams, London: Penguin, 1995.
Ash, R. (1965) Information Theory, New York: John Wiley and Sons.
Attali, J. (1985) Noise: the political economy of music, Minneapolis: University of Minnesota Press.
Bear, G. (2004) Dead Lines: a novel of life … after death, New York: Ballantine.
Bense, M. (1982) Aesthetica: Einführung in die neue Aesthetik, Baden-Baden: Agis. Boltzmann, L. (1896–98) Vorlesungen über Gastheorie, Leipzig: Johann Ambrosius Barth. Chaitin, G.J. (1987) Algorithmic Information Theory, Cambridge: Cambridge University Press.
DeLillo, D. (1985) White Noise, New York: Viking.
Floridi, L. (2004) “Information,” in L. Floridi (ed.) The Blackwell Guide to the Philosophy of Computing and Information, Malden: Blackwell, pp. 40–61.
Freese, P. (1997) From Apocalypse to Entropy and Beyond: the second law of thermodynamics in postwar American fiction, Essen: Die Blaue Eule.
Gábor, D. (1946) “Theory of communication,” The Journal of the Institution of Electrical Engineers, 93(26): 429–57.
Gibbs, W.J. (1876, 1878) “On the equilibrium of heterogeneous substances,” Transactions of the Connecticut Academy of Sciences, 3: 108–248, 343–524.
Gibson, W. (1984) Neuromancer, New York: Ace Books, 1994.
Hall, S. (1980) “Encoding/decoding,” in S. Hall, D. Hobson, A. Lowe, and P. Willis (eds) Culture, Media, Language: working papers in cultural studies, 1972–1979, London: Hutchinson, pp. 128–38.
Hartley, R.V.L. (1928) “Transmission of information,” Bell System Technical Journal, 7: 535–63.
Hayles, N.K. (1987) “Information or noise? economy of explanation in Barthes’s S/Z and Shannon’s information theory,” in G. Levine (ed.) One Culture: essays in science and literature, Madison: University of Wisconsin Press, pp. 119–42.
——(1999) How We Became Posthuman: virtual bodies in cybernetics, literature, and informatics, Chicago: University of Chicago Press.
Heims, S.J. (1993) Constructing a Social Science for Postwar America: the cybernetics group (1946–1953), Cambridge, Mass.: MIT Press.
Heller, J. (1975) Something Happened, New York: Ballantine. Kittler, F.A. (1990) Discourse Networks 1800/1900, Stanford: Stanford University Press.
——(1999) Gramophone, Film, Typewriter, trans. G. Winthrop-Young and M. Wutz, Stanford: Stanford University Press.
MacKay, D.M. (1969) Information, Mechanism and Meaning, Cambridge, Mass.: MIT Press.
Neumann, J. v. (1932) Mathematische Grundlagen der Quantenmechanik, Berlin: Springer.
Nyquist, H. (1924) “Certain factors affecting telegraph speed,” Bell System Technical Journal, 3: 324–46.
——(1928) “Certain topics in telegraph transmission theory,” Transactions of the American Institute for Electrical Engineers, 47: 617–44.
Paulson, W.R. (1988) The Noise of Culture: literary texts in a world of information, Ithaca, N.Y.: Cornell University Press.
Powers, R. (1992) The Gold Bug Variations, New York: Harper Perennial.
Pynchon, T. (1966) The Crying of Lot 49, London: Picador, 1979. S
chweighauser, P. (2006) The Noises of American Literature, 1890–1985: toward a history of literary acoustics, Gainesville: University Press of Florida.
Serres, M. (1982) The Parasite, Baltimore: Johns Hopkins University Press. ——(1997) Genesis, Ann Arbor: University of Michigan Press.
Shannon, C.E. (1956) “The bandwagon,” Institute of Radio Engineers Transactions on Information Theory, 2: 3.
——and Weaver, W. (1963) The Mathematical Theory of Communication, Chicago: University of Illinois Press.
Siegert, B. (1999) Relays: literature as an epoch of the postal system, Stanford: Stanford University Press.
Sokal, A.D. (2008) Beyond the Hoax: science, philosophy and culture, Oxford: Oxford University Press.
——and Bricmont, J. (1998) Fashionable Nonsense: postmodern intellectuals’ abuse of science, New York: Picador. Stephenson, N. (1993) Snow Crash, New York: Bantam.
——(1999) Cryptonomicon, New York: Avon.
Szilárd, L. (1929) “Über die Entropieverminderung in einem thermodynamischen System bei Eingriffen intelligenter Wesen,” Zeitschrift für Physik, 53(11–12): 840–56.
Verdú, S. (1998) “Fifty years of Shannon theory,” IEEE Transactions on Information Theory, 44(6): 2057–78.
Wallace, D.F. (1987) The Broom of the System, New York: Viking. Wells, H.G. (1895) The Time Machine, New York: W.W. Norton,
2008. Wiener, N. (1942) The Extrapolation, Interpolation, and Smoothing of Stationary Time Series, Cambridge: NDRC Report 370.
——(1948) Cybernetics or Control and Communication in the Animal and the Machine, Cambridge, Mass.: MIT Press.