Complexity - A Guided Tour - novelonlinefull.com
You’re read light novel Complexity - A Guided Tour Part 3 online at NovelOnlineFull.com. Please use the follow button to get notification about the latest chapter next time when you visit NovelOnlineFull.com. Use F11 button to read novel in full-screen(PC only). Drop by anytime you want to read free – fast – latest novel. It’s great if you could leave a comment, share your opinion about the new chapters, new novel with others on the internet. We’ll do our best to bring you the finest, latest novel everyday. Enjoy
Leo Szilard, 18981964 (AIP Emilio Segre Visual Archives).
From our twenty-first-century vantage, it may seem obvious (or at least unsurprising) that the acquisition of information requires expenditure of work. But at the time of Maxwell, and even sixty years later when Szilard wrote his famous paper, there was still a strong tendency in people's minds to view physical and mental processes as completely separate. This highly ingrained intuition may be why Maxwell, as astute as he was, did not see the "intelligence" or "observing powers" of the demon as relating to the thermodynamics of the box-molecules-demon system. Such relationships between information and physics became clear only in the twentieth century, beginning with the discovery that the "observer" plays a key role in quantum mechanics.
Szilard's theory was later extended and generalized by the French physicists Leon Brillouin and Denis Gabor. Many scientists of the 1950s and later believed that Brillouin's theory in particular had definitively finished off the demon by demonstrating in detail how making a measurement entails an increase of entropy.
However, it wasn't over yet. Fifty years after Szilard's paper, it was discovered that there were some holes in Szilard's and Brillouin's solutions as well. In the 1980s, the mathematician Charles Bennett showed that there are very clever ways to observe and remember information-in the demon's case, whether an air molecule is fast or slow-without increasing entropy. Bennett's remarkable demonstration of this formed the basis for reversible computing, which says that, in theory, any computation can be done without expending energy. Bennett's discoveries might seem to imply that we are back at square one with the demon, since measurement can, in fact, be done without increasing entropy. However, Bennett noted that the second law of thermodynamics was saved again by an earlier discovery made in the 1960s by physicist Rolf Landauer: it is not the act of measurement, but rather the act of erasing memory that necessarily increases entropy. Erasing memory is not reversible; if there is true erasure, then once the information is gone, it cannot be restored without additional measurement. Bennett showed that for the demon to work, its memory must be erased at some point, and when it is, the physical act of this erasure will produce heat, thus increasing entropy by an amount exactly equal to the amount entropy was decreased by the demon's sorting actions.
Landauer and Bennett's solution to the paradox of Maxwell's demon fixed holes in Szilard's solution, but it was in the same spirit: the demon's act of measurement and decision making, which requires erasure, will inevitably increase entropy, and the second law is saved. (I should say here that there are still some physicists who don't buy the Landauer and Bennett solution; the demon remains controversial to this day.) Maxwell invented his demon as a simple thought experiment to demonstrate his view that the second law of thermodynamics was not a law but a statistical effect. However, like many of the best thought-experiments in science, the demon's influence was much broader: resolutions to the demon paradox became the foundations of two new fields: information theory and the physics of information.
Statistical Mechanics in a Nutsh.e.l.l.
In an earlier section, I defined "entropy" as a measure of the energy that cannot be converted into additional work but is instead transformed into heat. This notion of entropy was originally defined by Rudolph Clausius in 1865. At the time of Clausius, heat was believed to be a kind of fluid that could move from one system to another, and temperature was a property of a system that affected the flow of heat.
In the next few decades, a different view of heat emerged in the scientific community: systems are made up of molecules, and heat is a result of the motion, or kinetic energy, of those molecules. This new view was largely a result of the work of Ludwig Boltzmann, who developed what is now called statistical mechanics.
Ludwig Boltzmann, 18441906 (AIP Emilio Segre Visual Archives, Segre Collection).
Statistical mechanics proposes that large-scale properties (e.g., heat) emerge from microscopic properties (e.g., the motions of trillions of molecules). For example, think about a room full of moving air molecules. A cla.s.sical mechanics a.n.a.lysis would determine the position and velocity of each molecule, as well as all the forces acting on that molecule, and would use this information to determine the future position and velocity of that molecule. Of course, if there are fifty quadrillion molecules, this approach would take rather a long time-in fact it always would be impossible, both in practice and, as quantum mechanics has shown, in principle. A statistical mechanics approach gives up on determining the exact position, velocity, and future behavior of each molecule and instead tries to predict the average positions and velocities of large ensembles of molecules.
In short, cla.s.sical mechanics attempts to say something about every single microscopic ent.i.ty (e.g., molecule) by using Newton's laws. Thermodynamics gives laws of macroscopic ent.i.ties-heat, energy, and entropy-without acknowledging that any microscopic molecules are the source of these macroscopic ent.i.ties. Statistical mechanics is a bridge between these two extremes, in that it explains how the behavior of the macroscopic ent.i.ties arise from statistics of large ensembles of microscopic ent.i.ties.
There is one problem with the statistical approach-it gives only the probable behavior of the system. For example, if all the air molecules in a room are flying around randomly, they are most likely to be spread out all over the room, and all of us will have enough air to breathe. This is what we predict and depend on, and it has never failed us yet. However, according to statistical mechanics, since the molecules are flying around randomly, there is some very small chance that at some point they will all fly over to the same corner at the same time. Then any person who happened to be in that corner would be crushed by the huge air pressure, and the rest of us would suffocate from lack of air. As far as I know, such an event has never happened in any room anywhere. However, there is nothing in Newton's laws that says it can't happen; it's just incredibly unlikely. Boltzmann reasoned that if there are enough microscopic ent.i.ties to average over, his statistical approach will give the right answer virtually all the time, and indeed, in practice it does so. But at the time Boltzmann was formulating his new science, the suggestion that a physical law could apply only "virtually all of the time" rather than exactly all of the time was repellent to many other scientists. Furthermore, Boltzmann's insistence on the reality of microscopic ent.i.ties such as molecules and atoms was also at odds with his colleagues. Some have speculated that the rejection of his ideas by most of his fellow scientists contributed to his suicide in 1906, at the age of 62. Only years after his death were his ideas generally accepted; he is now considered to be one of the most important scientists in history.
Microstates and Macrostates.
Given a room full of air, at a given instant in time each molecule has a certain position and velocity, even if it is impossible to actually measure all of them. In statistical mechanics terminology, the particular collection of exact molecule positions and velocities at a given instant is called the microstate of the whole room at that instant. For a room full of air molecules randomly flying around, the most probable type of microstate at a given time is that the air molecules are spread uniformly around the room. The least probable type of microstate is that the air molecules are all clumped together as closely as possible in a single location, for example, the corner of the room. This seems simply obvious, but Boltzmann noted that the reason for this is that there are many more possible microstates of the system in which the air molecules are spread around uniformly than there are microstates in which they all are clumped together.
The situation is a.n.a.logous to a slot machine with three rotating pictures (figure 3.2). Suppose each of the three pictures can come up "apple," "orange," "cherry," "pear," or "lemon." Imagine you put in a quarter, and pull the handle to spin the pictures. It is much more likely that the pictures will all be different (i.e., you lose your money) than that the pictures will all be the same (i.e., you win a jackpot). Now imagine such a slot machine with fifty quadrillion pictures, and you can see that the probability of all coming up the same is very close to zero, just like the probability of the air molecules ending up all clumped together in the same location.
FIGURE 3.2. Slot machine with three rotating fruit pictures, ill.u.s.trating the concepts microstate and macrostate. (Drawing by David Moser.).
A type of microstate, for example, "pictures all the same-you win" versus "pictures not all the same-you lose" or "molecules clumped together-we can't breathe" versus "molecules uniformly spread out-we can breathe," is called a macrostate of the system. A macrostate can correspond to many different microstates. In the slot machine, there are many different microstates consisting of three nonidentical pictures, each of which corresponds to the single "you lose" macrostate, and only a few microstates that correspond to the "you win" macrostate. This is how casinos are sure to make money. Temperature is a macrostate-it corresponds to many different possible microstates of molecules at different velocities that happen to average to the same temperature.
Using these ideas, Boltzmann interpreted the second law of thermodynamics as simply saying that an isolated system will more likely be in a more probable macrostate than in a less probable one. To our ears this sounds like a tautology but it was a rather revolutionary way of thinking about the point back then, since it included the notion of probability. Boltzmann defined the entropy of a macrostate as a function of the number of microstates that could give rise to that macrostate. For example, on the slot machine of figure 3.2, where each picture can come up "apple," "orange," "cherry," "pear," or "lemon," it turns out that there are a total of 125 possible combinations (microstates), out of which five correspond to the macrostate "pictures all the same-you win" and 120 correspond to the macrostate "pictures not all the same-you lose." The latter macrostate clearly has a higher Boltzmann entropy than the former.
FIGURE 3.3. Boltzmann's tombstone, in Vienna. (Photograph courtesy of Martin Roell.).
Boltzmann's entropy obeys the second law of thermodynamics. Unless work is done, Boltzmann's entropy will always increase until it gets to a macrostate with highest possible entropy. Boltzmann was able to show that, under many conditions, his simple and intuitive definition of entropy is equivalent to the original definition of Clausius.
The actual equation for Boltzmann's entropy, now so fundamental to physics, appears on Boltzmann's tombstone in Vienna (figure 3.3).
Shannon Information.
Many of the most basic scientific ideas are spurred by advances in technology. The nineteenth-century studies of thermodynamics were inspired and driven by the challenge of improving steam engines. The studies of information by mathematician Claude Shannon were likewise driven by the twentieth-century revolution in communications-particularly the development of the telegraph and telephone. In the 1940s, Shannon adapted Boltzmann's ideas to the more abstract realm of communications. Shannon worked at Bell Labs, a part of the American Telephone and Telegraph Company (AT&T). One of the most important problems for AT&T was to figure out how to transmit signals more quickly and reliably over telegraph and telephone wires.
Claude Shannon, 19162001. (Reprinted with permission of Lucent Technologies Inc./Bell Labs.) Shannon's mathematical solution to this problem was the beginning of what is now called information theory. In his 1948 paper "A Mathematical Theory of Communication," Shannon gave a narrow definition of information and proved a very important theorem, which gave the maximum possible transmission rate of information over a given channel (wire or other medium), even if there are errors in transmission caused by noise on the channel. This maximum transmission rate is called the channel capacity.
Shannon's definition of information involves a source that sends messages to a receiver. For example, figure 3.4 shows two examples of a source talking to a receiver on the phone. Each word the source says can be considered a message in the Shannon sense. Just as the telephone doesn't understand the words being said on it but only transmits the electrical pulses used to encode the voice, Shannon's definition of information completely ignores the meaning of the messages and takes into account only how often the source sends each of the possible different messages to the receiver.
FIGURE 3.4. Top: Information content (zero) of Nicky's conversation with Grandma. Bottom: Higher information content of Jake's conversation with Grandma. (Drawings by David Moser.) Shannon asked, "How much information is transmitted by a source sending messages to a receiver?" In a.n.a.logy with Boltzmann's ideas, Shannon defined the information of a macrostate (here, a source) as a function of the number of possible microstates (here, ensembles of possible messages) that could be sent by that source. When my son Nicky was barely a toddler, I would put him on the phone to talk with Grandma. He loved to talk on the phone, but could say only one word-"da." His messages to Grandma were "da da da da da...." In other words, the Nicky-macrostate had only one possible microstate (sequences of "da"s), and although the macrostate was cute, the information content was, well, zero. Grandma knew just what to expect. My son Jake, two years older, also loved to talk on the phone but had a much bigger vocabulary and would tell Grandma all about his activities, projects, and adventures, constantly surprising her with his command of language. Clearly the information content of the Jake-source was much higher, since so many microstates-i.e., more different collections of messages-could be produced.
Shannon's definition of information content was nearly identical to Boltzmann's more general definition of entropy. In his cla.s.sic 1948 paper, Shannon defined the information content in terms of the entropy of the message source. (This notion of entropy is often called Shannon entropy to distinguish it from the related definition of entropy given by Boltzmann.) People have sometimes characterized Shannon's definition of information content as the "average amount of surprise" a receiver experiences on receiving a message, in which "surprise" means something like the "degree of uncertainty" the receiver had about what the source would send next. Grandma is clearly more surprised at each word Jake says than at each word Nicky says, since she already knows exactly what Nicky will say next but can't as easily predict what Jake will say next. Thus each word Jake says gives her a higher average "information content" than each word Nicky says.
In general, in Shannon's theory, a message can be any unit of communication, be it a letter, a word, a sentence, or even a single bit (a zero or a one). Once again, the entropy (and thus information content) of a source is defined in terms of message probabilities and is not concerned with the "meaning" of a message.
Shannon's results set the stage for applications in many different fields. The best-known applications are in the field of coding theory, which deals with both data compression and the way codes need to be structured to be reliably transmitted. Coding theory affects nearly all of our electronic communications; cell phones, computer networks, and the worldwide global positioning system are a few examples.
Information theory is also central in cryptography and in the relatively new field of bioinformatics, in which entropy and other information theory measures are used to a.n.a.lyze patterns in gene sequences. It has also been applied to a.n.a.lysis of language and music and in psychology, statistical inference, and artificial intelligence, among many other fields. Although information theory was inspired by notions of entropy in thermodynamics and statistical mechanics, it is controversial whether or not information theory has had much of a reverse impact on those and other fields of physics. In 1961, communications engineer and writer John Pierce quipped that "efforts to marry communication theory and physics have been more interesting than fruitful." Some physicists would still agree with him. However, there are a number of new approaches to physics based on concepts related to Shannon's information theory (e.g., quantum information theory and the physics of information) that are beginning to be fruitful as well as interesting.
As you will see in subsequent chapters, information theoretic notions such as entropy, information content, mutual information, information dynamics, and others have played central though controversial roles in attempts to define the notion of complexity and in characterizing different types of complex systems.
CHAPTER 4.
Computation.
Quo facto, quando orientur controversiae, non magis disputatione opus erit inter duos philosophos, quam inter duos Computistas. Sufficiet enim calamos in ma.n.u.s sumere sedereque ad abacos, et sibi mutuo dicere: Calculemus!
[If controversies were to arise, there would be no more need of disputation between two philosophers than between two accountants. For it would suffice to take their pencils in their hands, to sit down to their slates, and say to each other, "Let us calculate."]
-G. Leibniz (Trans. B. Russell).
PEOPLE USUALLY THINK of computation as the thing that a computer does, such as calculations in spreadsheets, word processing, e-mail, and the like. And they usually think of a computer as the machine on one's desk (or lap) that has electronic circuits inside, and usually a color monitor and a mouse, and that in the distant past used old-fashioned technology such as vacuum tubes. We also have a vague idea that our brains themselves are roughly like computers, with logic, memory, input, and output.
However, if you peruse some of the scholarly literature or seminar t.i.tles in complex systems, you will find the word computation used in some rather unfamiliar contexts: a biology book on "computation in cells and tissues"; a keynote lecture on "immune system computation"; an economics lecture concerning "the nature and limits of distributed computation in markets"; an article in a prestigious science journal on "emergent computation in plants." And this is just a small sampling of such usage.
The notion of computation has come a long way since the early days of computers, and many scientists now view the phenomenon of computation as being widespread in nature. It is clear that cells, tissues, plants, immune systems, and financial markets do not work anything like the computer on your desk, so what exactly do these people mean by computation, and why do they call it that?
In order to set the stage for addressing this question in chapter 12, this chapter gives an overview of the history of ideas about computation and what can be computed, and describes basics of computational concepts used by scientists to understand natural complex systems.
What Is Computation and What Can Be Computed?
Information, as narrowly defined by Shannon, concerns the predictability of a message source. In the real world, however, information is something that is a.n.a.lyzed for meaning, that is remembered and combined with other information, and that produces results or actions. In short, information is processed via computation.
The meaning of computation has changed dramatically over the years. Before the late 1940s, computing meant performing mathematical calculations by hand (what nineteenth-century British schoolboys would have called "doing sums"). Computers were people who did such calculations. One of my former professors, Art Burks, used to tell us how he had married a "computer"-the term used for women who were enlisted during World War II to hand-calculate ballistic trajectories. Alice Burks was working as such a computer when she met Art.
Nowadays computation is what computers of the electronic variety do and what natural complex systems seem to do as well. But what exactly is computation, and how much can it accomplish? Can a computer compute anything, in principle, or does it have any limits? These are questions that were answered only in the mid-twentieth century.
Hilbert's Problems and G.o.del's Theorem.
The study of the foundations and limitations of computation, which led to the invention of electronic computers, was developed in response to a set of seemingly abstract (and abstruse) math problems. These problems were posed in the year 1900 at the International Congress of Mathematicians in Paris by the German mathematician David Hilbert.
David Hilbert, 18621943 (AIP Emilio Segre Visual Archives, Lande Collection).
Hilbert's lecture at this congress set out a list of mathematical New Year's resolutions for the new century in the form of twenty-three of the most important unsolved problems in mathematics. Problems 2 and 10 ended up making the biggest splash. Actually, these were not just problems in mathematics; they were problems about mathematics itself and what can be proved by using mathematics. Taken together, these problems can be stated in three parts as follows: Is mathematics complete? That is, can every mathematical statement be proved or disproved from a given finite set of axioms?
For example, remember Euclid's axioms from high-school geometry? Remember using these axioms to prove statements such as "the angles in a triangle sum to 180 degrees"? Hilbert's question was this: Given some fixed set of axioms, is there a proof for every true statement?
Is mathematics consistent? In other words, can only the true statements be proved? The notion of what is meant by "true statement" is technical, but I'll leave it as intuitive. For example, if we could prove some false statement, such as 1 + 1 = 3, mathematics would be inconsistent and in big trouble.
Is every statement in mathematics decidable? That is, is there a definite procedure that can be applied to every statement that will tell us in finite time whether or not the statement is true or false? The idea here is that you could come up with a mathematical statement such as, "Every even integer greater than 2 can be expressed as the sum of two prime numbers," hand it to a mathematician (or a computer), who would apply a precise recipe (a "definite procedure"), which would yield the correct answer "true" or "false" in finite time.
Kurt G.o.del, 19061978 (Photograph courtesy of Princeton University Library) The last question is known by its German name as the Entscheidungsproblem ("decision problem"), and goes back to the seventeenth-century mathematician Gottfried Leibniz. Leibniz actually built his own calculating machine, and believed that humans could eventually build a machine that could determine the truth or falsity of any mathematical statement.
Up until 1930, these three problems remained unsolved, but Hilbert expressed confidence that the answer would be "yes" in each case, and a.s.serted that "there is no such thing as an unsolvable problem."
However, his optimism turned out to be short-lived. Very short-lived. At the same meeting in 1930 at which Hilbert made his confident a.s.sertion, a twenty-five-year-old mathematician named Kurt G.o.del astounded the mathematical world by presenting a proof of the so-called incompleteness theorem. This theorem stated that if the answer to question 2 above is "yes" (i.e., mathematics is consistent), then the answer to question 1 (is mathematics complete?) has to be "no."
G.o.del's incompleteness theorem looked at arithmetic. He showed that if arithmetic is consistent, then there are true statements in arithmetic that cannot be proved-that is, arithmetic is incomplete. If arithmetic were inconsistent, then there would be false statements that could be proved, and all of mathematics would come crashing down.
G.o.del's proof is complicated. However, intuitively, it can be explained very easily. G.o.del gave an example of a mathematical statement that can be translated into English as: "This statement is not provable."
Think about it for a minute. It's a strange statement, since it talks about itself-in fact, it a.s.serts that it is not provable. Let's call this statement "Statement A." Now, suppose Statement A could indeed be proved. But then it would be false (since it states that it cannot be proved). That would mean a false statement could be proved-arithmetic would be inconsistent. Okay, let's a.s.sume the opposite, that Statement A cannot be proved. That would mean that Statement A is true (because it a.s.serts that it cannot be proved), but then there is a true statement that cannot be proved-arithmetic would be incomplete. Ergo, arithmetic is either inconsistent or incomplete.
It's not easy to imagine how this statement gets translated into mathematics, but G.o.del did it-therein lies the complexity and brilliance of G.o.del's proof, which I won't cover here.
This was a big blow for the large number of mathematicians and philosophers who strongly believed that Hilbert's questions would be answered affirmatively. As the mathematician and writer Andrew Hodges notes: "This was an amazing new turn in the enquiry, for Hilbert had thought of his programme as one of tidying up loose ends. It was upsetting for those who wanted to find in mathematics something that was perfect and una.s.sailable...."
Turing Machines and Uncomputability.
While G.o.del dispatched the first and second of Hilbert's questions, the British mathematician Alan Turing killed off the third.
In 1935 Alan Turing was a twenty-three-year-old graduate student at Cambridge studying under the logician Max Newman. Newman introduced Turing to G.o.del's recent incompleteness theorem. When he understood G.o.del's result, Turing was able to see how to answer Hilbert's third question, the Entscheidungsproblem, and his answer, again, was "no."
How did Turing show this? Remember that the Entscheidungsproblem asks if there is always a "definite procedure" for deciding whether a statement is provable. What does "definite procedure" mean? Turing's first step was to define this notion. Following the intuition of Leibniz of more than two centuries earlier, Turing formulated his definition by thinking about a powerful calculating machine-one that could not only perform arithmetic but also could manipulate symbols in order to prove mathematical statements. By thinking about how humans might calculate, he constructed a mental design of such a machine, which is now called a Turing machine. The Turing machine turned out to be a blueprint for the invention of the electronic programmable computer.
Alan Turing, 19121954 (Photograph copyright 2003 by Photo Researchers Inc. Reproduced by permission.).
A QUICK INTRODUCTION TO TURING MACHINES.
As ill.u.s.trated in figure 4.1, a Turing machine consists of three parts: (1) A tape, divided into squares (or "cells"), on which symbols can be written and from which symbols can be read. The tape is infinitely long in both directions. (2) A movable read/write tape head that reads symbols from the tape and writes symbols to the tape. At any time, the head is in one of a number of states. (3) A set of rules that tell the head what to do next.
The head starts out at a particular tape cell and in a special start state. At each time step, the head reads the symbol at its current tape cell. The head then follows the rule that corresponds to that symbol and the head's current state. The rule tells the head what symbol to write on the current tape cell (replacing the previous symbol); whether the head should move to the right, move to the left, or stay put; and what the head's new state is. When the head goes into a special halt state, the machine is done and stops.
FIGURE 4.1. Ill.u.s.tration of a Turing machine.
The input to the machine is the set of symbols written on the tape before the machine starts. The output from the machine is the set of symbols written on the tape after the machine halts.
A SIMPLE EXAMPLE.