Eqs

Sunday, June 22, 2014

On quantum measurement (Part 1: Hans Bethe, the oracle)

For this series of posts, I'm going to take you on a ride through the bewildering jungle that is quantum measurement. I've no idea how many parts will be enough, but I'm fairly sure there will be more than one. After all, the quantum mechanics of measurement has been that subject's "mystery of mysteries" for ages, it now seems. 

Before we begin, I should tell you how I became interested in the quantum measurement problem. Because for the longest time I wasn't. During graduate school (at the University of Bonn), the usual thing happened: the Prof (in my case Prof. Werner Sandhas, who I hope turned eighty this past April 14th) says that they'll tell us about quantum measurement towards the end of the semester, and then never actually get there. I have developed a sneaking suspicion that this happened a lot, in quantum mechanics classes everywhere, every time. Which would explain a lot of the confusion that still reigns. 

However, to tell you how I became interested in this problem is a little difficult, because I risk embarrassing myself. The embarrassment that I'm risking is not the usual type. It is because the story that I will tell you will seem utterly ridiculous, outrageously presumptuous, and altogether improbable. But it occurred just as I will attempt to tell it. There is one witness to this story, my collaborator in this particular endeavor, the Belgian theoretical physicist Nicolas Cerf.  

Now, because Nicolas and I worked together very closely on a number of different topics in quantum information theory when we shared an office at Caltech, you might surmise that he would corroborate any story I write (and thus not be an independent witness). I'm sure he remembers the story (wait for it, I know I'm teasing) differently, but you would have to ask him. All I can say is that this is how I remember it.

Nicolas and I had begun to work in quantum information theory around 1995-1996. After a while we were studying the quantum communication protocols of quantum teleportation and quantum superdense coding, and in our minds (that is, our manner of counting), information did not add up. But, we thought, information must be conserved. We were certain. (Obviously that has been an obsession of mine for a while, those of you who have read my black hole stories will think to yourselves).
Space-time diagrams for the quantum teleportation process (a) and superdense coding process (b). EPR stand for an entangled Einstein-Podolsky-Rosen pair. Note the information values for the various classical and quantum bits in red. Adapted from Ref. [1]. The letters 'M' and 'U' stand for a measurement and a unitary operation, respectively. A and B are the communication partners 'Alice' and 'Bob'.

But information cannot be conserved, we realized, unless you can have negative bits. Negative entropy: anti-qubits (see the illustration above). This discovery of ours is by now fairly well-known (so well-known, in fact, that sometimes articles about negative quantum entropy don't seem to feel it necessary to refer to our original paper at all). But it is only the beginning of the story (ludicrous as it may well appear to you) that I want to tell. 

After Nicolas and I wrote the negative entropy paper, we realized that quantum measurement was, after all, reversible. That fact was obvious once you understood these quantum communication protocols, but it was even more obvious once you understood the quantum erasure experiment. Well, for all we knew, this was flying in the face of accepted lore, which (ever since Niels Bohr) would maintain that quantum measurement required an irreversible collapse of the quantum wavefunction. Ordinarily, I would now put up a picture of the Danish physicist who championed wave function collapse, but I cannot bring myself to do it: I have come to loathe the man. I'm sure I'm being petty here.

With this breakthrough discovery in mind ("Quantum measurement is reversible!") Nicolas and I went to see Hans Bethe, who was visiting Caltech at the time. At this point, Hans and I had become good friends, as he visited Caltech regularly. I wrote up my recollections of my first three weeks with him (and also our last meeting) in the volume commemorating his life. (If you don't want to buy that book but read the story, try this link. But you should really buy the book: there's other fun stuff in it). The picture below is from Wikipedia, but that is not how I remember him. I first met him when he was 85. 
        
         Hans A. Bethe (1906-2005) (Source: Wikimedia

Alright, enough of the preliminaries. Nicolas Cerf and I decided to ask for Hans's advice, and enter his office, then on the 3rd floor of Caltech's Kellogg Radiation Laboratory. For us, that meant one flight of stairs up. We tell him right away that we think we have discovered something important that is relevant to the physics of quantum measurement, and start explaining our theory. I should tell you that what we have at this point isn't much of a theory: it is the argument, based on negative conditional quantum entropies, that quantum measurement can in principle be reversed. 

Hans listens patiently. Once in a while he asks a question that forces us to be more specific.

After we are done, he speaks.

"I am not that much interested in finding that quantum measurement is reversible. What I find much more interesting is that you have solved the quantum measurement problem."

[See Note added later, below.] 

After that, there is a moment of silence. Both Nicolas and I are utterly stunned. 

I am first to ask the obvious. 
"Can you explain to us why?"

You see, it is fairly improbable that a physicist of the caliber of Hans Bethe tells you that you have solved the "mystery of mysteries". Neither Nicolas nor I had seen this coming from a mile away. And we certainly had no idea why he just said that.

We were waiting with--shall we say--bated breath. Put yourself into our position. How would you have reacted? What came after was also wholly unexpected.

After I asked him to explain that last statement, he was silent for--I don't know--maybe three seconds. In a conversation like this, that is bordering on a perceived eternity.

My recollection is fuzzy at this point. Either he began by saying "I can't explain it to you", or he immediately told the story of the Mathematics Professor who lectures on a complex topic and fills blackboard after blackboard, until a student interrupts him and asks: "Can you explain this last step in your derivation to me?"

The Professor answers: "It is obvious". The student insists. "If it is obvious, can you explain it?", and the Professor answers: "It is obvious, but I'll have to get back to you to explain it tomorrow".

At this point of Hans telling this story, the atmosphere is a little awkward. Hans tell us that it is obvious that we solved the quantum measurement problem, but he can't tell us exactly why he thinks it is obvious that we did. It certainly is not obvious to us.

I know Hans well enough at this point that I press on. I cannot let that statement go just like that. He did go on to try to explain what he meant.  Now of course I wish I had taken notes but I didn't. But what he said resonated in my mind for a long time (and I suspect that this is true for Nicolas as well). After what he said, we both dropped everything we were doing, and worked only on the quantum measurement problem, for six months, culminating in this paper

What he said was something like this: "When you make a measurement, its outcome is conditional on the measurements made on that quantum system before that, and so on, giving rise to a long series of measurements, all conditional on each other".

This is nowhere near an exact rendition of what he said. All I remember is him talking about atomic decay, and measuring the product of the decay and that this is conditional on previous events, and (that is the key thing I remember) that this gives rise to these long arcs of successive measurements whose outcomes are conditional on the past, and condition the future. 

Both Nicolas and I kept trying to revive that conversation in our memory when we worked on the problem for the six months following. (Hans left Caltech that year the day after our conversation). Hans also commented that our finding had deep implications for quantum statistical mechanics, because it showed that the theory is quite different from the classical theory after all. We did some work on the quantum Maxwell Demon in reaction to that, but never really had enough time to finish it. Other people after us did. But for the six months that followed, Nicolas and I worked with only this thought in our mind:

"He said we solved the problem. Let us find out how!"

In the posts that follow this one, I will try to give you an idea of what it is we did discover (most of it contained in the article mentioned above). You will easily out find that this article isn't published (and I'll happily tell you the story how that happened some other time). While a good part of what's in that paper did get published ultimately, I think the main story is still untold. And I am attempting to tell this story still, via a preprint I have about consecutive measurements, that I'm also still working on. But consecutive measurement is what Hans was telling us about in this brief session, that changed the scientific life of both Nicolas and I. He knew what he was talking about, but he didn't know how to tell us just then. It was obvious to him. I hope it will be obvious to me one day too.

Even though the conversation with Hans happened as I described, I should tell you that 18 years after Hans said this to us (and thinking about it and working on it for quite a while) I don't think he was altogether right. We had solved something, but I don't think we solved "the whole thing". There is more to it. Perhaps much more.

Stay tuned for Part 2, where I will explain the very basics of quantum measurement, what von Neumann had to say about it, as well as what this has to do with Everett and the "Many-world" interpretation. And if this all works out as I plan, perhaps I will ultimately get to the point that Hans Bethe certainly did not foresee: that the physics of quantum measurement is intimately linked to Gödel incompleteness. But I'm getting ahead of myself.

[1] N.J. Cerf and C. Adami. Negative entropy and information in quantum mechanics. Phys. Rev. Lett. 79 (1997) 5194-5197.

Note added: upon reading the manuscript again after all this time, I found in the acknowledgements the (I suppose more or less exact) statement that Hans had made. He stated that "negative entropy solves the problem of the reduction of the wave packet". Thus, it appears he did not maintain that we had "solved the measurement problem" as I had written above, only a piece if it.

Part 2 (Some history, and John von Neumann is confused) continues here.

Sunday, June 8, 2014

Whose entropy is it anyway? (Part 2: The so-called Second Law)

This is the second part of the "Whose entropy is it anyway?" series. Part 1: "Boltzmann, Shannon, and Gibbs" is here.

Yes, let's talk about that second law in light of the fact we just established, namely that Boltzmann and Shannon entropy are fundamentally describing the same thing: they are measures of uncertainty applied to different realms of inquiry, making us thankful that Johnny vN was smart enough to see this right away. 

The second law is usually written like this: 

"When an isolated system approaches equilibrium from a non-equilibrium state, its entropy almost always increases"

I want to point out here that this is a very curious law, because there is, in fact, no proof for it. Really, there isn't. Not every thermodynamics textbook is honest enough to point this out, but I have been taught this early on, because I learned Thermodynamics from the East-German edition of Landau and Lifshitz's tome "Statistische Physik", which is quite forthcoming about this (in the English translation):

"At the present time, it is not certain whether the law of increase of entropy thus formulated can be derived from classical mechanics"

From that, L&L go on to speculate that the arrow of time may be a consequence of quantum mechanics.

I personally think that quantum mechanics has nothing to do with it (but see further below). The reason the law cannot be derived is because it does not exist. 

I know, I know. Deafening silence. Then:

"What do you mean? Obviously the law exists!"

What I mean, to be more precise, is that strictly speaking Boltzmann's entropy cannot describe what goes on when a system not at equilibrium approaches said equilibrium, because Boltzmann's entropy is an equilibrium concept. It describes the value that is approached when a system equilibrates. It cannot describe its value as it approaches that constant. Yes, Boltzmann's entropy is a constant: it counts how many microstates can be taken on by a system at fixed energy. 

When a system is not at equlibrium, fewer microstates are actually occupied by the system, but the number it could potentially take on is constant. Take, for example, the standard "perfume bottle" experiment that is so often used to illustrate the second law:
An open "perfume bottle" (left) about to release its molecules into the available space (right)


The entropy of the gas inside the bottle is usually described as being small, while the entropy of the gas on the right (because it occupies a large space) is believed to be large. But Boltzmann's formula is actually not applicable to the situation on the left, because it assumes (on account of the equilibrium condition), that the probability distributions in phase space of all particles involved are independent. But they are clearly not, because if I know the location of one of the particles in the bottle, I can make very good predictions about the other particles because they occupy such a confined space. (This is much less true for the particles in the larger space at right, obviously).

What should we do to correct this? 

We need to come up with a formula for entropy that is not explicitly true only at equilibrium, and that allows us to quantify correlations between particles. Thermodynamics cannot do this, because equilibrium thermodynamics is precisely that theory that deals with systems whose correlations have decayed long ago, or as Feynman put it, systems "where all the fast things have happened but the slow things have not". 

Shannon's formula, it turns out, does precisely what we are looking for: quantify correlations between all particles involved. Thus, Shannon's entropy describes, in a sense, nonequilibrium thermodynamics. Let me show you how.

Let's go back to Shannon's formula applied to a single molecule, described by a random variable \(A_1\), and call this entropy \(H(A_1)\). 

I want to point out right away something that may shock and disorient you, unless you followed the discussion in the post "What is Information? (Part 3: Everything is conditional)" that I mentioned earler. This entropy \(H(A_1)\) is actually conditional. This will become important later, so just store this away for the moment. 

OK. Now let's look at a two-atom gas. Our second atom is described by random variable \(A_2\), and you can see that we are assuming here that the atoms are distinguishable. I do this only for convenience, everything can be done just as well for indistinguishable particles.

If there are no correlations between the two atoms, then the entropy of the joint system \(H(A_1A_2)=H(A_1)+H(A_2)\), that is, entropy is extensive. Thermodynamical entropy is extensive because it describes things at equilibrium. Shannon entropy, on the other hand is not. It can describe things that are not at equilibrium, because then

                                   \(H(A_1A_2)=H(A_1)+H(A_2)-H(A_1:A_2) ,\)

where \(H(A_1:A_2)\) is the correlation entropy, or shared entropy, or information, between \(A_1\) and \(A_2\). It is what allows you to predict something about \(A_2\) when you know \(A_1\), which is precisely what we already knew we could do in the picture of the molecules crammed into the perfume bottle on the left. This is stunning news for people who only know thermodynamics,

What if we have more particles? Well, we can quantify those correlations too. Say we have three variables, and the third one is (with very little surprise) described by variable \(A_3\). It is then a simple exercise to write the joint entropy \(H(A_1A_2A_3)\) as

         \(H(A_1)+H(A_2)+H(A_3)-H(A_1:A_2)-H(A_1:A_3)-H(A_2:A_3)+H(A_1:A_2:A_3)\)

Entropy Venn diagram for three random variables, with the correlation entropries indicated.

We find thus that the entropy of the joint system of variables can be written in terms of the extensive entropy (the sum of the subsystem entropies) minus the correlation entropy \(H_{\rm corr}\), which includes correlations between pairs of variables, triplets of variables, and so forth. Indeed, the joint entropy of an \(n\)-particle system can be written in terms of a sum that features the (extensive) sum of single-particle entropies plus (or minus) the possible many-particle correlation entropies (the sign always alternates between even and odd number of participating particles):

\(H(A_1,...,A_n)=\sum_{i=1}^n H(A_i)-\sum_{i\neq j}H(A_i:A_j)+\sum_{i\neq j\neq k} H(A_i:A_j:A_k)-\cdots. \)

This formula quickly becomes cumbersome, which is why Shannon entropy isn't a very useful formulation of non-equilibrium thermodynamics unless the correlations are somehow confined to just a few variables. 

Now, let's look at what happens when the gas in the bottle escapes into the larger area. Initially, the entropy is small, because the correlation entropy is large. Let's write this entropy as 

                                            \(H(A_1,...,A_n|I)=H(A_1,...,A_n)-I,\)

where \(I\) is the information I have because I know that the molecules are in the bottle. You now see why the entropy is small: you know a lot (in fact, \(I\)) about the system. The unconditional piece is the entropy of the system when all the fast things (the molecules escaping the bottle) have happened.  

Some of you may have already understood what happens when the bottle is opened: the information \(I\) that I have (or any other observer, for that matter, has) decreases. And as a consequence, the conditional entropy \(H(A_1,...,A_n|I)\) increases. It does so until \(I=0\), and the maximum entropy state is achieved. Thus, what is usually written as the second law is really just the increase of the conditional entropy as information becomes outdated. Information, after all, is that which allows me to make predictions with accuracy better than chance. If the symbols that I have in my hand (and that I use to make the predictions) do not predict anymore, then they are not information anymore: they have turned to entropy. Indeed, in the end this is all the second law is about: how information turns into entropy.

You have probably already noticed that I could now take the vessel on the right of the figure above and open that one up. Then you realize that you did have information after all, namely you knew that the particles were confined to the larger area. This example teaches us that, as I pointed out in "What is Information? (Part I)", the entropy of a system is not a well-defined quantity unless we specify what measurement device we are going to use to measure it with, and as a consequence what the range of values of the measurements are going to be. 

The original second law, being faulty, should therefore be reformulated like this: 

In a thermodynamical equilibrium or non-equilibrium process, the unconditional (joint) entropy of a closed system remains a constant. 

The "true second law", I propose, should read:

When an isolated system approaches equilibrium from a non-equilibrium state, its conditional entropy almost always increases

Well, that looks suspiciously like the old law, except with the word "conditional" in front of "entropy". It seems like an innocuous change, but it took two blog posts to get there, and I hope I have convinced you that this change is not at all trivial. 

Now to close this part, let's return to Gibbs's entropy, which really looks exactly like Shannon's. And indeed, the \(p_i\) in Gibbs's formula 

                                                   \(S=-\sum_i p_i\log p_i\)

could just as well refer to non-equilibrium distributions. If it does refer to equilibrium, we should use the Boltzmann distribution (I set here Boltzmann's constant to \(k=1\), as it really just renormalizes the entropy)
$$p_i=\frac1Z e^{-E_i/T}$$
where \(Z=\sum_ie^{-E_i/T}\) is known as the "partition function" in thermodynamics (which just makes sure that the \(p_i\) are correctly normalized), and \(E_i\) is the energy of the \(i\)th microstate. Oh yeah, T is the temperature, in case you were wondering.

If we plug this \(p_i\) into Gibbs's (or Shannon's) formula, we get 
$$S=\log Z+E/T$$
This is, of course, a well-known thermodynamical relationship because \(F=-T\log Z\) is also known as the Helmholtz free energy, so that \(F=E-TS\). 

As we have just seen that this classical formula is the limiting case of using the Boltzmann (equilibrium) distribution within Gibbs's (or Shannon's) formula, we can be pretty confident that the relationship between information theory and thermodynamics I just described is sound. 

As a last thought: how did von Neumann know that Shannon's formula was the (non-equilibrium) entropy of thermodynamics? He had been working on quantum statistical mechanics in 1927, and deduced that the quantum entropy should be written in terms of the quantum density matrix \(\rho\) as (here "Tr" stands for the matrix trace)
$$S(\rho)=-{\rm Tr} \rho\log \rho.$$
Quantum mechanical density matrices are in general non-diagonal. But were they to become classical, they would approach a diagonal matrix where all the elements on the diagonal are probabilities \(p_1,...,p_n\). In that case, we just find
$$S(\rho)\to-\sum_{i=1}^n p_i\log p_i, $$ 
in other words, Shannon's formula is just the classical limit of the quantum entropy that was invented twenty-one years before Shannon thought of it, and you can bet that Johnny immediately saw this!

In other words, there is a very good reason why Boltzmann's, Gibbs's, and Shannon's formulas are all called entropy, and Johnny von Neumann didn't make this suggestion to Shannon in jest.

Is this the end of "Whose entropy is it anyway?". Perhaps, but I have a lot more to write about the quantum notion of entropy, and whether considering quantum mechanical measurements can say anything about the arrow of time (as Landau and Lifshitz suggested). Because considering the quantum entropy of the universe can also say something about the evolution of our universe and the nature of the "Big Bang", perhaps a Part 3 will be appropriate. 

Stay tuned!


Saturday, June 7, 2014

Whose entropy is it anyway? (Part 1: Boltzmann, Shannon, and Gibbs )

Note: this post was slated to appear on May 31, 2014, but events outside of my control (such as grant submission deadlines, and parties at my house) delayed its issuance.

The word "entropy" is used a lot, isn't it? OK, not in your average conversation, but it is a staple of conversations between some scientists, but certainly all nerds and geeks. You have read my introduction to information theory I suppose (and if not, go ahead and start here, right away!)  But in my explanations of Shannon's entropy concept, I only obliquely referred to another "entropy": that which came before Shannon: the thermodynamic entropy concept of Boltzmann and Gibbs. The concept was originally discussed by Clausius, but because he did not give a formula, I will just have to ignore him here. 

Why do these seemingly disparate concepts have the same name? How are they related? And what does this tell us about the second law of thermodynamics?

This is the blog post (possibly a series) where I try to throw some light on that relationship. I suspect that what follows below isn't very original (otherwise I probably should have written it up in a paper), but I have to admit that I didn't really check. I did write about some of these issues in an article that was published in a Festschrift on the occasion of the 85th birthday of Gerry Brown, who was my Ph.D. co-advisor and a strong influence on my scientific career. He passed away a year ago to this day, and I have not yet found a way to remember him properly. Perhaps a blog post on the relationship between thermodynamics and information theory is appropriate, as it bridges a subject Gerry taught often (Thermodynamics) with a subject I have come to love: the concept of information. But face it: a book chapter doesn't get a lot of readership. Fortunately, you can read it on arxiv here, and I urge you to because it does talk about Gerry in the introduction.  

Gerry Brown (1926-2013)
Before we get to the relationship between Shannon's entropy and Boltzmann's, how did they end up being called by the same name? After all, one is a concept within the realm of physics, the other from electrical engineering. What gives?

The one to blame for this confluence is none other than John von Neumann, the mathematician, physicist, engineer, computer scientist (perhaps Artificial Life researcher, sometimes moonlighting as an economist). It is difficult to appreciate the genius that was John von Neumann, not the least because there aren't many people who are as broadly trained as he was. For me, the quote that fills me with awe comes from another genius who I've had the privilege to know well, the physicist Hans Bethe. I should write a blog post about my recollections of our interactions, but there is already a write-up in the book memorializing Hans's life. While I have never asked Hans directly about his impressions of von Neumann (how I wish that I had!), he is quoted as saying (in the 1957 LIFE magazine article commemorating von Neumann's death: "I have sometimes wondered whether a brain like von Neumann's does not indicate a species superior to man".

The reason why I think that this quite a statement, is that I think Bethe's brain was in itself very unrepresentative of our species, and perhaps indicated an altogether different kind.

So, the story goes (as told by Myron Tribus in his 1971 article "Energy and Information") that when Claude Shannon had figured out his channel capacity theorem, he consulted von Neumann (both at Princeton at the time) about what he should call the "-p log p" value of the message to be sent over a channel. von Neumann supposedly replied:

"You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name. In the second place, and more importantly, no one knows what entropy really is, so in a debate you will always have the advantage.”

The quote is also reprinted in the fairly well-known book "Maxwell's Demon: Entropy, Information, and Computing", edited by Leff and Rex. Indeed, von Neumann had defined a quantity just like that as early as 1927 in the context of quantum mechanics (I'll get to that). So he knew exactly what he was talking about.

Let's assume that this is an authentic quote. I can see how it could be authentic, because the thermodynamic concept of entropy (due to the Austrian physicist Ludwig Boltzmann) can be quite, let's say, challenging. I'm perfectly happy to report that I did not understand it for the longest time, in fact not until I understood Shannon's entropy, and perhaps not until I understood quantum entropy.
Ludwig Boltzmann (1844-1906). Source: Wikimedia
Boltzmann defined entropy. In fact, his formula $S= k \log W$ is engraved on top of his tombstone, as shown here:
Google "Boltzmann tombstone" to see the entire marble edifice to Boltzmann
In this formula, $S$ stands for entropy, $k$ is now known as "Boltzmann's constant", and $W$ is the number of states (usually called "microstates" in statistical physics) a system can take on. But it is the $\log W$ that is the true entropy of the system. Entropy is actually a dimensionless quantity in thermodynamics. It takes on the form above (which has the dimensions of the constant $k$) if you fail to convert the energy units of temperature into more manageable units, such as the Kelvin. In fact, $k$ just tells you how to do this translation:
$$k=1.38\times 10^{-23} {\rm J/K},$$
where J (for Joule) is the SI unit for energy. If you define temperature in these units, then entropy is dimensionless
$$S=\log W.   (1)$$
But this doesn't at all look like Shannon's formula, you say? 

You're quite right. We still have a bit of work to do. We haven't yet exploited the fact that $\log W$ is the number of microstates consistent with a macrostate at energy $E$. Let us write down the probability distribution $w(E)$ for the macrostate to be found with energy $E$. We can then see that


I'm sorry, that last derivation was censored. It would have bored the tears out of you. I know because I could barely stand it myself. I can tell you where to look it up in Landau & Lifshitz if you really want to see it.

The final result is this: Eq. (1) can be written as
$$S=-\sum_{E_i} w_i\log w_i   (2)$$
implying that Boltzmann's entropy formula looks to be exactly the same as Shannon's. 

Except, of course, that in the equation above the probabilities $w_i$ are all equal to each other. If some microstates are more likely than others, the entropy becomes simply
$$S=-\sum_{E_i} p_i\log p_i     (3)$$
where the $p_i$ are the different probabilities to occupy the different microstate $i$. 

Equation (3) was derived by the American theoretical physicist Willard Gibbs, who is generally credited for the development of statistical mechanics. 

J. Willard Gibbs (1839-1903) Source: Wikimedia
Now Eq. (3) does precisely look like Shannon's, which you can check by comparing to Eq. (1) in the post "What is Information? (Part 3: Everything is conditional)". Thus, it is Gibbs's entropy that is like Shannon's, not Boltzmann's. But before I discuss this subtlety, ponder this:

At first sight, this similarity between Boltzmann's and Shannon's entropy appears ludicrous. Boltzmann was concerned with the dynamics of gases (and many-particle systems in general). Shannon wanted to understand whether you can communicate accurately over noisy channels. These appear to be completely unrelated endeavors. Except they are not, if you move far enough away from the particulars. Both, in the end, have to do with measurement. 

If you want to communicate over a noisy channel, the difficult part is on the receiving end (even though you quickly find out that in order to be able to receive the message in its pristine form, you also have to do some work at the sender's end). Retrieving a message from a noisy channel requires that you or I make accurate measurements that can distinguish the signal from the noise. 

If you want to characterize the state of a many-particle system, you have to do something other than measure the state of every particle (because that would be impossible). You'll have to develop a theory that allows us to quantify the state given a handful of proxy variables, such as energy, temperature, and pressure. This is, fundamentally, what thermodynamics is all about. But before you can think about what to measure in order to know the state of your system, you have to define what it is you don't know. This is Boltzmann's entropy: how much you don't know about the many-particle system. 

In Shannon's channel, a message is simply a set of symbols that can encode meaning (they can refer to something). But before it has any meaning, it is just a vessel that can carry information. How much information? This is what's given by Shannon's entropy. Thus, the Shannon entropy quantifies how much information you could possibly send across the channel (per use of the channel), that is, entropy is potential information

Of course, Boltzmann entropy is also potential information: If you knew the state of the many-particle system precisely, then the Boltzmann entropy would vanish. You (being an ardent student of thermodynamics) already know what is required to make a thermodynamical entropy vanish: the temperature of the system must be zero. This, incidentally, is the content of the third law of thermodynamics.

"The third law?", I hear some of you exclaim. "What about the second?"

Yes, what about this so-called Second Law?

To be continued, with special emphasis on the Second Law, in Part 2