## Friday, December 5, 2014

### Life and Information

I wrote a blog post about "information-theoretic considerations concerning life and its origins" for PBS's Blog "The Nature of Reality", but as they own the copyright to that piece, I cannot reproduce it here. You are free to follow the link, though: "Living Bits: Information and the Origin of Life".

I'm not complaining: the contract I signed clearly spelled out ownership. I can ask for permission to reproduce it, and I may.

The piece is based in part on an article that is currently in review. You can find the arxiv version here, and some other bloggers comments here and here

Creationists also had something to say about that article, but I won't link any of it here. After all, this is a serious blog.

## Sunday, November 9, 2014

### On quantum measurement (Part 5: Quantum Venn diagrams)

Here's what you missed, in case you have stumbled into this series midway. As you are wont to do, of course.

Part 1 had me reminiscing about how I got interested in the quantum measurement problem, even though my Ph.D. was in theoretical nuclear physics, not "foundational" stuff, and introduced the incomparable Hans Bethe, who put my colleague Nicolas Cerf and I on the scent of the problem.

Part 2 provides a little bit of historical background. After all, a bunch of people have thought about quantum measurement, and they are all rightfully famous: Bohr, Einstein, von Neumann. Two of those three are also heroes of mine. Two, not three.

Part 3 starts out with the math of classical measurement, and then goes on to show that quantum mechanics can't do anything like that, because no-cloning. Really: the no-cloning theorem ruins quantum measurement. Read about it if you don't believe me.

Part 4 goes further. In that part you learn that measuring something in quantum physics means not looking at the quantum system, and that classical measurement devices are, in truth,  really, really large quantum measurement devices, whose measurement basis is statistically orthogonal to the quantum system (on account of them being very high-dimensional). But that you should still respect their quantumness, which Bohr did not.

Sometimes I wonder how our understanding of quantum physics would be if Bohr had never lived. Well, come to think of it, perhaps I would not be writing this, as Bohr actually gave Gerry Brown his first faculty position at the NORDITA in Copenhagen, in 1960 (yes, before I was even born).  And it was in Gerry's group where I got my Ph.D., which led to everything else. So, if Niels Bohr had never lived, we would all understand quantum mechanics a little better, and this blog series would not only never have been written, but also be altogether unnecessary? So, when I wonder about such things, clearly I am wasting everybody's time.

All right, let's get back to the problem at hand. I showed you how Born's rule emerges from not looking. Not looking at the quantum system, that is, which of course you never do because you are so focused on your classical measurement device (that you fervently hope will reveal to you the quantum truth). And said measurement device then proceeds to lie to you by not revealing the quantum truth, because it can't. Let's study this mathematically.

First, I will change a bit the description of the measurement process from what I showed you in the previous post, where a quantum system (to be measured) was entangled with another measurement device (which intrinsically is also a quantum system). One of the two (system and measurement device) has a special role (namely we are going to look at it, because it looks classical). Rather than describing that measurement device by $10^{23}$ qubits measuring that lonely quantum bit, I'm going to describe the measurement device by two bits. I'm doing that so that I can monitor the consistency of the measurement device: after all, each and every fibre of the measurement device should confidently tell the same story, so all individual bits that make up the measurement device should agree. And if I show you only two of the bits and their correlation, that's because it is simpler than showing all $10^{23}$, even though the calculation including all the others would be exactly the same.

All right, let's do the measurement with three systems: the quantum system $Q$, and the measurement devices (aka ancillae, see previous posts for explanation of that terminology) $A_1$ and $A_2$.

Initially then, the quantum system and the ancillae are in the state
$$|Q\rangle|A_1A_2\rangle=|Q\rangle|00\rangle.$$
I'll be working in the "position-momentum" picture of measurement again, that is, the state I want to transfer from $Q$ to $A$ is the position $x$. And I'm going to jump right in and say that $Q$ is in a superposition $x+y$. After measurement, the system $QA_1A_2$ will then be
$$|QA_1A_2\rangle=|x,x,x\rangle+|y,y,y\rangle.$$
Note that I'm dispensing with normalizing the state. Because I'm not a mathematician, is why. I am allowed to be sloppy to get the point across.

This quantum state after measurement is pure, which you know of course means that it is perfectly "known", and has zero entropy:
$$\rho_{QA_1A_2}=|QA_1A_2\rangle\langle QA_1A_2|.$$
Yes, obviously something that is perfectly known has zero uncertainty. And indeed, any density matrix of the form $|.\rangle\langle.|$ has vanishing entropy. If you are still wondering why, wait until you see some mixed ("non-pure") states, and you'll know.

Now, you're no dummy. I know you know what comes next. Yes, we're not looking at the quantum system $Q$. We're looking at *you*, the measurement device! So we have to trace out the quantum system $Q$ to do that. Nobody's looking at that.

Quick note on "tracing out". I remember when I first heard that jargony terminology of "tracing over" (or "out") a system. It is a mathematical operation that reduces the dimension of a matrix by "removing" the degrees of freedom that are "not involved'. In my view, the only way to really "get" what is going on there is to do one of those "tracing outs" yourself. Best example is, perhaps, to take the joint density matrix of an EPR pair, and "trace out" one of the two elements. Once you've done this and seen the result, you'll know in your guts forever what all this means. If this was a class, I'd show you at least two examples. Alas, it's a blog.

So let's trace out the quantum system, which is not involved in this measurement, after all. (See what I did there?)
$$\rho_{A_1A_2}={\rm Tr}(\rho_{QA_1A_2})=|x,x\rangle\langle x,x|+|y,y\rangle\langle y,y|\;.$$
Hey, this is a mixed state! It has *two* of the $|.\rangle\langle.|$ terms. And if I had done the normalization like I'm supposed to, each one would have a "0.5" in front of it.

Let's make a quick assessment of the entropies involve here. The entropy of the density matrix $\rho_{A_1A_2}$ is positive because it is a mixed state. But the entropy of the joint system was zero! Well, this is possible because someone you know has shown that conditional entropies can be negative:
$$S(QA_1A_2)=S(Q|A_1A_2)+S(A_1A_2)=0.$$
In the last equation, the left hand side has zero entropy because it is a pure state. The entropy of the mixed classical state (second term on right hand side) is positive, implying that the entropy of the quantum system given the measurement device (first term on the right hand side) is negative.

What about the measurement device itself? What is the shared entropy between all the "pieces" of the measurement device? Because I gave you only two pieces here, the calculation is much simpler than you might have imagined. I only have to calculate the shared entropy between $A_1$ and $A_2$. But that is trivial given the density matrix $\rho_{A_1A_2}$. Whatever $A_1$ shows, $A_2$ shows also: every single piece of the measurement device agrees with every other piece. Pure bliss and harmony!

Except when you begin to understand that this kumbaya of understanding may have nothing at all to do with the state of the quantum system! They may all sing the same tune, but the melody can be false. Like I said before: measurement devices can lie to you, and I'll now proceed to show you that they must.

The pieces of the measurement device are correlated, all right. A quick look at the entropy Venn diagram will tell you as much:
 Fig. 1: Venn diagram of the entropies in the measurement device made by the pieces $A_1$ and $A_2$.
Here, the entropy $S$ is the logarithm of the number of states that the device can possibly take on. A simple example is a device that can take on only two states, in which case $S=1$ bit. You can also imagine a Venn diagram of a measurement device with more than two pieces. If it is more than five your imagination may become fuzzy. The main thing to remember when dealing with classical measurement devices is that each piece of the device is exactly like any other piece. Once you know the state of one part, you know the state of all other parts. The device is of "one mind", not several.

But we know, of course, that the pieces by themselves are not really classical, they are quantum. How come they look classical?

Let's look at the entire system from a quantum information-theoretic point of view, not just the measurement device. The Venn diagram in question, of a quantum system $Q$ measured by a classical system $A$ that has two pieces $A_1$ and $A_2$ is
 Fig. 2: Venn diagram of entropies for the full quantum measurement problem: including the quantum  system $A$ and two "pieces" of the measurement device $A$.
Now, that diagram looks a bit baffling, so let's spend some time with it. There are a bunch of minus signs in there for conditional entropies, but they should not be baffling you, because you should be getting used to them by now. Remember, $A$ is measuring $Q$. Let's take a look at what the entropy Venn diagram between $A$ and $Q$ looks like:
 Fig. 3: Entropic Venn diagram for quantum system $Q$ and measurement device $A$
That's right, $Q$ and $A$ are perfectly entangled, because that is what the measurement operation does when you deal with quantum systems: it entangles. This diagram can be obtained in a straightforward manner from the joint diagram just above, simply by taking $A$ to be the joint system $A_1A_2$. Then, the conditional entropy of $A$ (given $Q$) is the sum of the three terms $-S$, $S$, and $-S$, the shared entropy is the sum of the three terms $S$, 0, and $S$, and so on. And, if you ignore $Q$ (meaning you don't look at it), then you get back the classically correlated diagram (0,$S$,0) for $A_1$ and $A_2$ you see in Fig. 1.

But how much does the measurement device $A$ know about the quantum system?

From the entropy diagram above, the shared quantum entropy is $2S$, twice as much as the classical device can have! That doesn't seem to make any sense, and that is because the Venn diagram above has quantum entanglement $2S$, which is not the same thing as classical information. Classical information is that which all the pieces of the measurement device agree upon. So let's find out how much of that shared entropy is actually shared with quantum system.

That, my friends, is given by the center of the triple Venn diagram above (Fig. 2). And that entry happens to be zero!

"Oh, well", you rumble, "that must be due to how you chose to construct this particular measurement!"

Actually, no. That big fat zero is generic, it will always be there in any tri-partite quantum entropy diagram where the joint system (system and measurement device combined) are fully known. It is a LAW.
"What law is that?", you immediately question.

That law is nothing other than the law that prevents quantum cloning! The classical device cannot have any information about the quantum system, because as I said in a previous post, quantum measurement is actually impossible! What I just showed you is the mathematical proof of that statement.
"Hold on, hold on!"
What?
"Measurement reveals nothing about the quantum state? Ever?"

I can see you're alarmed. But rest assured: it depends. If quantum system $Q$ and classical measurement device $A$ (composed of a zillion subsystems $A_1,....A_{\rm zillion}$) are the only things there (meaning they form a pure state together), then yeah, you can't know anything. This law is really a quite reasonable one: sometimes it is called the law of "monogamy of entanglement".

Otherwise, that is, if a bunch of other systems exist so that, if you trace over them the state and the measurement device are mixed, then you can learn something from your measurement.

"Back up there for a sec. Monogamy what?"

Monogamy of entanglement? That's just a law that says that if one system is fully entangled with another (meaning their entropy diagram looks like Fig. 3), then they can't share that entanglement in the same intimate manner with a third party. Hence the big fat zero in the center of the Venn diagram in Fig. 2. If you think just a little about it, it will occur to you that if you could clone quantum states, then entanglement would not have to be monogamous at all.

"A second is all it takes?"

I said a little. It could be a second. It could be a tad longer. Stop reading and start thinking. I't not a very hard quiz.

"Oh, right, because if you're entangled with one system, and then if you could make a clone..."

Go on, you're on the right track!

"...then I could take a partner of a fully entangled pair, clone it, and then transform its partner into a new state (because if you clone the state you got to take its partner with it) so that the cloned state is entangled with the new state. Voilà, no more monogamy."

OK you, get back to your seat. You get an A.

So we learn that if a measurement leaves the quantum system in a state as depicted in Fig. 2, then nothing at all can be learned about the quantum state. But this is hardly if ever the case. In general, there are many other systems that the quantum system $A$ is entangled with. And if we do not observe them, then the joint system $QA$ is not pure, meaning that the joint entropy does not vanish. And in that case, the center of the triple quantum Venn diagram does not vanish. And because that center tells you how much information you can obtain about the quantum system in a classical measurement, that means that you can actually learn something about the quantum system, after all.

But keep in mind: if the joint system $QA$ does not have exactly zero entropy, it is just a little bit classical, for reasons I discussed earlier in this series. Indeed, the whole concept of "environment-induced superselection" (or einselection) advocated by the quantum theorist Wojciech Zurek hinges precisely on such additional systems that interact with $QA$ (the "environment"), and that are being ignored ("traced over"). They "classicize" the quantum system, and allow you to extract some information about it.

I do realize that "classicize" is not a word. Or at least, not a word usually used in this context.  Not until I gave it a new meaning, right?

With this, dear patient reader, I must end this particular post, as I have exceeded the length limit of single blog posts (a limit that I clearly have not bothered with in any of my previous posts). I know you are waiting for Schrödinger's cat. I'll give you that, along with your run-of-the-mill beam splitter, as well as the not-so-run-of-the-mill quantum eraser, in Part 6.

"Quantum eraser? Is that some sort of super hero, or are you telling me that you can reverse a quantum measurement?"

Oh, the quantum eraser is real. The super hero, however, is Marlan Scully, and you'll learn all about it in Part 6.

## Tuesday, October 7, 2014

### Nifty papers I wrote that nobody knows about (Part 4: Complex Langevin equation)

This is the last installment of the "Nifty Papers" series. Here are the links to Part1, Part2, and Part 3.

For those outside the computational physics community, the following words don't mean anything:

For those others that have encountered the problem, these words elicit terror. They stand for sleepless nights. They spell despair. They make grown men and women weep helplessly. The Sign Problem.

OK, I get it, you're not one of those. So let me help you out.

In computational physics, one of the main tools people use to calculate complicated quantities is the Monte Carlo method. The method relies on random sampling of distributions in order to obtain accurate estimates of means. In the lab where I was a postdoc from 1992-1995, the Monte Carlo methods was used predominantly to calculate the properties of nuclei, using a shell model approach.

I can't get into the specifics of the Monte Carlo method in this post, not the least because such an exposition would loose me a good fraction of what viewers/readers I have left at this point. Basically, it is a numerical method to calculate integrals (even though it can be used for other things too). It involves sampling the integrand and summing the terms. If the integrand is strongly oscillating (lots of high positives and high negatives), then the integral may be slow to converge. Such integrals appear in particular when calculating expectation values in strongly interacting systems, such as for example big nuclei. And yes, the group I had joined as a postdoc at that point in my career specialized in calculating properties of large nuclei computationally using the nuclear shell model. These folks would battle the sign problem on a daily basis.

And while as a Fairchild Prize Fellow (at the time it was called the "Division Prize Fellowship", because Fairchild did not at that time want their name attached) I could work on anything I wanted (and I did!), I also wanted to do something that would make the life of these folks a little easier. I decided to try to tackle the sign problem. I started work on this problem in the beginning of 1993 (the first page of my notes reproduced below is dated February 9th, 1993, shortly after I arrived).

The last calculation, pages 720-727 of my notes, is dated August 27th, 1999, so I clearly took my good time with this project! Actually, it lay dormant for about four years as I worked on digital life and quantum information theory. But my notes were so detailed that I could pick the project back up in 1999.

The idea to use the complex Langevin equation to calculate "difficult" integrals is not mine, and not new (the literature on this topic goes back to 1985, see the review by Gausterer [1]). I actually had the idea without knowing these papers, but this is neither here nor there. I was the first to apply the method to the many-fermion problem, where I also was able to show that the complex Langevin (CL) averages converge reliably. Indeed, the CL method was, when I began working on it, largely abandoned because people did not trust those averages. But enough of the preliminaries. Let's jump into the mathematics.

Take a look at the following integral:

$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}\cos(\sigma z).$$
This integral looks very much like the Gaussian integral
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}=1,$$
except for that cosine function. The exact result for the integral with the cosine function is (trust me there, but of course you can work it out yourself if you feel like it)
$$e^{-(1/2)z^2}.$$
This result might surprise you, as the integrand itself (on account of the cos function) oscillates a lot:
 The integrand $\cos(10x)e^{-1/2 x^2}$
The factor $e^{-(1/2) \sigma^2}$ dampens these oscillations, and in the end the result is simple: It is as if the cosine function wasn't even there, and just replaces $\sigma$ by $z$.  But a Monte Carlo evaluation of this integral runs into the sign problem when $z$ gets large and the oscillations become more and more violent. The numerical average converges very very slowly, which means that your computer has to run for a very long time to get a good estimate.

Now imagine calculating an expectation value where this problem occurs both in the numerator and the denominator. In that case, we have to deal with small but weakly converging averages both in the numerator and denominator, and the ratio converges even more slowly. For example, imagine calculating the "mean square"
The denominator of this ratio (for $N=1$) is the integral we looked at above. The numerator just has an extra $\sigma^2$ in it. The $N$ ("particle number") is there to just make things worse if you choose a large one, just as in nuclear physics larger nuclei are harder to calculate. I show you below the result of calculating this expectation value using the Monte Carlo approach (data with error bars), along with the analytical exact result (solid line), and as inset the average "sign" $\Phi$ of the calculation. The sign here is just the expectation value of
$$\Phi(z)=\frac{\cos(\sigma z)}{|\cos(\sigma z)|}$$

You see that for increasing $z$, the Monte Carlo average becomes very noisy, and the average sign disappears. For a $z$ larger than three, this calculation is quite hopeless: sign 1, Monte Carlo 0.

I want to make one thing clear here: of course you would not use the Monte Carlo method to calculate this integral if you can do it "by hand" (as you can for the example I show here). I'm using this integral as a test case, because the exact result is easy to get. The gist is: if you can solve this integral computationally, maybe you can solve those integrals for which you don't know the answer analytically in the same manner. And then you solve the sign problem. So what other methods are there?

The solution I proposed was using the complex Langevin equation. Before moving to the complex version (and why), let's look at using the real Langevin equation to calculate averages. The idea here is the following. When you calculate an integral using the Monte Carlo approach, what you are really doing is summing over a set of points that are chosen such that you reject (probabilistically) those that are not close to the integrand--and you accept those that are close, again probabilistically, which creates a sequence of random samples that approximates the probability distribution that you want to integrate.

But there are other methods to create sequences that appear to be drawn from a given probability distribution. One is the Langevin equation which I'm going to explain. Another is the Fokker-Planck equation, which is related to the Langevin equation but that I'm not going to explain.

Here's the theory (not due to me, of course), on how you use the Langevin equation to calculate averages. Say you want to calculate the expectation value of a function $O(\sigma)$. To do that, you need to average $O(\sigma)$, which means you sum (and by that I mean integrate), this function over the probability that you find $\sigma$. The idea here is that $\sigma$ is controlled by a physical process: $\sigma$ does not change randomly, but according to some laws of physics. You want to know the average $O$, which depends on $\sigma$, given that $\sigma$ changes according to some natural process.

If you think about it long enough, you realize that many many things in physics boil down to calculating averages just like that. Say, the pressure at room temperature given that the molecules are moving according to the known laws of physics. Right, almost everything in physics, then. So you see, being able to do this is important. Most of the time, Monte Carlo will serve you just fine. We are dealing with all the other cases here.

First, we need to make sure we capture the fact that the variable $\sigma$ changes according to some physical law. When you are first exposed to classical mechanics, you learn that the time development of any variable is described by a Lagrangian function (and then you move on to the Hamiltonian so that you are prepared to deal with quantum mechanics, but we won't go there here). The integral of the Lagrangian is called the "action" $S$, and that is the function that is used to quantify how likely any variable $\sigma$ is given that it follows these laws. For example, if you are a particle following the laws of gravity, then I can write down for you the Lagrangian (and hence the action) that makes sure the particles follow the law. It is $L=-\frac12m v^2+mV(\sigma)$, where $m$ is the mass, and $v$ is the velocity of the $\sigma$ variable, $v=d\sigma/dt$,  and $V(\sigma)$ is the gravitational potential.

The action is $S=\int dt L(\sigma(t)) dt$, and the equilibrium distribution of $\sigma$ is
$$P(\sigma)=\frac1Z e^{-S}$$ where $Z$ is the partition function $Z=\int e^{-S}d\sigma$.

In computational physics, what you want is a process that creates this equilibrium distribution, because if you have it, then you can just sum over the variables so created and you have your integral. Monte Carlo is one method to create that distribution. We are looking for another.

It turns out that the Langevin equation
$$\frac{d\sigma}{dt}=-\frac12 \frac{dS}{d\sigma}+\eta(t)$$
creates precisely such a process. Here, $S$ is the action for the process, and $\eta(t)$ is a noise term with zero mean and unit variance:
$$\langle \eta(t)\eta(t^{\prime})\rangle=\delta(t-t^\prime).$$
Note that $t$ here is a "fictitious" time: we use it only to create a set of $\sigma$s that are distributed according to the probability distribution $P(\sigma)$ above. If we have this fictitious time series $\sigma_0$ (the solution to the differential equation above), then we can just average the observable $O(\sigma)$:
$$\langle O\rangle=\lim_{T\to\infty}\frac1T\int_0^TO(\sigma_0(t))dt$$
Let's try the "Langevin approach" to calculating averages on the example integral $\langle \sigma^2\rangle_N$ above. The action we have to use is
$$S=\frac12 \sigma^2-N\ln [\cos(\sigma z)]$$ so that $e^{-S}$ gives exactly the integrand we are looking for. Remember, all expectation values are calculated as
$$\langle O\rangle=\frac{\int O(\sigma) e^{-S(\sigma)}d\sigma}{\int e^{-S(\sigma)}d\sigma}.$$

With that action, the Langevin equation is
$$\dot \sigma=-\frac12(\sigma+Nz\tan(\sigma z))+\eta \ \ \ \ (1)$$
This update rule creates a sequence of $\sigma$ that can be used to calculate the integral in question.

And the result is ..... a catastrophe!

The average does not converge, mainly because in the differential equation (1), I ignored a drift term that goes like $\pm i\delta(\cos(z\sigma))$. That it's there is not entirely trivial, but if you sit with that equation a little while you'll realize that weird stuff will happen if the cosine is zero. That term throws the trajectory all over the place once in a while, giving rise to an average that simply will not converge.

In the end, this is the sign problem raising its ugly head again. You do one thing, you do another, and it comes back to haunt you. Is there no escape?

You've been reading patiently so far, so you must have suspected that there is an escape. There is indeed, and I'll show it to you now.

This simple integral that we are trying to calculate
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}\cos(\sigma z),$$
we could really write it also as
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}e^{iz},$$
because the latter integral really has no imaginary part. Because the integral is symmetric.

This is the part that you have to understand to appreciate this article. And as a consequence this blog post.  If you did, skip the next part. It is only there for those people that are still scratching their head.