Eqs

Friday, December 5, 2014

Life and Information

I wrote a blog post about "information-theoretic considerations concerning life and its origins" for PBS's Blog "The Nature of Reality", but as they own the copyright to that piece, I cannot reproduce it here. You are free to follow the link, though: "Living Bits: Information and the Origin of Life".

I'm not complaining: the contract I signed clearly spelled out ownership. I can ask for permission to reproduce it, and I may. 

The piece is based in part on an article that is currently in review. You can find the arxiv version here, and some other bloggers comments here and here

Creationists also had something to say about that article, but I won't link any of it here. After all, this is a serious blog. 

Sunday, November 9, 2014

On quantum measurement (Part 5: Quantum Venn diagrams)

Here's what you missed, in case you have stumbled into this series midway. As you are wont to do, of course.

Part 1 had me reminiscing about how I got interested in the quantum measurement problem, even though my Ph.D. was in theoretical nuclear physics, not "foundational" stuff, and introduced the incomparable Hans Bethe, who put my colleague Nicolas Cerf and I on the scent of the problem.

Part 2 provides a little bit of historical background. After all, a bunch of people have thought about quantum measurement, and they are all rightfully famous: Bohr, Einstein, von Neumann. Two of those three are also heroes of mine. Two, not three.

Part 3 starts out with the math of classical measurement, and then goes on to show that quantum mechanics can't do anything like that, because no-cloning. Really: the no-cloning theorem ruins quantum measurement. Read about it if you don't believe me. 

Part 4 goes further. In that part you learn that measuring something in quantum physics means not looking at the quantum system, and that classical measurement devices are, in truth,  really, really large quantum measurement devices, whose measurement basis is statistically orthogonal to the quantum system (on account of them being very high-dimensional). But that you should still respect their quantumness, which Bohr did not.  

Sometimes I wonder how our understanding of quantum physics would be if Bohr had never lived. Well, come to think of it, perhaps I would not be writing this, as Bohr actually gave Gerry Brown his first faculty position at the NORDITA in Copenhagen, in 1960 (yes, before I was even born).  And it was in Gerry's group where I got my Ph.D., which led to everything else. So, if Niels Bohr had never lived, we would all understand quantum mechanics a little better, and this blog series would not only never have been written, but also be altogether unnecessary? So, when I wonder about such things, clearly I am wasting everybody's time.

All right, let's get back to the problem at hand. I showed you how Born's rule emerges from not looking. Not looking at the quantum system, that is, which of course you never do because you are so focused on your classical measurement device (that you fervently hope will reveal to you the quantum truth). And said measurement device then proceeds to lie to you by not revealing the quantum truth, because it can't. Let's study this mathematically.  

First, I will change a bit the description of the measurement process from what I showed you in the previous post, where a quantum system (to be measured) was entangled with another measurement device (which intrinsically is also a quantum system). One of the two (system and measurement device) has a special role (namely we are going to look at it, because it looks classical). Rather than describing that measurement device by \(10^{23}\) qubits measuring that lonely quantum bit, I'm going to describe the measurement device by two bits. I'm doing that so that I can monitor the consistency of the measurement device: after all, each and every fibre of the measurement device should confidently tell the same story, so all individual bits that make up the measurement device should agree. And if I show you only two of the bits and their correlation, that's because it is simpler than showing all \(10^{23}\), even though the calculation including all the others would be exactly the same. 

All right, let's do the measurement with three systems: the quantum system \(Q\), and the measurement devices (aka ancillae, see previous posts for explanation of that terminology) \(A_1\) and \(A_2\). 

Initially then, the quantum system and the ancillae are in the state
$$|Q\rangle|A_1A_2\rangle=|Q\rangle|00\rangle.$$
I'll be working in the "position-momentum" picture of measurement again, that is, the state I want to transfer from \(Q\) to \(A\) is the position \(x\). And I'm going to jump right in and say that \(Q\) is in a superposition \(x+y\). After measurement, the system \(QA_1A_2\) will then be
$$|QA_1A_2\rangle=|x,x,x\rangle+|y,y,y\rangle.$$
Note that I'm dispensing with normalizing the state. Because I'm not a mathematician, is why. I am allowed to be sloppy to get the point across.

This quantum state after measurement is pure, which you know of course means that it is perfectly "known", and has zero entropy:
$$\rho_{QA_1A_2}=|QA_1A_2\rangle\langle QA_1A_2|.$$
Yes, obviously something that is perfectly known has zero uncertainty. And indeed, any density matrix of the form \(|.\rangle\langle.|\) has vanishing entropy. If you are still wondering why, wait until you see some mixed ("non-pure") states, and you'll know.

Now, you're no dummy. I know you know what comes next. Yes, we're not looking at the quantum system \(Q\). We're looking at *you*, the measurement device! So we have to trace out the quantum system \(Q\) to do that. Nobody's looking at that.

Quick note on "tracing out". I remember when I first heard that jargony terminology of "tracing over" (or "out") a system. It is a mathematical operation that reduces the dimension of a matrix by "removing" the degrees of freedom that are "not involved'. In my view, the only way to really "get" what is going on there is to do one of those "tracing outs" yourself. Best example is, perhaps, to take the joint density matrix of an EPR pair, and "trace out" one of the two elements. Once you've done this and seen the result, you'll know in your guts forever what all this means. If this was a class, I'd show you at least two examples. Alas, it's a blog.

So let's trace out the quantum system, which is not involved in this measurement, after all. (See what I did there?)
$$\rho_{A_1A_2}={\rm Tr}(\rho_{QA_1A_2})=|x,x\rangle\langle x,x|+|y,y\rangle\langle y,y|\;.$$
Hey, this is a mixed state! It has *two* of the $|.\rangle\langle.|$ terms. And if I had done the normalization like I'm supposed to, each one would have a "0.5" in front of it. 

Let's make a quick assessment of the entropies involve here. The entropy of the density matrix \(\rho_{A_1A_2}\) is positive because it is a mixed state. But the entropy of the joint system was zero! Well, this is possible because someone you know has shown that conditional entropies can be negative:
$$S(QA_1A_2)=S(Q|A_1A_2)+S(A_1A_2)=0.$$
In the last equation, the left hand side has zero entropy because it is a pure state. The entropy of the mixed classical state (second term on right hand side) is positive, implying that the entropy of the quantum system given the measurement device (first term on the right hand side) is negative.

What about the measurement device itself? What is the shared entropy between all the "pieces" of the measurement device? Because I gave you only two pieces here, the calculation is much simpler than you might have imagined. I only have to calculate the shared entropy between \(A_1\) and \(A_2\). But that is trivial given the density matrix \(\rho_{A_1A_2}\). Whatever \(A_1\) shows, \(A_2\) shows also: every single piece of the measurement device agrees with every other piece. Pure bliss and harmony!

Except when you begin to understand that this kumbaya of understanding may have nothing at all to do with the state of the quantum system! They may all sing the same tune, but the melody can be false. Like I said before: measurement devices can lie to you, and I'll now proceed to show you that they must.

The pieces of the measurement device are correlated, all right. A quick look at the entropy Venn diagram will tell you as much:
 Fig. 1: Venn diagram of the entropies in the measurement device made by the pieces \(A_1\) and \(A_2\).
Here, the entropy \(S\) is the logarithm of the number of states that the device can possibly take on. A simple example is a device that can take on only two states, in which case \(S=1\) bit. You can also imagine a Venn diagram of a measurement device with more than two pieces. If it is more than five your imagination may become fuzzy. The main thing to remember when dealing with classical measurement devices is that each piece of the device is exactly like any other piece. Once you know the state of one part, you know the state of all other parts. The device is of "one mind", not several. 

But we know, of course, that the pieces by themselves are not really classical, they are quantum. How come they look classical? 

Let's look at the entire system from a quantum information-theoretic point of view, not just the measurement device. The Venn diagram in question, of a quantum system \(Q\) measured by a classical system \(A\) that has two pieces \(A_1\) and \(A_2\) is
Fig. 2: Venn diagram of entropies for the full quantum measurement problem: including the quantum 
system \(A\) and two "pieces" of the measurement device \(A\).
Now, that diagram looks a bit baffling, so let's spend some time with it. There are a bunch of minus signs in there for conditional entropies, but they should not be baffling you, because you should be getting used to them by now. Remember, \(A\) is measuring \(Q\). Let's take a look at what the entropy Venn diagram between \(A\) and \(Q\) looks like:
Fig. 3: Entropic Venn diagram for quantum system \(Q\) and measurement device \(A\)
That's right, \(Q\) and \(A\) are perfectly entangled, because that is what the measurement operation does when you deal with quantum systems: it entangles. This diagram can be obtained in a straightforward manner from the joint diagram just above, simply by taking \(A\) to be the joint system \(A_1A_2\). Then, the conditional entropy of \(A\) (given \(Q\)) is the sum of the three terms \(-S\), \(S\), and \(-S\), the shared entropy is the sum of the three terms \(S\), 0, and \(S\), and so on. And, if you ignore \(Q\) (meaning you don't look at it), then you get back the classically correlated diagram (0,\(S\),0) for \(A_1\) and \(A_2\) you see in Fig. 1.

But how much does the measurement device \(A\) know about the quantum system? 

From the entropy diagram above, the shared quantum entropy is \(2S\), twice as much as the classical device can have! That doesn't seem to make any sense, and that is because the Venn diagram above has quantum entanglement (\2S\), which is not the same thing as classical information. Classical information is that which all the pieces of the measurement device agree upon. So let's find out how much of that shared entropy is actually shared with the quantum system. 

That, my friends, is given by the center of the triple Venn diagram above (Fig. 2). And that entry happens to be zero!

"Oh, well", you rumble, "that must be due to how you chose to construct this particular measurement!"

Actually, no. That big fat zero is generic, it will always be there in any tri-partite quantum entropy diagram where the joint system (system and measurement device combined) are fully known. It is a LAW.

"What law is that?", you immediately question.

That law is nothing other than the law that prevents quantum cloning! The classical device cannot have any information about the quantum system, because as I said in a previous post, quantum measurement is actually impossible! What I just showed you is the mathematical proof of that statement.

"Hold on, hold on!"
What?
"Measurement reveals nothing about the quantum state? Ever?"

I can see you're alarmed. But rest assured: it depends. If quantum system \(Q\) and classical measurement device A\(\) (composed of a zillion subsystems \(,....A_{\rm zillion}\)) are the only things there (meaning they form a pure state together), then yeah, you can't know anything. This law is really a quite reasonable one: sometimes it is called the law of "monogamy of entanglement". 

Otherwise, that is, if a bunch of other systems exist so that, if you trace over them, the state and the measurement device are mixed, then you can learn something from your measurement. 

"Back up there for a sec. Monogamy what?"

Monogamy of entanglement? That's just a law that says that if one system is fully entangled with another (meaning their entropy diagram looks like Fig. 3), then they can't share that entanglement in the same intimate manner with a third party. Hence the big fat zero in the center of the Venn diagram in Fig. 2. If you think just a little about it, it will occur to you that if you could clone quantum states, then entanglement would not have to be monogamous at all.

"A second is all it takes?"

I said a little. It could be a second. It could be a tad longer. Stop reading and start thinking. It's not a very hard quiz.

"Oh, right, because if you're entangled with one system, and then if you could make a clone..."

Go on, you're on the right track!

"...then I could take a partner of a fully entangled pair, clone it, and then transform its partner into a new state (because if you clone the state you got to take its partner with it) so that the cloned state is entangled with the new state. Voilà, no more monogamy."

OK you, get back to your seat. You get an A.

So we learn that if a measurement leaves the quantum system in a state as depicted in Fig. 2, then nothing at all can be learned about the quantum state. But this is hardly if ever the case. In general, there are many other systems that the quantum system \(A\) is entangled with. And if we do not observe them, then the joint system \(QA\) is not pure, meaning that the joint entropy does not vanish. And in that case, the center of the triple quantum Venn diagram does not vanish. And because that center tells you how much information you can obtain about the quantum system in a classical measurement, that means that you can actually learn something about the quantum system, after all. 

But keep in mind: if the joint system \(QA\) does not have exactly zero entropy, it is just a little bit classical, for reasons I discussed earlier in this series. Indeed, the whole concept of "environment-induced superselection" (or einselection) advocated by the quantum theorist Wojciech Zurek hinges precisely on such additional systems that interact with \(QA\) (the "environment"), and that are being ignored ("traced over"). They "classicize" the quantum system, and allow you to extract some information about it. 

I do realize that "classicize" is not a word. Or at least, not a word usually used in this context.  Not until I gave it a new meaning, right?

With this, dear patient reader, I must end this particular post, as I have exceeded the length limit of single blog posts (a limit that I clearly have not bothered with in any of my previous posts). I know you are waiting for Schrödinger's cat. I'll give you that, along with your run-of-the-mill beam splitter, as well as the not-so-run-of-the-mill quantum eraser, in Part 6.

"Quantum eraser? Is that some sort of superhero, or are you telling me that you can reverse a quantum measurement?" 

Oh, the quantum eraser is real. The superhero, however, is Marlan Scully, and you'll learn all about it in Part 6.




Tuesday, October 7, 2014

Nifty papers I wrote that nobody knows about (Part 4: Complex Langevin equation)

This is the last installment of the "Nifty Papers" series. Here are the links to Part1, Part2, and Part 3.

For those outside the computational physics community, the following words don't mean anything: 


For those others that have encountered the problem, these words elicit terror. They stand for sleepless nights. They spell despair. They make grown men and women weep helplessly. The Sign Problem. 

OK, I get it, you're not one of those. So let me help you out.

In computational physics, one of the main tools people use to calculate complicated quantities is the Monte Carlo method. The method relies on random sampling of distributions in order to obtain accurate estimates of means. In the lab where I was a postdoc from 1992-1995, the Monte Carlo methods was used predominantly to calculate the properties of nuclei, using a shell model approach. 

I can't get into the specifics of the Monte Carlo method in this post, not the least because such an exposition would loose me a good fraction of what viewers/readers I have left at this point. Basically, it is a numerical method to calculate integrals (even though it can be used for other things too). It involves sampling the integrand and summing the terms. If the integrand is strongly oscillating (lots of high positives and high negatives), then the integral may be slow to converge. Such integrals appear in particular when calculating expectation values in strongly interacting systems, such as for example big nuclei. And yes, the group I had joined as a postdoc at that point in my career specialized in calculating properties of large nuclei computationally using the nuclear shell model. These folks would battle the sign problem on a daily basis.

And while as a Fairchild Prize Fellow (at the time it was called the "Division Prize Fellowship", because Fairchild did not at that time want their name attached) I could work on anything I wanted (and I did!), I also wanted to do something that would make the life of these folks a little easier. I decided to try to tackle the sign problem. I started work on this problem in the beginning of 1993 (the first page of my notes reproduced below is dated February 9th, 1993, shortly after I arrived).

The last calculation, pages 720-727 of my notes, is dated August 27th, 1999, so I clearly took my good time with this project! Actually, it lay dormant for about four years as I worked on digital life and quantum information theory. But my notes were so detailed that I could pick the project back up in 1999.


The idea to use the complex Langevin equation to calculate "difficult" integrals is not mine, and not new (the literature on this topic goes back to 1985, see the review by Gausterer [1]). I actually had the idea without knowing these papers, but this is neither here nor there. I was the first to apply the method to the many-fermion problem, where I also was able to show that the complex Langevin (CL) averages converge reliably. Indeed, the CL method was, when I began working on it, largely abandoned because people did not trust those averages. But enough of the preliminaries. Let's jump into the mathematics.

Take a look at the following integral:

$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}\cos(\sigma z).$$
This integral looks very much like the Gaussian integral
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}=1,$$
except for that cosine function. The exact result for the integral with the cosine function is (trust me there, but of course you can work it out yourself if you feel like it) 
$$e^{-(1/2)z^2}.$$
This result might surprise you, as the integrand itself (on account of the cos function) oscillates a lot:
The integrand \(\cos(10x)e^{-1/2 x^2}\)
The factor \(e^{-(1/2) \sigma^2}\) dampens these oscillations, and in the end the result is simple: It is as if the cosine function wasn't even there, and just replaces \(\sigma\) by \(z\).  But a Monte Carlo evaluation of this integral runs into the sign problem when \(z\) gets large and the oscillations become more and more violent. The numerical average converges very very slowly, which means that your computer has to run for a very long time to get a good estimate.

Now imagine calculating an expectation value where this problem occurs both in the numerator and the denominator. In that case, we have to deal with small but weakly converging averages both in the numerator and denominator, and the ratio converges even more slowly. For example, imagine calculating the "mean square"
The denominator of this ratio (for \(N=1\)) is the integral we looked at above. The numerator just has an extra \(\sigma^2\) in it. The \(N\) ("particle number") is there to just make things worse if you choose a large one, just as in nuclear physics larger nuclei are harder to calculate. I show you below the result of calculating this expectation value using the Monte Carlo approach (data with error bars), along with the analytical exact result (solid line), and as inset the average "sign" \(\Phi\) of the calculation. The sign here is just the expectation value of
$$\Phi(z)=\frac{\cos(\sigma z)}{|\cos(\sigma z)|}$$

You see that for increasing \(z\), the Monte Carlo average becomes very noisy, and the average sign disappears. For a \(z\) larger than three, this calculation is quite hopeless: sign 1, Monte Carlo 0.

I want to make one thing clear here: of course you would not use the Monte Carlo method to calculate this integral if you can do it "by hand" (as you can for the example I show here). I'm using this integral as a test case, because the exact result is easy to get. The gist is: if you can solve this integral computationally, maybe you can solve those integrals for which you don't know the answer analytically in the same manner. And then you solve the sign problem. So what other methods are there?

The solution I proposed was using the complex Langevin equation. Before moving to the complex version (and why), let's look at using the real Langevin equation to calculate averages. The idea here is the following. When you calculate an integral using the Monte Carlo approach, what you are really doing is summing over a set of points that are chosen such that you reject (probabilistically) those that are not close to the integrand--and you accept those that are close, again probabilistically, which creates a sequence of random samples that approximates the probability distribution that you want to integrate. 

But there are other methods to create sequences that appear to be drawn from a given probability distribution. One is the Langevin equation which I'm going to explain. Another is the Fokker-Planck equation, which is related to the Langevin equation but that I'm not going to explain. 

Here's the theory (not due to me, of course), on how you use the Langevin equation to calculate averages. Say you want to calculate the expectation value of a function \(O(\sigma)\). To do that, you need to average \(O(\sigma)\), which means you sum (and by that I mean integrate), this function over the probability that you find \(\sigma\). The idea here is that \(\sigma\) is controlled by a physical process: \(\sigma\) does not change randomly, but according to some laws of physics. You want to know the average \(O\), which depends on \(\sigma\), given that \(\sigma\) changes according to some natural process.

If you think about it long enough, you realize that many many things in physics boil down to calculating averages just like that. Say, the pressure at room temperature given that the molecules are moving according to the known laws of physics. Right, almost everything in physics, then. So you see, being able to do this is important. Most of the time, Monte Carlo will serve you just fine. We are dealing with all the other cases here. 

First, we need to make sure we capture the fact that the variable $\sigma$ changes according to some physical law. When you are first exposed to classical mechanics, you learn that the time development of any variable is described by a Lagrangian function (and then you move on to the Hamiltonian so that you are prepared to deal with quantum mechanics, but we won't go there here). The integral of the Lagrangian is called the "action" \(S\), and that is the function that is used to quantify how likely any variable \(\sigma\) is given that it follows these laws. For example, if you are a particle following the laws of gravity, then I can write down for you the Lagrangian (and hence the action) that makes sure the particles follow the law. It is \(L=-\frac12m v^2+mV(\sigma)\), where \(m\) is the mass, and \(v\) is the velocity of the \(\sigma\) variable, \(v=d\sigma/dt\),  and \(V(\sigma)\) is the gravitational potential.

The action is \(S=\int dt L(\sigma(t)) dt\), and the equilibrium distribution of \(\sigma\) is
$$P(\sigma)=\frac1Z e^{-S}$$ where $Z$ is the partition function \(Z=\int e^{-S}d\sigma\).

In computational physics, what you want is a process that creates this equilibrium distribution, because if you have it, then you can just sum over the variables so created and you have your integral. Monte Carlo is one method to create that distribution. We are looking for another. 

It turns out that the Langevin equation
$$\frac{d\sigma}{dt}=-\frac12 \frac{dS}{d\sigma}+\eta(t)$$
creates precisely such a process. Here, \(S\) is the action for the process, and \(\eta(t)\) is a noise term with zero mean and unit variance:
$$\langle \eta(t)\eta(t^{\prime})\rangle=\delta(t-t^\prime).$$
Note that \(t\) here is a "fictitious" time: we use it only to create a set of $\sigma$s that are distributed according to the probability distribution \(P(\sigma)\) above. If we have this fictitious time series \(\sigma_0\) (the solution to the differential equation above), then we can just average the observable \(O(\sigma)\):
$$\langle O\rangle=\lim_{T\to\infty}\frac1T\int_0^TO(\sigma_0(t))dt$$
Let's try the "Langevin approach" to calculating averages on the example integral \(\langle \sigma^2\rangle_N\) above. The action we have to use is
$$S=\frac12 \sigma^2-N\ln [\cos(\sigma z)]$$ so that \(e^{-S}\) gives exactly the integrand we are looking for. Remember, all expectation values are calculated as
$$\langle O\rangle=\frac{\int O(\sigma) e^{-S(\sigma)}d\sigma}{\int e^{-S(\sigma)}d\sigma}.$$

With that action, the Langevin equation is
$$\dot \sigma=-\frac12(\sigma+Nz\tan(\sigma z))+\eta \ \ \ \      (1)$$
This update rule creates a sequence of $\sigma$ that can be used to calculate the integral in question.

And the result is ..... a catastrophe! 

The average does not converge, mainly because in the differential equation (1), I ignored a drift term that goes like \(\pm i\delta(\cos(z\sigma))\). That it's there is not entirely trivial, but if you sit with that equation a little while you'll realize that weird stuff will happen if the cosine is zero. That term throws the trajectory all over the place once in a while, giving rise to an average that simply will not converge.

In the end, this is the sign problem raising its ugly head again. You do one thing, you do another, and it comes back to haunt you. Is there no escape?

You've been reading patiently so far, so you must have suspected that there is an escape. There is indeed, and I'll show it to you now.

This simple integral that we are trying to calculate
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}\cos(\sigma z),$$
we could really write it also as
$$\frac1{\sqrt{2\pi}}\int_{-\infty}^\infty d\sigma e^{-(1/2)\sigma^2}e^{iz},$$
because the latter integral really has no imaginary part. Because the integral is symmetric. 

This is the part that you have to understand to appreciate this article. And as a consequence this blog post.  If you did, skip the next part. It is only there for those people that are still scratching their head.

OK: here's what you learn in school: \(e^{iz}=\cos(z)+i\sin(z)\). This formula is so famous, it even has its own name. It is called Euler's formula. And \(cos(z)\) is a symmetric function (it remains the same if you change \(z\to-z\)), while \(\sin(z)\) is anti-symmetric (\(\sin(-z)=-\sin(z)\)). An integral from \(-\infty\) to \(\infty\) will render any asymmetric function zero: only the symmetric parts remain. Therefore, \(\int_{-\infty}^\infty e^{iz}= \int_{-\infty}^\infty \cos(z)\). 

This is the one flash of brilliance in the entire paper: that you can replace a cos by a complex exponential if you are dealing with symmetric integrals. Because this changes everything for the Langevin equation (it doesn't do that much for the Monte Carlo approach). The rest was showing that this worked also for more complicated shell models of nuclei, rather than the trivial integral I showed you. Well, you also have to figure out how to replace oscillating functions that are not just a cosine, (that is, how to extend arbitrary negative actions into the complex plane) but in the end, it turns out that this can be done if necessary.

But let's first see how this changes the Langevin equation. 

Let's first look at the case \(N=1\). The action for the Langevin equation was 
$$S=\frac12\sigma^2-\log\cos(\sigma z)$$
If you replace the cos, the action instead becomes
$$S=\frac12\sigma^2\pm i\sigma z .$$ The fixed point of the differential equation (1), which was on the real line and therefore could hit the singularity \(\delta(\cos(z\sigma))\), has now moved into the complex plane. 

And in the complex plane there are no singularities! Because they are all on the real line! As a consequence, the averages based on the complex action should converge! The sign problem can be vanquished just by moving to the complex Langevin equation!

And that explains the title of the paper. Sort of. In the figure below, I show you how the complex Langevin equation fares in calculating that integral that, scrolling up all the way, gave rise to such bad error bars when using the Monte Carlo approach. And the triangles in that plot show the result of using a real Langevin equation. That's the catastrophe I mentioned: not only is the result wrong. It doesn't even have large error bars, so it is wrong with conviction! 

The squares (and the solid line) come from using the extended (complex) action in the Langevin equation. It reproduces the exact result precisely.


Average calculated with the real Langevin equation (triangles) and the complex Langevin equation (squares), as a function of the variable \(z\). The inset shows the "sign" of the integral, which still vanishes at large \(z\) even as the complex Langevin equation remains accurate.
The rest of the paper is somewhat anti-climactic. First I show that the same trick works in a quantum-mechanical toy model of rotating nuclei (as opposed to the trivial example integral). I offer the plot below from the paper as proof:
Solid line is exact theory, symbols are my numerical estimates. You've got to hand it to me: Complex Langevin rules.

But if you want to convince the nuclear physicists, you have to do a little bit more than solve a quantum mechanical toy model. Short of solving the entire beast of the nuclear shell model, I decided to tackle something in between: the Lipkin model (sometimes called the Lipkin-Meshkov-Glick model), which is a schematic nuclear shell model that is able to describe collective effects in nuclei. And the advantage is that exact analytic solutions to the model exist, which I can use to compare my numerical estimates to.

The math for this model is far more complicated and I spare you the exposition for the sake of sanity here. (Mine, not yours). A lot of path integrals to calculate. The only thing I want to say here is that in this more realistic model, the complex plane is not entirely free of singularities: there are in fact an infinity of them. But they naturally lie in the complex plane, so a random trajectory will avoid them almost all of the time, whereas you are guaranteed to run into them if they are on the real line and the dynamics return you to the real line without fail. That is, in a nutshell, the discovery of this paper. 

So, this is obviously not a well-known contribution. This is a bit of a bummer, because the sign problem still very much exists, in particular in lattice gauge theory calculations of matter at finite chemical potential (meaning, at finite density). Indeed, a paper came out just recently (see the arXiv link in case you ended up behind a paywall) where the authors try to circumvent the sign problem in lattice QCD at finite density by doing the calculations explicitly at high temperature using the old trick of doubling your degrees of freedom. Incidentally, this is the same trick that gives you black holes at Hawking temperature, because the event horizon naturally doubles degrees of freedom. I used this trick a lot when calculating Feynman diagrams in QCD at finite temperature. But that's a fairly well-known paper, so I can't discuss it here. 

Well, maybe some brave soul one day rediscovers this work, and  writes a "big code" that solves the problem once and for all using this trick. I think the biggest reason why this paper never got any attention is that I don't write big code. I couldn't apply this to a real-world problem, because to do that you need mad software engineering skills. And I don't have those, as anybody who knows me will be happy to tell you. 

So there this work lingers. Undiscovered. Lonely. Unappreciated. Like sooo many other papers by sooo many other researchers over time. If only there was a way that old papers like that could get a second chance! If only :-)

[1] H. Gausterer, Complex Langevin: A numerical Method? Nuclear Physics A 642 (1998) 239c-250c.
[2] C. Adami and S.E. Koonin, Complex Langevin equation and the many-fermion problem. Physical Review C 63 (2001) 034319. 













Friday, October 3, 2014

Nifty papers I wrote that nobody knows about: (Part 3: Non-equilibrium Quantum Statistical Mechanics)

This is the third part of the "Nifty Papers" series. Link to Part 1. Link to Part 2.

In 1999, I was in the middle of writing about quantum information theory with my colleague Nicolas Cerf. We had discovered that quantum conditional entropy can be negative, discussed this finding with respect to the problem of quantum measurement, separability, Bell inequalities, as well as the capacity of quantum channels. Heady stuff, you might think. But we were still haunted by Hans Bethe's statement to us that the discovery of negative conditional entropies would change the way we perceive quantum statistical physics. We had an opportunity to write an invited article for a special issue on Quantum Computation in the journal "Chaos, Solitons, and Fractals", and so we decided to take a shot at the "Quantum Statistical Mechanics" angle.

Because I'm writing this blog post in the series of "Articles I wrote that nobody knows about", you already know that this didn't work out as planned. 

Maybe this was in part because of the title. Here it is, in all its ingloriousness:
C. Adami & N.J. Cerf, Chaos Sol. Fract. 10 (1999) 1637-1650
There are many things that, in my view, conspired to this paper being summarily ignored. The paper has two citations for what it's worth, and one is a self-citation! 

There is, of course, the reputation of the journal to blame. While this was a special issue that put together papers that were presented at a conference (and which were altogether quite good), the journal itself was terrible as it was being ruled autocratically by its editor Mohammed El Naschie, who penned and published a sizable fraction of the papers appearing in the journal (several hundred, in fact). A cursory look at any of these papers shows him to be an incurable (but certainly self-assured) crackpot, and he was ultimately fired from his position by the publisher, Elsevier. He's probably going to try to sue me just for writing this, but I'm trusting MSU has adequate legal protection for my views. 

There is, also, the fairly poor choice of a title. "Prolegomena?" Since nobody ever heard of this article, I never found anyone who would, after a round of beers, poke me in the sides and exclaim "Oh you prankster, choosing this title in hommage to the one by Tom Banks!" Because there is indeed a paper by Tom Banks (a string theorist) entitled: "Prolegomena to a theory of bifurcating universes: A nonlocal solution to the cosmological constant problem or little lambda goes back to the future".  Seriously, it's a real paper, take a look:


For a reason that I can't easily reconstruct, at the time I thought this was a really cool paper. In hindsight it probably wasn't, but it certainly has been cited a LOT more often than my own Prolegomena. That word, by the way, is a very innocent greek word meaning: "An introduction at the start of a book". So I meant to say: "This is not a theory, it is the introduction to something that I would hope could one day become a theory". 

There is also the undeniable fact that I violated the consistency of singular/plural usage, as "a" is singular, and "Statistical Mechanics" is technically plural, even though it is never used in the singular.

Maybe this constitutes three strikes already. Need I go on?

The paper begins with a discussion of the second law of thermodynamics, and my smattering of faithful readers has read my opinions about this topic before. My thoughts on the matter were born around that time, and this is indeed the first time that these arguments were put in print. It even has the "perfume bottle" picture that also appears in the aforementioned blog post.

Now, the arguments outlined in this paper concerning the second law are entirely classical (not quantum), but I used them to introduce the quantum information-theoretic considerations that followed, because the main point was that for the second law, it is a conditional entropy that increases. And it is precisely the conditional entropy that is peculiar in quantum mechanics, because it can be negative. So in the paper I'm writing about I first review that fact, and then show that the negativity of conditional quantum entropy has interesting consequences for measurements on Bell states. The two figures of the Venn diagrams of same-spin as opposed to orthogonal-spin measurements is reproduced here:

What these quantum Venn diagrams show is that the choice of measurement to make on a fully entangled quantum state \(Q_1Q_2\) will determine the relative state of the measurement devices (perfect correlation in the case of same-direction spin measurements, zero correlation in the case of orthogonal measurements), but the quantum reality is that the measurement devices are even more strongly entangled with the quantum system in the case of the orthogonal measurement, even though they are not correlated at all with each other. Which goes to show you that quantum and physical reality can be two entirely different things altogether.

I assure you these results are profound, and because this paper is essentially unknown, you might even try to make a name for yourself! By, umm, citing this one? (I didn't encourage you to plagiarize, obviously!)

So what comes after that? After that come the Prolegomena of using quantum information theory to solve the black hole information paradox!

This is indeed the first time that any of my thoughts on black hole quantum thermodynamics appear in print. And if you compare what's in this paper with the later papers that appeared first in preprint form in 2004, and finally in print in 2014, the formalism in this manuscript seems fairly distinct from these calculations.

But if you look closer, you will see that the basic idea was already present there.

The way I approach the problem is clearly rooted in quantum information theory. For example, people often start by saying "Suppose a black hole forms from a pure state". But what this really means is that the joint state between the matter and radiation forming the black hole, as well as the radiation that is being produced at the same time (which does not ultimately become the black hole) is in a pure state. So you have to describe the pure state in terms of a quantum Venn diagram, and it would look like this:
Entropy Venn diagram between the forming black hole ("proto-black hole" PBH) and a radiation field R. The black hole will ultimately have entropy \(\Sigma\), the entanglement entropy.
Including this radiation field R entangled with the forming black hole is precisely the idea of stimulated emission of radiation that ultimately would solve all the black hole information paradoxes: it was clear to me that you could not form a black hole without leaving an entangled signature behind. I didn't know at the time that R was stimulated emission, but I knew something had to be there. 

Once the black hole is formed, it evaporates by the process of Hawking radiation. During evaporation, the black hole becomes entangled with the radiation field R' via the system R:
Entropy Venn diagram between radiation-of-formation R, the black hole BH, and the Hawking radiation R'. Note that the entropy of the black hole \(S_{\rm BH}\) is smaller than the entropy-of-formation \(\Sigma\) by \(\Delta S\), the entropy of the Hawking radiation. 
The quantum entropy diagram of three systems is characterized by three (and exactly three) variables, and the above diagram was our best bet at this diagram. Note how the entire system has zero entropy and is highly entangled, but when tracing out the radiation-of-formation, the black hole is completely uncorrelated with the Hawking radiation as it should be. 

Now keep in mind, this diagram was drawn up without any calculation whatsoever. And as such, it is prone to be dismissed as a speculation, and it was without doubt a speculation at the time. Five years later I had a calculation, but its acceptance would have to wait for a while.

In hindsight, I'm still proud of this paper. In part because I was bold enough to pronounce the death of the second law as we know it in print, and in part because it documents my first feeble attempts to make sense of the black hole information morass. This was before I had made any calculations in curved space quantum field theory, and my ruminations can therefore easily be dismissed as naive. They were naive (for sure), but not necessarily stupid.

Next week, be prepared for the last installment of the "Nifty Papers" series. The one where I single-handedly take on the bugaboo of computational physics: the "Sign Problem". That paper has my postdoctoral advisor Steve Koonin as a co-author, and he did provide encouragement and helped edit the manuscript. But by and large, this was my first single-author publication in theoretical/computational physics. And the crickets are still chirping....






Sunday, September 28, 2014

Nifty papers I wrote that nobody knows about: (Part 2: Quark-Gluon Plasma)

This is Part 2 of the "Nifty papers" series, talking about papers of mine that I think are cool, but that have been essentially ignored by the community. Part 1 is here.

This is the story of my third paper, still as a graduate student (in my third year) at Stony Brook University, on Long Island.

Here's the title:

Physics Letters B 217 (1989) 5-8
First things first: What on earth is "Charmonium"? To answer this question, I'll give you in what follows a concise introduction to the theory of quarks and gluons, knows as "Quantum Chromodynamics".

Just kidding, of course. The Wiki article I linked above should get you far enough for the purposes of this article right here. But if this is TL;DR for you, here's the synopsis:

There are exactly six quarks in this universe, up (u), down (d), strange (s), bottom (b, also sometimes 'beauty"), and t (top).

These are real, folks. Just because they have weird names doesn't mean you don't carry them in every fibre of your body. In fact you carry only two types of quarks with you, really: the u and d, because they make up the protons and neutrons that make all of you: proton=uud, neutron=udd: three quarks for every nucleon. 

The s, c, and b, quarks exist only to annoy you, and provide work for high-energy experimental physicists! 

Just kidding again, of course. The fact that they (s,c, and b) exist provides us with a tantalizing clue about the structure of matter. As fascinating as this is, you and I have to move on right now. 

For every particle, there's an anti-particle. So there have to be anti-u, and anti-d. They make up almost all of anti-matter. You did know that anti-matter was a real thing, not just an imagination of Sci-Fi movies, right?

The particles that make up all of our known matter (and energy). The stuff that makes you (and almost all of our known universe) is in the first and 4th column. I'm still on the fence about the Higgs. It doesn't look quite right in this graph, does it?  Kind of like it's a bit of a mistake? Or maybe because it really is a condensate of top quarks? Source: Wikimedia
Right. You did. Good thing that. So we can move on then. So if we have u and d, we also must have anti-u and anti-d. And I'm sure you already did the math on charge to figure out that the charge of u better be +2/3, and the charge of d is necessarily -1/3. Because anti-matter has anti-charge, duh. If you're unsure why, contemplate the CPT theorem. 

Yes, quarks have fractional charges. If this blows your mind, you're welcome. And this is how we make one positive charge for the proton (uud), and a neutral particle (the neutron) from udd.

But the tinkerer in you has already found the brain gears in motion: what prevents me from making a (u-anti u), (d-anti d), (u-anti d), (d-anti u ) etc. ?

The answer is: nothing. They exist. (Next time, come up with this discovery when somebody else has NOT YET claimed a Nobel for it, OK?) These things are called mesons. They are important. I wrote my very first paper on the quantization of an effective theory that would predict how nucleons (you remember: protons and neutrons, meaning "we-stuff") interact with pions (a type of meson made only of u,d, anti-u, and anti-d), as discussed here.

What about the other stuff, the "invisible universe" made from all the other quarks, like strange, charm bottom, and top? Well, they also form all kinds of baryons (the world that describes all the heavy stuff, such as protons and neutrons) and mesons (the semi-heavy stuff). But they tend to decay real fast.

But one very important such meson--both in the history of QCD and our ability to understand it-- is the meson called "charmonium".

I did tell yout that it would take me a little time to get you up to date, right? Right. So, Charmonium is a bound state of the charm and anti-charm quark.

(By the way, if there is anybody reading this that still thinks: "Are you effing kidding me with all these quark names and stuff, are they even real?", please reconsider these thoughts, because they are like doubting we landed on the moon. We did, and there really are six quarks, and six anti-quarks. We are discussing their ramifications here. Thank you. Yes, "ramification" is a real word, that's why I linked to a Dictionnary. Yes, those Wiki pages on science are not making things up. Now, let's move on, shall we?)

The reason why we call the \({\rm c}\bar {\rm c}\) meson "charmonium" is because we have a name for the bound state of the electron and positron (also known as the anti-electron): we call it Positronium. Yes, that's a real thing. Not related to the Atomium, by the way. That's a real building, but not a real element.

So why is charmonium important? To understand that, we have to go back to the beginning of the universe.

No, we don't have to do it by time travel. Learning about charmonium might allow you to understand something about what was going on when our universe was really young. Like, less than a second young. Why would we care about these early times? Because they might reveal so us clues about the most fundamental laws of nature. Because the state of matter in the first few milliseconds (even the first few microseconds) might have left clues to decipher for us today.

At that time (before a millisecond), our universe was very different from how we see it today. No stars, no solar systems. We didn't even have elements. We didn't have nuclei! What we had was a jumble of quarks and gluons, which one charitable soul dubbed the "quark gluon plasma" (aka: QGP). The thing about a plasma is that positive and negative charges mill about unhindered, because they have way too much energy to form these laid-back bound states that we might (a few milliseconds later) find everywhere. 

So, here on Earth, people have been trying to recreate this monster of a time when the QGP reigned supreme, by shooting big nuclei onto other big nuclei. The idea here is that, for a brief moment in time, a quark gluon plasma would be formed that would allow us to study the properties of this very very early universe first hand. Make a big bang at home, so to speak. Study hot and dense matter.

While contemplating such a possibility at the RHIC collider in Brookhaven, NY (not far from where I was penning the paper I'm about to talk to you about), a pair of physicists (Tetsuo Matsui and Helmut Satz [1]) speculated that charmonium (you know, the \(\bar c c\) bound state) might be seriously affected by the hot plasma. In the sense that you could not see the charmonium anymore.

Now, for ordinary matter, the \(J/\psi\) (as the lowest energy state of the charmonium system is called for reasons I can't get into) has well known properties. It has a mass of 3.1 GeV (I still know this by heart), and a short but measurable lifetime. Matsui and Satz in 1986 speculated that this \(J/\psi\) would look very different if it was born in the midst of a quark gluon plasma, and that this would allow us to figure out whether such a state of matter was formed: all you have to do is measure the \(J/\psi\)'s properties: if it is much reduced in appearance (or even absent), then we've created a quark gluon plasma in the lab.

It was a straightforward prediction that many people accepted. The reason why the \(J/\psi\) would disappear in a QGP according to Matsui and Satz was the phenomenon of "color screening". Basically, the energy of the collision would create so many additional \(\bar c c\) pairs that they would provide a "screen" to the formation of a meson. It is as if a conversation shouted over long distances is disrupted by a bunch of people standing in between, whispering to each other. 

For a reason I cannot remember, Ismail Zahed (my Ph.D. advisor) and I came to doubt this scenario. We were wondering whether it was really the "hotness" of the plasma (creating all these screening pairs) that destroyed the \(J/\psi\). Could it instead be destroyed even if a hot plasma was not formed?

Heavy ion collsision in the rest system of the target (above), and in the center of mass rest system (below)

The image we had in our heads was the following. When a relativistically accelerated nucleus hits another nucleus, then in the reference frame of the center of mass of both nuclei each is accelerated (from this center of mass, you see two nuclei coming at you at crazy speed). And when two nuclei are accelerated relativistically, their lateral dimension (in the direction of movement) contracts, while the orthogonal direction remains unchanged. This means that the shapes of the nuclei are affected: instead of spherical nuclei they appear to be squeezed, as the image above suggests.

When looked at from this vantage point, a very different point of view concerning the disappearance of the \(J/\psi\) can be had. Each of the nuclei creates around it a color-electric and color-magnetic field, because of all the gluons exchanged between the flat nuclei. Think of it in terms of electrodynamics as opposed to color dynamics: if the two nuclei would be electrical conductors, they would span between them an electric field. Indeed, a set of conducting plates separated by a small distance is a capacitor. So, could it be that in such a collision, instead of all that hot screening, all that happens is the formation of a color-electric capacitor that simply rips the \(J/\psi\) to pieces?

That's the question we decided to check, by doing an old-fashioned calculation. How do you do this? I recall more or less that if I am going to calculate the fate of a bound state within a color-electric field, I ought to know how to calculate the fate of a bound state in an electric field. Like, for example: who calculated what happens to the hydrogen atom if it is placed between the plates of a capacitor? Today, I would just google the question, but in 1988, you have to really search. But after searching (I spent a lot of time in the library these days) I hit paydirt. A physicist by the name of Cornel Lanczos had done exactly that calculation (the fate of a hydrogen atom in a strong electric field). What he showed is that in strong electric fields, the electron is ripped off of the proton, leading to the ionization of the hydrogen atom.

This was the calculation I was looking for! All I had to do is change the potential (namely the standard Coulomb potential of electrodynamics, and replace it by the effective potential of quantum chromodynamics.

Now, both you and I know that we if don't have a potential, then we can't calculate the energy of the bound state. And the potential for color-electric flux tubes (as opposed to the exchange of photons, which gives rise to the electromagnetic forces as we all know) ought to be notably different from the Coulomb potential. 

No, I'm not known to be sidetracked by engaging in a celebration of the pioneers of quantum mechanics. But the career of Lanczos should give you pause. The guy was obviously brilliant (another one of the Hungarian diaspora) but he is barely remembered now. Spend some time with his biography on Wikipedia: there are others besides Schrödinger, Heisenberg, Planck, and Einstein that advanced our understanding of not just physics, but in the case of Lanczos, computational physics as well.

So I was sidetracked after all. Moving on. So, I take Lanczos's calculation, and just replace the Coulomb potential by the color-electric potential. Got it? 

Easier said than done! The Coulomb potential is, as everybody knows, \(V(r)=-\frac er\). The color-electric potential is (we decided to ignore color-magnetic forces for reasons that I don't fully remember, but that made perfect sense then)
$$V(r)=-\frac43\frac{\alpha_s}r+\sigma r.      (1)$$

"All right", you say, "what's all this about?"

Well, I respond, you have to understand that when it comes to color-electric (rather than just electric) effects, the coupling constant is not the electron charge, but 4/3 of the strong coupling constant \(\alpha_s\).
"But why 4/3?"

Ok, the 4/3 is tricky. You don't want to see this, trust me. It's not even in the paper. You do want to see it? OK. All others, skip the colored text.

How to obtain the quark-antiquark Coulomb potential
To calculate the interquark potential you have to take the Fourier transform of the Feynman diagram of quark-anti-quark scattering:

The solid lines are quarks or anti-quarks, and the dashed line is the gluon exchanged between them. Because the gluon propagator \(D^{-1}_{ab}\) is diagonal, the amplitude of the process is given mostly by the expectation value of \(\vec T^2\). \(T^{(a)}\) is the generator of the symmetry group of quarks SU(3), given by \(T^{(a)}=\frac12\lambda^a\). And \(\lambda^a\) is a Gell-Mann matrix. There are eight of them. What the value of \(\langle \vec T^2\rangle\) is depends on the representation the pair of quarks is in. A little calculation shows that for a quark-antiquark pair in a singlet state, \(\langle \vec T^2\rangle=-4/3\). If the pair is in an octet state, this same expectation value gives you 1/6, meaning that the octet is unbound. 

More interesting than the Coulomb term is the second term in the potential (1), the one with the wavy thing in it. Which is called a "sigma", by the way.

"What of it?"

Well, \(\sigma\) is what is known as the "string tension". As I mentioned earlier, quarks and anti-quarks can't just run away from each other (if you gave them enough energy). In strong interactions, the force between a quark and an anti-quark increases in proportion to their distance. In the lingo of strong interaction physics, this is called "asymptotic freedom", because it means that at short distances, quarks get to be free. Not so if they attempt to stray, I'm afraid.

So suppose we insert this modified potential, which looks just like a Coulomb potential but has this funny extra term, into the equations that Lanczos wrote down to show that the hydrogen atoms gets ripped apart by a strong electric field?

Well, what happens is that (after a bit of a song and dance that you'll have to read about by yourself), it turns out that if the color-electric field strength just marginally larger than the string tension \(\sigma\), then this is sufficient to rip apart the charmonium bound state. Rip apart, as in disintegrate. The color-electric field generated by these colliding nuclei will kill the charmonium, but it is not because a hot quark gluon plasma creates a screening effect, it is because the cold color-electric field rips the bound state apart!

The observable result of these two very different mechanisms might look the same, though: the number of \(J/\psi\) particles you would expect is strongly reduced.

So what have we learned here? One way to look at this is to say that a smoking gun is only a smoking gun if there are no other smoking guns nearby. A depletion of \(J/\psi\)s does not necessarily signal a quark gluon plasma.

But this caveat went entirely unheard, as you already know because otherwise I would not be writing about this here. Even though we also published this same paper as a conference proceeding, nobody wanted to hear about something doubting the holy QGP.

Is the controversy resolved today? Actually, it is still in full swing, almost 30 years after the Matsui and Satz paper [1], and 25 years after my contribution that was summarily ignored. How can this still be a mystery? After all, we have had more and more powerful accelerators attempt to probe the elusive QGP. At first it was CERN's SPS, followed by RHIC in Brookhaven (not far away from where I wrote the article in question). And after RHIC, there was the LHC, which after basking in the glory of the Higgs discovery needed something else to do, and turned its attention to.... the QGP and \(J/\psi\) suppression!

The reason why this is not yet solved is that the signal of \(J/\psi\) suppression is tricky. What you want to do is compare the number of \(J/\psi\) produced in a collision of really heavy nuclei (say, lead on lead) with those produced when solitary protons hit other protons, scaled by the number of nucleons in the lead-on-lead collision. Except that in the messy situation of lead-on-lead, \(J/\psi\) can be produced at the edge rather than the center, be ripped apart, re-form, etc. Taking all these processes into account is tedious and messy.

So the latest news is: yes, \(J/\psi\) is suppressed in these collisions. But whether it is due to "color-screening" as the standard picture of the QGP suggests, or whether it is because a strong color-electric field rips apart these states (which could happen even if there is no plasma present at all as I have shown in the paper you can download from here), this we do not yet know. After all this time.

[1] T. Matsui and H. Satz, “J/ψ Suppression by Quark-Gluon Plasma Formation,” Phys. Lett. 178 (1986) 416.

Now, move over to Part 3, where I awkwardly explain the meaning of the word "Prolegomena".











Monday, September 15, 2014

Nifty papers I wrote that nobody knows about (Part I: Solitons)

I suppose this happens even to the best of us: you write a paper that you think is really cool and has an important insight in it, but nobody ever reads it. Or if they read it, they don't cite it. I was influenced here by the blog post by Claus Wilke, who argues that you should continue writing papers even if nobody reads them. I'm happy to do that, but I also crave attention. If I have a good idea, I want people to notice. 

The truth is, there are plenty of papers out there that are true gems and that should be read by everybody in the field, but are completely obscure for one reason or another. I know this to be true but I have little statistical evidence because, well, the papers I am talking about are obscure. You can actually use algorithms to detect these gems, but they usually only find papers that are already fairly well known. 

In fact, this is just common sense: once in a while a paper just "slips by". You have a bad title, you submitted to the wrong journal, you wrote in a convoluted manner. But you had something of value. Something that is now, perhaps, lost. One of my favorite examples of this sort of overlooked insight is physicist Rafael Sorkin's article: "A Simple Derivation of Stimulated Emission by Black Holes", familiar to those of you who follow my efforts in this area. The article has 10 citations. In my view, it is brilliant and ground-breaking in more than one way. But it was summarily ignored. It still is, despite my advocacy.

I was curious how often this had happened to me. In the end the answer is: not so much, actually. I counted five four papers that I can say have been "overlooked". I figured I would write a little vignette about each of them, why I like them (as opposed to the rest of the world), and what may have gone wrong--meaning--why nobody else likes them.

Here are my criteria for a paper to be included into the list:

1.) Must be older than ten years. Obviously, papers written within the last decade may not have had a significant amount of time to "test the waters". (But truthfully, if a paper does not get some citations within the first 5, it probably never will. )

2.) Must have had fewer than 10 citations on Google Scholar (excluding self-citations).

3.) Must not be a re-hash of an idea published somewhere else (by me) where it did get at least some attention.

4.) Must not be a commentary about somebody else's work (obvious, this one). 

5.) Must be work that I'm actually proud of. 

When going through my Google Scholar list, I found exactly four papers that meet these criteria. 

(Without taking into account criterion 5, the list is perhaps twice as long, mind you. But some of my work is just not that interesting in hindsight. Go figure.)

These are the four papers in the final list:

1. Soliton quantization in chiral models with vector mesons, C Adami, I Zahed (1988)
2. Charmonium disintegration by field-ionization, C Adami, M Prakash, I Zahed (1989)
3. Prolegomena to a non-equilibrium quantum statistical mechanics, C Adami, NJ Cerf (1999)
4. Complex Langevin equation and the many-fermion problem, C Adami, SE Koonin (2001).

I will publish a blog post about one of these each of the coming weeks.

I'll start in reverse chronological order:

Physics Letters B 215 (1988) 387-391. Number of citations: 10 

This is actually my first paper ever, written at the tender age of 25. But it didn't get cited nearly as much as the follow-up paper, which was published a few months earlier: Physics Letters B 213 (1988) 373-375. 

How is this possible, you ask? 

Well, the editors at Physics Letters lost my manuscript after it was accepted, is how it happened! 

You have to remember that this was "the olden days". We had computers all right. But we used them to make plots, and send Bitnet messages. You did not send electronic manuscripts to publishers. These were sent around in large manila envelopes.  And one day I get the message (after the paper was accepted): "Please send another copy, we lost ours". Our triplicates, actually, because each reviewer gets a copy that you send in, of course. I used to keep all the correspondence about manuscripts from these days, but I guess after moving offices so many times, at one point stuff gets lost. So I can't show you the actual letter that said this (I looked for it).  Of course, after that mishap the editorial office used a new "received" date, just so that it doesn't look so embarrassing. And arxiv wouldn't exist for another 4 years to prove my point.

So that's probably the reason why the paper didn't get cited: people cited the second one that was published first, instead. But what is this paper all about?

It is about solitons, and how to quantize them. Solitons were my first exposure to theoretical physics in a way, because I had to give a talk about topological solitons called "Skyrmions" in a theoretical physics seminar at Bonn University in, oh, 1983. Solitons are pretty cool things: they are really waves that behave like particles. You can read a description of how they were discovered by John Scott Russell riding his horse alongside a canal in Scotland, and noticing this wave that just... wouldn't... dissipate, here

Now, there is a non-linear field theory due to T.H.R. Skyrme that has such soliton solutions, and people suggested that maybe these Skyrmions could describe a nucleon. You know, the thing you are made of, mostly? A nucleon is a proton or a neutron, depending on charge. Nuclei are are made from them. Your are all nucleons and electrons really. Deal with it. 

Skyrme incidentally is the one who died just days after I submitted the very manuscript I'm writing about, which started the rumour that my publications are lethal. Irrelevant fact, here. 

Skyrme's theory was a classical one, and so the question arose what happens if you quantize that theory. This is an interesting question because usually, if you quantize a field you create fluctuations of that field, and if these fluctuations were of the right kind, they should (if they fluctuate around a nucleon) describe pions. And voilà: we would have a theory that describes how nucleons have to interact with pions. 

What are pions, you ask? Go read the Wiki page about them. But really, they are the stuff you get if you bang a nucleon and an anti-nucleon together. They have a quark and an anti-quark in them, as opposed to the nucleons, that have three quarks: Three quarks for Muster Mark

Now, people actually already knew at the time what such an interaction term was supposed to look like: the so-called pion-nucleon coupling. But if the term that comes out of quantizing Skyrme's theory did not look like this, well then you could safely forget about that theory being a candidate to describe nucleons. Water waves maybe, just not the stuff we are made out of.  

So I started working this out, using the theory of quantization under constraints that Paul Dirac developed, because we (my thesis advisor Ismail Zahed and I) had stabilized the Skyrmion using another meson, namely the ω-meson. You don't have to know what this is, but what is important here is that the components of the ω field are not independent, and therefore you have to quantize under that constraint.

You very quickly run into a problem: you can't quantize the field because there are fluctuation modes that have zero energy. Indeed, because in order to do the quantization you have to take the inverse of the matrix of fluctuations, these zero modes create a matrix that cannot be inverted (its determinant vanishes). What to do?

The answer is: you find out what those zero modes are, and quantize them independently. It turns out that those zero modes were really rotations in "isospin-space", and they naturally have zero energy because you can rotate that soliton in iso-space and it costs you nothing. I figured out how to quantize those modes by themselves (you just get the Hamiltonian for a spinning top out of that), then project out these zero modes from the Skyrmion fluctuations, and quantize only those modes that are orthogonal to the zero modes. And that's what I proceeded to do. Easy as pie.

And the result is fun too, because the resulting interaction term looks almost like the one we should have gotten, and then we realized that the "standard" term of chiral theory comes out in a particular limit, known as the "strong coupling" limit. Even better, using this interaction I could calculate the mass of the first excitation of the nucleon, the so-called Δ resonance. That would be the content of the second paper, which you now know actually got published first, and stole the thunder of this pretty calculation.  

So what did we learn in this paper in hindsight? Skyrmions are actually very nice field-theoretic objects, and the effective theory (while obviously not the full underlying theory that should describe you, namely the theory of quarks and gluons called Quantum Chromodynamics, or QCD), this approximate theory can give you very nice predictions about low energy hadronic physics, where QCD actually is not at all predictive. That's because we can only calculate QCD in the high-energy limit (for example what happens when you shoot quarks at quarks with lots of energy, for example). Research on Skyrmions (and low-energy effective theories in general) is still going on strong, it turns out. And perhaps even more surprising is this: there is now a connection (uncovered by my former advisor), between these Skyrmions and the holographic principle

So even old things turn out to be new sometimes, and old calculations can still teach you something today. Also we learn: electronic submissions aren't as easily lost behind file cabinets. So there is that.

Next up:  Charmonium Disintegration by Field-Ionization [Physics Letters B 217 (1988), 5-8]. A story involving the quark-gluon plasma, and how an old calculation by Cornel Lanczos from 1930 can shed light on what happens to the \(J/\psi\), when suitably modernized. All of 5 citations on Google Scholar this one got. But what a fun calculation! Read on here