Eqs

Monday, October 14, 2024

What gets your attention? Brain evolution suggests a new theory

 Are you sure you know what's going on in your brain? Are you really the master of your thoughts?


"Of course I am!" you exhort me for asking, "Who else would be in control?" 


I won't actually be discussing multiple personalities in this post. What I will ask instead is: "How do you make decisions? What is the basis of these decisions? Can you always trust what you base your decisions on?"


A more provocative question would be: "How easily are you fooled?" Or more precisely,


"What happens in your brain when you are fooled?"


Everybody knows we get fooled a lot. A prime example of this tomfoolery is optical illusions. Many have played around with these before, but if you never have, let me suggest a couple of sites that are really good.


Michael Bach: Visual Phenomena & Optical Illusions


Gizmodo: Optical Illusions that might break your mind


In particular, Michael Bach's page explains some of the cognitive science of perception that underlies the illusion. But basically, what it boils down to is that it is not your eyes that see, it is your brain.


What this implies is that your brain makes a lot of assumptions about what it is that it perceives, and the reason it makes these assumptions is that making these assumptions saves a lot of time and energy.


I'll give you an example (not from vision). Suppose you have reached your car, put your hand in your pocket, and find that the pocket is empty. The place you are sure the key was, does not actually hold it. What do you do now?


If the key is not in your pocket, it could be anywhere, right? But do you start checking possible locations randomly? Of course not. You first try the other pocket. Why? Because this is the next most likely location? Why do you think that?


The answer is that this is guided by experience. You know about where keys are likely to be. If they are not in any of your pockets, they still must be on the desk. If that's not the case, you check on the counter. And so on. Experience has created a model of "where keys usually are to be found" in your head, and you are using this model to drastically reduce the time it takes you to find the key. You are not conducting a random search here.


Your visual system operates on a similar premise, except not only does it take into account past experience of visual scenes, but there are evolutionarily hardwired assumptions at work here too. "Chairs are usually on the floor. Windows separate the inside from the outside. Objects that are close appear to be larger." These are just a few of the things your brain "knows" about visual scenes. In fact, it is known that when we look at a scene, we barely take in anything of what is out there. Instead, we make most of it up: we hallucinate it. It works really well because most of the stuff in visual scenes is always the same. Chairs are always on the floor. Windows always separate the inside from the outside. Our eyes will saccade over the scene to perform a few spot checks, to fill in the stuff that is difficult to synthesize. Is it day or night? What kind of a chair is it? What color? We need to look directly at an object to notice its color, because we can only see color in a narrow cone surrounding the point we focus: everything that is outside of about 5 degrees of your focus is grey-scale (see below).


Central visual field as described in the XKCD comic. 

From https://xkcd.com/1080/ (licensed under CC Attribution-NonCommercial 2.5 License)


It is because of all these assumptions that our brain makes (in order to save time) that we are so easily fooled. We are fooled precisely when these assumptions are subtly violated.


So now you understand that you can make decisions based on what you thought you saw (even when you did not actually see it), but how exactly are you being fooled? What did your brain actually do when it suggested falsehoods to you?

You can imagine that this is difficult to figure out in practice, because of two reasons. 1.) People tend to not like to have electrodes stuck in their brain to record their thoughts and 2.) Even if we did that, we may not be able to figure out what really happened in the brain just from those recordings. (Because, duh, we have a hard time interpreting those recordings.) So how can we then test hypotheses about how we reach decisions, in particular faulty ones?

Let me indulge in a quick interlude about the importance of models in science. In areas other than brain/cognitive sciences, we often create models of the process we are interested in, and then test different hypotheses within that model. If the model makes predictions that are similar to what you observe in real life, then you have.... a model that at least does not contradict real life. You still don't know what's going on in real life, of course. But you can put the model through its paces. You can test it in many different scenarios. If you cannot get a discrepancy, as hard as you may try, then maybe your model actually incorporates something that is quite similar to what you have in real life. And I can already sense your question:
 
 "Could you do this with brains? Can you make brain models that behave just like human brains?"

So you are really asking is, can you make Artificial Intelligence? Well, up to now, nobody has succeeded. And you, who has been reading this blog since the very beginning (or at least since I wrote "Your Conscious You", haven't you?) already know how I'm going to answer this. We can make it alright, but we can't design it, because we don't know how brains work, really. Instead, we're going to use the process that already did the "making" at least once: Darwinian evolution. And we can do this because evolution will produce something that works whether we understand that working thing or not. So in this post, I'm going to tell you about an experiment where we evolved brains to solve a task that humans can fairly easily solve in the lab. And then, we'll test whether these artificial brains pass some other tests that are the equivalent of, say, a "brain-permit". (If you do not pass this, you can't be called a brain.) And after the brains that we make pass this test, we're going to try to fool them. We will then find that they get fooled just as easily as humans get fooled, but we can look into these brains and figure out how and why they are fooled. And from that we learn a lot about how our brain works. And it is all in a paper that you'll get to read, of course. (I'm not being very original here, I know. There's always a paper.)


What kind of brain should we evolve? You may think that because I write so much about vision that I'll evolve a brain that is fooled by visual illusions. But there isn't anyone I know that does experiments with such illusions here at MSU. When I was at Caltech, my colleague and friend Christof Koch was doing psychophysics experiments all the time, and I probably could have used some of his data sets for this purpose. At MSU, it turns out that there is a lab that also focuses on psychophysics in a way, by studying how people perceive sound and music: the "Timing, Perception, and Action" Lab, led by Devin McAuley in the Department of Psychology. Devin has been doing some really interesting experiments on how people perceive rhythmic sequences. When I called him up to discuss a possible collaboration, he told me about this really interesting experiment he did with people. And when I say "people", I mean undergrad students that were paid $10 per hour. Shamefully, this is better than federal minimum wage.

The task is really simple: You put on headphones and listen to a repeating beep. It repeats rhythmically. At one point (you are not told when) a beep occurs that is distinguished by being at a higher frequency. You are then asked whether this beep is longer or shorter than the background beeps this "oddball beep" was embedded in. You are exposed to longer and shorter oddball tones equally, and you think "This really isn't all that hard!" People are indeed pretty good at this task, as long as the difference between the background and the oddball is noticeable. And that is actually the "brain-permit" part I was mentioning. There is a psychometric relationship called "Weber's Law" that describes a subject's (or for that matter, any detector's) ability to perceive relative differences. The main idea of this law (sometimes called Weber-Fechner Law) is that "relative differential sensitivity remains the same regardless of size of stimulus". Basically, this means that the measurable difference is proportional to how strong the signal is. You already have an intuitive understanding of this law: you can perceive small differences in loudness when things are whispered, but such differences are not noticeable when you are standing next to enormous speakers at a concert.

You can verify this law by testing how well people perceive the difference between the oddball length and the standard tone (within which the oddball is embedded). The longer the standard tone, the bigger the just-noticeable-difference. "Just noticeable difference" (or "JND") is actually a technical term. It is the difference at which half the subjects deem the oddball short, and the other half deem it long.

I know I know. I'm boring you with psychometric laws when all you want to hear is how you can't be fooled. But psychometry is important, people. The word itself pretty much means "measuring the mental state". So let's leave psychometry for a moment (until I will tell you that the brains we evolve pass this test with flying colors, that is). And now let's go right into the "fooling" part.

You see, when you were taking the oddball test (I don't really think it was you specifically, I just put you into the narrative like this, for effect), so when you were taking the test, sometimes something happened that you weren't told about. Most of the time, the oddball tone started exactly when the standard tone would start, that is, they were "in rhythm". But sometimes, the oddball starts a little late, or a little early. And you were not told that. And it turns out that how you judge the length of the oddball depends strongly on this time difference.

If you mull this over a bit, it really shouldn't come as a big surprise that timing is very important in how we perceive the world The world, after all, is full of periodic (or rhythmic) signals, and our brains are attuned to detecting such rhythmic signals, because they allow us to predict the world a little better. Accurate prediction (generally speaking) is a recipe for survival, so you can imagine how it is important we get that right. To a large extent, this sensitivity to repeating auditory sequences explains our affinity to music, as I wrote about in an entirely different context.

Here's what happens when subjects (this includes you) are asked to judge advanced or delayed tones: delayed tones are judged long (even if they are short), and advanced tones are judged short. Here's the data that Devin McAuley obtained in his lab:

Duration distortion factor as a function of oddball delay, from Ref. [1].


In this figure, you see three groups of trials: when the oddball was delivered early (first column), when it was delivered on time (middle column), and when it appeared late (third column). The y-axis is the "duration distortion factor", which measures how "wrong" the reported oddball length was. The DDF is one if the oddball appears on time, which means that there is no distortion. But if it is early, the perceived length is short (DDF<1), while late onset creates a DDF>1. However, the subjects were never given longer or shorter oddballs in these experiments: they were all the same length as the standard tone!

So how is this important, you ask?

Well, get this: When the subjects self-report a longer or shorter tone, they are mistaken! They are fooled!

How can you explain this illusion? Well, there are theories that try to explain this effect. One theory posits that the brain measures time intervals using an internal clock that, in a way, pours "time units" into place that can measure how many units are in there. The start of the tone opens the gate and the end of the tone closes it, so that the amount of time units determines the length of the signal. I can sense that you're not buying this theory, and indeed that theory has not stood the test of time. Another theory, called "Dynamic Attending Theory" (or DAT) assumes that our attention is driven by changes, and that a rhythmic signal will produce peaks of attention at the beginning of the rhythm. When a tone is late or early, it gets less attention (because it misses the attentional peak), and that explains the illusion of longer and shorter tones (when in fact they are the same length as the background).

Indeed, it was Devin McAuley who tested the "time units bucket" theory (actually named "Scalar Expectancy Theory, or SET) against DAT using those undergrad cohorts, and showed that DAT came out ahead. But of course, it's still a theory. How can we figure out what's really going on?


One was is to evolve brains in the computer to do this same task, and then take a look if they are fooled by those malicious early or late tones. And if they are, we can find out why, because we can measure the heck out of those artificial brains without violating any IACUC rules. I will spare you the details of how we evolve artificial brains in the computer. I wrote about it before in this post, and I'll probably write about it more extensively in the future. We call these artificial brains "Markov Brains", and there is a write-up on arXiv that tells you most of what you need to know about them. Here, just imagine we can evolve a lot of them. For this study, we "made" 50 brains (each is the best brain evolved in fifty independent experiments). These are relatively simple brains: they have 14 neurons that they can use to perform computations, along with a single neuron that perceives the tone, and a single neuron that signals the action, as shown in the figure below.

Structure of a Markov brain that listens to a tone, and signals the duration of the tone with a binary decision.

The brains get high fitness (that is, their genome will have many copies in the next generation) if they can correctly judge the length of an "oddball" tone that is embedded within a rhythmic sequence of tones, as shown in the figure below.



The "oddball paradigm" asks subjects to judge the length of the oddball tone (in red) with respect to a rhythmic sequence of tones (in grey). The oddball tone to be judged is indicated to the subject by, for example, an elevated pitch. During training, the subjects always hear tones that begin at the exact time the rhythmic signal is expected. But during testing, the subject may be given tones that are advanced or delayed with respect to the expected onset (without revealing that manipulation to the subject).


When we evolved brains to judge the length of oddball tones, they solved the problem in less than 2,000 generations. Not all the brains could do this perfectly, but most could. We evolved them to excel at this task not for one example sequence (like the one shown in the figure above). We did this for many different background rhythms (defined by the time between onsets of the background rhythm, known as the inter-onset interval (IOI). For each IOI (between 10 and 25 units) we created a standard tone (half the IOI if the IOI is even, otherwise half of IOI minus 1) as well as all possible longer and shorter tones that fit within the IOI, and asked those brains to judge all of them. 

We found that the brains we evolved could judge the short IOIs without any problems, but found that the task becomes progressively more difficult the longer the IOI. This is in fact precisely what Weber's Law predicts, and indeed our evolved brains followed Weber's Law almost precisely! But what about the illusions observed in the experiments with students? How do the evolved brains react to delayed and advanced tones? It turns out that they are fooled in the exact same way as the students are! Below you can see the measured duration distortion factor (DDF) for an IOI of 14 time units, with a standard tone of length 7. This corresponds to the experiment above where the standard tone was 350 msec if you take a time unit to correspond to 50 msec. 

Duration distortion factor for evolved Markov brains exposed to a standard tone of 7 time units, embedded in an IOI of 14 time units. From [4].

Evolved Markov brains also perceive delayed tones (negative oddball onset) as short, and advanced tones (positive oddball onset) as long. What accounts for this illusion? How is this even possible, since these brains are deterministic? Now, compared to human brains, we have a distinct advantage here: we can peek into these brains to find out how they work!


One way to do this is to understand how the brain's state is changing as it is listening to the tone. Here, the brain's state can simply be rendered as a decimal number composed from the binary state of the combined neurons. So, for example, for a brain with 10 neurons that brain state where all 10 neurons are quiescent (the state `0000000000') is '0', while the state of all ten neurons firing is the state '1023'. We can then depict a state change by drawing an arrow between two state, depending on whether a time (a '1') or no tone (a '0') was perceived, as in the figure below.


Brain state change as a function of the input (digit next to the arrow). Top: a ten-neuron brain in binary notation. Bottom the same state change, but in decimal notation.

In the movie below you can follow the brain state changes as an evolved brain listens to a standard tone of length 5. Note the loop in state space that has evolved to make the length assessment possible. This loop is in fact an evolved representation of the standard tone.



In that movie, you see the state changes as the brain listens to a standard tone of length 5 embedded in an IOI of length 10. This brain has a total of 14 (of 16) neurons participating in the computation, but only 12 are used to depict the state (one receives the tone, one signals the decision). The oddball tone is depicted at the bottom in green. Note that the decision (here 'L', which stands for "same length as standard or longer" since the decision must be binary) is rendered at the end of the very end of the IO
I, in the transition from state 359 to state 3,911.

The movie below shows the transitions in the same brain for a signal that is long (six units), but is advanced by two units. Due to its advancement, it ends at the same exact time that a short tone would have ended, and indeed because the brain does not pay attention to the beginning of the tone, it ends up in precisely the same state as it would have ended up in if it had listened to a short tone. Because of that lack of attention, it issues the "S" determination with full confidence, but is completely wrong. 





Why does the brain not pay attention to the beginning of the tone? According to the SAT theory of attention, both the beginning and the  end of the tone represent a contrast that the brain should be paying attention to. In hindsight, however, this makes perfect sense. These brains never experienced out-of-rhythm tones (during evolution) and so focus only on the end of the tone, as this is the only place where there is expected variation! From an information-theoretic perspective, there is no entropy at the beginning of the tone, so information can only be gathered from the end! In other words, the brain only pays attention to the potentially informative parts of the signal, and not to those aspects that are always the same.


An information-theoretic analysis of all 50 evolved brains indeed bears this out. Do we then have to completely change our theories of attention?


In the realm of visual attention, one of the common theories of what gets our attention is the "visual saliency" model of Itti and Koch [2]. In this model, attention is attracted to parts of the visual stimulus that are visually salient, that is, they stand out from their background. This is similar to the DAT theory of auditory attention, and it may very well be that this is only part of the story of visual attention. Indeed, we have some evidence that our eye saccades are drawn not only by saliency, but also by our expectation of where the relevant information is to be found. 


A strong indication that what we expect to find plays a crucial role in what we pay attention to is an experiment run by Lawrence Stark. He recorded the "scan path" of human subjects when saccading an image of the famous "Rubin vase" (the black-and-white image that can be seen either as a vase or as two faces in profile looking at each other). When priming the subject with a Rubin vase adorned in such a manner that it is clearly recognizable as one or the other image (see image below) the scan path of subjects follows that of the expected image, even though the subjects were looking at the unadorned image [3]


Two images of the "Rubin vase" adorned with markers that suggest one or the other image. They were used to "prime" subjects who had their scan path measured in Ref. [3], but the image they actually saw when their eye saccade path was scanned was the image without adornment. Image from [5].

We can thus assume that attention is driven by multiple mechanisms: a "top-down" mechanism driven by high-contrast features, as well as by a "bottom-up" mechanism where attention is driven by what the brain expects to experience. What it is that we expect to experience we may not always be conscious of, so you may think you know what your brain is paying attention to, but don't be surprised if you can easily be fooled!


References:


[1] J. D. McAuley and E.K. Fromboluti, "Attentional entrainment and perceived event duration", Phil. Trans. Roy. Soc. B 369 (2014) 20130401.

[2] L.W. Stark, C.M. Privitera, H. Yang, M. Azzariti, Y.F. Ho, T.T. Blackmon, and D. Chernyak (2001). "Representation of human vision in the brain: How does human perception recognize images?" Journal of Electronic Imaging 10, 123–151.

[3] L. Itti and C. Koch (2001). "Computational modelling of visual attention". Nature Reviews Neuroscience, 2, 194–203. 

[4] A. Tehrani-Saleh, J.D. McAuley, and C. Adami (2004), "Mechanism of duration perception in artificial brains suggests new model of attentional entrainment" (2024). Neural Computation 36, 2170-2200.

[5] C. Adami, "How brains perceive the world" (2024). Artificial Life 30



Thursday, August 12, 2021

What is a Black Hole Laser?

When you grow up as a child that does not have to worry about being fed and housed and all that, you get to think about all kinds of awesome and fantastic things: Dinosaurs! Space ships! Black holes! Lasers!

What if you could combine all of those? Would that create the ultimate awe-pocalypse? Dinosaurs flying into black holes in spaceships powered by lasers? Obviously we can't pull this off because dinosaurs are extinct, we have a hard time building spaceships (though Elon is working hard to change that) and we certainly can't fly into black holes, even though movies such as Interstellar  have teased that. But where do Lasers fit in? Well, hold on to your seats: it appears that (at least theoretically) black holes can be turned into Lasers, or maybe some already are, but we don't know it yet! 

If you are reading this post you probably already know what Lasers are, but in order to make the case for BHLs (black hole lasers), let me reintroduce you to them briefly. `Laser' is an acronym for "Light Amplification by Stimulated Emission of Radiation" (be honest, it's one of the greatest Science Acronyms of all time, because it is not tortured at all). The phenomenon relies entirely on Einstein's work concerning the probability of absorption and emission of light from atoms or molecules. Here's the title of this groundbreaking (and, as a matter of fact, easy to read) paper:

In this paper, Einstein showed how you can derive Planck's celebrated formula (the "Law of Radiation", or "Strahlungsgesetz") from first principles. Yeah, Planck had essentially guessed his law. 

Einstein made the following assumptions, given an atom or molecule that has discrete energy levels with energies \(E_1\) (the "ground state") and \(E_2\) (the "excited state"):

1. A quantum of light can be absorbed by the atom, which raises the state from ground state to excited state 
Here, \(\hbar \nu\) is the energy of the quantum that is tuned to the difference in energies: \(\hbar\nu=E_2-E_1\).

2. If an atom is in its excited state, it can spontaneously emit a quantum of energy:


3. If an atom is in the excited state and absorbs a quantum at the same time, the quantum stimulates the emission of another quantum just like it:

Using these assumptions, Einstein was able to derive Planck's law, which in turn means that these are the only processes needed to understand how radiation interacts with matter (barring quantum field-theoretic effects, of course). 

The third process, called stimulated emission, is the one that gave us Lasers. The thinking behind is (in hindsight) quite simple. Take a look at the picture above. What happened here is that the incoming quantum was doubled (copied). You don't violate the no-cloning theorem this way because the spontaneous emission process provides just about enough noise. But you already knew this. Now suppose that after you made two-out-of-one, the two quanta go on their merry way but encounter a mirror and head back:

At the same time, imagine you have a way to "pump" atoms back to their excited state from the ground state (that's what the yellow light bulb in the sketch above is supposed to do). When the two quanta encounter an atom (not necessarily the same one, there will be plenty of atoms in the gas that's enclosed in the cavity between the two mirrors), each will stimulate the emission of a clone, so from 2 make 4. Because this doubling gives rise to an exponential process, after a short while the number of quanta becomes astronomical. Now, if the quanta remain in the cavity then they would be of no use to anyone, which is why one of the mirrors is usually made semi-transparent: so that the beam of coherent quanta can leave out of that side. That's your Laser, right there:





Alright, now that you have become a Laser expert, what does this have to do with black holes? Those of you that have been following this blog from its inception maybe remember that stimulated emission is precisely the process that saves information from disappearing into a black hole (if you haven't, then start here). In a nutshell, the processes that Einstein wrote about don't just hold for light interacting with atoms, they hold for any and all particles (fermions or bosons) interacting with any matter. In particular, they hold in quantum field theory. And as a consequence, they hold for quantum field theory in curved space. They hold for stuff interacting with a black hole. So when a particle is absorbed by a black hole, it stimulates the emission of a clone outside the horizon (which is carrying a copy of the information), and so we don't have to worry about the one that disappears behind the horizon. That's it, that's the whole ballgame for solving the problem of information loss in black holes. If you take into account this process (instead of ignoring it, as Hawking did in his calculation [1]) you find that information is conserved (the capacity of the black hole to process classical information is positive). 

"Fine (I hear you saying), but this doesn't make a Laser yet"!

True. For example, where's the mirror? But, hear me out. Suppose you fell into a black hole (one that is very large, so you don't get ripped apart by the tidal forces), and suppose you brought with you some rocket engines that would allow you to hover inside the event horizon, but you don't fall towards the singularity (if there is one). Even better, imagine that there are planets inside the black hole (like in the movie Interstellar, again). Then you would "look up" to the horizon, and you would notice something strange. If you shine a light beam towards the horizon, it is reflected back. The reason for this is simple: nothing can escape the black hole, so the best that your light beam can do is "go into orbit" below the event horizon. Basically, this means that from inside of a black hole, you're looking at a white hole horizon. (I've written about this before: it is a consequence of time-reversal invariance.)

The other thing you have to keep in mind when discussing stimulated emission in quantum field theory is that you always "stimulate in pairs". The reason for this is that you need to conserve quantum numbers: for example, if you are going to stimulate an electron, you are going to have to stimulate a positron also. So for every clone that is stimulated outside of the horizon, you also stimulate an anti-clone inside of the black hole (see Fig. 1)
Fig. 1 The horizon of a perfectly absorbing black hole looks black from the outside, and white from the inside. Particle p (black) is absorbed, and stimulates the emission of the clone-anticlone pair (red).

But clearly that is not enough yet to make a Laser. So imagine then two black holes that are connected by an Einstein-Rosen bridge: a wormhole. This might look a bit like this:
Wikipedia's depiction of a Schwarzschild wormhole. 

Basically, it is two black holes connected by a "throat". We don't know if it's traversable for people, but you certainly can imagine that a particle thrown into one of the black holes might come out at the other end. 

Come out? But nothing can come out of a perfectly absorbing black hole, right? Well, this is both right and wrong. Physical particles cannot come out, but you can clone those particles, so copies can come out, which after all is just as good. 

Let's see how this would look like. We now take the black hole horizon from Fig. 1, and add another one like so:

Fig. 2: Two connected black holes with horizons \(H_1\) and \(H_2\). The anti-clone that travels towards \(H_2\) is reflected there, and stimulates another pair.

Because from the inside the black hole looks like a perfectly reflecting white hole, the anti-clone heads back to horizon \(H_1\), but the mathematics of stimulated emission in black holes says that the reflection creates a clone pair as well

What happens now? Well it's clear. The absorbed particle and one of the anti-clones are on a collision course leading to annihilation, but the other anti-clone will reflect on \(H_1\) and stimulate another pair. As if by magic, there are now two clones outside horizon \(H_1\), and two outside horizon \(H_2\). But the anti-clones inside the wormhole keep reflecting between the horizons just as in the optical Laser described above. Except that we don't need semi-transparent mirrors: the "Laser beams" will emanate from the horizon in a coherent manner as long as the inside of the wormhole is coherent!

Fig. 3: The wormhole Laser. Anti-clones that are reflected from the inside of the black holes stimulate emission of clones outside the respective horizons. 

So where does all this energy come from? After all, there is no pump that "charges up" this Laser, as there must be for an optical Laser. The answer is that this bill is paid for by the black hole's masses, just as it is in spontaneous emission. A detailed calculation would have to show how fast a wormhole might deplete its mass, but the calculation is already difficult for a single black hole (and it can only be made by approximating the interaction between black hole and radiation with a model, see here). 

Now to the last question: what would a BHL look like? First of all, it is clear that whatever radiation emanates from the black hole, it will look like it is coming from a disk surrounding the black hole. We also know from explicit calculations that the stimulated emission in response to absorbed material is not red-shifted (because this is "late-time" absorption). However, what happens to material that reaches the second horizon I can't say without a calculation. The important distinction for Laser light, however, is that it is coherent. If the stuff that is stimulated outside the horizon is similarly coherent, we might be able to detect this using typical Hong-Ou-Mandel interferometry of light coming from such a black hole. We've just learned how to look at light emitted from black holes using telescopes like the Event Horizon Telescope, so it might be some time before we can check if that light is really BHL light. We don't know how many black holes are actually connected to others making BHLs possible, but at least there is a chance to find out! 



[1] The reason Hawing ignored stimulated emission in his calculation of radiation coming from a black hole is that he thought that it would require energy from a black hole's rotation (the rotational energy would provide the "pump energy"). Because he treated a non-rotating black hole, he decided he could ignore the effect. It turns out that stimulated emission does not require black hole rotation. 




Monday, December 2, 2019

On quantum measurement (Part 8: Leggett-Garg Inequalities)

So Part 7 wasn't the final installment in this series after all. I know this because this is Part 8 of the quantum measurement saga. And for this one, I'm going to change the style a bit. I will begin with a fictional dialogue between two quantum physicists, called "Lenny" and "Chris". After the dialogue, I will delve more deeply into what it is that Lenny and Chris are talking about.


This is a dialogue between Lenny and Chris, two fictional physicists. Both know quantum physics. They both can calculate things. But Lenny has an advantage: he can also do experiments. He has made a discovery, and wants to show it to Chris.


Lenny: "Hey Chris, I just wrote down this equation describing a series of three consecutive measurements on the same quantum system. My equation says that when you make these three measurements 'a', 'b', and 'c', a particular sum must always be smaller or equal to 1."

Chris: "Wow Lenny, this looks great. It looks like a solid prediction. You should try and check that. You know, because of your ability to measure things. In quantum physics."

Lenny: "You know what, I'm going to do just that.

Time passes. Music is playing in the background. It is mildly annoying. A while later:

Lenny: Hey, that's actually a difficult experiment. Doing two measurements in a row is easy, but doing three is hard. Do you think I can instead do three different measurements where I just do a pair of measurements in each?"

Chris: "What do you mean?"

Lenny: "Here I'll sketch it for you on the board"



Lenny: "Here's the quantum state \(Q\), and here's the three measurements, see? Let's call them a, b, and c. And we should do each measurement at an angle \(\theta\) with respect to the previous measurement"

Chris: "Yup. Go on."

Lenny: "Well, I wanted to just do the pairs, 'ab', 'bc', and 'ac', like so:




Lenny: "See, I first do the first pair just like above. Then I do the "bc" measurement (but I don't really have to because it is really just the same as the "ab" measurement), and then I do the one at the bottom of the chalkboard, where I leave out the middle one."

Chris: "Ah but Lenny, leaving out the middle measurement changes everything. It's, like, the first lesson in quantum mechanics. Measurements change the system that is being measured. That's why we have the uncertainty relation".

Lenny: "But what if the middle measurement doesn't matter?"

Chris: "What?"

Lenny: "Hear me out. What if I did the experiment and assume that I get the same result as if I had done the measurement? As if the middle measurement didn't matter."

Chris: "You mean, like in classical physics?"

Lenny: "Yes, just as in classical physics".

Chris: "But we already know that measurements like this cannot be described by classical physics".

Lenny: "I'll just say I'm testing "classical thinking".

Chris: "Sure, you could do that. But nobody thinks that classical thinking will get the same result, so you'll never be able to publish such a result."

Lenny: "Oh but I bet I can. I just won't call it 'classical thinking'. I'll call it macrorealism. I'll say I'm testing macrorealist theories.''

Chris: "Now you're just trolling me. But go ahead knock yourself out."

More time passes. A lot of time. Finally Lenny is back. He is out of breath. The blackboard, miraculously, has not been erased.

Lenny: "Chris, you're not going to believe it. I did those three pairs of measurements and I get a result that's different from what quantum mechanics predicts for doing all three measurements!"

Chris: "Actually, Lenny, I totally believe it. It is what quantum mechanics predicts after all."

Lenny: "But it's much better than that! The results are such that the equation that quantum mechanics says can never exceed one, actually does! I broke the inequality!"

Chris: "That is not possible. Show me what you did."

Lenny: "I did the measurements just like on the board."

Chris: "You did? What angles did you use?"

Lenny: "I used \(\theta\) for the angle between the first two and also between the last two, and \(2\theta\) between the first and the third, assuming the middle one does not matter."

Chris: "Yeah, you can't do that."

Lenny: "What?"

Chris: "The three experiments are supposed to mimic the experiment at the top of the blackboard, right?

Lenny: "Yes.  And as long as I prepare each of the three experiments exactly in the same manner, I can use the statistics of the three pairwise measurements to stand in for the triple one."

Chris: "That is true.  But you didn't prepare your three measurements in the same way. You used very different parameters in each. That is not allowed."

Lenny: "What do you mean? Angle \(\theta\) between the first two and the last two, and angle \(2\theta\) between the first and third. It's the same."

Chris: "It is not. Let me explain."


So what are Lenny and Chris talking about here? They are discussing making consecutive measurements on the same quantum system, something that we encountered already in Part 7, which is linked. While I describe the theory of consecutive measurements in that part, with a focus on whether or not wavefunctions collapse during measurements, that part of the theory is completely irrelevant for the following. Collapse and no-collapse pictures make the same predictions for what follows. 

Let's first set up the three measurements. For convenience, we'll take a quantum two-state system (a qubit) and all our detectors are binary detectors (they click or they don't). There are of course many ways in which we can realize such a series of measurement, and one of the simplest is via a Mach-Zehnder optical setup. You've seen versions of it several times in this series.
Fig. 1: Measuring the quantum system \(|\Psi_1\rangle\) using the classical detectors  \(A_1\), \(A_2\), and \(A_3\).
Here, the dark grey boxes with an angle written on top are polarizing beam splitters set at an angle with respect to the polarization of the initial state. You can think of these polarizing beam splitters to change the basis of the quantum state, so that for example, \(A_2\) measures the quantum state at a relative angle \(\theta_1\) compared to \(A_1\), and \(A_3\) measures at angle \(\theta_2\) with respect to \(A_2\). I'm sure you realized by now that these three measurements are just the 'a', 'b', and 'c' of the dialogue.

If you don't want to think in terms of interferometers, you can also just look at the quantum circuit in Fig. 2. 

Fig. 2: The three consecutive measurements as a quantum circuit. The initial quantum state is here written as an entangled bi-partite state. Measurements are indicated by a CNOT operation followed by a Hadamard gate that rotates the quantum state by the indicated angle.
Alright, so what is this equation that Lenny and Chris are discussing. It concerns the correlation between pairs of detectors. For example, I might want to know what the correlation is between detectors 1 and 2. I call this function \(K_{12}\). It's fairly simple to derive it, but I won't do it here. It's described in detail in the paper that I'll be linking to. Don't groan, you knew this was coming.

For example, suppose \(\theta_1=0\). This means that \(A_1\) and \(A_2\) measure in the same basis, and quantum mechanics then tells us that both detectors must agree. So \(K_{12}(0)=1\). On the contrary, if \(\theta_1=\pi/2\), then you are measuring at orthogonal angles and the correlation must vanish. Indeed quantum mechanics predicts

\(K_{12}=\cos(\theta_1)\;.\)             (1)

We can also calculate \(K_{23}\) in this manner. You get the same result, only with the angle \(\theta_2\), that is, \(K_{23}=\cos(\theta_2)\).

What about \(K_{13}\)? Well, this result will depend on whether or not we make the 'b' measurement in between. If we do it, then the result is  

                                      \(K_{13}=\cos(\theta_1)\cos(\theta_2)\;.\)             (2)

If instead we don't do it, then the third measurement will occur at the angle \(\theta_1+\theta_2\) with respect to the first, and we then should find \(K_{13}=\cos(\theta_1+\theta_2)\). 

There is, by the way, no controversy about the result of these calculations. They are easy to do. Doing or not doing a measurement matters. Everybody knows that. I assume you are nodding vigorously in assent. You did, didn't you?

Let me now write down the equation that Chris and Lenny have been discussing. 

It can be shown, using the formalism I introduced in Part 7 (or any other standard quantum measurement formalism, for that matter) that the following inequality holds:

                                     \(K_{12}+K_{23}-K_{13}\leq1\;.\)             (3)

This is only one of the so-called "Leggett-Garg inequalities", introduced by Sir Anthony Leggett and Anupam Garg [1] , but for the sake of being brief it is the only one I'm going to consider here. 

Like I wrote above, whether or not you are making two or three measurements matters. Let's find out how. First, let us check what we get for the inequality if all three measurements are performed. Using the results above, and assuming that \(\theta_1=\theta_2\equiv\theta\), that is, I'm choosing the same change in basis between the first and the second, as the second and third measurement), we get

                                 \(2\cos(\theta)-\cos^2(\theta))\leq1\;.\)             (4)

It's immediately clear that this inequality cannot be violated, because the left hand side of Eq. (4) is \(1-(1-\cos(\theta))^2\), which is less than one for all \(\theta\).  There is no way that Equation (3) can ever be violated, even if you use two different angles \(\theta_1\) or \(\theta_2\). 

But Lenny said that he did violate inequality (3), and all he had to do was not make the middle measurement. If that would be true, quantum mechanics would be broken because equation (3) should hold whether or not I'm making the middle measurement.

OK, what does it mean to *not* make a measurement? There are actually two ways in which you could do that (of course they turn out to be the same). One way to do this is to actually *do* the measurement, but you must make it at the same angle that you made the first measurement. You choose any other value, and the second measurement *is* made. But remember, in the 'ac' measurement you are not doing the second one, and since in order to simulate the triple measurement you must do all three pairwise experiments with the same parameter values, then if you do not measure at position 'b' in the third measurement (for the 'ac' correlation), you must also not make that measurement in the first of the three also (for the 'ab' correlation). And the second (for the 'bc' correlation) just the same. 

Otherwise, those three pairwise measurements do not describe the triple-measurement situation, and the inequality does not describe this set of measurements. So instead, what you should do is this:



You're not doing the 'b' measurement, so you have to choose \(\theta=0\) for the 'ab' measurement. You are doing the 'bc' measurement at angle \(\theta\). That's fine. But the angle between the first (non-measurement) and the third is then \(\theta\), not \(2\theta\). Because 0 plus  \(\theta\) is \(\theta\).

If you use these angles and plug them in to equation (3), you immediately realize that the inequality is not violated: it is exactly equal to 1. If you want to use  \(2\theta\) for the third ('ac') measurement instead, then you also have to use that angle in the 'bc' measurement, since the first measurement (at 'b') had to be done at zero angle. You can't say that you don't do a measurement (for the 'ac') experiment, yet do it in the 'ab' measurement, and believe that it describes the same experiment (the 'abc' measurement). It's like saying I use \(\theta=45\) degrees in the 'ab' measurement, but I use  \(\theta=30\) in the 'ac' measurement. You can't just change angles if you want the statistics of the pairwise measurements to reflect the full triple measurement. If you do, then the three pairwise measurements are not standing in for the triple measurement. 

I told you earlier that there was another way of *not* doing a measurement. This is making a measurement so weakly that it becomes non-existent. What's a weak measurement you ask? It's actually a simple concept, and I'll briefly explain it here.

At the risk of repeating myself, here's what happens in a measurement. You want to move the measurement device's "needle" in such a manner that it reflects the thing that you want to measure. If what you are measuring has only two states, then (for example) you want to keep the measurement device as is when the system is in one of the states, and you want to move the measurement device into a different state if the system is in the other. If you want to make sure which is which, you need to make sure that the two states of the measurement device are easily distinguishable.  In quantum physics, the best you can do (for distinguishability) is to make the two states orthogonal. Indeed that is precisely what you do in projective measurements.

But it is possible to make measurements where you don't move the measurement device's "needle" by 90 degrees (into the orthogonal direction). What if you just move it a little?

If you move the device's needle just a little, then it becomes harder to make sure what the device's state is after measurement. Say you move the needle by an angle \(\phi\). Because your device at this point is just another qubit, then in order to amplify your measurement (make it available to a classical reader) you need to measure it again. But this time using an orthogonal device. This means there is a change that the ancillary qubit will be misread. (This is not possible if the measurement was done at an angle 90 degrees, that is, a strong measurement). 

A weak measurement, therefore, is a fuzzy measurement. You might think that this is a terrible way of doing a measurement, but there are plenty of uses for this. For one, you can make this measurement more precise by repeating it many times. If you do, you can detect even very small angles \(\phi\). What you gained is that your weak measurement only weakly disturbed the quantum system. OK, so let's do some weak measurements. To prepare you for that, I'm going to magnanimously repeat strong measurements, as if you did not already read Parts 1-7. I know you did, but who can remember things from so long ago?

Here we imagine that our quantum system is prepared. This means that we took an arbitrary quantum state, and measured it with a particular device. No matter what the outcome of that "preparing" step is, we'll treat this as the first measurement (the 'a'). So we'll only have to do two more, but Lenny wants to simplify things, and do 'b' only for the first pair. Then, he'll prepare an identical quantum state with 'b' and measure 'c' (and as I said, that's really a repetition of the first measurement), and finally he'll do the 'ac' measurement. 

We do measurements by entangling using a CNOT operator 

                    \(U_1=|\theta_1\rangle \langle\theta_1|\otimes \mathbb{1} +|\bar\theta_1\rangle \langle\bar\theta_1|\otimes\sigma_x\;.\)

Here, \(|\theta_1\rangle \langle\theta_1|\) is the projector into the new basis, and \(|\bar\theta_1\rangle \langle\bar\theta_1|=1-|\theta_1\rangle \langle\theta_1|\) is the projector on the orthogonal basis state. \(\sigma_x\) is the first Pauli matrix, and just flips a qubit. 

Here's a little exercise for you: you can write the same operator as 

                               \(U_1=e^{i\pi/2 |\theta_1\rangle \langle\theta_1|\otimes \sigma_y}\;.\)

We won't do this exercise here. But if this is stunning to you, go ahead and expand the exponential and so forth, it'll be worth it. And no, it's not a typo in the exponent: it is really the y-Pauli matrix. 

The angle \(\pi/2\) in the above expression is the "flipping" part of the operator. In a weak measurement, we just use a smaller angle \(\phi\) instead. If you do that, then instead of moving your ancilla from \(|0\rangle\) to \(|1\rangle\), you instead movie it to the state

                          \(|\epsilon\rangle=\sqrt{1-\epsilon^2}|0\rangle + \epsilon|1\rangle\;.\)

Oh, and \(\cos(\phi)=\sqrt{1-\epsilon^2}\). So basically, we now repeat the entire calculation with a weak measurement 

                                  \(U_1=e^{i\phi |\theta_1\rangle \langle\theta_1|\otimes \sigma_y}\;,\)

keeping in mind that in the limit \(\phi\to\pi/2\) we return to a strong measurement, but in the limit of \(\phi\to0\) the measurement becomes so weak that it does not even take place! Just what we need!

Alright, here's the plan. We still measure the quantum state with the 'a' measurement in whatever basis. The next measurement is the middle one, so it needs to be potentially weak, so we perform it with a strength  \(\phi\), at angle \(\theta\). Just to be clear: \(\phi\) is the strength of measurement, with 0 meaning no measurement, and \(\pi/2\) meaning full-strength orthogonal measurement. The angle is still \(\theta\). The last measurement will be strong again, at angle  \(\theta\) again. The calculation for the correlation function now gives

                            \(K_{12}=(1-\epsilon^2) + \epsilon^2\cos(\theta)\) 

and you're not surprised to recover \(K_{12}=\cos(\theta)\)  when the weak measurement is strong instead. Now let's calculate 

                   \(K_{23}=(1-\epsilon^2)\cos^2(\theta) + \epsilon^2\cos(\theta)\;.\) 

You can check again that in the limit \(\epsilon\to1\) you recover the old (strong) result. Finally,
\(K_{13}\) does not depend on \(\epsilon\) because it is the 'ac' measurement and only 'b' is potentially weak. (Not that this is only true if the first measurement prepares the quantum state in an eigenstate, as opposed to a superposition, of the first detector basis.)

Let's write down the Leggett-Garg inequality for this generalized (weak) 'b' measurement:

                    \(K_{12}+K_{23}-K_{13}=1-\epsilon^2(1-\cos(\theta))^2\;.\)

In the limit of a non-existing measurement 'b' \(\epsilon\to0\), we get 

                         \(K_{12}+K_{23}-K_{13}=1\;\)

the same result we obtained when we didn't do the second measurement by choosing the first \(\theta=0\). Whichever way we choose not to do the middle measurement (because we want to simplify our work by doing only pairs of measurements), the Leggett-Garg inequalities cannot be violated. Ever. 

What does all this mean for us? Well, people have wondered over and over again why it is so easy to violate inequalities that should be observed by quantum mechanics. Some (including Leggett) have suggested that when it comes to macroscopic phenomena, quantum mechanics can't be the right theory. Of course, you might say, we use classical mechanics there. But you need to remember that classical mechanics is not a correct theory. We need to be able to describe classical objects with quantum mechanics, or at least something that both describes microscopic and macroscopic objects accurately at the same time. Others have suggested that quantum mechanics is indeed the right theory, but that we just haven't (for a variety of reasons)  been able to observe the deviations that the violation of LG inequalities implies. One of the reasons suggested by that camp is that perhaps we can never really perform perfectly strong measurements, and that this ``fuzziness" of measurements might obscure the violations [2]. Instead, we now realize that there are no paradoxes at all. Quantum mechanics correctly describes both microscopic and macroscopic physics. And we see no "weird" stuff in quantum measurement of macroscopic devices not because strong measurements can't be achieved, but because those violations simply do not exist. They are based on the faulty thinking that plagued Lenny. I do realize that there are literally hundreds of papers that follow Lenny's thinking in making pairwise experiments. All of them claim the Leggett-Garg inequalities are violated. All of them are wrong.

The preprint describing this work is [3] below. I have applied this thinking to the so-called "entropic" Leggett-Garg inequalities, which are related to the original inequalities but distinct. In [4] I show that these can't be violated either.

[1] A. J. Leggett and A. Garg, “Quantum mechanics versus macroscopic realism: Is the flux there when nobody looks?” Phys. Rev. Lett. 54, 857–860 (1985).

[2] J. Kofler and C. Brukner, “Conditions for quantum violation of macroscopic realism,” Phys. Rev. Lett. 101, 090403 (2008).

[3] C. Adami, “Leggett-Garg inequalities cannot be violated in quantum measurements,” arXiv:1908.02866 (2019)

[4] C. Adami, “Neither weak nor strong entropic Leggett-Garg inequalities can be violated,” arXiv:1908.03124 (2019).