A Primer on the Interpretations of Quantum Mechanics, Part 3

If you missed them, parts 1 and 2 are here:



Hidden Variables

Last time, I talked about interpretations involving wave function collapse, which collectively form one of the three main traditions in the foundations of Quantum Mechanics. To repeat a little bit, the issue is that Quantum Mechanics appears to predict that measurements never really have outcomes, and that all measurements will end up putting human brains into strange states where they are in a superposition of having different experiences, of believing different things. There are more or less three ways to try and deal with this:

1. Add something to the dynamics1 of Quantum Mechanics, namely a collapse postulate, that violates the Schrödinger Equation and results in one or another possible outcome being selected.

2. Add something to the ontology2 of Quantum Mechanics, over and above the wave functions of particles, that allows one or another outcome to be “picked out” as the one that occurs, even though the Schrödinger Equation remains true at all times.

3. Don’t add anything to Quantum Mechanics, and instead accept that those strange states exist and try to make sense of what they mean.

This time, I’m going to talk about option 2. Most of the ideas under this category are what are typically called “hidden variable” interpretations, though there are newer “modal interpretations” that would go under this heading as well. I’m going to primarily talk about the best-known of these theories, the model of David Bohm.

I should start by saying that there used to be, and to an extent still is, a tremendous bias against hidden variable interpretations in the physics community. Some of this has to do with the fact that Niels Bohr insisted that wave function collapse (and anti-realism) was the only proper way to understand QM. Louis de Broglie, an important contributor to early quantum theory, had sketched out what was called a pilot wave model of the theory in the 1920s, but was convinced by Bohr to abandon it in favor of the Copenhagen interpretation. In the 1950s, his work was revived by David Bohm, who developed it into a fully realized, rigorous theory. Unfortunately, Bohm’s work was ignored for many years, partly because of his suspected Communist ties3. And a widespread mis-understanding of several proofs (an argument by von Neumann, the Kochen-Sprecker Theorem, and, chiefly, Bell’s Theorem4) led many people to believe that hidden variable theories had been conclusively ruled out.

What Bell’s Theorem actually showed is that no local hidden variable interpretation is possible – that is, any hidden variable interpretation must involve instantaneous cause and effect between physically separate locations. This rules out the kind of hidden variable theory that Einstein hoped could be developed, preserving the principle of locality5. But non-local hidden variable theories are still entirely feasible. Moreover, the collapse-based interpretations that we’ve already looked at are also necessarily non-local, since when collapse occurs, it instantaneously affects all entangled particles, regardless of their location. So, setting aside any biases against this sort of model, let’s talk about Bohm’s theory.

Bohmian Mechanics

The idea behind the pilot wave interpretation6 is to posit that the universe contains point-like particles, as in classical Newtonian mechanics, and, associated with each particle, a Quantum Mechanical wave function. The particles always have definite positions and velocities; a particle can never be in a superposition of different states – in other words, they are really, truly, like classical particles. The wave functions, on the other hand, are just like the wave functions in standard QM. The wave functions evolve – always – according to the Schrödinger Equation. The wave function in turn governs the motion of its corresponding particle according to a new “Guiding Equation”. Roughly, you can think of the wave function as “pushing around” its corresponding particle, and the particle will tend to move toward regions where the amplitude of the wave function is high.

Both the Schrödinger Equation and the Guiding Equation are deterministic. In other words, if you were given the exact positions and velocities of all the particles, and their exact wave functions, you could calculate exactly what the positions of the particles and the values of the wave functions would be for any future time. On the face of it, this seems crazy – Quantum Mechanics, after all, is supposed to be intrinsically probabilistic. So how can Bohm’s theory possibly work as an interpretation of Quantum Mechanics?

The trick is that even though Bohmian Mechanics is, in principle, deterministic, it also ends up placing very definite limitations on how precisely the positions and momenta of particles can be measured. In fact, it exactly preserves Heisenberg’s Uncertainty Principle – except that while the Uncertainty Principle is usually understood as saying, essentially, that a particle doesn’t have an exact position and momentum at the same time, the Bohmian version says that the particle does have an exact position and momentum, but that they can’t be simultaneously known. And that limitation is not ad hoc; it follows rigorously from the way the mechanics of the theory work. The probabilities that come up in performing a measurement, then, are in Bohm’s theory classical sorts of probabilities. We can’t predict the result of a measurement with certainty simply because we don’t – and can’t – know the initial conditions with sufficient precision, not because the fundamental laws of the universe are inherently non-deterministic.

Bohmian mechanics makes one further assumption, namely that at some time in the past, the distribution of particles throughout space statistically followed the square of the amplitudes of the particles’ wave functions. In other words, that if you picked, from each wave function, a physical region containing X% of the square of its amplitude, exactly X% of those wave functions would have their corresponding particle within that region. That is to say that, at that time in the past, the physical positions of the particles were just right so that if you knew their wave functions and then measured their positions, the usual probability rule would give you the correct probabilities for the results of those measurements. And it can be shown that if this was the case at any one instant, the deterministic laws that move those particles and wave functions around will result in it being true for all time7. That means that we are guaranteed that, for all time, any measurements we make of the positions of those particles will give us the same results we expect from the probability rules of standard Quantum Mechanics.

Two-slit Interference Revisited

So far this is all pretty abstract, but let’s take a look at how the two-slit interference experiment that I talked about in part 1 works in Bohmian mechanics. As a reminder, this experiment consists of a coherent light source emitting light toward a screen – let’s imagine that the screen is made of a photosensitive material that will change color wherever light strikes it. Between the light source and the screen we place an opaque plate that contains two small vertical slits. If we turn on the light source, the light waves will pass through the slits, and the part of the wave coming through one slit will interfere with the part coming through the other slit, producing an “interference pattern” of bright and dark regions, like the one shown in the header image. Recall that the puzzling thing was that even if we shoot one photon at a time – which, classically, we’d expect would have to go through one slit or the other, but certainly not both at the same time – the little dots that appear on the screen as the photons hit, one by one, will gradually build up into that same interference pattern, even though it would seem that the individual photons had nothing to interfere with. In standard Quantum Mechanics (i.e. in any interpretation involving wave function collapse), this is explained as happening because as the photon travels from the source to the screen, it doesn’t have a definite position. Its wave function passes through both slits and interferes with itself; only once it hits the screen8 does the photon “decide” where within that wave function it is located. Since the probabilities for that “decision” depend on the amplitude of the wave function, the photon is more likely to choose to be in a spot where the wave function’s amplitude is large, and that’s why the interference pattern gradually emerges.

That’s orthodox QM; let’s look at the experiment again from the point of view of Bohmian mechanics. In this interpretation, each photon is a classical particle, with an associated wave function. The wave function of each photon starts out (as in standard QM) localized in a very small region at the light source – if we’re using, say, a laser, the wave function is localized within the small area of the laser-emitting surface. The amplitude of the wave function is presumably roughly uniform within that area. The particle – the photon itself – will have a definite position, somewhere within that area where the wave function has a non-zero amplitude. As we emit photon after photon, all of those photons’ wave functions will start out the same, in that same small area. But the photons themselves will not all start in exactly the same spot within that area; they’ll be distributed in proportion to the square of the wave function, which in this case means they’ll be uniformly distributed across the laser-emitting surface.

The wave function follows Schrödinger’s Equation, which means (just as in standard QM) it will pass through the two slits and interfere with itself – and since all the photons’ wave functions start out the same, they’ll all end up with the same interference pattern on the other side of the slits. But what happens to the photons themselves? Well, each photon has a definite position at all times, so a photon must go through one hole or the other one9. Remember, the photons in Bohmian mechanics are pushed around by their wave functions according to the Guiding Equation, which is deterministic. So if we start with a photon in some particular spot within the initial wave function, the Guiding Equation tells us exactly what path it will be pushed along. Which slit it passes through, then, depends only on where, exactly, within the initial wave function, the photon started. We can say, with certainty, that if a photon starts here then it will pass through this slit, and if it starts there then it will pass through the other one. But since we don’t actually know where this particular photon started, all we can say is that there’s a 50% chance it will pass through one slit and a 50% chance through the other.

Whichever slit it passes through, the photon will then emerge into the region between the plate and the screen, which is where its corresponding wave function is interfering with itself. The wave function now has a high amplitude at certain spots along the screen and a low amplitude at others. And because of the way the Guiding Equation works, it will tend to push the photon toward the regions where it has a high amplitude and away from those where it has a low amplitude. Where, exactly, on the screen the photon ends up hitting depends on where, exactly, it emerged from the slit it passed through – and the depends on where, exactly, it started within the initial wave function. But because the Guiding Equation pushes the photon toward the high amplitude region, more of those possible starting places will end up with the photon hitting the screen in one of the high amplitude regions. The lower the amplitude of the wave function, the smaller the area the photon would have had to start in to end up there.

If you used the Guiding Equation (and the Schrödinger Equation) to calculate the exact path that a photon would take from its starting position, and did this for many photons at slightly different starting positions distributed uniformly across the slits, you’d end up with this:

Trajectories of particles in the two slit experiment, according to Bohmian mechanics

You can see that the photons end up being pushed into separate bands, which gives you the pattern of light and dark regions you end up with when you do this experiment.

So that’s what it means when we say that in Bohm’s interpretation, the probabilities (as in classical physics) come purely from ignorance of the exact starting conditions. Standard QM says that there are only the wave functions, and you could know everything there is to know about a photon’s wave function and still not be able to predict where it will hit the screen. In Bohmian mechanics, although the wave functions are all the same, the photons themselves each start in slightly different positions, and the reason we can’t predict exactly where a given photon will hit is just that we don’t (and can’t) know exactly where it started.

Spin in Bohm’s Interpretation

I’ve so far ignored the example that featured so prominently last time, that of the measurement of the x-spin of an electron. The reason for this is that the way Bohm’s theory works makes a measurement of position, like that in the two-slit experiment, conceptually a lot simpler. But let’s look at the spin example now.

The spin part of the wave function works just the same way in Bohm’s theory as it does in standard QM. In other words, the wave function of an electron will have a spin-component that could be something like ψ = |x-up> or ψ = a|x-up> + b|x-down>. But the particles themselves in Bohm’s theory do not have spins. They just have positions and velocities. So it’s perfectly sensible to ask what the spin part of an electron’s wave function is, but it makes no sense here to ask what the actual spin of the electron itself is.

In our hypothetical experiment, we start with an electron in this state:

ψ = a|x-up> + b|x-down>

. . . and we let it enter a measuring device that contains a magnetic field. The field is supposed to push the electron upward if its x-spin is up and downward if its x-spin is down. Again, the wave function starts out localized within a very small, but non-zero, region, and, as with the photon in the two-slit experiment, the electron itself is somewhere within that region, but we don’t know exactly where. In this initial state, the |x-up> and |x-down> parts of the wave function are in the same spatial location (in other words, the spin part of the wave function and the position part are not entangled with each other). When the electron (and its wave function) enters the device, what the magnetic field essentially does is to separate the |x-up> and |x-down> components of the wave function in space. You can think of this as the wave function splitting into two pieces, the |x-up> piece drifting upward and the |x-down> piece drifting downward. Another way of saying this is that the spin part of the wave function is now entangled with the position part, so the wave function is now something like:

ψ = a|x-up>|upper-region> + b|x-down>|lower-region>

The amplitude of the piece of the wave function that has moved up into the upper region of the measuring device is still a, and the amplitude of the part that has moved into the lower region is b.

So, what happens to the electron itself? Well, the electron doesn’t care about x-spin; the electron doesn’t even know what that is. But the electron does start out at some particular position within the original wave function. When the wave function separates into two parts, each part will try to “push” the electron in its direction, and where the electron ends up will depend on where it started and what the amplitudes are. Let’s say that a = b, so the two parts have equal amplitude. Then, the electron will feel an equal push in both directions, so if it started out in the upper half of the initial region, it will end up moving upward with the |x-up> part of the wave function, and if it started out in the lower half of the initial region, it will end up moving downward with the |x-down> part. If, say, b is much smaller than a, then the |x-up> part will exert a greater “push” on the electron, and there will be a correspondingly smaller area within the original wave function where the electron could start to end up moving downward with the |x-down> piece.

If the electron moves upward, it will push on the pointer, and the measuring device will give a result of “up”, and if it moves downward, it will give a result of “down”. And the way the Guiding Equation works, if the electrons are uniformly distributed within the initial region of the wave function, then an a2 fraction of the times we perform that experiment, we’ll get the “up” result, and a b2 fraction of the times we’ll get a “down” result. And that’s a pretty neat little trick, if you ask me. Once again, we get the same results we expect from “normal” Quantum Mechanics, but instead of invoking probability on a fundamental level, it all comes down to the fact that we don’t know the precise location that the electron started in.

There’s one other kind of interesting thing to note here. Imagine that we start in a state where a = b, so the |x-up> and |x-down> components of the wave function have equal amplitude. And imagine that the electron happens to start out somewhere in the upper half of the region of the initial wave function. As we’ve said, if we run the experiment, the electron will drift upward with the |x-up> component, so we’ll measure the x-spin to be “up”. But now imagine that we have the same initial setup but we’ve reversed the direction of the magnetic field in the device (or just turned the device upside down, for that matter). What this means is that now the |x-up> component will drift downward and the |x-down> component will drift upward10. Now, our electron that starts in the upper half of the initial region will still feel a greater pull from the part of the wave function that moves upward. And so it will still drift upward, even though now that means that it’s moving with the |x-down> part of the wave function, not the |x-up> part. In other words, the result that we get when we measure the x-spin depends on the way the measuring device happens to be oriented! Spin, therefore, is said to be a “contextual” property in Bohmian mechanics. There’s no “fact” about whether the electron’s spin is up or down except in relation to some particular measurement device.

Multiple Particles and Dimensionality

I told a little white lie earlier, and it’s time to go back and correct that. Actually, there’s something that I mentioned in part 1 but that I’ve kind of been glossing over and not really calling attention to since then, namely that when you come down to it, you can’t really think of there being a wave function for each particle, but rather you need a single wave function for all the particles to adequately describe the universe. This is true in all interpretations of QM, but one of the biggest worries about Bohmian mechanics in particular has a lot to do with this, so let’s be clear about it now.

This issue is entanglement. If a particle is not entangled with any other particles (like the photon or the electron in the examples we’ve just been considering), then it is a perfectly good shorthand to talk about “that particle’s wave function”. But because entanglement is possible in Quantum Mechanics, to provide a general description of a system of particles, we need a wave function that describes all of them.

There’s going to be a little bit more math-y notation than usual in what follows, just because I want to show why it ends up being this way, but don’t worry – if your eyes glaze over, you can just take my word for it.

Imagine we have a universe containing two particles. If entanglement weren’t possible, then we could think of there being two wave functions, ψ_1 and ψ_2, in 3D space. To specify the state of the wave functions, we just need to specify the value of ψ_1 and the amplitude of ψ_2 at each x, y, and z position in 3D space. The amplitude of ψ _1 at coordinates x, y, z tells us what the chance is of measuring particle 1 to be at position x, y, z, and similarly for ψ_2 and particle 2. Or we could think of there being a single wave function, ψ, in 6D space. The amplitude of that wave function at coordinates x1, y1, z1, x2, y2, z2 tells us what the chance is of measuring particle 1 to be at position x1, y1, z1 and also of measuring particle 2 to be at position x2, y2, z2.

That might seem like it’s just two ways of saying exactly the same thing. It’s, like, do you specify three numbers twice or do you specify six numbers once? But once you allow entanglement, only the second way is adequate, because now the positions of the two particles could be correlated.

To see why, let’s simplify things and imagine we have two regions each of the two particles could be in, A and B. Let’s say that |1-A> means “the state where particle 1 is in region A”, and so on for particle 2 and region B. The two particles might be in an unentangled state where we could write their wave functions separately, like this11:

ψ_1 = 1/√ 2 |1-A> + 1/√ 2 |1-B>
ψ_2 = 1/√ 2 |2-A> + 1/√ 2 |2-B>

In that case, we would write the total wave function of the system as the product of the two particles’ wave functions:

ψ = ψ_1* ψ_2

Multiplying it out, we have:

ψ = 1/2|1-A>|2-A> + 1/2|1-A>|2-B> + 1/2|1-B>|2-A> + 1/2|1-B>|2-B>

This is a state where particle 1 has a 50% chance of being in region A and 50% in region B, and the same for particle 2, and those probabilities are uncorrelated. Particle 2 has the same chance of being in A or B regardless of where particle 1 actually turns out to be. This means that there’s a 25% chance of getting each possible combination of particles 1 and 2 in regions A and B.

But we could also have a state like this:

ψ = 1/√ 2 |1-A>|2-A> + 1/√ 2 |1-B>|2-B>

This is also a state where particle 1 has a 50% chance of being in region A and 50% in region B, and the same is true of particle 2. But it is not the same as the state above, because now there is no chance of finding the particles in different regions. If we were to specify just the amplitudes for each particle to be in each region, we’d think that this was the same state as the one above. Instead, we need to specify the amplitude for each combination of particles 1 and 2 in regions A and B. And note that, no matter how hard you try, there’s no way of writing this state out as a product of a wave function just involving particle 1 and one just involving particle 2.

So, in our little example, we see that instead of two wave functions living in three-dimensional space, we have one wave function living in six-dimensional space. And obviously the same arguments can be extended to three or four or a million particles. So if the universe contains N particles, we have, as it turns out, not N wave functions living in three-dimensional space, but one wave function living in 3N dimensional space.

A Strange Picture of the Universe

The above conclusion – that we must really deal with a single wave function in a many-dimensional space instead of many separate wave functions in three-dimensional space – is true of all interpretations of Quantum Mechanics, but let’s return now to Bohm’s theory. The little white lie I told was when I said that in Bohm’s theory, each particle has an associated wave function. As we’ve now seen, there’s actually just a single wave function for all particles. Now, mathematically, there’s no problem here whatsoever. The Guiding Equation simply relates the appropriate parts of the universal wave function to each particle.

But, to a certain way of thinking, there may be a philosophical problem here. The idea of Bohm’s interpretation was that the wave function was supposed to be pushing the particles around. And if you’re a realist about the wave function, you’re kind of committed to saying that the 3N-dimensional space that the wave function exists in is a literal, physical kind of space, as “real” as the three-dimensional space we live in. But if the wave function exists in this 3N-dimensional space, how can it be pushing around particles in our three-dimensional space? It’s almost as if the wave function exists in one universe and the particles in another, and it might be seen as philosophically objectionable that objects in one physical space could have physical effects on objects in a completely different space12.

One way around this is to say that, just as there is really only one wave function, there is also really only one particle, existing, like the wave function, in a 3N-dimensional space. The picture of the universe that this leads to is one in which there is a single world-particle being pushed around by a single wave function in a very high-dimensional space, and everything that we seem to see and interact with in the universe is, more or less, just a matter of the particular spot in that space that the world-particle occupies. In this view, the appearance that the world is three-dimensional, and that it contains many particles, is just an illusion that comes about because of the particular way that the world-particle is moving about in its strange universe13.

Now, as I’ve mentioned before, I’m more or less a logical positivist. As a result, this whole thing strikes me as a non-issue. If the purpose of a physical theory is just to describe and predict empirical experiences, then there’s no use worrying about the high-dimensional space of the wave function. The Schrödinger Equation and the Guiding Equation together describe the motions of particles in three-dimensional space, and that’s that. But for many of its proponents, the whole appeal of Bohm’s interpretation is that it allows us to be flat-footedly realistic about particles, in the ordinary, everyday, three-dimensional sense. And certainly, from that point of view, one can see it might be troubling if it turns out that the only way to be a realist about Bohmian mechanics is to make the astonishingly counter-intuitive claim that the universe contains just one single particle and that the number of spatial dimensions is not three but some almost inconceivable, astronomically high number.

There’s a lot more that could be said about Bohm’s interpretation, and interpretations of this type in general, and maybe I’ll have a chance to come back to them when I go over a few odds and ends in the last part. But for now, let me just wrap this up by saying that despite its unpopularity in the scientific community, I think Bohmian mechanics is a perfectly good solution to the measurement problem. It could easily have become the orthodox version of QM, and in a way it’s kind of surprising that it didn’t, and that a much more revolutionary and anti-classical way of thinking about the world assumed that role instead.

Next time, I’ll talk about the third option in that trilemma I mentioned at the beginning, that of both insisting that collapse never happens and refusing to add anything beyond wave functions to the theory. And we’ll discuss the only interpretation of QM that is the subject of a Star Trek episode, the Many Worlds Interpretation.