Chapter 5: Spin 1/2 and the Stern-Gerlach Experiment — Seeds of Hilbert Space¶
Story so far:
In Ch. 4, we introduced the rules of probability amplitudes as Feynman's three laws. We learned that quantum mechanical probabilities are given by "the absolute value squared of the amplitude," that amplitudes are complex numbers, and that when intermediate states are inserted, we "multiply then add" the amplitudes. This time, we apply these rules to a concrete physical system—the Stern-Gerlach experiment with spin-1/2 particles—and open the door to the mathematical structure of quantum mechanics.
Goals of this chapter
- Confirm the discreteness of "spin" through the Stern-Gerlach experiment, and introduce state vectors \(|\pm\rangle\) and the concept of a basis
- Describe how measurements in different directions change the state using the language of amplitudes, and experience the structure of a 2-dimensional complex vector space (the seeds of Hilbert space)
5.1 The Stern-Gerlach Experiment — A Beam Splits in Two¶
🟡 Lina: Now, in the previous chapter we acquired the rules of probability amplitudes. Today we'll see how those rules are actually used, through a historically famous experiment. It was performed in 1922 by Otto Stern and Walther Gerlach.
🔵 Kai: What kind of experiment is it?
🟡 Lina: The experimental apparatus is simple. A beam of silver atoms is emitted from a hot furnace and passed through an inhomogeneous magnetic field—that is, a magnetic field whose strength varies from place to place. Then we observe where the atoms arrive on a screen after passing through the field.
⚪ Mei: Why use an inhomogeneous magnetic field?
🟡 Lina: Good question. If you place a magnet in a uniform magnetic field, the magnet rotates but doesn't translate. To make a magnet translate, the field strength must vary with position—there must be a gradient. Since silver atoms behave like tiny magnets, they experience a force in an inhomogeneous field and get deflected.
🔵 Kai: I see. So what would classical physics predict?
🟡 Lina: In classical physics, the "magnet orientations" of silver atoms coming out of the furnace should be random, pointing in all directions. Atoms aligned with the field direction get deflected upward, those pointing opposite get deflected downward, and atoms at intermediate angles experience intermediate forces. So—
🔵 Kai: A continuous band-like pattern would appear on the screen?
🟡 Lina: Yes, that's the classical prediction. But the actual result was—
🔵 Kai: It was different!?
🟡 Lina: The beam split into exactly two spots. One above and one below. Nothing in between. Look at Fig. 5.1 "The Stern-Gerlach experimental apparatus"—the light gray shows the classical prediction of a continuous band, while what was actually observed was two concentrated spots.
Fig. 5.1: The Stern-Gerlach experimental apparatus. A beam of silver atoms from a furnace passes through an inhomogeneous magnetic field and splits into two spots on the screen. Rather than the continuous band predicted by classical physics (light gray), the atoms concentrate at two discrete spots (\(|+\rangle\) and \(|-\rangle\)).
🔵 Kai: Wait... not continuous, but only two?
🟡 Lina: Yes. This is the phenomenon called spatial quantization. It shows that "something" the silver atoms possess in the magnetic field direction can only take discrete—that is, separated—values, not continuous ones.
⚪ Mei: So it's not that they "can point in any direction," but rather there are only two choices: "up or down."
🟡 Lina: To be precise, the component in the magnetic field direction (let's call it the \(z\) direction) can only take two values. Those values are:
Here \(\hbar\) (h-bar) is the reduced Planck constant from Ch. 1—\(\hbar = h/(2\pi) \approx 1.055 \times 10^{-34}\) J·s.
🔵 Kai: What's \(S_z\)?
🟡 Lina: That's today's main theme. It's the \(z\)-component of spin angular momentum. I'll explain it in detail in the next section.
✅ Comprehension Check: The splitting of the beam into two in the Stern-Gerlach experiment is called "spatial quantization." What property of silver atoms does this demonstrate?
Answer
It demonstrates that the spin angular momentum component \(S_z\) of silver atoms in the magnetic field direction can only take the two discrete values \(+\hbar/2\) and \(-\hbar/2\), not continuous values.
✅ Comprehension Check: How did the actual result of the Stern-Gerlach experiment differ from the prediction of classical physics?
Answer
Classical physics predicted a continuous band-like pattern on the screen, but in reality the beam split into exactly two discrete spots. This shows that the magnetic property of silver atoms can only take discrete values, not continuous ones.
📝 Exercises:
- Qualitatively sketch the classical prediction of the Stern-Gerlach experiment and explain the difference from the quantum mechanical result → Problem A-2. "Which Path Was Taken" and the Disappearance of Interference
5.2 What Is Spin? — An Intrinsic Property, Not Classical "Rotation"¶
🔵 Kai: You said "spin angular momentum" earlier—does that mean the electron is spinning? Like the Earth rotates on its axis?
🟡 Lina: A very good question. The answer is "no." The name is "spin"—meaning "rotation" in English—but it's something completely different from classical rotation.
🔵 Kai: Then what is it?
🟡 Lina: Spin is an intrinsic angular momentum that particles are born with. Unlike angular momentum that arises from orbiting (called orbital angular momentum), spin exists even when a particle is at rest. It has no classical counterpart—it's a purely quantum mechanical property.
⚪ Mei: "No counterpart" means it's futile to try to understand it with classical imagery?
🟡 Lina: That's right. You might want to think of the electron as "a tiny spinning sphere," but models that treat the electron as a point particle work best. A point "spinning" doesn't make sense, does it? Spin exists as a measurable physical quantity, but you cannot draw a classical picture of it.
🔵 Kai: Hmm, then how should we understand it?
🟡 Lina: Through mathematical structure and experimental results. Let me list the properties of spin. I've also summarized the differences from classical angular momentum in Table 5.1 "Comparison of spin and classical angular momentum".
Table 5.1: Comparison of spin and classical angular momentum
| Property | Classical angular momentum (orbital) | Spin angular momentum |
|---|---|---|
| Origin | Rotation/revolution of objects | Intrinsic to the particle (independent of motion) |
| Measured values | Continuous, arbitrary values | Discrete (\(\pm\hbar/2\) only) |
| Changing magnitude | Changes by rotating faster/slower | Fixed by particle type, cannot be changed |
| Classical image | Rotating object (Earth's rotation, etc.) | No corresponding image |
🟡 Lina: Specifically, the properties of spin are:
- Intrinsic — determined by the type of particle, cannot be changed
- Discrete — measured values can only take separated values
- Characterized by the spin quantum number \(s\). Electrons, protons, and neutrons have \(s = 1/2\)
🟡 Lina: When a particle with spin quantum number \(s\) is measured along the magnetic field direction, the possible values of \(S_z\) range from \(-s\hbar\) to \(+s\hbar\) in steps of \(\hbar\)—a total of \((2s+1)\) values. For \(s = 1/2\), there are \(2 \times 1/2 + 1 = 2\) values: \(-\hbar/2\) and \(+\hbar/2\)—that's why the beam splits into two in the Stern-Gerlach experiment.
⚪ Mei: I see. Since the outermost electron of a silver atom has \(s = 1/2\), \(S_z\) can only take two values \(+\hbar/2\) and \(-\hbar/2\), so the beam splits into two lines. ...But silver atoms have many electrons—why does only one matter?
🟡 Lina: Good question. A silver atom has 47 electrons, but the inner 46 are paired so their spins cancel each other out. Only the remaining outermost one "shows" its spin. So the atom as a whole behaves as if it has spin 1/2. Incidentally, in 1925 Uhlenbeck and Goudsmit proposed this concept of "electron spin." That was three years after the Stern-Gerlach experiment.
🔵 Kai: Wait a moment. If it were a spin-1 particle, would the beam split into three?
🟡 Lina: Yes! For \(s = 1\), there are \(2 \times 1 + 1 = 3\) possible values—\(S_z = +\hbar, 0, -\hbar\)—so the beam splits into three. Feynman's textbook starts with the spin-1 example, but here we'll work with spin 1/2. The reason is that the electron—the most fundamental particle—has spin 1/2, and since there are only two states, the mathematics becomes the simplest possible.
✅ Comprehension Check: How many possible values of \(S_z\) can a particle with spin quantum number \(s\) have when measured along the magnetic field direction? Answer for both \(s = 1/2\) and \(s = 1\).
Answer
The number of possible values is \(2s+1\). For \(s = 1/2\): \(2 \times 1/2 + 1 = 2\) values (\(\pm\hbar/2\)). For \(s = 1\): \(2 \times 1 + 1 = 3\) values (\(+\hbar, 0, -\hbar\)).
✅ Comprehension Check: Give two ways in which spin differs from classical "rotation."
Answer
(1) Spin is an intrinsic property that exists even when the particle is at rest, and does not originate from orbital motion. (2) Its measured values are discrete (only \(\pm\hbar/2\)), unlike a classical rotating body which can have any arbitrary angular momentum.
5.3 Introducing State Vectors and Ket Notation¶
🟡 Lina: Now, here comes the main topic. We're going to describe the results of the Stern-Gerlach experiment using the language of probability amplitudes from Ch. 4. To do that, I'll introduce the concept of state vectors and the notation to write them.
🔵 Kai: State vectors?
🟡 Lina: In quantum mechanics, the "state" of a particle is represented by a vector. The vectors you learned about in high school were arrows—quantities with magnitude and direction. Quantum mechanical state vectors also have a kind of "direction," but they live in a different space. Not in real 2-dimensional or 3-dimensional space, but in a complex number space. Specifically, they can be represented as vectors whose components are complex numbers arranged vertically—for example
This kind of "vector with components arranged vertically" is called a column vector. In high school, you usually wrote components horizontally as \((a, b)\), but in quantum mechanics we use the vertical notation. Conversely, components arranged horizontally—like \((c_+^*,\; c_-^*)\)—are called a row vector. The reason this distinction matters will become clear in "Key Point 1: States Are Vectors".
🟡 Lina: To write these state vectors, we use a notation devised by Dirac. A vector representing a state is written as
and called a ket. It's a notation enclosed between a vertical bar and an angle bracket, where the \(\alpha\) part is a label to distinguish the state—like a name.
🔵 Kai: Why is it called a "ket"?
🟡 Lina: The English word "bracket" was split into "bra" and "ket." The "bra" will appear in the section right after this one. For now, just remember ket.
🟡 Lina: The state measured as \(S_z = +\hbar/2\) in the Stern-Gerlach experiment is written \(|+\rangle\), and the state measured as \(S_z = -\hbar/2\) is written \(|-\rangle\). These are the two basic states of the spin-1/2 system.
⚪ Mei: \(|+\rangle\) and \(|-\rangle\) correspond to the beam deflected upward and downward in the Stern-Gerlach apparatus.
🟡 Lina: Exactly. And the core of quantum mechanics is this—a general state can be written as a sum of these two basic states with complex number coefficients:
Such a combination is called a superposition.
🔵 Kai: "Adding" states—what does that mean? Why is such a thing allowed?
🟡 Lina: Because experiments demand it. For example, you can orient the Stern-Gerlach apparatus's magnetic field in the \(x\) direction instead of the \(z\) direction. Then a state "spin up in the \(x\) direction"—let's write it as \(|+\rangle_x\)—can be selected. If you then pass this \(|+\rangle_x\) particle through a \(z\)-direction apparatus, up and down come out in equal proportions.
🔵 Kai: Equal proportions? If it's \(|+\rangle\) it should go 100% up, and if it's \(|-\rangle\) it should go 100% down!
🟡 Lina: Right. That means \(|+\rangle_x\) is "a third state that is neither \(|+\rangle\) nor \(|-\rangle\)." To express this third state within a 2-dimensional framework, the only option is to "add" \(|+\rangle\) and \(|-\rangle\) with complex number coefficients—just as you decompose a diagonal arrow in a plane into \(x\) and \(y\) components. (The specific form and calculation will be done in 5.5 "Measurements in Different Directions — Basis Transformation and Probability Amplitudes". For now, just know that "such a state exists.") Why "addition" is mathematically allowed can be understood from the fact that the fundamental equation of quantum mechanics (the Schrödinger equation) is linear, which we'll cover in later chapters. "Linear" means that if a state is a solution of the equation, then its constant multiples and sums of two solutions are also solutions—so superposed states are also physically valid states. But for now, let's proceed taking "experiments demand it" as our starting point.
⚪ Mei: So superposition is not just "mathematically allowed" but "experimentally necessary."
Here \(c_+\) and \(c_-\) are complex number coefficients. In the language of Ch. 4, \(c_+\) is "the probability amplitude for state \(|\psi\rangle\) to be found as \(|+\rangle\)," and \(c_-\) is "the probability amplitude for state \(|\psi\rangle\) to be found as \(|-\rangle\)."
🔵 Kai: Probability amplitudes! Those came up in the previous chapter. So \(|c_+|^2\) is the probability of getting \(S_z = +\hbar/2\), and \(|c_-|^2\) is the probability of getting \(S_z = -\hbar/2\)?
🟡 Lina: Perfect. And since the total probability is 1:
This is called the normalization condition.
⚪ Mei: So equation (5.4) says "the spin state can be expressed as a complex-weighted combination of \(|+\rangle\) and \(|-\rangle\)." And the absolute value squared of the weights gives the measurement probabilities.
🔵 Kai: I get that superposition is necessary, but "adding states" still doesn't quite click for me concretely. With regular vectors, there's a clear physical meaning like "composition of forces"...
🟡 Lina: That feeling is completely valid. It's natural for it to feel abstract. When we do the concrete calculation of \(x\)-direction measurement in 5.5 "Measurements in Different Directions — Basis Transformation and Probability Amplitudes", you'll get a real sense that "without superposition, the experimental results can't be explained." This is the kind of concept that clicks after seeing concrete examples, so let's move forward for now.
✅ Comprehension Check: What does the normalization condition \(|c_+|^2 + |c_-|^2 = 1\) physically mean?
Answer
When \(S_z\) is measured, the probability of getting \(+\hbar/2\) plus the probability of getting \(-\hbar/2\) equals 1 (i.e., 100%), meaning the measurement result must be one or the other.
✅ Comprehension Check: For the state \(|\psi\rangle = \frac{1}{\sqrt{3}}|+\rangle + \sqrt{\frac{2}{3}}|-\rangle\), what is the probability of obtaining \(+\hbar/2\) when \(S_z\) is measured?
Answer
\(|c_+|^2 = |1/\sqrt{3}|^2 = 1/3\).
📝 Exercises:
- Verify that the state \(|\psi\rangle = \frac{1+i}{2}|+\rangle + \frac{\sqrt{2}}{2}|-\rangle\) is normalized, and find the probability of obtaining each measurement value of \(S_z\) → Problem B-1. Verifying the Normalization Condition
5.4 Basis, Orthonormality, and Completeness — The 2-Dimensional State Space¶
🟡 Lina: To understand equation (5.4) more deeply, let me organize what properties \(|+\rangle\) and \(|-\rangle\) have.
Orthonormality¶
🟡 Lina: First, I want to consider the "overlap" between the two basic states. To do this, I'll introduce the inner product. The bra corresponding to a ket \(|\alpha\rangle\) is written \(\langle\alpha|\), and the combination of a bra and a ket
is called the inner product. It generally takes a complex number value. Just as the high school vector dot product \(\vec{a} \cdot \vec{b}\) measured the "closeness" of two vectors, \(\langle\beta|\alpha\rangle\) measures the "closeness" of state \(|\alpha\rangle\) and state \(|\beta\rangle\). The specific computation method (how to calculate using column vector components) will be shown in "Key Point 1: States Are Vectors", so for now let's proceed using the property of "orthonormality."
🔵 Kai: Bracket... the first half of "bracket" is "bra" and the second half is "ket"!
🟡 Lina: Right, it's Dirac's pun. Now, \(|+\rangle\) and \(|-\rangle\) satisfy the following properties:
🔵 Kai: The inner product with itself is 1, and the inner product with a different state is 0?
🟡 Lina: Yes. Equation (5.6) is normalization — each basic state has "length" 1. Equation (5.7) is orthogonality — the two basic states are "perpendicular." Together this is called orthonormal.
⚪ Mei: It's the same structure as the unit vectors \(\hat{\mathbf{e}}_x\) and \(\hat{\mathbf{e}}_y\) in the \(xy\)-plane from high school math satisfying \(\hat{\mathbf{e}}_x \cdot \hat{\mathbf{e}}_x = 1\) and \(\hat{\mathbf{e}}_x \cdot \hat{\mathbf{e}}_y = 0\).
🟡 Lina: Exactly. You can think of \(|+\rangle\) and \(|-\rangle\) as "the quantum mechanics version of orthogonal coordinate axes." However, the difference is that the space is not real 2-dimensional but complex 2-dimensional — meaning complex numbers can be used as coefficients.
🟡 Lina: We can summarize equations (5.6) and (5.7) using the Kronecker delta \(\delta_{ij}\):
where \(\delta_{ij}\) is 1 when \(i = j\) and 0 when \(i \neq j\).
Completeness Relation¶
🟡 Lina: Now, to write the completeness relation, let me explain what the symbol \(|+\rangle\langle+|\) means.
🔵 Kai: \(|+\rangle\langle+|\)? The ket and bra are in the reverse order...
🟡 Lina: Good catch. The inner product \(\langle\beta|\alpha\rangle\) from before was in "bra × ket" order, and the result was a complex number (just a number). This time, \(|+\rangle\langle+|\) is in "ket × bra" order — this is not a number, but something that "acts on" a state vector — an "operation" that takes a state vector as input and returns another state vector. Just as plugging a number into a function gives another number, plugging a vector into this "operation" gives another vector.
🟡 Lina: Let's compute it concretely. For \(|\psi\rangle = c_+|+\rangle + c_-|-\rangle\), first let's calculate \(\langle+|\psi\rangle\). Here I'll use an important property of the inner product — linearity. In high school vectors, you could expand \(\vec{a} \cdot (c_1 \vec{b}_1 + c_2 \vec{b}_2) = c_1 (\vec{a} \cdot \vec{b}_1) + c_2 (\vec{a} \cdot \vec{b}_2)\) like the distributive law, right? The quantum mechanical inner product can be expanded the same way. That is, when the bra \(\langle+|\) acts on the sum of kets \(c_+|+\rangle + c_-|-\rangle\), we can distribute to each term:
I used the orthonormality from equations (5.6) and (5.7). With this in mind:
In other words, it's an operation that extracts only the \(|+\rangle\) component from state \(|\psi\rangle\) — a projection. Something created in the "ket × bra" form like \(|+\rangle\langle+|\) is called an outer product.
🔵 Kai: Is "outer product" different from the "cross product" we learned in high school — the \(\vec{a} \times \vec{b}\) that produces a perpendicular vector?
🟡 Lina: Completely different. The name is confusingly the same, but the "outer product" here means "the operation of placing a ket and a bra side by side to create an operator." It has nothing to do with the cross product, so don't mix them up.
🔵 Kai: I see, so it's an "operation" where you put in a state vector and get out another state vector.
🟡 Lina: Right. In general, an "operation that acts on a state vector and returns another state vector" is called an operator.
🟡 Lina: Now, using the concept of projection, let me derive a very useful relation. For any state \(|\psi\rangle = c_+|+\rangle + c_-|-\rangle\), what happens if we add together both the projection onto the \(|+\rangle\) component and the projection onto the \(|-\rangle\) component?
🔵 Kai: \(c_+|+\rangle + c_-|-\rangle\)... we get back the original \(|\psi\rangle\)!
🟡 Lina: Exactly! So \(|+\rangle\langle+|\) is "projection onto the \(|+\rangle\) direction" and \(|-\rangle\langle-|\) is "projection onto the \(|-\rangle\) direction." Adding these two gives an operation that returns any state unchanged — the identity operator \(\mathbf{1}\):
⚪ Mei: I see. \(|+\rangle\langle+|\) is "projection onto the \(|+\rangle\) direction," \(|-\rangle\langle-|\) is "projection onto the \(|-\rangle\) direction." Adding both gives the whole — the identity operator.
🟡 Lina: Right. This is the mathematical expression of the fact that "\(|+\rangle\) and \(|-\rangle\) together exhaust the entire state space." In a spin-1/2 system, the state space is 2-dimensional. No additional basic states are needed.
🔵 Kai: 2-dimensional — is that like a plane?
🟡 Lina: Since it's "complex 2-dimensional," counting in real numbers there are 4 degrees of freedom (since \(c_+\) and \(c_-\) each have a real part and an imaginary part). But one is removed by the normalization condition, and another is removed by the freedom that "multiplying the whole thing by the same phase \(e^{i\theta}\) (a complex number with absolute value always equal to 1, which appeared in Ch. 4; \(\theta\) is a real angle parameter, and \(e^{i\theta} = \cos\theta + i\sin\theta\)) produces a physically indistinguishable state" — resulting in 2 physically independent parameters. A state can be specified by a single point on a sphere — this will appear as the "Bloch sphere" in later chapters. Don't worry about the detailed counting for now — what matters is that "\(|+\rangle\) and \(|-\rangle\) are all you need to write everything."
🔵 Kai: I understand "degrees of freedom go from 4 → 3 → 2," but what specifically does "one is removed by the normalization condition" mean?
🟡 Lina: Since we have the condition \(|c_+|^2 + |c_-|^2 = 1\), one of the 4 real parameters of \(c_+\) and \(c_-\) is determined by the other 3 — only 3 are independent. For example, once \(|c_+|\) is determined, \(|c_-|\) is automatically determined as \(\sqrt{1 - |c_+|^2}\), right?
🔵 Kai: What does "multiplying by an overall phase makes no difference" mean?
🟡 Lina: Probabilities are calculated as \(|c_+|^2\) or \(|c_-|^2\), right? If you multiply both \(c_+\) and \(c_-\) by the same \(e^{i\theta}\), then \(|e^{i\theta} c_+|^2 = |e^{i\theta}|^2 |c_+|^2 = 1 \cdot |c_+|^2 = |c_+|^2\), so the probability doesn't change (\(|e^{i\theta}| = 1\) was confirmed in Ch. 4). That means the overall phase has absolutely no effect on measurement results. So we regard them as the same physical state. We'll revisit this in detail in later chapters.
Physical Meaning of the Inner Product¶
🟡 Lina: Using the completeness relation, the meaning of the coefficients \(c_+\), \(c_-\) in equation (5.4) becomes clear. Acting with equation (5.9) on \(|\psi\rangle\) from the left:
Comparing, we get:
⚪ Mei: So the coefficient \(c_+\) is simply "the inner product of state \(|\psi\rangle\) with basic state \(|+\rangle\)." This matches the probability amplitude \(\langle+|\psi\rangle\) we learned in Ch. 4.
🟡 Lina: Right. The inner product \(\langle+|\psi\rangle\) is "the probability amplitude for finding \(|+\rangle\) when measuring a particle in state \(|\psi\rangle\)." Its absolute value squared \(|\langle+|\psi\rangle|^2\) gives the probability. The rules from Ch. 4 have now taken concrete form.
✅ Comprehension Check: What do you get when you apply the outer product \(|+\rangle\langle+|\) to a state \(|\psi\rangle\)? What is this operation called?
Answer
You get \(|+\rangle\langle+|\psi\rangle = c_+|+\rangle\), extracting only the \(|+\rangle\) component from state \(|\psi\rangle\). This operation is called projection.
✅ Comprehension Check: State the physical meaning of the completeness relation \(|+\rangle\langle+| + |-\rangle\langle-| = \mathbf{1}\) in one sentence.
Answer
Any state of a spin-1/2 system can be completely described as a superposition of \(|+\rangle\) and \(|-\rangle\), and these two basic states exhaust the state space.
📝 Exercises:
- Using the completeness relation, show that \(|\langle+|\psi\rangle|^2 + |\langle-|\psi\rangle|^2 = 1\) for any normalized state \(|\psi\rangle\) → Problem M-1. Deriving the Normalization Condition from the Completeness Relation
5.5 Measurements in Different Directions — Basis Transformation and Probability Amplitudes¶
🟡 Lina: So far we've only considered the \(z\)-direction Stern-Gerlach apparatus. But the apparatus can be rotated to orient the magnetic field in the \(x\) or \(y\) direction. What happens then — this is where we touch the core of quantum mechanics.
Eigenstates of the \(x\) Direction¶
🟡 Lina: When we pass a beam through a Stern-Gerlach apparatus with the magnetic field oriented in the \(x\) direction, the beam again splits into two. A state with \(S_x = +\hbar/2\) and a state with \(S_x = -\hbar/2\). We write these as \(|+\rangle_x\) and \(|-\rangle_x\).
🔵 Kai: Are those different from \(|+\rangle\)?
🟡 Lina: Different. \(|+\rangle\) is the state "up when measured in the \(z\) direction." \(|+\rangle_x\) is the state "up when measured in the \(x\) direction." They are different states corresponding to measurements in different directions.
🟡 Lina: Here's the important question. How do we write \(|+\rangle_x\) in terms of the \(z\) basis \(|+\rangle\), \(|-\rangle\)? Hint: The \(x\) and \(z\) directions are spatially equivalent — neither is "special."
🔵 Kai: What does "equivalent" specifically mean?
🟡 Lina: Physical laws don't depend on the orientation of space — rotating the entire apparatus by 90° should give the same experimental results. So "measuring a particle with spin up in the \(x\) direction along the \(z\) direction" and "measuring a particle with spin up in the \(z\) direction along the \(x\) direction" are essentially the same situation — both are "measuring a particle with spin determined in one direction along a direction 90° perpendicular to it." By symmetry, there's no reason for it to go up rather than down when measured in the perpendicular direction, so—
🔵 Kai: Up and down have the same probability — 50% each!
🟡 Lina: Exactly! Since the probability is \(1/2\), we have \(|c_+|^2 = |c_-|^2 = 1/2\), so the absolute value of each coefficient is \(1/\sqrt{2}\). The coefficients could have a phase (angle) freedom like \(e^{i\theta}/\sqrt{2}\), but as explained in 5.4 "Basis, Orthonormality, and Completeness — The 2-Dimensional State Space", "multiplying by the same overall phase makes no physical difference," so we're allowed to choose the first component as a positive real number. Here we'll choose the simplest option with both as positive real numbers:
Now what about \(|-\rangle_x\)? Since \(|+\rangle_x\) and \(|-\rangle_x\) correspond to different measurement results, they must be orthogonal to each other.
🔵 Kai: The orthogonality condition — meaning the inner product of \(|+\rangle_x\) and \(|-\rangle_x\) is 0 — determines the sign. If \(|-\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), it would be the same as \(|+\rangle_x\), so...
🟡 Lina: Right. \(|-\rangle_x\) should also give 50%-50% when measured in the \(z\) direction (the equivalence of \(x\) and \(z\) applies to \(|-\rangle_x\) too), so \(|c_+|^2 = |c_-|^2 = 1/2\) and the absolute values of the coefficients are both \(1/\sqrt{2}\). Writing with real numbers, we can set \(|-\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{b}{\sqrt{2}}|-\rangle\) (\(b\) is \(\pm 1\)). Try determining \(b\) using the orthogonality condition. The method is the same as when we calculated \(\langle+|\psi\rangle\) earlier — expand using linearity of the inner product, and evaluate each term using the orthonormality from equations (5.6), (5.7).
🔵 Kai: Um, I need to calculate \({}_x\langle+|-\rangle_x\), right? Since \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\)... when I make it a bra, what happens to the coefficients?
🟡 Lina: In general, you take the complex conjugate of the ket's coefficients — this will be explained in detail in "Key Point 1: States Are Vectors". But this time the coefficients are \(1/\sqrt{2}\), which is real, so taking the complex conjugate doesn't change the value. So we can write \({}_{x}\langle+| = \frac{1}{\sqrt{2}}\langle+| + \frac{1}{\sqrt{2}}\langle-|\) directly.
🔵 Kai: Got it. Then acting this on \(|-\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{b}{\sqrt{2}}|-\rangle\)... \(\frac{1}{\sqrt{2}} \cdot \frac{1}{\sqrt{2}}\langle+|+\rangle + \frac{1}{\sqrt{2}} \cdot \frac{b}{\sqrt{2}}\langle+|-\rangle + \frac{1}{\sqrt{2}} \cdot \frac{1}{\sqrt{2}}\langle-|+\rangle + \frac{1}{\sqrt{2}} \cdot \frac{b}{\sqrt{2}}\langle-|-\rangle\). By orthonormality, \(\langle+|+\rangle = \langle-|-\rangle = 1\), \(\langle+|-\rangle = \langle-|+\rangle = 0\), so \(\frac{1}{2} + \frac{b}{2} = 0\)... \(b = -1\)!
🟡 Lina: Perfect. So:
🔵 Kai: I see, the sign difference is necessarily determined by the orthogonality condition.
🔵 Kai: It looks simple as an equation, but thinking about it again, it's strange. The spin is "definitely up" in the \(x\) direction, yet measuring in the \(z\) direction gives a completely fifty-fifty result.
🟡 Lina: Right. Since the coefficients are \(1/\sqrt{2}\), we get \(|1/\sqrt{2}|^2 = 1/2\). That is, when you measure a particle with spin up in the \(x\) direction along the \(z\) direction, \(+\hbar/2\) and \(-\hbar/2\) come out with equal probability — 50% each.
🔵 Kai: Both coefficients are \(1/\sqrt{2}\) so it's perfectly half and half... So even though the \(x\) direction is determined, nothing is known about the \(z\) direction?
🟡 Lina: Exactly. This is an essential feature of quantum mechanics. \(S_x\) and \(S_z\) cannot simultaneously have definite values. When one is made definite, the other becomes completely undetermined.
✅ Comprehension Check: For a particle in state \(|+\rangle_x\) (spin up in the \(x\) direction), what happens when \(S_z\) is measured? Explain the reason from the perspective of superposition.
Answer
Since \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), \(S_z = +\hbar/2\) and \(S_z = -\hbar/2\) are each obtained with probability \(|1/\sqrt{2}|^2 = 1/2\). When spin in the \(x\) direction is determined, the information in the \(z\) direction becomes completely undetermined.
Eigenstates of the \(y\) Direction¶
🟡 Lina: What about the \(y\) direction? Let's think with the same logic as the \(x\) direction. Since \(y\) and \(z\) are also spatially equivalent, measuring \(|+\rangle_y\) in the \(z\) direction gives 50%-50% — meaning the absolute values of the coefficients are both \(1/\sqrt{2}\). And \(|+\rangle_y\) and \(|-\rangle_y\) must be orthogonal.
🔵 Kai: Same conditions as the \(x\) direction. But for the \(x\) direction we solved it with \(+1/\sqrt{2}\) and \(-1/\sqrt{2}\) — what's different about the \(y\) direction?
🟡 Lina: Good question. Actually, \(|+\rangle_y\) must be different from both \(|+\rangle_x\) and \(|-\rangle_x\). Why? Because the \(y\) and \(x\) directions are also spatially perpendicular — the same relationship as \(z\) and \(x\) being perpendicular — so by exactly the same reasoning we used for "\(z\) and \(x\) perpendicular → 50%-50%," a particle with spin determined in the \(y\) direction should also give 50%-50% when measured in the \(x\) direction. If \(|+\rangle_y\) were the same state as \(|+\rangle_x\), then measuring in the \(x\) direction would give \(+\hbar/2\) with 100% probability — not 50%-50%. Similarly, if \(|+\rangle_y\) were the same as \(|-\rangle_x\), measuring in the \(x\) direction would give \(-\hbar/2\) with 100% — again not 50%-50%. So \(|+\rangle_y\) must be different from both \(|+\rangle_x\) and \(|-\rangle_x\). However, let's try to find an orthonormal pair using only real numbers with absolute value \(1/\sqrt{2}\). Each coefficient must be one of \(\pm 1/\sqrt{2}\), so the candidates are \((+1/\sqrt{2}, +1/\sqrt{2})\), \((+1/\sqrt{2}, -1/\sqrt{2})\), \((-1/\sqrt{2}, +1/\sqrt{2})\), and \((-1/\sqrt{2}, -1/\sqrt{2})\) — four total.
🔵 Kai: With four candidates, it seems like we could make a different combination from the \(x\) direction...
🟡 Lina: But it doesn't work out that way. Remember — in 5.4 "Basis, Orthonormality, and Completeness — The 2-Dimensional State Space" I said "multiplying by the same overall phase \(e^{i\theta}\) doesn't change probabilities, so it's physically the same state." Using Euler's formula with \(\theta = \pi\), we get \(e^{i\pi} = \cos\pi + i\sin\pi = -1\), so multiplying by \(-1\) is the same as "multiplying by \(e^{i\theta}\) with phase \(\pi\)" — meaning \(-1\) is also a kind of phase. So multiplying the entire state by \(-1\) produces a physically indistinguishable state.
🔵 Kai: Wait a moment. I don't intuitively get why \(-1\) is a "phase"... \(-1\) is just a regular real number, right?
🟡 Lina: Good question. The key point is that "all complex numbers with absolute value 1 are called phase factors." Complex numbers of the form \(e^{i\theta}\) always satisfy \(|e^{i\theta}| = 1\) — we confirmed this in Ch. 4. Conversely, any complex number with absolute value 1 can be written in the form \(e^{i\theta}\). Since \(|-1| = 1\), we can write \(-1 = e^{i\pi}\) — it's a perfectly valid phase factor. Probabilities are calculated as \(|c|^2\), so \(|(-1) \times c|^2 = |-1|^2 |c|^2 = 1 \times |c|^2 = |c|^2\) — completely unchanged.
🔵 Kai: Ah, I see. So \((-1/\sqrt{2}, -1/\sqrt{2})\) is just \((+1/\sqrt{2}, +1/\sqrt{2})\) multiplied overall by \(-1\), meaning they're probabilistically the exact same state?
🟡 Lina: Exactly. Check it — \(|-1/\sqrt{2}|^2 = |1/\sqrt{2}|^2 = 1/2\), so whether you measure in the \(z\) direction or the \(x\) direction, the probabilities are exactly the same. No experiment can in principle distinguish the two states — so we regard them as physically "the same state." Similarly, \((-1/\sqrt{2}, +1/\sqrt{2})\) is \((+1/\sqrt{2}, -1/\sqrt{2})\) multiplied overall by \(-1\) — check that \((-1) \times (1/\sqrt{2}, -1/\sqrt{2}) = (-1/\sqrt{2}, +1/\sqrt{2})\), right? The probabilities are \(|-1/\sqrt{2}|^2 = 1/2\), \(|1/\sqrt{2}|^2 = 1/2\) — exactly the same. For the same reason as before, it's physically the same state.
So using the overall phase freedom, we can choose the first component to be positive — for example, multiply \((-1/\sqrt{2}, -1/\sqrt{2})\) by \(-1\) to get \((+1/\sqrt{2}, +1/\sqrt{2})\), and multiply \((-1/\sqrt{2}, +1/\sqrt{2})\) by \(-1\) to get \((+1/\sqrt{2}, -1/\sqrt{2})\). Then the only physically distinguishable states are \((+1/\sqrt{2}, +1/\sqrt{2})\) and \((+1/\sqrt{2}, -1/\sqrt{2})\) — these are precisely \(|+\rangle_x\) and \(|-\rangle_x\). So within the realm of real numbers, only the \(x\)-direction basis can be constructed.
⚪ Mei: Real numbers alone can't represent a new direction — so complex numbers become essentially necessary.
🟡 Lina: Exactly. So for the \(y\) direction, complex numbers play an essential role:
🔵 Kai: \(i\) appeared! That's the imaginary unit. I see — \(|i/\sqrt{2}|^2 = 1/2\) so the probability condition is satisfied, and since the phase is different from the \(x\)-direction states, they're distinguishable.
🟡 Lina: Right. In the \(x\) direction the coefficients were real, but in the \(y\) direction the imaginary number \(i\) appears. This isn't a coincidence. To represent three independent directions of 3-dimensional space in a 2-dimensional complex space, real numbers alone aren't enough — complex numbers are required. This is a concrete manifestation of the fact that the quantum mechanical world is "made of complex probability amplitudes."
⚪ Mei: So for both the \(x\) and \(y\) directions, measuring in \(z\) gives 50%-50% — probabilities alone can't distinguish \(|+\rangle_x\) from \(|+\rangle_y\). To distinguish them, you need to look at the phase of the amplitudes, not just the probabilities?
🟡 Lina: Exactly right. Looking at probabilities alone, they're the same. But the phases of the amplitudes differ — \(1/\sqrt{2}\) and \(i/\sqrt{2}\) have the same absolute value but their phases are shifted by \(90°\). This phase difference shows up in interference experiments.
🔵 Kai: I see... they can't be distinguished by probability alone, but they can be distinguished by amplitude — that is, by information including the phase.
🟡 Lina: Right. That's why quantum mechanics takes "probability amplitudes" rather than "probabilities" as fundamental.
✅ Comprehension Check: Does the appearance of the imaginary number \(i\) in the coefficients of the \(y\)-direction eigenstates affect the probability calculation? Also, where does the difference from the \(x\)-direction eigenstates manifest?
Answer
It does not affect the probability calculation (\(|i/\sqrt{2}|^2 = 1/2\), same as the \(x\) direction). However, since the phase of the amplitude differs by \(90°\), the difference between \(x\)-direction and \(y\)-direction eigenstates appears in situations involving phase, such as interference experiments.
Matrix Representation of Basis Transformation¶
🟡 Lina: Let's organize equations (5.11)–(5.12) in matrix form. This matrix plays the role of "converting a state written in the \(x\) basis to the \(z\) basis." Each component is the inner product of "the \(z\)-basis bra corresponding to the row" and "the \(x\)-basis ket corresponding to the column." Let me do it concretely. The component in the 1st row, 1st column is \(\langle+|+\rangle_x\) — the inner product of the \(z\)-basis bra \(\langle+|\) and the \(x\)-basis ket \(|+\rangle_x\). From equation (5.11), \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), so \(\langle+|+\rangle_x = \frac{1}{\sqrt{2}}\langle+|+\rangle + \frac{1}{\sqrt{2}}\langle+|-\rangle = \frac{1}{\sqrt{2}} \cdot 1 + \frac{1}{\sqrt{2}} \cdot 0 = \frac{1}{\sqrt{2}}\). Similarly, the 2nd row, 1st column is \(\langle-|+\rangle_x = \frac{1}{\sqrt{2}}\). Arranging all components this way:
🔵 Kai: So each component is an amplitude. What specifically does the \(1/\sqrt{2}\) in the \((1,1)\) position mean?
🟡 Lina: \(\langle+|+\rangle_x = 1/\sqrt{2}\) is "the amplitude for obtaining spin up in the \(z\) direction when measuring a particle with spin up in the \(x\) direction." The entire matrix serves as a basis transformation matrix — a "translation dictionary" from one basis to another.
🔵 Kai: So this is a matrix version of the transformation when you rewrite vector components in a different coordinate system?
🟡 Lina: Right. The difference from ordinary real matrices is that the components can be complex numbers. This kind of matrix with complex components has the special property of being a unitary matrix, which guarantees that the total probability is preserved at 1. We'll cover the details in Ch. 11, but for now remember that "basis transformations can be written as matrices."
🔵 Kai: What's "unitary"? How is it different from a regular rotation matrix?
🟡 Lina: A rotation matrix is "a matrix with real components that preserves length," right? A unitary matrix is the complex number version — "a matrix with complex components that preserves vector length (= total probability)." Just remembering the name is sufficient. We'll go through the details carefully in Ch. 11.
✅ Comprehension Check: When a particle in state \(|+\rangle\) (spin up in \(z\) direction) is passed through an \(x\)-direction Stern-Gerlach apparatus, what is the probability of obtaining \(S_x = +\hbar/2\)?
Answer
We want \(|\langle+|+\rangle_x|^2\), which equals \(|{}_x\langle+|+\rangle|^2\). Inverting equations (5.11), (5.12) to expand \(|+\rangle\) in the \(x\) basis gives \(|+\rangle = \frac{1}{\sqrt{2}}|+\rangle_x + \frac{1}{\sqrt{2}}|-\rangle_x\). Therefore \({}_x\langle+|+\rangle = 1/\sqrt{2}\), and the probability is \(1/2\).
📝 Exercises:
- Verify that equation (5.13) is normalized (\({}_y\langle+|+\rangle_y = 1\)) and that \(|+\rangle_y\) and \(|-\rangle_y\) are orthogonal (\({}_y\langle+|-\rangle_y = 0\)) → Problem B-3. Action of Outer Products (Projection Operators)
5.6 Sequential Stern-Gerlach Experiments — The Heart of Quantum Mechanics¶
🟡 Lina: Now, using the tools we've built up so far, let's look at the most striking feature of quantum mechanics. Sequential Stern-Gerlach experiments.
Experiment 1: Measuring the Same Direction Twice¶
🟡 Lina: First, the simplest case. Select \(|+\rangle\) with a \(z\)-direction apparatus, then immediately pass it through another \(z\)-direction apparatus.
🔵 Kai: Obviously, everything goes up, right? Since it was already confirmed as "up."
🟡 Lina: Yes. The probability amplitude is \(\langle+|+\rangle = 1\). With 100% probability, \(+\hbar/2\) is obtained again. This is consistent with classical intuition.
Experiment 2: Three-Stage Measurement \(z\) → \(x\) → \(z\)¶
🟡 Lina: Next is the core. We line up three apparatuses:
- First apparatus (\(z\) direction): Pass only \(|+\rangle\)
- Second apparatus (\(x\) direction): Pass only \(|+\rangle_x\)
- Third apparatus (\(z\) direction): Observe whether \(|+\rangle\) passes
🔵 Kai: Let me see, first confirm spin up in the \(z\) direction, then confirm spin up in the \(x\) direction, then measure again in the \(z\) direction... Since the first and last are the same measurement, shouldn't everything pass through?
🟡 Lina: Classically, you'd expect that. But the quantum mechanical answer is — only half pass through.
🔵 Kai: What!? But we confirmed "\(z\) up" at the beginning!?
🟡 Lina: Let's calculate. The key point is that the second apparatus blocks \(|-\rangle_x\) — meaning it performs a measurement that establishes "passed through as \(|+\rangle_x\)."
Remember the rules from Ch. 4. When "the path taken in between is not determined (indistinguishable)," we add amplitudes. But when, as in this case, a measurement is made along the way and the path is determined — that is, \(|-\rangle_x\) is blocked so we know "it passed through as \(|+\rangle_x\)" — what happens? The blocking narrows the path to a single one, so we can directly apply the third rule from Ch. 4 (the amplitude for a sequential process is the product). The overall amplitude is the product of amplitudes at each stage:
The probability is the absolute value squared \(|1/2|^2 = 1/4\). Note that this also matches "multiplying the passage probability at each stage: \(1/2 \times 1/2 = 1/4\)." This is actually not a coincidence — when there's only a single path, the property of absolute values \(|A \times B|^2 = |A|^2 \times |B|^2\) means "squaring after multiplying amplitudes" and "multiplying probabilities" always give the same result. But when there are two or more paths and amplitudes are added, it's a different story — since \(|A + B|^2 \neq |A|^2 + |B|^2\), interference terms appear and change the result. Comparing with Experiment 3, where "nothing is blocked" — meaning two paths coexist and interfere — makes this difference even clearer.
🔵 Kai: Why does measuring along the way change things to "multiplying probabilities"? The third rule was "multiply amplitudes," right?
🟡 Lina: Good question. The key is that "blocking reduces the path to one." Remember the double slit — if you close one slit, the interference pattern disappears, right? Interference requires "adding the amplitudes of two paths," but if one is blocked, there's no partner to add. So no interference occurs.
🔵 Kai: I see. But how does "no interference" connect to "multiplying probabilities"?
🟡 Lina: Think of it this way. At the point of passing through the second apparatus, the particle's state is determined as \(|+\rangle_x\) — the "memory" that it was originally \(|+\rangle\) is erased. So for the particle entering the third apparatus, the first apparatus is irrelevant — it's as if a new experiment has begun. Since each stage becomes an independent event, multiplying probabilities is the correct rule.
🔵 Kai: "Memory is erased"... so after passing through the second apparatus, the information "it was originally \(z\) up" is completely lost, and it starts fresh from the state \(|+\rangle_x\).
🟡 Lina: Right. You can see it clearly from the equation — \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), so the \(z\)-direction information is "reset to half up and half down." There's no trace anywhere in this equation that it was originally \(|+\rangle\). That's the mathematical meaning of "memory is erased." Let me compute it concretely:
- Stage 1: The amplitude for state \(|+\rangle\) to pass through the second apparatus as \(|+\rangle_x\) is \({}_x\langle+|+\rangle = 1/\sqrt{2}\), probability \(1/2\)
- Stage 2: The amplitude for state \(|+\rangle_x\) to pass through the third apparatus as \(|+\rangle\) is \(\langle+|+\rangle_x = 1/\sqrt{2}\), probability \(1/2\)
In one sentence: When the state is determined midway, phase information from before is lost. When phase is lost, interference doesn't occur. When interference doesn't occur, multiplying probabilities independently is the correct rule.
⚪ Mei: So when nothing is blocked, the amplitudes "going through \(|+\rangle_x\)" and "going through \(|-\rangle_x\)" coexist and interference occurs. But blocking one eliminates the interference partner. Comparing with Experiment 3 where "nothing is blocked" should make this difference clear.
Now let me calculate concretely. The state after passing the first apparatus is \(|+\rangle\).
Step 1: Probability of passing through the second apparatus (\(x\) direction) as \(|+\rangle_x\). The amplitude is:
(Verification: From equation (5.11), \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), so the corresponding bra is \({}_x\langle+| = \frac{1}{\sqrt{2}}\langle+| + \frac{1}{\sqrt{2}}\langle-|\) (coefficients are real so complex conjugation doesn't change them). Acting on \(|+\rangle\): \({}_x\langle+|+\rangle = \frac{1}{\sqrt{2}}\langle+|+\rangle + \frac{1}{\sqrt{2}}\langle-|+\rangle = \frac{1}{\sqrt{2}} \cdot 1 + \frac{1}{\sqrt{2}} \cdot 0 = 1/\sqrt{2}\). Note that in general \(\langle\alpha|\beta\rangle = \langle\beta|\alpha\rangle^*\) (swapping the order of the inner product gives the complex conjugate), but since all coefficients here are real, \(\langle+|+\rangle_x = {}_x\langle+|+\rangle = 1/\sqrt{2}\) regardless of order.)
So the passage probability is \(|1/\sqrt{2}|^2 = 1/2\). The state of particles that passed is determined as \(|+\rangle_x\).
Step 2: Probability of passing through the third apparatus (\(z\) direction) as \(|+\rangle\). The amplitude for finding \(|+\rangle\) from state \(|+\rangle_x\) is:
So the passage probability is \(|1/\sqrt{2}|^2 = 1/2\).
Overall passage probability: Since the state resets to \(|+\rangle_x\) at the point of passing the second apparatus, Steps 1 and 2 are independent probabilistic events. The probability of independent events is their product:
That is, only one quarter of particles confirmed as \(z\) up at the beginning emerge as \(z\) up at the end.
🔵 Kai: Where did the remaining \(3/4\) go?
🟡 Lina: Half of the total is blocked by the second apparatus as \(|-\rangle_x\) (\(1/2\) is lost), and of the remaining \(1/2\), another half is deflected downward by the third apparatus as \(|-\rangle\) (another \(1/4\) of the total is lost). In total, \(1/2 + 1/4 = 3/4\) is removed along the way.
⚪ Mei: So inserting the \(x\)-direction measurement in the middle destroyed the \(z\)-direction information that was initially determined.
🟡 Lina: Exactly. This is the essence of measurement in quantum mechanics. Borrowing Feynman's words:
Once measured in a different direction, the particle does not "remember" its previous state.
🔵 Kai: But why? Isn't measurement just "looking"?
🟡 Lina: In quantum mechanics, measurement changes the state of the system. Passing only \(|+\rangle_x\) through the second apparatus means the state has been re-prepared as \(|+\rangle_x\). And since \(|+\rangle_x\) is an equal superposition of \(|+\rangle\) and \(|-\rangle\), the \(z\)-direction information is completely reset.
🔵 Kai: ...So the act of "selecting in the \(x\) direction" itself resets the \(z\)-direction state back to half-and-half between \(|+\rangle\) and \(|-\rangle\). That's why only half pass the final \(z\) measurement. "Looking" rewrites the state. ...But what if you could "look gently" — that is, obtain information without disturbing the state?
🟡 Lina: That's a sharp question. But in quantum mechanics, to learn the direction of spin, you must interact with an apparatus, and that interaction changes the state. The limits of "measuring without disturbing" will be quantitatively discussed as the uncertainty relation in Ch. 8.
🔵 Kai: Understood. For now I'll accept that "measurement changes the state" and move forward.
✅ Comprehension Check: In the sequential Stern-Gerlach experiment, why does the intermediate \(x\)-direction measurement "destroy" the \(z\)-direction information?
Answer
When \(|+\rangle_x\) is selected in the \(x\) direction, the state is re-prepared as \(|+\rangle_x\). Since \(|+\rangle_x = \frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\), the \(z\)-direction spin becomes an equal-probability superposition of \(+\hbar/2\) and \(-\hbar/2\), completely resetting the \(z\)-direction information that was initially determined.
Experiment 3: Passing All Beams in the \(x\) Direction¶
🟡 Lina: For comparison, let's consider the case where nothing is blocked at the second apparatus. We pass through the \(x\)-direction apparatus, but let both \(|+\rangle_x\) and \(|-\rangle_x\) through.
🔵 Kai: If nothing is blocked, isn't it the same as having no apparatus at all?
🟡 Lina: Exactly! Intuitively, yes. But what's important is confirming with the "add amplitudes" calculation that this really works — we want to see where in the equations the difference from Experiment 2 appears. In terms of the amplitude rules from Ch. 4, expanding \(|+\rangle\) in the \(x\) basis:
(Adding equations (5.11) and (5.12): \(|+\rangle_x + |-\rangle_x = \left(\frac{1}{\sqrt{2}}|+\rangle + \frac{1}{\sqrt{2}}|-\rangle\right) + \left(\frac{1}{\sqrt{2}}|+\rangle - \frac{1}{\sqrt{2}}|-\rangle\right) = \frac{2}{\sqrt{2}}|+\rangle = \sqrt{2}|+\rangle\). Dividing both sides by \(\sqrt{2}\) gives \(|+\rangle = \frac{1}{\sqrt{2}}(|+\rangle_x + |-\rangle_x)\).)
And in fact, the completeness relation holds not just for the \(z\) basis but for any orthonormal basis. The reason is the same — \(|+\rangle_x\) and \(|-\rangle_x\) are also orthonormal and these two span the entire 2-dimensional space, so projecting any state onto these two and adding recovers the original. Here \({}_x\langle+|\) is the bra corresponding to \(|+\rangle_x\), and \({}_x\langle-|\) is the bra corresponding to \(|-\rangle_x\) (the same relationship as \(\langle+|\) being the bra of \(|+\rangle\) in the \(z\) basis). The completeness relation in the \(x\) basis is:
This expresses that "\(|+\rangle_x\) and \(|-\rangle_x\) also exhaust the state space." Passing all beams through is equivalent to applying the identity operator \(\mathbf{1}\) — that is, doing nothing. In this case, 100% pass through the third apparatus.
⚪ Mei: I see. Blocking changes the state, but passing everything through doesn't — equation (5.20) guarantees this mathematically.
🔵 Kai: "Passing everything" is the same as "doing nothing"... but what happens when you verify this with amplitude calculations? In Experiment 2 we got \(1/4\), so does it really go back to \(1\)?
🟡 Lina: Good question. Let's do it explicitly. We calculate the total amplitude for obtaining \(|+\rangle\) at the third apparatus after passing both through the second apparatus. Remember the second rule from Ch. 4 — "add amplitudes for indistinguishable paths." This time, the particle has two paths through the second apparatus — through \(|+\rangle_x\) and through \(|-\rangle_x\) — but since neither is blocked, they're indistinguishable. So we add the amplitudes of the two paths.
Mathematically, "blocking nothing" is the same as inserting the identity operator \(\mathbf{1}\). Substituting the completeness relation from equation (5.20) there automatically produces the form of "adding two path amplitudes":
Let me read off each factor from equations (5.11), (5.12). \(\langle+|+\rangle_x\) is "the amplitude for obtaining \(|+\rangle\) when measuring a particle in state \(|+\rangle_x\) in the \(z\) direction," which is \(1/\sqrt{2}\). \({}_x\langle+|+\rangle\) is "the amplitude for obtaining \(|+\rangle_x\) when measuring a particle in state \(|+\rangle\) in the \(x\) direction," also \(1/\sqrt{2}\).
🔵 Kai: The remaining two are the same way?
🟡 Lina: Right. Let me calculate \(\langle+|-\rangle_x\). From equation (5.12), \(|-\rangle_x = \frac{1}{\sqrt{2}}|+\rangle - \frac{1}{\sqrt{2}}|-\rangle\), so:
Next, \({}_x\langle-|+\rangle\) — "the amplitude for \(|+\rangle\) to be found as \(|-\rangle_x\)." In general, \(\langle\alpha|\beta\rangle = \langle\beta|\alpha\rangle^*\) (swapping the order of the inner product gives the complex conjugate), so once we know \(\langle+|-\rangle_x\), we get \({}_x\langle-|+\rangle = \langle+|-\rangle_x^*\). Since \(\langle+|-\rangle_x = 1/\sqrt{2}\) is real, \({}_x\langle-|+\rangle = (1/\sqrt{2})^* = 1/\sqrt{2}\) immediately. Let me also verify by direct calculation. From equation (5.12), \(|-\rangle_x = \frac{1}{\sqrt{2}}|+\rangle - \frac{1}{\sqrt{2}}|-\rangle\), so the corresponding bra is \({}_x\langle-| = \frac{1}{\sqrt{2}}\langle+| - \frac{1}{\sqrt{2}}\langle-|\) (coefficients are real so no complex conjugation needed). Acting on \(|+\rangle\):
Indeed this matches \(\langle+|-\rangle_x^* = 1/\sqrt{2}\). (Alternative method: expand \(|+\rangle = \frac{1}{\sqrt{2}}|+\rangle_x + \frac{1}{\sqrt{2}}|-\rangle_x\) from equation (5.19) and use orthonormality of the \(x\) basis to get the same result.)
🔵 Kai: They're all \(1/\sqrt{2}\). So substituting...
Therefore:
🔵 Kai: Oh, it really comes back to 1! Adding the amplitudes of the two paths gives 1. But wait — in Experiment 2, we multiplied probabilities to get \(1/2 \times 1/2 = 1/4\). Here we're multiplying amplitudes and then adding. The difference is "whether or not a measurement was made in between"?
🟡 Lina: Exactly! In Experiment 2, the second apparatus blocked \(|-\rangle_x\) — meaning "which one it went through" was determined. So we took the probability at each stage and multiplied. But this time both are passed through, so "which one it went through is unknown" — so we add amplitudes. The comparison between these two experiments will be summarized in a figure shortly.
🟡 Lina: But if one is blocked — as we saw in Experiment 2 — the interference disappears and the probability drops to \(1/4\).
🔵 Kai: This has the same structure as the double-slit experiment! Closing one slit makes the interference pattern disappear. But in the double slit, it was "two spatially separated paths," while here "internal states of spin" that you can't see are interfering — interference isn't limited to spatial paths?
🟡 Lina: Exactly right. This is the universality of Feynman's laws from Ch. 4 — "add amplitudes for indistinguishable paths → interference" versus "distinguish them → multiply probabilities → no interference." What determines whether interference occurs is not whether the paths are spatially separated, but whether they are distinguishable — it boils down to this single point.
⚪ Mei: So the sequential Stern-Gerlach experiment is "the double-slit experiment in the spin world." And since the completeness relation \(|+\rangle_x{}_{x}\langle+| + |-\rangle_x{}_{x}\langle-| = \mathbf{1}\) mathematically guarantees that "passing everything = doing nothing," the reason interference fully recovers is also clear.
🔵 Kai: Is there a figure where I can compare the apparatus setups of Experiments 1, 2, and 3 at a glance?
🟡 Lina: Look at Fig. 5.2 "Comparison of sequential Stern-Gerlach experiments. Three experimental configurations compared. (a) An unfiltered beam from the furnace (random spin orientations) splits into two spots when passed through SG\(_z\). (b) Extracting \(|+\rangle\) with SG\(_z\) and passing through SG\(_z\) again gives 100% passage (Experiment 1). (c) Inserting SG\(_x\) in between and selecting only \(|+\rangle_x\) causes the final SG\(_z\) to give 50:50 again". (a) is the basic Stern-Gerlach experiment from the first section — a beam from the furnace (with random spin orientations, unfiltered) splits into two. (b) is Experiment 1 — selecting \(|+\rangle\) with SG\(_z\) and passing through SG\(_z\) again gives 100% passage. (c) is Experiment 2 — inserting SG\(_x\) in between and selecting only \(|+\rangle_x\) causes the final SG\(_z\) to revert to 50:50. You can see at a glance that the \(x\)-direction measurement erases the \(z\)-direction information. The comparison with Experiment 3 (no blocking) will be shown right after in a separate figure.
Fig. 5.2: Comparison of sequential Stern-Gerlach experiments. Three experimental configurations compared. (a) An unfiltered beam from the furnace (random spin orientations) splits into two spots when passed through SG\(_z\). (b) Extracting \(|+\rangle\) with SG\(_z\) and passing through SG\(_z\) again gives 100% passage (Experiment 1). (c) Inserting SG\(_x\) in between and selecting only \(|+\rangle_x\) causes the final SG\(_z\) to give 50:50 again — the \(x\)-direction measurement has erased the \(z\)-direction information (Experiment 2). {: #fig-qm-ch5-sequential-stern-gerlach } 🔵 Kai: Is there also a figure comparing Experiment 2 and Experiment 3 at a glance?
🟡 Lina: Look at Fig. 5.3 "Comparison of interference and blocking. Left: Experiment 2 (one path blocked)". The left side is Experiment 2 — one path is blocked, probabilities at each stage are multiplied giving an overall \(1/4\). The right side is Experiment 3 — no blocking, the amplitudes of two paths are added, and interference restores the passage probability to \(1\).
Fig. 5.3: Comparison of interference and blocking. Left: Experiment 2 (one path blocked) — probabilities at each stage are multiplied, giving an overall passage probability of \(1/4\). Right: Experiment 3 (no blocking) — amplitudes of two paths are added, and interference restores the passage probability to \(1\) (100%).
⚪ Mei: Looking at this figure, the difference between "adding amplitudes" and "multiplying probabilities" is crystal clear on the left and right. The results being \(1/4\) versus \(1\) — that dramatic difference comes solely from whether a measurement was made in between.
🟡 Lina: Right. As this figure shows, "whether or not a measurement was made in between" dramatically changes the result. In the double slit, amplitudes for "the path through slit A" and "the path through slit B" interfered. Here, amplitudes for "the path through \(|+\rangle_x\)" and "the path through \(|-\rangle_x\)" interfere to determine the result. Blocking one kills the interference and changes the result — the structure is exactly the same. And the completeness relation in equation (5.20) mathematically guarantees that "passing everything = doing nothing," making the reason for the complete revival of interference clear.
⚪ Mei: So "slit A / B" in the double slit just corresponds to "\(|+\rangle_x\) / \(|-\rangle_x\)" here — the structure is the same.
✅ Comprehension Check: When nothing is blocked at the second apparatus (\(x\) direction) and all beams are passed through, from the perspective of the completeness relation, what is this operation equivalent to?
Answer
Since \(|+\rangle_x{}_{x}\langle+| + |-\rangle_x{}_{x}\langle-| = \mathbf{1}\) (identity operator), passing all beams through is equivalent to "doing nothing." Therefore the state doesn't change, and \(|+\rangle\) is obtained with 100% probability at the end.
✅ Comprehension Check: In the \(z\) → \(x\) → \(z\) sequential experiment, what is the probability of obtaining \(|+\rangle\) at the end when (a) only \(|+\rangle_x\) is passed through the second apparatus (\(x\) direction), and (b) nothing is blocked?
Answer
When only \(|+\rangle_x\) is passed: passage probability at second apparatus \(1/2\), passage probability at third apparatus \(1/2\), overall \(1/4\). When nothing is blocked: amplitudes interfere to sum to 1, probability \(1\) (100% passage).
📝 Exercises:
- In a \(z\) → \(y\) → \(z\) sequential experiment, find the probability of obtaining \(|+\rangle\) at the end when only \(|+\rangle_y\) is passed through the second apparatus (\(y\) direction) → Problem B-6. Finding Probabilities from Probability Amplitudes
5.7 Structure of State Space — Seeds of Hilbert Space¶
🟡 Lina: Let's look back at our discussion so far and organize what kind of mathematical structure quantum mechanics has.
🔵 Kai: Please do. So many things have come up that I'm a bit confused.
🟡 Lina: Don't worry. There are 4 key points.
Key Point 1: States Are Vectors¶
🟡 Lina: The state of a spin-1/2 particle is specified by a pair of two complex numbers \((c_+, c_-)\). This is an element of a 2-dimensional complex vector space — that is, a vector.
The right side is the column vector representation. It corresponds to \(|+\rangle\) as \(\begin{pmatrix} 1 \\ 0 \end{pmatrix}\) and \(|-\rangle\) as \(\begin{pmatrix} 0 \\ 1 \end{pmatrix}\).
🔵 Kai: Why do you use \(\doteq\) instead of \(=\)?
🟡 Lina: Good question. \(|\psi\rangle\) is an abstract state vector — the "entity itself" that doesn't depend on the choice of basis. On the other hand, the column vector \(\begin{pmatrix} c_+ \\ c_- \end{pmatrix}\) is the "component representation" when the \(z\) basis is chosen — if you chose the \(x\) basis, the numerical values of the components would change. So I use \(\doteq\) with the meaning of "representation after choosing a basis." It's similar to how the same terrain looks different when you change the projection method of a map. I'll continue using this symbol from here on.
🔵 Kai: How does the bra \(\langle\psi|\) correspond to the column vector?
🟡 Lina: The bra is a row vector, obtained by taking the complex conjugate of the components. Complex conjugation, as appeared in Ch. 4, means taking a complex number \(z = a + bi\) and flipping the sign of its imaginary part to get \(z^* = a - bi\):
🟡 Lina: The reason we take the complex conjugate is that without it, "the inner product with itself" wouldn't be a positive real number. If you try calculating \(\langle\psi|\psi\rangle\), you get \(c_+^* c_+ + c_-^* c_- = |c_+|^2 + |c_-|^2\) — where \(c^* c = |c|^2\) is the relation we confirmed in Prologue. This is the sum of probabilities, so it must be a positive real number. If you didn't take the complex conjugate and computed \(c_+ \cdot c_+ + c_- \cdot c_-\), when \(c_+\) is complex this could be negative or imaginary, right?
🔵 Kai: I see — the complex conjugate is needed so that probabilities are positive real numbers.
🟡 Lina: Right. In general, the inner product is \(\langle\phi|\psi\rangle = \phi_1^* \psi_1 + \phi_2^* \psi_2\) — take the complex conjugate of the left components and multiply by the right components, then add. In matrix language, it's the product of the row vector \((\phi_1^*,\; \phi_2^*)\) and the column vector \(\begin{pmatrix} \psi_1 \\ \psi_2 \end{pmatrix}\).
🔵 Kai: Let me check. For example, if \(|\psi\rangle \doteq \begin{pmatrix} 1/\sqrt{2} \\ i/\sqrt{2} \end{pmatrix}\), then \(\langle\psi| \doteq (1/\sqrt{2},\; -i/\sqrt{2})\)? Because the complex conjugate of \(i\) is \(-i\).
🟡 Lina: Perfect. And \(\langle\psi|\psi\rangle = (1/\sqrt{2})(1/\sqrt{2}) + (-i/\sqrt{2})(i/\sqrt{2}) = 1/2 + 1/2 = 1\). To check the second term: \((-i) \times i = -(i \times i) = -i^2 = -(-1) = 1\), so \((−i/\sqrt{2})(i/\sqrt{2}) = 1 \times (1/2) = 1/2\).
⚪ Mei: It's satisfying that it comes out neatly to 1. I really feel the reason for taking the complex conjugate now.
Key Point 2: The Inner Product Gives Probability Amplitudes¶
🟡 Lina: The inner product \(\langle\phi|\psi\rangle\) of two states \(|\psi\rangle\) and \(|\phi\rangle\) is the probability amplitude for finding a particle in \(|\psi\rangle\) as \(|\phi\rangle\). The probability is \(|\langle\phi|\psi\rangle|^2\).
Key Point 3: The Choice of Basis Is Not Unique¶
🟡 Lina: \(\{|+\rangle, |-\rangle\}\) is the \(z\)-direction basis. \(\{|+\rangle_x, |-\rangle_x\}\) is the \(x\)-direction basis. Both are orthonormal and complete — meaning both are "valid bases." Physics doesn't depend on the choice of basis.
Key Point 4: Measurement Corresponds to Choosing a Basis¶
🟡 Lina: Measuring \(S_z\) means projecting onto the \(z\) basis. Measuring \(S_x\) means projecting onto the \(x\) basis. Measurement "collapses" the state to one of the basis vectors.
⚪ Mei: So in summary — states are vectors, inner products give probability amplitudes, and measurement corresponds to projection onto a basis. All four key points are connected.
🟡 Lina: Exactly. And this entire structure — "a complex vector space with an inner product defined" — is called a Hilbert space in mathematics. The spin-1/2 system we've seen is a 2-dimensional Hilbert space. When we deal with wave functions from Ch. 7 onward, infinite-dimensional Hilbert spaces will appear, but the structure is essentially the same.
🔵 Kai: 2-dimensional becomes infinite-dimensional...?
🟡 Lina: Yes. But don't worry. The rules we learned in 2 dimensions today — "orthonormal basis," "completeness relation," "inner product = amplitude," "projection = measurement" — can be used directly even as the dimension increases. That's precisely why it's important to drill the structure into ourselves using the simplest 2-dimensional system.
✅ Comprehension Check: Briefly state the 4 key points of the mathematical structure of quantum mechanics (states, inner product, basis, measurement).
Answer
(1) States are represented by vectors (elements of a complex vector space). (2) The inner product \(\langle\phi|\psi\rangle\) gives the probability amplitude. (3) The choice of basis is not unique — there are different orthonormal bases for each measurement direction. (4) Measurement corresponds to the operation of projecting the state onto the chosen basis.
Summary of Matrix Representation¶
🟡 Lina: Finally, let me summarize the column vector representations in the \(z\) basis:
🔵 Kai: It's interesting that only the \(y\) direction has the imaginary number \(i\).
🟡 Lina: Right. To represent 3 directions of 3-dimensional space, a 2-dimensional complex space needs both "real coefficients" and "imaginary coefficients." You can see a glimpse here of why complex numbers are indispensable in quantum mechanics. Take a look at Fig. 5.4 "Basis vectors of spin 1/2", where I've drawn how each basis vector "points" when the \(z\) basis is taken as the coordinate axes. The \(x\) basis points in a direction rotated by 45° within the real plane, but the \(y\) basis has imaginary components so it can't be fully represented in the real plane alone — the figure shows it for convenience, but the true "direction" can only be understood when the complex plane is included.
Fig. 5.4: Basis vectors of spin 1/2. With the \(z\) basis \(\{|+\rangle, |-\rangle\}\) as coordinate axes, the \(x\) basis points in a direction rotated by 45°. The \(y\) basis has imaginary components and cannot be fully represented in the real plane alone.
✅ Comprehension Check: What does it mean that the state space of a spin-1/2 system is a "2-dimensional Hilbert space"?
Answer
It means that any spin state can be expressed as a linear combination with complex number coefficients of two orthonormal basis vectors (for example \(|+\rangle\) and \(|-\rangle\)), and it is a complex vector space with an inner product defined.
📝 Exercises:
- Verify that \(|+\rangle_x\) and \(|-\rangle_x\) from equation (5.24) are orthonormal using the column vector inner product (\(\langle a|b\rangle = a_1^* b_1 + a_2^* b_2\)) → Problem B-2. Calculating Inner Products
Summary and Outlook¶
🟡 Lina: Let's review today's content. The main concepts are summarized in Table 5.2 "Summary of main concepts in @chapter".
Table 5.2: Summary of main concepts in Ch. 5
| Concept | Content |
|---|---|
| Stern-Gerlach experiment | Silver atom beam splits into two → discreteness of spin |
| Spin | Intrinsic angular momentum of particles. Not classical "rotation" |
| State vector \(\vert\psi\rangle\) | Element of a complex vector space representing a quantum state |
| Basis \(\vert+\rangle, \vert-\rangle\) | Set of orthonormal basic states. Different for each measurement direction |
| Inner product \(\langle\phi\vert\psi\rangle\) | Probability amplitude. \(\vert\langle\phi\vert\psi\rangle\vert^2\) is the probability |
| Completeness relation | \(\vert+\rangle\langle+\vert + \vert-\rangle\langle-\vert = \mathbf{1}\) |
| Measurement | Operation of projecting the state onto a basis. Destroys information in other directions |
🔵 Kai: What surprised me most is that after confirming the \(z\) direction, measuring the \(x\) direction erases the \(z\) direction information. But conversely, is there no way to measure \(z\) and \(x\) simultaneously? For example, if you tilt the apparatus to 45°, could you get "intermediate" information about both \(z\) and \(x\)?
🟡 Lina: That's a good idea. But tilting the apparatus to 45° means you're measuring "the spin component in the 45° direction" — you're not measuring \(z\) and \(x\) "simultaneously." The result is still two values \(\pm\hbar/2\), and that direction's spin becomes determined while the information in the \(z\) or \(x\) direction becomes undetermined.
⚪ Mei: So no matter which direction you orient the apparatus, you can only measure "the component in that direction," and information in other directions is lost.
🔵 Kai: But why can't we "determine them simultaneously"? There must be some deep reason...
🟡 Lina: That question hits the core. "Spin components in different directions cannot be simultaneously determined" — this property mathematically originates from commutation relations. The matrices representing \(S_x\) and \(S_z\) "give different results when you swap the order of multiplication" — this non-commutativity is the root of uncertainty. We'll formulate this quantitatively as the uncertainty relation in Ch. 8.
🔵 Kai: "The result changes when you swap the order of multiplication"... I learned in high school that the order matters in matrix multiplication, but I didn't know that connects to physical uncertainty. So conversely, if there's a pair of physical quantities where "swapping the multiplication order doesn't change the result," could those be determined simultaneously?
🟡 Lina: Exactly right. That's one of the core themes of Ch. 8. Look forward to it.
Preview of the Next Chapter¶
🟡 Lina: This time we acquired tools for describing "the state at a given instant." But what physics really wants to know is how states change with time, right?
🔵 Kai: Right. Does that mean the coefficients \(c_+\) and \(c_-\) of the state vector change with time?
🟡 Lina: Exactly. In the next Ch. 6, we'll deal with time evolution of two-state systems. Specifically, we'll look at how the nitrogen atom in an ammonia molecule (NH₃) quantum mechanically oscillates between "up" and "down" positions — quantum oscillation. And the device that utilizes this oscillation is the maser — the ancestor of the laser.
⚪ Mei: If "rules for time evolution" are added to "description of states" from this chapter, we'll be able to make predictions.
🟡 Lina: Right. The probability amplitude rules from Ch. 4, the state vectors and bases from Ch. 5, and the time evolution from Ch. 6 — only when these three come together does quantum mechanics begin to function as "a model that makes predictions."
References¶
- J. J. Sakurai, J. Napolitano, Modern Quantum Mechanics, 3rd ed., Cambridge University Press, 2021 — Ch.1: A structure that introduces state vectors, operators, and measurement starting from the Stern-Gerlach experiment. The primary reference for this chapter.
- R. P. Feynman, R. B. Leighton, M. Sands, The Feynman Lectures on Physics, Vol. III, Basic Books — Ch.5–6: "Spin One" / "Spin One-Half" — Detailed discussion of the Stern-Gerlach experiment for the spin-1 system, and derivation of rotation matrices for spin 1/2.
- D. J. Griffiths, D. F. Schroeter, Introduction to Quantum Mechanics, 3rd ed., Cambridge University Press, 2018 — Ch.4.4: Introduction of spin and Pauli matrices.
Feedback on this page
Let us know if something was unclear, incorrect, or could be improved.



