Feeds:
Posts

## Finite noncommutative probability, the Born rule, and wave function collapse

The previous post on noncommutative probability was too long to leave much room for examples of random algebras. In this post we will describe all finite-dimensional random algebras with faithful states and all states on them. This will lead, in particular, to a derivation of the Born rule from statistical mechanics. We will then give a mathematical description of wave function collapse as taking a conditional expectation.

Some preliminary remarks

Previously we described how the tensor product $A \otimes B$ of two random algebras could be endowed with the structure of a random algebra in a natural way. The direct product $A \times B$ can also be endowed with such a structure, but to construct a state on it given states $\mathbb{E}_A, \mathbb{E}_B$ on $A, B$ it is necessary to fix a parameter $p \in [0, 1]$ and then define

$\displaystyle \mathbb{E}_{A \times B}( (a, b) ) = p \mathbb{E}_A(a) + (1 - p) \mathbb{E}_B(b)$.

Conversely, every state on $A \times B$ is of this form for some $p$, which is precisely the expected value of the idempotent $(1, 0)$. The corresponding noncommutative probability space is a disjoint union of the spaces $\text{Spec } A, \text{Spec } B$, with the system finding itself in $\text{Spec } A$ with probability $p$ and in $\text{Spec } B$ with probability $1 - p$.

More generally, the set of states on any complex $^{\dagger}$-algebra are closed under convex linear combinations: that is, if $\mathbb{E}_1, ... \mathbb{E}_n$ are states, then so is

$p_1 \mathbb{E}_1 + ... + p_n \mathbb{E}_n$

where the $p_i$ are non-negative reals such that $\sum p_i = 1$. This is what one might refer to as a classical superposition of states. Thus states form a convex cone. Suitably defined, we can also take infinite sums or integrals of families of states to get more states, but this won’t be necessary in the finite-dimensional case.

States and density operators

Recall that we showed that a finite-dimensional random algebra $A$ with a faithful state is semisimple. Actually the proof shows slightly more than this: it shows that $A$ is a finite direct product, as a $^{\dagger}$-algebra, of $^{\dagger}$-algebras of the form $V \Rightarrow V$ where $V$ is a finite-dimensional Hilbert space, or more concretely of the form $\mathcal{M}_n(\mathbb{C})$ (equipped with the familiar conjugate transpose). In the commutative case, we get a finite direct product of copies of $\mathbb{C}$, which is familiar as the algebra of random variables on a finite set. In both the commutative and noncommutative cases states have a uniform description as follows.

Any finite-dimensional algebra $A$ over a field $k$ comes equipped with a canonical linear functional $A \to k$, namely the trace

$\displaystyle \text{tr}(a) = \text{tr}(L_a)$

where $L_a : x \mapsto ax$ is the linear operator $A \to A$ described by left multiplication by $a$. This defines a canonical bilinear form $B : A \times A \to k$, the trace form

$\displaystyle B(a, b) = \text{tr}(ab) = \text{tr}(L_a L_b)$.

Since $\text{tr}(ab) = \text{tr}(ba)$, the trace form is symmetric. The radical $\text{rad}(B)$ consists of all $a \in A$ such that $B(a, -)$ is identically zero. Since $B(a, b) = \text{tr}(ab)$, this condition is invariant under right multiplication, so $\text{rad}(B)$ is a right ideal of $A$, and since $\text{tr}(ab) = \text{tr}(ba)$, this condition is invariant under left multiplication, so $\text{rad}(B)$ is a two-sided ideal of $A$. In fact it is a familiar such ideal.

Theorem: $\text{rad}(B) = J(A)$, the Jacobson radical of $A$.

Proof. Recall that for a left Artinian ring, the Jacobson radical is the largest nilpotent (left or right) ideal, so it suffices to show that $\text{rad}(B)$ is the largest nilpotent right ideal. By passing to the algebraic closure $\bar{k}$ and upper-triangularizing, it follows that if $a$ is nilpotent then $\text{tr}(a) = 0$. Consequently, if $N$ is a nilpotent right ideal then $ab$ is nilpotent for all $a \in N, b \in A$ by assumption, hence $\text{tr}(ab) = B(a, b) = 0$ for all $b \in A$, so $N \subseteq \text{rad}(B)$. $\Box$

Corollary: $A$ is semisimple if and only if the trace form is nondegenerate.

Specializing to the case that $A$ is a finite-dimensional complex $^{\dagger}$-algebra, we can say more.

Theorem: Let $A$ be a finite-dimensional complex $^{\dagger}$-algebra. The following are equivalent:

1. $A$ has a faithful state.
2. $A$ is a finite direct product of matrix $^{\dagger}$-algebras (those of the form $\mathcal{M}_n(\mathbb{C})$ with involution the conjugate transpose).
3. $A$ has a faithful $^{\dagger}$-representation on a Hilbert space.
4. $A$ is a C*-algebra.
5. $A$ is semisimple.
6. The trace form on $A$ is nondegenerate.
7. The normalized trace $\frac{1}{\dim A} \text{tr}(a)$ is a faithful state.

(In particular, a finite-dimensional C*-algebra has a faithful state and a faithful $^{\dagger}$-representation on a Hilbert space. This is a special case of the noncommutative Gelfand-Naimark theorem.)

Proof. $1 \Rightarrow 2$: proved previously; follows from the GNS construction.

$2 \Rightarrow 3$: any matrix $^{\dagger}$-algebra comes by definition with a faithful $^{\dagger}$-representation, and we can take finite direct sums of these.

$3 \Rightarrow 4$: any closed $^{\dagger}$-subalgebra of a C*-algebra is a C*-algebra.

$4 \Rightarrow 5$: follows from the fact that C*-algebras are semiprimitive.

$5 \Rightarrow 6$: proven above.

$6 \Rightarrow 7$: faithfulness is equivalent to a modified version of the trace form $\langle a, b \rangle = \text{tr}(a^{\dagger} b)$ being an inner product, the Hilbert-Schmidt inner product. The axioms are straightforward to verify with the possible exception of positive-definiteness, which we verify as follows. Since $^{\dagger}$ is invertible, the form $\langle a, b \rangle$ is nondegenerate, and consequently for every $a \in A$ there exists $b \in B$ such that $\langle a, b \rangle \neq 0$. But by Cauchy-Schwarz, it follows that

$0 < | \langle a, b \rangle |^2 \le \langle a, a \rangle \langle b, b \rangle$

so $\langle a, a \rangle > 0$ as desired.

$7 \Rightarrow 1$: by assumption the normalized trace is such a state. $\Box$

For $A$ satisfying any of the equivalent conditions above, by the nondegeneracy of the trace form a state $\mathbb{E} : A \to \mathbb{C}$ can be uniquely expressed in the form

$\mathbb{E}(a) = \text{tr}(\rho a)$

for some $\rho \in A$, the density operator associated to the state. The axioms describing a state translate into properties satisfied by the density operator:

1. $^{\dagger}$-linearity is equivalent to the condition that $\text{tr}(\rho a^{\dagger}) = \text{tr}(\rho a)^{\dagger} = \text{tr}(\rho^{\dagger} a^{\dagger})$, and by the uniqueness of $\rho$ this is equivalent to the condition that $\rho^{\dagger} = \rho$ ($\rho$ is self-adjoint).
2. $\mathbb{E}(1) = 1$ is equivalent to the condition that $\text{tr}(\rho) = 1$.
3. $\mathbb{E}(a^{\dagger} a) \ge 0$ is equivalent to the condition that $\text{tr}(\rho a^{\dagger} a) = \text{tr}(a \rho a^{\dagger}) \ge 0$.

The last condition can be interpreted as follows. $\rho$ acts by left multiplication on $A$, which is a Hilbert space equipped with the inner product $\langle a, b \rangle = \text{tr}(a^{\dagger} b)$, as a self-adjoint operator. Consequently, by the spectral theorem it admits an orthonormal basis with real eigenvalues, and the last condition is equivalent to the condition that all of the eigenvalues are non-negative, hence that $\rho$ is a positive element of $A$. Furthermore, if $\psi_i$ denotes an orthonormal basis of eigenvectors of $\rho$ with eigenvalues $p_i$, then we can write

$\displaystyle \rho = \sum_i p_i \psi_i \otimes \psi_i^{\ast} \in A \otimes A^{\ast}$.

The condition that $\text{tr}(\rho) = 1$ is equivalent to the condition that $\sum p_i = 1$. The corresponding state is

$\displaystyle \mathbb{E}(a) = \sum_i p_i \text{tr}\left( (\psi_i \otimes \psi_i^{\ast}) a \right) = \sum_i p_i \langle \psi_i, a \psi_i \rangle$

(where the inner product above is the Hilbert-Schmidt inner product). This describes a mixed state, which is a classical superposition of the pure states $\langle \psi_i, a \psi_i \rangle$.

It is worth pointing out the special status of the density operator $\rho = \frac{1}{\dim V}$, whose corresponding state is the normalized trace itself. With respect to an arbitrary orthonormal basis $\psi_i$, it may be written

$\displaystyle \rho = \sum_i \frac{1}{\dim V} \psi_i \otimes \psi_i^{\ast}$

which gives the uniform distribution, and it is the unique density operator with this property.

If desired, writing $A$ as a finite direct product of matrix $^{\dagger}$-algebras $\mathcal{M}_n(\mathbb{C})$ gives a corresponding decomposition of a density operator as a finite direct product of density operators associated to each factor $\mathcal{M}_n(\mathbb{C})$, so understanding density operators in general reduces to understanding density operators in matrix $^{\dagger}$-algebras.

Strictly speaking, from the perspective of noncommutative probability the term “pure state” as we have been using it is a misnomer because being pure is not a property of a state on a complex $^{\dagger}$-algebra $A$. Rather, it is a property of a state together with a $^{\dagger}$-representation of $A$. If $V$ is a $^{\dagger}$-representation, then a state $\mathbb{E}$ is pure with respect to $V$ if

$\displaystyle \mathbb{E}(a) = \langle \psi, a \psi \rangle_V$

for some $\psi \in H$, and this notion depends very strongly on the inner product space $V$. Above we used $V = A$ with the Hilbert-Schmidt inner product; in the special case that $A = \mathcal{M}_n(\mathbb{C})$ it is more typical to use $\mathbb{C}^n$.

Note that by the GNS construction, every state is pure with respect to some $^{\dagger}$-representation. Apparently the use of this perspective has a name: it is referred to as the Church of the Larger Hilbert Space.

Examples in finite probability

Let $A$ be a finite-dimensional commutative semisimple complex $^{\dagger}$-algebra. Then $A \cong \mathbb{C}^S$ for some finite set $S$, the sample space in the usual sense. A density operator on $A$ isin this case is precisely a function $\rho : S \to \mathbb{R}$ assigning to an element of the sample space its probability, and so we recover the familiar setting of probability on a finite sample space from the above.

For a less familiar example, let $A = \mathcal{M}_2(\mathbb{C})$ be the complex $^{\dagger}$-algebra describing a qubit. The space of states on $A$ can be explicitly described as follows. Any density operator $\rho \in \mathcal{M}_2(\mathbb{C})$ must in particular be self-adjoint with trace $1$, hence must have the form

$\displaystyle \rho = \frac{I + \sigma}{2}$

where $\sigma$ is self-adjoint with trace $0$. If the eigenvalues of $\sigma$ are $\pm r$ for some real $r \ge 0$, then the eigenvalues of $\rho$ are $\frac{1 \pm r}{2}$, hence $\rho$ is positive if and only if $r \in [0, 1]$.

If $r = 0$, then $\rho = \frac{I}{2}$ is the normalized trace. Otherwise, for fixed $r > 0$, $\sigma$ has distinct eigenvalues, hence the $\frac{1 + r}{2}$-eigenspace uniquely determines the $\frac{1 - r}{2}$-eigenspace by orthogonality, so $\sigma$ is uniquely specified by specifying a $1$-dimensional subspace of $\mathbb{C}^2$, which may be identified with a point on a sphere (the Bloch sphere) of radius $r$.

This interpretation continues to makes sense when $r = 0$, giving a parameterization of the states on $A$ by the points in a solid ball $B^3$, which may be thought of as the Bloch sphere together with its interior. Note that the boundary of this ball consists of precisely the states such that $r = 1$, which is equivalent to $\rho$ having rank $1$ and hence equivalent to $\rho$ describing a pure state.

Gibbs states and the Born rule

Although mixed states are natural from the perspective of noncommutative probability, traditional introductions to quantum mechanics generally begin by talking about pure states. The probability distributions on measurements are then given by the Born rule, which may appear somewhat mysterious at first glance. In noncommutative probability, the Born rule is equivalent to the description of a state as

$\displaystyle \mathbb{E}(a) = \langle \psi, a \psi \rangle_H$

where $a$ is an element of a complex $^{\dagger}$-algebra $A$, $H$ is a $^{\dagger}$-representation of $A$, and $\psi \in H$ is a unit vector. When $A$ is finite-dimensional and admits a faithful state, we saw above that any state on $A$ may be described in terms of a density operator. Moreover, the density operators in, say, $A = \mathcal{M}_n(\mathbb{C})$ giving rise to states of the above form (with $H = \mathbb{C}^n$) are precisely the density operators of rank $1$. So the only mystery in the Born rule is why one should expect density operators to have rank $1$, at least in some situations.

One answer is the following. Let $A = \mathcal{M}_n(\mathbb{C})$, and let $H \in A$ be a self-adjoint element, to be thought of as a Hamiltonian. Using the Heisenberg picture, we can describe time evolution of states on $A$ with respect to the Hamiltonian $H$ as follows: after time $t$, a state $\mathbb{E}_0$ is sent to the state

$\displaystyle \mathbb{E}_t(a) = \mathbb{E}_0(e^{- \frac{H}{i \hbar} t} a e^{ \frac{H}{i \hbar} t})$.

Writing $\mathbb{E}_0(a) = \text{tr}(\rho_0 a)$ in terms of a density operator, we may equivalently describe $\mathbb{E}_t$ by describing its density operator, which is given by

$\displaystyle \rho_t = e^{ \frac{H}{i \hbar} t} \rho_0 e^{- \frac{H}{i \hbar} t}$

using the cyclic symmetry of the trace.

Suppose now that the state $\mathbb{E}_0$ is a Gibbs state, meaning that it is invariant under time evolution. This is equivalent to the condition that $\rho$ commutes with $e^{ \frac{h}{i \hbar} t}$ for all $t$. Differentiating with respect to $t$, it follows that $\rho$ commutes with $H$, and exponentiating it follows that this necessary condition is also sufficient. In other words, an equilibrium state is a state whose density operator lies in the centralizer

$\text{Cen}(H) = \{ a \in \mathcal{M}_n(\mathbb{C}) : [a, H] = 0 \}$

of $H$.

Assume now that $H$ has distinct eigenvalues (that is, that there are no degenerate energy levels).

Proposition: Let $T$ be a linear operator on a finite-dimensional vector space $V$ over an algebraically closed field $k$ with distinct eigenvalues. Then $\text{Cen}(T) = \text{span}(1, T, T^2, ...)$.

Proof. Since $T$ has distinct eigenvalues, $V$ breaks up into a direct sum of eigenspaces, and any operator commuting with $T$ necessarily preserves each eigenspace. Writing $T$ as a diagonal matrix, it follows that $\text{Cen}(T)$ consists precisely of diagonal matrices, which are spanned by the powers of $T$ by standard arguments (for example the invertibility of Vandermonde determinants). $\Box$

(By passing to the algebraic closure we can remove the hypothesis that $k$ is algebraically closed, but this is not needed in what follows.)

It follows that a Gibbs state has a density operator $\rho = \rho_0$ which is a polynomial in $H$. That is, writing

$\displaystyle H = \sum_i E_i \psi_i \otimes \psi_i^{\ast}$

where the $\psi_i$ are energy eigenstates with energy eigenvalues $E_i$, it follows that

$\displaystyle \rho = \sum_i p(E_i) \psi_i \otimes \psi_i^{\ast}$

for some $p$; in other words, $\rho$ describes a state in which the probability that the system is measured to be in a particular energy eigenstate depends only on the value of the energy. However, in order for $\rho$ to be determined in a physically meaningful way from $H$, it must be the case that adding an arbitrary scalar to $H$ does not affect $\rho$; equivalently, only differences between energies are physically meaningful, and consequently the ratios $\frac{p(E_i)}{p(E_j)}$ can only depend on the value of $E_i - E_j$. Writing

$\displaystyle \frac{p(E_i)}{p(E_j)} = f(E_i - E_j)$

we observe that

$\displaystyle \frac{p(E_i)}{p(E_j)} \frac{p(E_j)}{p(E_k)} = f(E_i - E_j) f(E_j - E_k) = \frac{p(E_i)}{p(E_k)} = f(E_i - E_k)$

and so, under very mild continuity assumptions on $f$ (enough to rule out pathological solutions to the Cauchy functional equation, none of which are physically meaningful anyway), together with the constraint that $\rho$ is positive, it follows that $f$ must have the form

$\displaystyle f(E_i - E_j) = e^{ - \beta (E_i - E_j)}$

for some real constant $\beta$. Under the additional physical assumption that the system is more likely to be in a state with low energy than in a state with high energy, it follows that $\beta$ is non-negative. An equivalent way of stating the above identity is to write the density operator as

$\displaystyle \rho = \frac{1}{Z(\beta)} e^{ - \beta H}$

where

$\displaystyle Z(\beta) = \text{tr}(e^{- \beta H}) = \sum_i e^{- \beta E_i}$

is the partition function (a normalization necessary for $\rho$ to have trace $1$). The corresponding Gibbs state is then

$\displaystyle \mathbb{E}(a) = \frac{1}{Z(\beta)} \text{tr}(e^{- \beta H} a)$.

Explicitly, the probability that the system is in energy eigenstate $\psi_i$ is given by $\frac{1}{Z(\beta)} e^{- \beta E_i}$.

In statistical mechanics, $\beta$ is interpreted as thermodynamic beta $\frac{1}{k_B T}$ where $T$ is temperature and $k_B$ is the Boltzmann constant. When $T$ is very large, $\beta$ is very small, and the system resembles a distribution which is uniform in energy eigenstates.

When $T$ is very small, $\beta$ is very large, and the system is overwhelmingly likely to be in its lowest energy eigenstate; that is, if the energy eigenvalues are ordered $E_0 < E_1 < ... < E_{n-1}$, then the lowest energy eigenstate is $e^{ \beta (E_1 - E_0)}$ times more likely to occur than the second-lowest energy eigenstate, and the other energy eigenstates are even less likely. In terms of the density operator, we can write

$\displaystyle \rho = \frac{1}{Z(\beta)} \sum_i e^{- \beta E_i} \psi_i \otimes \psi_i^{\ast} \approx \psi_0 \otimes \psi_0^{\ast}$

for large $\beta$; in other words, as $\beta \to \infty$, $\rho$ approaches an operator of rank $1$, and the corresponding Gibbs state approaches the pure (with respect to $\mathbb{C}^n$) state described by the lowest energy eigenstate.

Wave function collapse as conditioning

The Born rule describes the probability distribution associated to a measurement of a quantum system, but another postulate of quantum mechanics, the collapse postulate, describes what happens to a measured quantum system afterwards. It asserts the following (simplified to the case of discrete spectrum): suppose a quantum system is in a pure state described by a unit vector $\psi$ in a Hilbert space $H$. Suppose we measure an observable $a$, given by a suitable self-adjoint operator $H \to H$. Then this observable takes value $\lambda$, some eigenvalue of $a$, with probability $\langle \psi, P_{\lambda} \psi \rangle$, where $P_{\lambda}$ is the projection onto the $\lambda$-eigenspace of $a$. After this measurement, the state $\psi$ is replaced by its projection $P_{\lambda} \psi$, normalized to have unit length.

The physical meaning of the collapse postulate is contentious. Here I only want to record a mathematical observation, which is that it is mathematically equivalent to conditioning with respect to the observation that $a$ was measured to have value $\lambda$.

Recall that if $A$ is a random algebra and $P$ a projection in $A$ with $\mathbb{E}(P) > 0$ (an event that occurs with positive probability), then the conditional expectation $\mathbb{E}(a | P)$ (the expected value of $a$ conditional on the fact that the event $P$ happened) is given by

$\displaystyle \mathbb{E}(a | P) = \frac{\mathbb{E}(PaP)}{\mathbb{E}(P)}$.

If $A$ has a $^{\dagger}$-representation on a Hilbert space $H$ relative to which the state $\mathbb{E}$ is the pure state $\mathbb{E}(a) = \langle \psi, a \psi \rangle$ for some $\psi \in H$, then

$\displaystyle \mathbb{E}(a | P) = \frac{\langle \psi, PaP \psi \rangle}{\langle \psi, P \psi \rangle} = \frac{\langle P \psi, a P \psi \rangle}{\langle P \psi, P \psi \rangle}$

(using the fact that $P^2 = P$ and that $P$ is self-adjoint). In other words, the conditional expectation is precisely the pure state associated to the projection $P \psi$, normalized to have unit length.