Hilbert spaces are a particularly nice class of Banach spaces. They axiomatize ideas from Euclidean geometry such as orthogonality, projection, and the Pythagorean theorem, but the ideas apply to many infinite-dimensional spaces of functions of interest to various branches of mathematics. Hilbert spaces are also fundamental to quantum mechanics, as vectors in Hilbert spaces (up to phase) describe (pure) states of quantum systems.
Today we’ll develop and discuss some of the basic theory of Hilbert spaces. As with the theory of Banach spaces, there are (at least) two types of morphisms we might want to talk about (unitary operators and bounded operators), and we will discuss an elegant formalism that allows us to talk about both. Things written by John Baez will be cited excessively.
Definition and introductory remarks
Let be a vector space over or . An inner product on is a map satisfying
- (linearity in the second argument),
- (conjugate symmetry; this implies conjugate linearity in the first argument),
- and (positive-definiteness).
(Linearity in the second variable is conventional in physics but in mathematics the convention is generally to have linearity in the first variable. We use the physics convention above for reasons explained in the next section.)
A vector space equipped with an inner product is an inner product space. Inner products generalize the ordinary dot product of vectors in , but the formalism applies to infinite-dimensional spaces such as various function spaces, allowing us to use geometric intuition from the former to understand the latter. In quantum mechanics, inner products are fundamental as they give rise to transition amplitudes (see for example the Born rule).
Any inner product spaces gives rise to a function which is readily seen to satisfy all of the axioms of a norm with the possible exception of the triangle inequality, which we now prove.
Cauchy-Schwarz inequality: let be vectors in an inner product space. Then .
The Cauchy-Schwarz inequality can be proven in many ways (see for example Steele’s The Cauchy-Schwarz Master Class). Although it is stated here for an arbitrary inner product space, by restricting to the subspace generated by and we see that it is really a statement about -dimensional inner product spaces.
Proof. Consider the quadratic polynomial
By positive-definiteness, it cannot be negative, so its discriminant cannot be positive. This gives
and it follows that . Multiplying by a complex number of absolute value does not change the RHS, and it can make the LHS real and non-negative, giving the desired inequality.
Proof. By Cauchy-Schwarz,
Following the above, for an inner product we call the induced norm.
Corollary: For any inner product space and any , the map is a continuous linear functional of operator norm with respect to the induced norm.
The identity should be thought of an abstract form of the law of cosines. In particular, if ( are orthogonal), then the Pythagorean theorem
An inner product space is a Hilbert space if it is complete with respect to the induced norm.
Example. For any measure space with measure , the space is a Hilbert space with inner product
Special cases include the spaces for a set as in the Banach space examples; wehn is finite and we work over the reals we recover Euclidean space with the usual inner product. In quantum mechanics, a fundamental example is with Lebesgue measure, as is the space in which wave functions describing a particle in three spatial dimensions live. If is a probability measure we can think of as random variables, and if they happen to have expected value then is their covariance.
If is a real inner product space with induced norm , then a straightforward computation shows that
and if is a complex inner product space a somewhat more tedious computation shows that
In any case, we conclude that the inner product uniquely determined by the norm it induces. Thus being Hilbert is a property of a Banach space up to isometric isomorphism. We can even characterize the Banach spaces with this property in a fairly straightforward manner: they are precisely the ones with norms satisfying the parallelogram identity
This is fairly annoying to prove, but it has a nice interpretation: if a norm is like the Euclidean norm in this particular respect, then it must be like the Euclidean norm in various other respects (coming from what can be proven using the inner product space axioms).
We might now be tempted to think of Hilbert spaces as a subcategory of , but we shouldn’t. For example, the product or coproduct of Hilbert spaces in is almost never a Hilbert space; Hilbert spaces instead admit a direct sum coming from a generalized -norm rather than a generalized – or -norm. This suggests that weak contractions aren’t a natural choice of morphisms between Hilbert spaces.
If we want to be permissive, we should take bounded linear operators as morphisms. If we want to be restrictive, we want all of the relevant structure to be preserved (namely the inner product), so we could take as morphisms maps such that
These include the unitary maps, which are the invertible maps with this property.
(Note that since the inner product is uniquely determined by a composition of linear functions and the norm, it follows that a linear operator between Hilbert spaces preserves the inner product if and only if it preserves the norm. Thus we may call a map satisfying the above property an isometry.)
We also make the following observation whose name will be explained below.
The Yoneda lemma for inner product spaces: Let be vectors in an inner product space such that . Then .
Proof. The above implies , so , so by positive-definiteness.
The theory of real Hilbert spaces is a straightforward axiomatization of the properties of the dot product in Euclidean space, but the theory of complex Hilbert spaces includes an additional wrinkle, namely the issue of conjugate symmetry and the fact that the inner product is conjugate-linear rather than linear in one variable. Above I chose to have inner products be linear in the second variable rather than the first, and the reason is the following example.
Let be a finite group and consider the category of finite-dimensional complex representations of . For with characters , recall that we have
In other words, the dimension of spaces of intertwining operators defines an inner product on the complex vector space spanned by characters (formally, the tensor product where denotes the Grothendieck group) which is naturally conjugate-linear in the first variable. Morally this is because Hom is contravariant in the first variable and covariant in the second.
This example is particularly interesting because in quantum mechanics the inner product of states describes the transition amplitude between them (in a sense that I don’t completely understand), and it would not be too far-fetched to think of transition amplitudes as being morphisms in some vague sense between states.
In this way we see that itself is a kind of categorified Hilbert space, with morphisms as a kind of categorified inner product. Decategorifying the Yoneda lemma for elements of gives back the Yoneda lemma for inner products above. Decategorifying the isomorphism gives conjugate-symmetry. Decategorifying the adjunction between, say, restriction and induction functors gives adjoint operators (see below). And so forth. For a further elaboration on this theme, see Baez’s Higher-Dimensional Algebra II: 2-Hilbert spaces.
Projections and complements
In , the ordinary dot product allows us to define the projection
of a vector onto another vector . The above notation is somewhat confusing, as it takes two vectors as inputs when it should really take as input a vector and a subspace ; the projection should then be the closest vector in to . The above is just the special case that .
We formalize this as follows. For and , define the distance
(Of course this definition makes sense in any metric space.) Then is a closest vector in to if . We say that admits closest vectors if such a vector always exists for all . (Note that such a subset is in particular closed.)
For general subsets , closest vectors are not guaranteed to be unique. However:
Proposition: Let be a subset of an inner product space which is closed under taking midpoints. Then the closest vector to a vector is unique if it exists.
Proof. Suppose that are two closest vectors. By the parallelogram identity,
It follows that (which lies in by assumption) is strictly closer to than either or unless , hence unless .
Note that this is badly false in a general normed space. For example, in with the norm, every vector is closest among the vectors on the -axis to the vector .
In Euclidean space, projection is valuable among other things because it resolves a vector into two perpendicular components. The same is true in arbitrary inner product spaces.
Proposition: Let be a subset of an inner product space which is closed under scalar multiplication. If the closest vector to a vector exists, then .
Proof. By multiplying by a suitable unit complex number as necessary we may assume WLOG that is real. Since is closest, the real function has a local minimum at . Its derivative there is therefore
Let be a subspace (necessarily closed) which admits closest vectors. Then it follows by the above that we may write any as a sum
of a vector in and a vector in its orthogonal complement .
We now need to introduce some important terminology. If are inner product spaces, their direct sum can be given the inner product
This defines the direct sum of inner product spaces. If an inner product space has subspaces such that is the internal direct sum of as vector spaces and moreover such that are orthogonal, then is an internal direct sum of as inner product spaces, which we write as .
Proposition: Let be a subspace of an inner product space which admits closest vectors. Then .
Proof. By assumption, every can be written as where . Since , this sum decomposition is necessarily unique, which already implies that it must be linear. Since is orthogonal to by assumption, has the direct sum inner product.
We can reformulate the above geometric discussion algebraically in terms of axioms that the map satisfies as follows. A projection on an inner product space is a bounded linear operator such that
- is idempotent (), and
- is self-adjoint ( for all ).
We recall the following general result on idempotents.
Proposition: Let be a left -module ( a ring, not necessarily commutative) and be an idempotent morphism of -modules. Then admits a direct sum decomposition
Proof. We may write any as . Since , we have . Conversely, if then , so . Since , fixes any element of , so . Finally, since is a morphism, its kernel and image are both submodules of .
The converse is straightforward; hence studying idempotents in is equivalent to studying direct sum decompositions of .
Applied to projections, we have the following.
Proposition: Let be a projection on an inner product space . Then admits a direct sum decomposition
In particular, , so a projection is uniquely determined by its image.
Proof. Everything follows from the last proposition except the last claim, which follows from self-adjointness:
The converse is again straightforward. Altogether we can summarize our discussion as follows.
Theorem: The following conditions on a subspace of an inner product space are equivalent.
- admits closest vectors.
- There exists a projection such that .
We turn now to the question of which subspaces have this property.
Proposition: Let be a finite-dimensional subspace of an inner product space . Then admits closest vectors.
Proof. Let . We want to show that there is a closest vector in to . Since is at a distance from , it follows by the triangle inequality that the closest vector, if it exists, is necessarily contained in the closed ball of radius centered at the origin in . Since is finite-dimensional, this ball is compact, so the function attains its minimum.
This proof does not generalize to the infinite-dimensional case, since closed unit balls are no longer compact in this setting. By assuming that is a Hilbert space, we can substitute completeness for compactness.
Theorem: Let be a closed convex subset of a Hilbert space . Then admits a closest vector.
Proof. One direction is straightforward. In the other direction, let be a vector and let be a sequence such that
By the parallelogram identity,
for any (note that this is the same use of the parallelogram identity as when we proved that closest vectors are unique). The RHS approaches as while by definition, so it follows that , hence that is a Cauchy sequence. Since is a Hilbert space and is closed, has a limit satisfying , hence this limit must be the closest vector.
Corollary: A subspace of a Hilbert space admits closest vectors if and only if it is closed.
Corollary: If is a subspace of a Hilbert space , then .
Proof. is a closed subspace of such that , hence by the above we have a direct sum decomposition
In any direct sum decomposition the two spaces are orthogonal complements of each other, so it follows that as desired.
The theory of Banach spaces is unlike ordinary linear algebra in that Banach spaces do not admit a particularly good notion of basis. The linear-algebraic notion of basis, which only allows finite sums, is clearly unsuitable: it ignores the infinite sums which are now available, and spaces of functions don’t have reasonable Hamel bases anyway (see for example this math.SE question). The next obvious choice is to talk about Schauder bases, which are sequences in a Banach space such that every has a unique representation as an infinite sum
Unlike ordinary bases, Schauder bases must be ordered since the sum above is not required to converge absolutely. They also don’t always exist, even for separable Banach spaces; there is a counterexample due to Enflo. Finally, as far as I can see there is no guarantee that the function sending a vector to the coefficient above is even linear, let alone continuous, due to the lack of absolute convergence.
But everything works out for Hilbert spaces. In any inner product space, a collection of vectors is orthonormal if they satisfy . In particular, the have norm and are linearly independent, since if then for all . An orthonormal basis of a Hilbert space is an orthonormal set whose span is dense in .
Bessel’s inequality: Let be an orthonormal set and a vector in an inner product space . Then for all but countably many , and .
Proof. Let be indexed by a set and let be a finite subset of . Let be the projection onto . Then we may write explicitly as
by inspection. Since , taking norms gives for all finite subsets . By exhausting every countable subset of by finite sets, it follows that the inequality holds for all countable subsets of . Because we cannot take uncountable sums of positive real numbers, it follows that for all but countably many , so the inequality holds for .
Bessel’s inequality becomes an equality in the following case, which is an infinite-dimensional generalization of the Pythagorean theorem.
Parseval’s identity: Let be an at most countable orthonormal set. If converges, then it converges absolutely, , and .
Proof. One direction is clear. In the other direction, let , let . We have by assumption, so the sum converges absolutely. Convergence implies convergence of norms, hence . Finally, since for all , it follows by continuity that .
We would like to conclude that orthonormal bases really are bases in a suitable Hilbert space sense, but first we need to prove the following.
Proposition: Let be an orthonormal set and suppose that lies in the closure of the span of the . Then .
Proof. By the above, we may assume WLOG that is countable, indexed . Let be the projection onto . Since is the closest vector in to , it follows that lies in the closure of the span of the if and only if .
Corollary: Let be a Hilbert space with an orthonormal basis . Then the map
is a unitary isomorphism.
Proof. We showed above that preserves norms, so it remains to prove that it is linear. clearly respects scalar multiplication, and it also clearly respects addition on the subspace of consisting of sequences with finite support. Since preserves norms, the rest follows by the continuity of and addition.
This is a strong structure theorem for Hilbert spaces with an orthonormal basis. We now turn our attention to proving that an orthonormal basis always exists. The idea, known as the Gram-Schmidt process, is the following in finitely many dimensions.
Suppose are finitely many nonzero vectors in an inner product space. We’d like to find an orthonormal set of vectors with the same span. We’ll do this inductively. First, set . Assuming that have been defined, let denote the projection onto . Now, if is the smallest index such that , we can set
It follows that is an orthonormal basis of for all .
The Gram-Schmidt process as defined here extends without fuss to countably many vectors .
Corollary: Every separable Hilbert space has an orthonormal basis.
In particular, the separable infinite-dimensional Hilbert space is unique up to unitary isomorphism. Thus physicists sometimes speak of “Hilbert space” (as in “vectors in Hilbert space”) by which they mean the unique separable infinite-dimensional Hilbert space.
To extend the Gram-Schmidt process to an arbitrary number of vectors in a Hilbert space, we use transfinite induction. If you don’t care about non-separable Hilbert spaces, you can stop reading here.
Let be a collection of vectors in a Hilbert space indexed by ordinals and define a corresponding orthonormal set as follows. As above, we set . If has already been defined for all , let be the least ordinal such that is not contained in the closure of , let be the projection onto this subspace, and define
Similarly, if has already been defined for all for a limit ordinal, let be the least ordinal such that is not contained in the closure of , let be the projection onto this subspace, and define
By transfinite induction the are orthonormal and is an orthonormal basis for (where is defined as above in relation to ).
Corollary: Every Hilbert space has an orthonormal basis.
Unfortunately, this seems to require some form of choice. What we can prove in ZF is that every Hilbert space for which one can exhibit explicitly a dense well-ordered subset has an orthonormal basis.
Using orthonormal bases
Consider the Hilbert space where carries normalized Haar measure. Equivalently, consider with the inner product
The function separates points, so by Stone-Weierstrass the smallest algebra it contains which is closed under complex conjugation is dense in in the uniform topology, hence in . Since is dense in , it follows that in fact the algebra of complex polynomials is dense in . Consequently, is separable and has an orthonormal basis. The Gram-Schmidt process can be used to construct such a basis starting from the vectors ; these are, up to some normalization, the Legendre polynomials.
Another orthonormal basis comes from the observation that also separates points, so the span of the functions , is also dense by Stone-Weierstrass. Happily, these functions are already orthonormal: we have
It follows that we may expand any function in in a Fourier series
We caution that what we have proven so far is only enough to conclude that Fourier series converge in , which says nothing about uniform or pointwise convergence; these are much more subtle matters. However, even just convergence is enough to prove some nontrivial results. For example, we compute using integration by parts that
if and since is odd, hence
in . Taking norms of both sides, we conclude
This is the answer to the famous Basel problem. Replacing with above gives us a method for evaluating for all positive integers .
The assignment defines an injection from any inner product space to its dual space (recall that this consists of bounded linear operators ). Moreover, has norm , so this injection is norm-preserving. However, it is conjugate-linear rather than linear. To fix this, we introduce for any complex vector space the conjugate (not to be confused with its closure in some ambient space!), which is the same abelian group as but with scalar multiplication defined by the conjugate of scalar multiplication in . (This only matters if we work over rather than over .) Then the inner product on any inner product space defines a linear norm-preserving injection
It is natural to ask when this map is an isomorphism (of normed vector spaces).
Riesz represenation: Let be a Hilbert space. Then the map above is an isomorphism.
Proof. We know that it is linear, injective, and norm-preserving, so it suffices to prove that it is surjective. Let be a continuous linear functional. The claim is trivial if is zero, so suppose is nonzero. is closed, so admits a direct sum decomposition
Since is nonzero, is nontrivial, and if then , so it follows that is one-dimensional. If is any nonzero vector, then is a continuous linear functional which is trivial in and nontrivial on its orthogonal complement, so must be equal to up to a scalar.
The completeness of is essential. For example, let be the space of compactly supported sequences with the inner product induced from . Then there is a continuous linear functional sending such a sequence to, say, which is not of the form for any .
Corollary: Hilbert spaces are reflexive.
The Riesz representation theorem allows us to define the following crucial operation.
Theorem-Definition: let be a bounded linear operator. There exists a unique map , the adjoint (or Hermitian adjoint) of , which satisfies
Proof. For fixed the map is a continuous linear functional on , so by Riesz representation there exists a unique vector such that for all . Moreover, by uniqueness
so the assignment is linear. Finally,
so is bounded (in fact has the same norm as ).
Remark. Let and let be an orthonormal basis. Then , which says precisely that the “matrix” of with respect to the basis is the conjugate transpose of the “matrix” of .
Remark. The adjoint is closely related, but not identical, to the dual . If are any two Banach spaces, then for any bounded linear operator we may define its dual on dual spaces, which is defined by precomposition. It is a corollary of the Hahn-Banach theorem that , but the above argument does not need the Hahn-Banach theorem. If are Hilbert spaces, then is a map , or equivalently by Riesz representation a map , whereas the adjoint is a map , so it is important not to confuse the two as mathematical objects; however, one is essentially the complex conjugate of the other.
The adjoint satisfies the following basic properties which follow straightforwardly from the definition. The second property shows that taking adjoints may be regarded as a generalization of complex conjugation for operators on Hilbert spaces.
- ( a scalar),
The adjoint allows us to define the following important classes of linear operators. A bounded linear operator on a Hilbert space is
- self-adjoint if ,
- skew-adjoint if ,
- unitary if ,
- normal if .
In quantum mechanics, self-adjoint operators play the role of real-valued observables. They should be thought of as the “real operators,” for example because their eigenvalues are necessarily real. Any operator can be written uniquely as the sum of a self-adjoint and skew-adjoint operator
Since is self-adjoint if and only if is skew-adjoint, one can think of the above as a decomposition of an operator into its real and imaginary parts , although this is not particularly useful unless the two commute (which is the case if and only if is normal). When that happens, if is an eigenvector of with eigenvalue , then is an eigenvector of with eigenvalue (we will prove this below), hence is an eigenvector of with eigenvalue and an eigenvector of with eigenvalue .
The unitary maps are precisely the invertible maps preserving the inner product. They form a group, the unitary group of . A homomorphism where is a group is a unitary representation of , and these are a very natural object of study. (See for example the Peter-Weyl theorem.)
The skew-adjoint maps form a Lie algebra under commutator, the unitary Lie algebra . These are precisely the maps such that is a continuous group homomorphism . The proof is straightforward but we will defer it to the next post when it can be done in slightly greater generality.
The spectral theorem in finite dimensions
As a simple but important illustration of thinking in terms of adjoints, we prove the following.
Spectral theorem: Let be a self-adjoint operator on a finite-dimensional Hilbert space . Then there exists an orthonormal basis of consisting of eigenvectors of , and all eigenvalues of are real.
Proof. The first step is to prove that has an eigenvector. This is true for any linear transformation on a finite-dimensional complex vector space using, for example, standard facts about characteristic polynomials, but we will give an independent proof that more strongly suggests the correct generalization to the infinite-dimensional case.
Let be a vector of norm such that
is maximized. (Such a vector exists by compactness.) We claim that is an eigenvector of . To see this, let and let be a unit vector. Then is a one-parameter family of unit vectors, and by assumption the function has a local maximum at . We compute that this is equal to
Its derivative at is equal to
Since we may scale by unit complex numbers without loss of generality, it follows that for all , hence for some . Since
it follows that is real. Finally, since
it follows that is an invariant subspace for , so by induction we may complete to an orthonormal basis of eigenvectors of as desired.
An equivalent statement is that a self-adjoint operator is diagonalizable by a unitary operator. Since commuting operators act on each other’s eigenspaces, this is also true for normal operators (although the eigenvalues need no longer be real in this case). More generally, we can say the following.
Corollary: Let be a commuting family of normal operators on a finite-dimensional Hilbert space . Then there exists an orthonormal basis consisting of eigenvectors for all of the .
In other words, the may be simultaneously diagonalized by a unitary operator.
A geometric interpretation of the spectral theorem is the following. Working over for simplicity, is a self-adjoint operator if and only if the bilinear form is symmetric. Associated to such a bilinear form is the quadratic form from which it may be recovered. The spectral theorem shows that, letting be an orthonormal basis and letting be the corresponding eigenvalues, we may write
The “unit spheres” then describe shapes in generalizing conic sections for depending on how many of the are positive, negative, or zero. For example, when we may get ellipsoids or hyperboloids. is positive-definite if and only if all of the are positive, in which case describes an ellipsoid. In this case the vectors can be interpreted as the “principal axes” of the ellipsoid, which generalize the semimajor and semiminor axis from the case , and the are the squares of the reciprocals of the lengths of these axes.
The dagger category of Hilbert spaces
The category of Hilbert spaces has as morphisms the bounded linear operators. Since two Hilbert spaces which are bi-Lipschitz equivalent have orthonormal bases of the same cardinality, they are actually isometrically (equivalently, unitarily) isomorphic, but not every bi-Lipschitz equivalence is an isometry. We still want to talk about unitary maps in this setting, so how should we do that?
The answer is to explicitly make the adjoint part of the structure of . We define a dagger category, or -category, to be a category equipped with a contravariant functor
which is the identity on objects and which satisfies . More explicitly, for every pair of objects there is a map
such that and . In any dagger category, an endomorphism is self-adjoint if and an isomorphism is unitary if . A functor between dagger categories is a dagger functor if .
Example. Let denote the category of sets and relations. Recall that a relation between two sets is a subset of their Cartesian product . We write to mean that is in this subset. Composition of relations is defined as follows: if and are two relations, then is the relation defined by
(Note that this disagrees with the usual convention for function composition, where a function is realized as the relation ; what I call would be for functions called .) For intuition, you should think of a relation between two sets as defining a partially defined and nondeterministic function between them (“nondeterministic” is another way to say “multivalued” but I think it gives a better intuition).
is a dagger category with the dagger defined by
A relation is self-adjoint if and only if it is symmetric, and every isomorphism is unitary (and is also a bijective function).
Example. Let be a positive integer. The category of –cobordisms is the category whose objects are -dimensional compact manifolds and whose morphisms are diffeomorphism classes of -dimensional manifolds with boundary the disjoint union . Composition in this category is defined by “sewing together” two manifolds at a common boundary component. (There are some subtleties here about maintaining a manifold structure when doing this that we will ignore completely.)
is a dagger category with the dagger given by switching the role of and ; in other words, “turning cobordisms around.”
Heuristically speaking, the morphisms in describe time evolution between -dimensional “spaces,” with the cobordisms describing -dimensional “spacetimes.” (To make the connection to general relativity closer we should require, say, a Lorentzian structure on the cobordisms such that the boundary is a spacelike slice.) is of fundamental importance to the subject of topological quantum field theory, which is roughly speaking the study of certain kinds of functors . A unitary TQFT is a certain kind of dagger functor , which can be thought of as a “functor from general relativity to quantum mechanics.” For an elaboration on this point of view, see Baez’s Physics, Topology, Logic, and Computation: a Rosetta Stone.
Example. Let be any category which admits finite pullbacks. The category of spans in is the category whose objects are those of and whose morphisms are diagrams with composition defined by pullback. Given any span its dagger is simply obtained by switching and .
Spans of sets generalize relations in that they allow “multiple arrows” between an element of and an element of . They also generalize cobordisms, since one can think of a cobordism as a cospan where the two arrows are the two inclusions of the boundary components into the cobordism. For more about spans, see this page by Baez, which contains slides for a talk as well as references. The tale of groupidification is also relevant.
But let’s return to Hilbert spaces for the time being. Given that we can define unitary maps using only the adjoint, and unitary maps are the isomorphisms preserving the inner products, it seems that the adjoint already captures the inner product on a Hilbert space. This is in fact true.
We first need some notation. In , there is a distinguished object , the one-dimensional Hilbert space . This object represents the obvious forgetful functor to in that can be canonically identified with the vectors in . Thus we may think of vectors in as morphisms .
Proposition: Let be vectors. Then .
Proof. By definition, is the unique operator satisfying
Since is a morphism , it is just a scalar, so and the conclusion follows.
In any dagger category with a distinguished object (usually the identity object of a monoidal operation on making it a dagger monoidal category) we may therefore define inner products of morphisms taking values in , and this inner product satisfies , so the dagger behaves the same way with respect to it as the adjoint does for Hilbert spaces. Moreover, among the isomorphisms in we can distinguish the unitary isomorphisms because they preserve inner products.
Example. In , a morphism is a subset of , so the functor sends a set to its collection of subsets and sends a relation to the function
(These functions are precisely the functions which preserve arbitrary unions.) If are two subsets, then is one of the two possible subsets of , the empty set and the entire set; it is empty if are disjoint and the entire set otherwise. Then the relation when restricted to one-element subsets says precisely that .
Quantum weirdness is not so weird
It turns out that some important quantum phenomena, such as quantum teleportation, can be described in an abstract framework based on dagger categories. More precisely, we need dagger compact categories, which are dagger categories equipped with extra structure generalizing the tensor product and dual of Hilbert spaces. The nLab page on this subject has a nice list of references.
This suggests that part of the difference between classical and quantum mechanics boils down to the difference between dagger compact categories and a category like . A basic such difference is that in a dagger category, the two representable functors and are canonically (contravariantly) isomorphic, the isomorphism provided by the dagger operation. (A unitary isomorphism is then precisely an isomorphism which preserves both representable functors and which also preserves this identification between them.) This is very far from the case in a more classical category like .
Replacing with already helps a great deal. Since relations behave like nondeterministic functions, they are morally much more closely related to linear operators between vector spaces than to (deterministic) functions between sets. In some sense they already are linear operators: it is possible to think of relations as being matrices over the truth semiring with addition defined by union and multiplication defined by intersection. For the special case of relations between finite sets, this is abstractly because admits finite biproducts (given by the disjoint union) and every finite set is a biproduct of copies of .
admits a monoidal operation given on sets by the Cartesian product. The fact that this is not the categorical product is reflected in the fact that “entangled states” exist: namely there are subsets of a Cartesian product which cannot be obtained by taking the product of a subset of with a subset of . further admits an internal hom which is also given on sets by the Cartesian product (but it is contravariant in the first variable; remember that the underlying set here is , so we get the set of subsets of the Cartesian product as we should), and the tensor-hom adjunction
holds, making a closed monoidal category and in fact a dagger compact category.
There is a lot more to say here, but it will have to wait for later posts.