One annoying feature of the abstract theory of vector spaces, and one that often trips up beginners, is that it is not possible to make sense of an infinite sum of vectors in general. If we want to make sense of infinite sums, we should probably define them as limits of finite sums, so rather than work with bare vector spaces we need to work with topological vector spaces over a topological field, usually or (but sometimes fields like are also considered, e.g. in number theory). Common and important examples include spaces of continuous or differentiable functions.
Today we’ll discuss a class of topological vector spaces which is convenient to work with but which still covers many examples of interest, namely Banach spaces. The material in the first half of this post is completely standard and can be found in any text on functional analysis.
In the second half of the post we discuss a category of Banach spaces such that two Banach spaces are isomorphic in this category if and only if they are isometrically isomorphic but which still allows us to talk about bounded linear operators between Banach spaces, and to do this we briefly discuss Lawvere metrics; this material can be found on the nLab.
Definition and examples
Let be a vector space over or . A seminorm on is a function satisfying
- for all ,
A norm is a seminorm such that . Any norm defines a metric on , and a vector space equipped with a norm is a normed (vector) space. It is a Banach space if in addition it is complete with respect to this metric.
Example. The completion of any normed space is naturally a Banach space.
Example. Let be real and let be a set. The -space is the normed space of sequences such that
converges with the above as a norm (the triangle inequality for this norm is a consequence of Hölder’s inequality). These spaces are always complete. (The proof is an unnecessary digression at this point, but can probably be found in any text on measure theory.)
When and is a finite set of size we recover the Euclidean space with the usual Euclidean distance.
Example. The -space is the space of bounded sequences equipped with the , supremum, or uniform norm
gets its name from the fact that for such that both sides are well-defined, and is also complete (although the proof is more straightforward in this case).
Example. More generally, let be a measure space with measure and let be real. The -space is the space of all measurable functions such that
exists. The above function is a seminorm (the triangle inequality is again a consequence of Hölder’s inequality), but usually not a norm since, for example, may have support a set of measure zero. To get a norm, we quotient by the subspace of with to get the -space , which is genuinely a normed space. These spaces are also always complete. (Again, an unnecessary digression.)
This construction generalizes the construction of the spaces in the case that every subset of is measurable and the measure assigns measure to all singletons in (hence assigns measure to all infinite sets in ).
Example. The -space is the space of measurable functions such that the or essential supremum norm
exists. In more words, the essential supremum is the “almost everywhere supremum”: may be greater than it, but only on a set of measure zero, and is the infimum of all numbers with this property. Again after quotienting by the elements of norm zero we obtain a normed space which is again complete. Again we have for suitable .
(The rest of this post requires no measure theory. It’s a curious feature of functional analysis that measure theory is not needed to understand the proofs of the main theorems but it is used to construct some of the most important examples!)
Example. Let be a topological space and be the space of bounded continuous functions equipped with the supremum norm
Since the uniform limit of continuous functions is continuous, is complete with respect to this norm.
Example. Any closed subspace of a Banach space is a Banach space with the induced norm.
Example. Let be a Banach space and a subspace of . (When) can we equip the quotient with the structure of a Banach space? The first question is what the norm of a coset ought to be. The norm of the zero coset must be zero. More generally, if we want the quotient map to be continuous, the norm of a coset where is small must also be small. A natural idea is to take the distance from (thought of as a hyperplane) to the origin, or
This always defines a seminorm. if and only if lies in the closure of , so in order to get a norm it is necessary and sufficient that is closed.
The infinite sums that make sense in a Banach space include those infinite sums which converge absolutely in the sense that converges. Any such infinite sum has the property that its partial sums are Cauchy, hence converges by completeness. Absolute convergence also has the same nice property that it does in real analysis that the order of summation is irrelevant (sketch: given any permutation of an absolutely convergent series, for any we may continue the summation until the first terms in the original order are included in the summation).
If only converges in the sense that its partial sums approach a limit, then it converges conditionally, and just as in real analysis sums which converge conditionally but not absolutely cannot be permuted in general (since we can take the to lie in the same one-dimensional subspace and use the same counterexamples as in real analysis).
It is possible to make sense of the limit of a sequence of more than countably many terms using ordinals, and therefore to make sense of conditional convergence for an infinite series with more than countably many terms, but we will not do so for the following reason.
Proposition: Let be an uncountable set of positive real numbers. For any positive real , there exists a finite subset of whose sum is greater than .
(Consequently, no infinite series with more than countably many nonzero terms can converge absolutely. Also, the sequences appearing in the function spaces for uncountable necessarily have countable support.)
Proof. The sets for positive integers exhibit as a countable union of sets. Since is uncountable, at least one of these sets is uncountable, in particular infinite, and the conclusion follows.
Some basic propositions
Let be normed spaces and a linear transformation between them. In functional analysis are typically spaces of functions and conventionally is referred to as a (linear) operator in this context. The operator norm of , when it exists, is
If exists, we say that is bounded. Note that if exists, then for any ; moreover, is the infimum of all numbers with this property.
Proposition: is bounded if and only if it is continuous.
Proof. If is bounded then it is Lipschitz, so continuous. In the other direction, if is continuous then the preimage of the open ball of radius in is open, hence contains an open ball of some radius . Scaling, it follows that the image of the open ball of radius under is contained in the open ball of radius , and the conclusion follows.
Corollary: Two norms induce the same topology on a vector space if and only if there exist constants such that
We say that are (bi-Lipschitz) equivalent.
Given two normed spaces , the operator norm endows the space of bounded linear operators with a norm, hence also making it a normed space.
Proposition: If is a Banach space, then so is .
Proof. Let be a Cauchy sequence. Then is Cauchy for any , so by the completeness of we may define
It remains to verify that is a linear operator and that it is the limit of the sequence . The former is straightforward. The latter is also straightforward but it is worth noticing that it remains to be proven. Fix and let be such that . Then
for all such that . Taking the pointwise limit as , it follows that for all and for all (with not depending on ; this is the reason why we weren’t already done), and the conclusion follows.
The dual space of a normed space is the space of all bounded linear functionals .
Corollary: is a Banach space.
A normed space is reflexive if the natural map is a homeomorphism.
Corollary: Any reflexive normed space is a Banach space.
Note that unlike the finite-dimensional case, it is not obvious that the natural map is even injective in general. The problem is that is not obviously nontrivial. In fact, it is consistent with ZF that there exist vector spaces such that is trivial!
For Banach spaces the nontriviality of is only settled by the Hahn-Banach theorem, which is independent of ZF but follows from the ultrafilter lemma. In fact, Hahn-Banach shows that the natural map is injective and norm-preserving, but for many Banach spaces it is still not an isomorphism (Wikipedia gives examples).
Finite-dimensional and infinite-dimensional normed spaces
The real meat of the theory of Banach spaces lies in infinite dimensions so it is good to get an understanding of the differences between the finite- and infinite-dimensional cases.
Proposition: Let be a finite-dimensional real vector space. There exists a unique Hausdorff topology on relative to which addition and scalar multiplication are continuous.
(The Hausdorff condition is necessary to prevent the trivial counterexample of the indiscrete topology.)
Proof. Assume that is equipped with such a topology and let be a basis for . By assumption, the map
is a continuous bijection. Thinking of as being equipped with the Euclidean norm, restricts to a continuous bijection from the closed ball of radius in to the corresponding subset of , which is compact since closed balls in are compact and Hausdorff by assumption. But any continuous bijection between compact Hausdorff spaces is a homeomorphism. Taking it follows that is a homeomorphism from with the product topology to .
Note that the above argument implies that is complete, hence all finite-dimensional normed spaces are Banach spaces.
Corollary: Any two norms on a finite-dimensional real vector space are bi-Lipschitz equivalent.
This is quite false in infinite dimensions.
Example. Consider the norms, , on the space of functions of compact support (that is, sequences with finitely many nonzero values). For a real parameter consider the sequence of functions
This sequence is Cauchy with respect to the norm for finite if and only if converges, hence if and only if , and it is always Cauchy with respect to the norm. Choosing suitable values of shows that none of the norms are bi-Lipschitz equivalent (since bi-Lipschitz equivalence preserves Cauchy sequences).
A nice feature of the finite-dimensional case is that the unit ball of a finite-dimensional normed space is always compact. This is also quite false in infinite dimensions.
Proposition: Let be an infinite-dimensional normed space. Then the unit ball of is not compact.
Proof idea. A compact subset of a metric space admits a finite cover by open balls of any given radius. It takes at least such balls of radius to cover the unit ball in a normed space of dimension by considering volumes, so “as ” it ought to take infinitely many such balls to cover the unit ball in an infinite-dimensional normed space.
(The proof idea, unfortunately, leads to the realization that there is no obvious generalization of Lebesgue measure to infinite-dimensional vector spaces. The problem is that we would like this measure to satisfy where is some measurable subset, , and is the dimension, but when we can’t satisfy this requirement in a reasonable way.)
Proof. Suppose by contradiction that the unit ball of admits a covering by open balls of radius centered at points . Since is infinite-dimensional, by adding vectors not in of norm less than if necessary to the , we can construct a covering of by open balls of radius centered at points which are linearly independent.
These points span a subspace in which the unit ball is also covered by open balls of radius centered at the . However, is finite-dimensional, so the proof idea applies: the unit ball has volume (relative to Lebesgue measure on after choosing an identification ) but the union of the balls centered at the has volume at most ; contradiction.
The non-compactness of the unit ball in infinite dimensions is an inconvenience, as it makes certain naive arguments fail, but it can be dealt with.
“The” category of Banach spaces I
What should a morphism between Banach spaces be? If you’re used to thinking of morphisms of metric spaces as being continuous functions, then the obvious answer is a bounded linear operator. We’ll denote this category by . Two Banach spaces are isomorphic in this category if and only if they are bi-Lipschitz equivalent.
A less obvious choice of morphisms between metric spaces is to take morphisms to be distance-decreasing maps or weak contractions: maps satisfying
In other words, we allow only maps which are Lipschitz with Lipschitz constant at most . Two metric spaces are isomorphic in this category if and only if they are isometrically isomorphic, so among other things this definition gives a notion of isomorphism of metric spaces which actually preserves all metric structure rather than just the induced topology.
When applied to Banach spaces, we get the category whose morphisms are operators of norm at most (short maps).
Where does this definition come from?
A lengthy digression into Lawvere metric spaces
The story goes that Lawvere noticed that the triangle inequality
bears a certain resemblance to composition in a category, which is a map
This observation eventually led Lawvere to the realization that metric spaces are, in fact, enriched categories. Recall that an -enriched category, for a monoidal category, is a collection of objects together with an object for every pair of objects , a distinguished arrow (where denotes the identity object in ), and composition maps
satisfying the usual identity and associativity requirements. A Lawvere metric space is an -enriched category where is the non-negative extended reals regarded as a poset (hence there is a morphism if and only if ) together with the monoidal operation . In other words, it is a collection of objects together with a non-negative real assigned to each pair of objects such that there is a unique arrow (hence ) and such that composition maps
exist – and this is precisely the statement of the triangle inequality. (The identity and associativity requirements are automatic here.)
The correct notion of functor between enriched categories is that of an enriched functor, which is a map on objects together with a collection of induced maps
compatible with composition. When applied to Lawvere metric spaces, this is precisely the requirement that is a weak contraction. (Again the compatibility with composition is automatic here.)
Lawvere metric spaces generalize ordinary metric spaces in three ways (besides the fact that their collection of objects need not be a set):
- a Lawvere metric need not take only finite values,
- a Lawvere metric need not be symmetric, and
- a Lawvere metric need not have the property that if then .
The first generalization is actually quite natural geometrically: intuitively an infinite distance between two points means it is not possible to reach one from the other. Infinite distances also allow us to construct coproducts of Lawvere metric spaces by defining the distance between points in different summands to be infinite. The third generalization is just saying that we allow objects to be isomorphic without being identical, which is analogous to the generalization from posets to preorders and which we were already doing when we considered seminorms that were not norms.
The second generalization is probably the hardest to come to terms with geometrically. What does it mean for the distance between and to be different from the distance between and ? My basic intuition here comes from traveling on a hilly surface under the influence of gravity: if is on top of a hill and is at the bottom, traveling from to will be easier than traveling from to if one factors in gravity even though the ordinary spatial distance between the two are the same. This leads to the following family of Lawvere metric spaces which are not symmetric.
Example. Let be a metric space in the ordinary sense and let be a non-constant function which we think of as a potential (e.g. gravitational potential). Define a Lawvere metric by
This metric reduces to the ordinary metric whenever the potential at is less than or equal to the potential at but is greater if going from to involves going “uphill” (in the direction of increasing potential).
More generally, I think a reasonable intuition for Lawvere metrics is to think of the set of points of a Lawvere metric space as a set of states of some physical system and the metric as a measure of the minimal “cost” or “energy” necessary to transition from one state to another. (This is closely related to the intuition for ordinary categories where one thinks of a Hom-set as describing the set of ways one can transition from one state to another.) The triangle inequality is the fundamental observation that it can’t cost more to transition from to than it does to transition from to , then from to (since the latter is a valid way to transition from to ), but beyond that, in general
- there’s no reason it should be possible to transition from to at all,
- there’s no reason the cost to transition from to should be the same as the cost to transition from to , and
- there’s no reason it should cost a positive amount to transition between different states.
What about weak contractions? Well, keeping our physicist hats on, if we think of the points in a Lawvere metric space as being specified by the values of certain observables features of the state of a physical system and the “cost” or “energy” as a certain function of these observables, then it is natural to think about what happens if we “forget” some of these observables. This identifies some of the states and also, intuitively speaking, removes part of the “cost” function, which is more or less what a surjective weak contraction ought to do (and any weak contraction is surjective onto its image).
“The” category of Banach spaces II
One advantage of working in the category is that any categorical constructions (e.g. products, coproducts) are guaranteed to spit out a Banach space with a unique norm rather than a norm which is unique up to bi-Lipschitz equivalence. So what kind of categorical constructions are available?
The first thing to observe is that has an unexpected forgetful functor to , namely , which gives the unit ball of a Banach space rather than the entire space.
Remark. It is possible to think of the unit ball of a Banach space as a kind of algebraic object in its own right, which motivates this construction. Namely, the unit ball of a Banach space is naturally a totally convex space, which is roughly speaking the “closest algebraic theory to Banach spaces.”
Proposition: The forgetful functor above has left adjoint .
Proof. We want to show that there is a natural identification
(where above we denote by the morphisms between two objects of a category . Formerly I used the notation for this but this gets clunky.) Given a map from a set to the unit ball of , it clearly linearly extends to a map
where denotes the free vector space on (note that if is infinite we cannot identify this with ). In order to place a norm on such that the linear extension above has operator norm at most , we should give each norm since , which lies in the unit ball, has norm at most and equality may occur. The linear combination $\sum_s c_s s$ has norm at most by the triangle inequality, and equality may occur, so we should actually give it norm . But this is just the norm on the space of compactly supported functions , so its completion is .
The above argument may be summarized as “ has a universal property because it describes the equality case of the triangle inequality.”
Remark. is a space of functions on , which is an essentially contravariant construction. The functor above is supposed to be covariant, so it is better to think of it as the completion of or some confusion might result about what it does to morphisms. See this math.SE question for a thorough discussion of this point.
Since is a left adjoint, it preserves coproducts, so it follows that is the coproduct of copies of (the free Banach space on one element). This suggests more generally that coproducts ought to exist in the category of Banach spaces, and indeed they do.
Proposition: has all small coproducts.
Proof. Let be a family of Banach spaces. We want to construct a Banach space so that we have natural identifications
The same remarks above about the equality case of the triangle inequality apply and give the following: take the direct sum equipped with the “-norm”
where , and complete it. From any family of weak contractions this naturally produces a weak contraction and vice versa as desired.
Given the above we might even suspect that has all small colimits.
Proposition: has all coequalizers.
Proof. The coequalizer of a pair of morphisms is just the cokernel of , so it suffices to construct cokernels. The cokernel of a map is in turn just the quotient of by the closure of the image of .
Remark. Generally, in any -enriched category, the coequalizer of a pair of morphisms is just the cokernel of . The category of Banach spaces is not -enriched because the sum of two weak contractions is not in general a weak contraction (hence the factor of above), but it is quite close: since its morphisms are unit balls in certain Banach spaces, it is enriched over the category of totally convex spaces, so even if we can’t take arbitrary finite linear combinations of morphisms we can still at least take appropriately bounded (infinite!) linear combinations.
We now appeal to the following general result.
Proposition: Let be a category which has all small coproducts and coequalizers. Then is cocomplete (it has all small colimits). (Dually, if a category has small products and equalizers, then it is complete.)
Proof. Let be a diagram where is small. Recall that a colimit of this diagram is an object together with maps such that for any morphism which is initial with respect to this property. This is equivalent to the statement that is a coequalizer of the diagram
where and are the objects and morphisms in , is the domain of a morphism , acts as the identity map on components, and acts by (where is the codomain of ) on components. Since is small, both the coproducts above exist by assumption, and so does the coequalizer above.
Corollary: is cocomplete.
Given that “behaves like an algebraic theory” (namely the theory of totally convex spaces whose operations are certain infinite sums) this is perhaps not too surprising. We might expect the same to be true of limits for the same reason.
Proposition: has all small products.
Proof. Let be a family of Banach spaces. We want to construct a Banach space so that we have natural identifications
In particular, setting , we observe that needs to have a unit ball which is naturally identified with the product of the unit balls of the . This suggests the following: we will take to be the subspace of the Cartesian product of the which is bounded in the sense that the “-norm”
exists. This subspace is already complete. Its fundamental property is that an element has norm at most if and only if all of its components do, and the universal property follows straightforwardly from this.
Note that it follows that is the product of copies of .
Proposition: has all equalizers.
Proof. The equalizer of a pair of morphisms is just the kernel of , so it suffices to construct kernels. The kernel of a map is in turn just its kernel in the ordinary algebraic sense (since is continuous, it is already closed).
Corollary: is complete.
The closed monoidal category of Banach spaces
, as we saw above, has nice categorical properties. But it is undeniable that in practice we want to consider bounded linear operators of norm greater than , and we don’t want to have to normalize them in order to consider them as morphisms.
Fortunately, we don’t have to. The space of bounded linear operators between two Banach spaces may be regarded as a bifunctor (contravariant in , covariant in )
instead. This bifunctor has all the properties that morphisms in a category satisfy, which are summarized by the axioms of a closed category; we call such a functor an internal hom. Among other things, there is a distinguished object such that there is a natural isomorphism ; moreover,
The prototypical example of a category with internal homs is , where is just the set of functions from to . An example where the distinction between the ordinary hom and the internal hom is clearer is the category of -sets for a group; here the internal hom is the set of functions between two -sets, which is itself a -set. is the trivial -set, is the functor sending a -set to its subset of fixed points, and so is the set of morphisms between two -sets as expected.
A linear version of this family of examples is given by the categories of representations of groups . There the internal hom is given by the space of linear maps between two representations, which is itself a representation. in this context is the functor sending a representation to its invariant subspace, and the invariant subspace of the internal hom is the space of morphisms between two representations.
A feature that the above examples of internal homs have in mind is that there exists a monoidal operation and a natural identification
called the tensor-hom adjunction. This says precisely that is left adjoint to , or equivalently that is right adjoint to . This relationship, which holds in very general contexts, allows us to characterize various tensor products we care about in terms of various internal homs we care about or vice versa.
In the monoidal operation is just the categorical product and this is one of the axioms describing a Cartesian closed category. In the monoidal operation is the tensor product and this is one of the axioms describing a closed monoidal category.
Does such a monoidal operation exist for ? If it does, we can make sense of the statement that enriches over itself (since we need to define composition maps to do this). As it turns out, yes. We will ask for a stronger statement to hold, namely
so that taking unit balls of both sides recovers the natural identification we want. The space on the right is precisely the space of bilinear maps which are continuous in both variables with norm given by
This suggests that we ought to define in the same way that we define the tensor product of vector spaces. A first thought is to start with the ordinary tensor product and complete it with respect to some norm, but the question of exactly what norm to place on the tensor product is somewhat subtle and there is more than one reasonable choice; see Wikipedia for a discussion.
Fortunately, the tensor product in the above sense has a norm which is already uniquely specified by its adjunction with , so in this sense the categorical machinery at work here picks out a unique tensor product, if it exists. But does it exist?
If we want to imitate the construction of the tensor product of vector spaces, we should complete earlier. Recall that the ordinary tensor product of two vector spaces is the quotient of the vector space on formal symbols by the relations
When repeating this construction in the Banach space setting, we should give the formal symbol the norm by bilinearity and compatibility with the norm used above on bilinear maps. After completing this space with respect to this norm, we should then quotient by the closure of the space spanned by the relations above. This gives, if I’m not mistaken, the projective tensor product of two Banach spaces.
With this tensor product becomes a closed monoidal category. Moreover, for any Banach space , the space of bounded linear operators becomes a monoid object in the category of Banach spaces, or a Banach algebra, which we’ll turn to soon.