One annoying feature of the abstract theory of vector spaces, and one that often trips up beginners, is that it is not possible to make sense of an infinite sum of vectors in general. If we want to make sense of infinite sums, we should probably define them as limits of finite sums, so rather than work with bare vector spaces we need to work with topological vector spaces over a topological field, usually or
(but sometimes fields like
are also considered, e.g. in number theory). Common and important examples include spaces of continuous or differentiable functions.
Today we’ll discuss a class of topological vector spaces which is convenient to work with but which still covers many examples of interest, namely Banach spaces. The material in the first half of this post is completely standard and can be found in any text on functional analysis.
In the second half of the post we discuss a category of Banach spaces such that two Banach spaces are isomorphic in this category if and only if they are isometrically isomorphic but which still allows us to talk about bounded linear operators between Banach spaces, and to do this we briefly discuss Lawvere metrics; this material can be found on the nLab.
Definition and examples
Let be a vector space over
or
. A seminorm on
is a function
satisfying
for all
,
.
A norm is a seminorm such that . Any norm defines a metric
on
, and a vector space equipped with a norm is a normed (vector) space. It is a Banach space if in addition it is complete with respect to this metric.
Example. The completion of any normed space is naturally a Banach space.
Example. Let be real and let
be a set. The
-space
is the normed space of sequences
such that
converges with the above as a norm (the triangle inequality for this norm is a consequence of Hölder’s inequality). These spaces are always complete. (The proof is an unnecessary digression at this point, but can probably be found in any text on measure theory.)
When and
is a finite set of size
we recover the Euclidean space
with the usual Euclidean distance.
Example. The -space
is the space of bounded sequences
equipped with the
, supremum, or uniform norm
.
gets its name from the fact that
for
such that both sides are well-defined, and is also complete (although the proof is more straightforward in this case).
Example. More generally, let be a measure space with measure
and let
be real. The
-space
is the space of all measurable functions
such that
exists. The above function is a seminorm (the triangle inequality is again a consequence of Hölder’s inequality), but usually not a norm since, for example, may have support a set of measure zero. To get a norm, we quotient by the subspace of
with
to get the
-space
, which is genuinely a normed space. These spaces are also always complete. (Again, an unnecessary digression.)
This construction generalizes the construction of the spaces in the case that every subset of
is measurable and the measure
assigns measure
to all singletons in
(hence assigns measure
to all infinite sets in
).
Example. The -space
is the space of measurable functions
such that the
or essential supremum norm
exists. In more words, the essential supremum is the “almost everywhere supremum”: may be greater than it, but only on a set of measure zero, and
is the infimum of all numbers with this property. Again after quotienting by the elements of norm zero we obtain a normed space
which is again complete. Again we have
for suitable
.
(The rest of this post requires no measure theory. It’s a curious feature of functional analysis that measure theory is not needed to understand the proofs of the main theorems but it is used to construct some of the most important examples!)
Example. Let be a topological space and
be the space of bounded continuous functions
equipped with the supremum norm
Since the uniform limit of continuous functions is continuous, is complete with respect to this norm.
Example. Any closed subspace of a Banach space is a Banach space with the induced norm.
Example. Let be a Banach space and
a subspace of
. (When) can we equip the quotient
with the structure of a Banach space? The first question is what the norm of a coset
ought to be. The norm of the zero coset
must be zero. More generally, if we want the quotient map
to be continuous, the norm of a coset
where
is small must also be small. A natural idea is to take the distance from
(thought of as a hyperplane) to the origin, or
This always defines a seminorm. if and only if
lies in the closure of
, so in order to get a norm it is necessary and sufficient that
is closed.
Infinite sums
The infinite sums that make sense in a Banach space include those infinite sums which converge absolutely in the sense that
converges. Any such infinite sum has the property that its partial sums are Cauchy, hence converges by completeness. Absolute convergence also has the same nice property that it does in real analysis that the order of summation is irrelevant (sketch: given any permutation of an absolutely convergent series, for any
we may continue the summation until the first
terms in the original order are included in the summation).
If only converges in the sense that its partial sums approach a limit, then it converges conditionally, and just as in real analysis sums which converge conditionally but not absolutely cannot be permuted in general (since we can take the
to lie in the same one-dimensional subspace and use the same counterexamples as in real analysis).
It is possible to make sense of the limit of a sequence of more than countably many terms using ordinals, and therefore to make sense of conditional convergence for an infinite series with more than countably many terms, but we will not do so for the following reason.
Proposition: Let be an uncountable set of positive real numbers. For any positive real
, there exists a finite subset of
whose sum is greater than
.
(Consequently, no infinite series with more than countably many nonzero terms can converge absolutely. Also, the sequences appearing in the function spaces for
uncountable necessarily have countable support.)
Proof. The sets for positive integers
exhibit
as a countable union of sets. Since
is uncountable, at least one of these sets is uncountable, in particular infinite, and the conclusion follows.
Some basic propositions
Let be normed spaces and
a linear transformation between them. In functional analysis
are typically spaces of functions and conventionally
is referred to as a (linear) operator in this context. The operator norm of
, when it exists, is
If exists, we say that
is bounded. Note that if
exists, then
for any
; moreover,
is the infimum of all numbers with this property.
Proposition: is bounded if and only if it is continuous.
Proof. If is bounded then it is Lipschitz, so continuous. In the other direction, if
is continuous then the preimage of the open ball of radius
in
is open, hence contains an open ball of some radius
. Scaling, it follows that the image of the open ball of radius
under
is contained in the open ball of radius
, and the conclusion follows.
Corollary: Two norms induce the same topology on a vector space
if and only if there exist constants
such that
.
We say that are (bi-Lipschitz) equivalent.
Given two normed spaces , the operator norm endows the space
of bounded linear operators
with a norm, hence also making it a normed space.
Proposition: If is a Banach space, then so is
.
Proof. Let be a Cauchy sequence. Then
is Cauchy for any
, so by the completeness of
we may define
.
It remains to verify that is a linear operator and that it is the limit of the sequence
. The former is straightforward. The latter is also straightforward but it is worth noticing that it remains to be proven. Fix
and let
be such that
. Then
for all such that
. Taking the pointwise limit as
, it follows that
for all
and for all
(with
not depending on
; this is the reason why we weren’t already done), and the conclusion follows.
The dual space of a normed space is the space
of all bounded linear functionals
.
Corollary: is a Banach space.
A normed space is reflexive if the natural map
is a homeomorphism.
Corollary: Any reflexive normed space is a Banach space.
Note that unlike the finite-dimensional case, it is not obvious that the natural map is even injective in general. The problem is that
is not obviously nontrivial. In fact, it is consistent with ZF that there exist vector spaces
such that
is trivial!
For Banach spaces the nontriviality of is only settled by the Hahn-Banach theorem, which is independent of ZF but follows from the ultrafilter lemma. In fact, Hahn-Banach shows that the natural map
is injective and norm-preserving, but for many Banach spaces it is still not an isomorphism (Wikipedia gives examples).
Finite-dimensional and infinite-dimensional normed spaces
The real meat of the theory of Banach spaces lies in infinite dimensions so it is good to get an understanding of the differences between the finite- and infinite-dimensional cases.
Proposition: Let be a finite-dimensional real vector space. There exists a unique Hausdorff topology on
relative to which addition and scalar multiplication are continuous.
(The Hausdorff condition is necessary to prevent the trivial counterexample of the indiscrete topology.)
Proof. Assume that is equipped with such a topology and let
be a basis for
. By assumption, the map
is a continuous bijection. Thinking of as being equipped with the Euclidean norm,
restricts to a continuous bijection from the closed ball of radius
in
to the corresponding subset of
, which is compact since closed balls in
are compact and Hausdorff by assumption. But any continuous bijection between compact Hausdorff spaces is a homeomorphism. Taking
it follows that
is a homeomorphism from
with the product topology to
.
Note that the above argument implies that is complete, hence all finite-dimensional normed spaces are Banach spaces.
Corollary: Any two norms on a finite-dimensional real vector space are bi-Lipschitz equivalent.
This is quite false in infinite dimensions.
Example. Consider the norms,
, on the space
of functions
of compact support (that is, sequences with finitely many nonzero values). For
a real parameter consider the sequence of functions
This sequence is Cauchy with respect to the norm for
finite if and only if
converges, hence if and only if
, and it is always Cauchy with respect to the
norm. Choosing suitable values of
shows that none of the
norms are bi-Lipschitz equivalent (since bi-Lipschitz equivalence preserves Cauchy sequences).
A nice feature of the finite-dimensional case is that the unit ball of a finite-dimensional normed space is always compact. This is also quite false in infinite dimensions.
Proposition: Let be an infinite-dimensional normed space. Then the unit ball of
is not compact.
Proof idea. A compact subset of a metric space admits a finite cover by open balls of any given radius. It takes at least such balls of radius
to cover the unit ball in a normed space of dimension
by considering volumes, so “as
” it ought to take infinitely many such balls to cover the unit ball in an infinite-dimensional normed space.
(The proof idea, unfortunately, leads to the realization that there is no obvious generalization of Lebesgue measure to infinite-dimensional vector spaces. The problem is that we would like this measure to satisfy where
is some measurable subset,
, and
is the dimension, but when
we can’t satisfy this requirement in a reasonable way.)
Proof. Suppose by contradiction that the unit ball of admits a covering by open balls of radius
centered at points
. Since
is infinite-dimensional, by adding vectors not in
of norm less than
if necessary to the
, we can construct a covering of
by open balls of radius
centered at points
which are linearly independent.
These points span a subspace in which the unit ball is also covered by open balls of radius
centered at the
. However,
is finite-dimensional, so the proof idea applies: the unit ball has volume
(relative to Lebesgue measure on
after choosing an identification
) but the union of the balls centered at the
has volume at most
; contradiction.
The non-compactness of the unit ball in infinite dimensions is an inconvenience, as it makes certain naive arguments fail, but it can be dealt with.
“The” category of Banach spaces I
What should a morphism between Banach spaces be? If you’re used to thinking of morphisms of metric spaces as being continuous functions, then the obvious answer is a bounded linear operator. We’ll denote this category by . Two Banach spaces are isomorphic in this category if and only if they are bi-Lipschitz equivalent.
A less obvious choice of morphisms between metric spaces is to take morphisms to be distance-decreasing maps or weak contractions: maps satisfying
.
In other words, we allow only maps which are Lipschitz with Lipschitz constant at most . Two metric spaces are isomorphic in this category if and only if they are isometrically isomorphic, so among other things this definition gives a notion of isomorphism of metric spaces which actually preserves all metric structure rather than just the induced topology.
When applied to Banach spaces, we get the category whose morphisms are operators of norm at most
(short maps).
Where does this definition come from?
A lengthy digression into Lawvere metric spaces
The story goes that Lawvere noticed that the triangle inequality
bears a certain resemblance to composition in a category, which is a map
.
This observation eventually led Lawvere to the realization that metric spaces are, in fact, enriched categories. Recall that an -enriched category, for
a monoidal category, is a collection of objects together with an object
for every pair of objects
, a distinguished arrow
(where
denotes the identity object in
), and composition maps
satisfying the usual identity and associativity requirements. A Lawvere metric space is an -enriched category where
is the non-negative extended reals
regarded as a poset (hence there is a morphism
if and only if
) together with the monoidal operation
. In other words, it is a collection of objects together with a non-negative real
assigned to each pair of objects such that there is a unique arrow
(hence
) and such that composition maps
exist – and this is precisely the statement of the triangle inequality. (The identity and associativity requirements are automatic here.)
The correct notion of functor between enriched categories is that of an enriched functor, which is a map on objects together with a collection of induced maps
compatible with composition. When applied to Lawvere metric spaces, this is precisely the requirement that is a weak contraction. (Again the compatibility with composition is automatic here.)
Lawvere metric spaces generalize ordinary metric spaces in three ways (besides the fact that their collection of objects need not be a set):
- a Lawvere metric need not take only finite values,
- a Lawvere metric need not be symmetric, and
- a Lawvere metric need not have the property that if
then
.
The first generalization is actually quite natural geometrically: intuitively an infinite distance between two points means it is not possible to reach one from the other. Infinite distances also allow us to construct coproducts of Lawvere metric spaces by defining the distance between points in different summands to be infinite. The third generalization is just saying that we allow objects to be isomorphic without being identical, which is analogous to the generalization from posets to preorders and which we were already doing when we considered seminorms that were not norms.
The second generalization is probably the hardest to come to terms with geometrically. What does it mean for the distance between and
to be different from the distance between
and
? My basic intuition here comes from traveling on a hilly surface under the influence of gravity: if
is on top of a hill and
is at the bottom, traveling from
to
will be easier than traveling from
to
if one factors in gravity even though the ordinary spatial distance between the two are the same. This leads to the following family of Lawvere metric spaces which are not symmetric.
Example. Let be a metric space in the ordinary sense and let
be a non-constant function which we think of as a potential (e.g. gravitational potential). Define a Lawvere metric by
.
This metric reduces to the ordinary metric whenever the potential at is less than or equal to the potential at
but is greater if going from
to
involves going “uphill” (in the direction of increasing potential).
More generally, I think a reasonable intuition for Lawvere metrics is to think of the set of points of a Lawvere metric space as a set of states of some physical system and the metric as a measure of the minimal “cost” or “energy” necessary to transition from one state to another. (This is closely related to the intuition for ordinary categories where one thinks of a Hom-set as describing the set of ways one can transition from one state to another.) The triangle inequality is the fundamental observation that it can’t cost more to transition from to
than it does to transition from
to
, then from
to
(since the latter is a valid way to transition from
to
), but beyond that, in general
- there’s no reason it should be possible to transition from
to
at all,
- there’s no reason the cost to transition from
to
should be the same as the cost to transition from
to
, and
- there’s no reason it should cost a positive amount to transition between different states.
What about weak contractions? Well, keeping our physicist hats on, if we think of the points in a Lawvere metric space as being specified by the values of certain observables features of the state of a physical system and the “cost” or “energy” as a certain function of these observables, then it is natural to think about what happens if we “forget” some of these observables. This identifies some of the states and also, intuitively speaking, removes part of the “cost” function, which is more or less what a surjective weak contraction ought to do (and any weak contraction is surjective onto its image).
“The” category of Banach spaces II
One advantage of working in the category is that any categorical constructions (e.g. products, coproducts) are guaranteed to spit out a Banach space with a unique norm rather than a norm which is unique up to bi-Lipschitz equivalence. So what kind of categorical constructions are available?
The first thing to observe is that has an unexpected forgetful functor to
, namely
, which gives the unit ball of a Banach space rather than the entire space.
Remark. It is possible to think of the unit ball of a Banach space as a kind of algebraic object in its own right, which motivates this construction. Namely, the unit ball of a Banach space is naturally a totally convex space, which is roughly speaking the “closest algebraic theory to Banach spaces.”
Proposition: The forgetful functor above has left adjoint
.
Proof. We want to show that there is a natural identification
(where above we denote by the morphisms between two objects
of a category
. Formerly I used the notation
for this but this gets clunky.) Given a map
from a set
to the unit ball of
, it clearly linearly extends to a map
where denotes the free vector space on
(note that if
is infinite we cannot identify this with
). In order to place a norm on
such that the linear extension above has operator norm at most
, we should give each
norm
since
, which lies in the unit ball, has norm at most
and equality may occur. The linear combination
has norm at most
by the triangle inequality, and equality may occur, so we should actually give it norm
. But this is just the
norm on the space of compactly supported functions
, so its completion is
.
The above argument may be summarized as “ has a universal property because it describes the equality case of the triangle inequality.”
Remark. is a space of functions on
, which is an essentially contravariant construction. The functor above is supposed to be covariant, so it is better to think of it as the completion of
or some confusion might result about what it does to morphisms. See this math.SE question for a thorough discussion of this point.
Since is a left adjoint, it preserves coproducts, so it follows that
is the coproduct of
copies of
(the free Banach space on one element). This suggests more generally that coproducts ought to exist in the category of Banach spaces, and indeed they do.
Proposition: has all small coproducts.
Proof. Let be a family of Banach spaces. We want to construct a Banach space
so that we have natural identifications
.
The same remarks above about the equality case of the triangle inequality apply and give the following: take the direct sum equipped with the “
-norm”
where , and complete it. From any family of weak contractions
this naturally produces a weak contraction
and vice versa as desired.
Given the above we might even suspect that has all small colimits.
Proposition: has all coequalizers.
Proof. The coequalizer of a pair of morphisms is just the cokernel of
, so it suffices to construct cokernels. The cokernel of a map
is in turn just the quotient
of
by the closure of the image of
.
Remark. Generally, in any -enriched category, the coequalizer of a pair of morphisms
is just the cokernel of
. The category of Banach spaces is not
-enriched because the sum of two weak contractions is not in general a weak contraction (hence the factor of
above), but it is quite close: since its morphisms are unit balls in certain Banach spaces, it is enriched over the category of totally convex spaces, so even if we can’t take arbitrary finite linear combinations of morphisms we can still at least take appropriately bounded (infinite!) linear combinations.
We now appeal to the following general result.
Proposition: Let be a category which has all small coproducts and coequalizers. Then
is cocomplete (it has all small colimits). (Dually, if a category has small products and equalizers, then it is complete.)
Proof. Let be a diagram where
is small. Recall that a colimit of this diagram is an object
together with maps
such that
for any morphism
which is initial with respect to this property. This is equivalent to the statement that
is a coequalizer of the diagram
where and
are the objects and morphisms in
,
is the domain
of a morphism
,
acts as the identity map
on components, and
acts by
(where
is the codomain
of
) on components. Since
is small, both the coproducts above exist by assumption, and so does the coequalizer above.
Corollary: is cocomplete.
Given that “behaves like an algebraic theory” (namely the theory of totally convex spaces whose operations are certain infinite sums) this is perhaps not too surprising. We might expect the same to be true of limits for the same reason.
Proposition: has all small products.
Proof. Let be a family of Banach spaces. We want to construct a Banach space
so that we have natural identifications
.
In particular, setting , we observe that
needs to have a unit ball which is naturally identified with the product of the unit balls of the
. This suggests the following: we will take
to be the subspace of the Cartesian product of the
which is bounded in the sense that the “
-norm”
exists. This subspace is already complete. Its fundamental property is that an element has norm at most if and only if all of its components do, and the universal property follows straightforwardly from this.
Note that it follows that is the product of
copies of
.
Proposition: has all equalizers.
Proof. The equalizer of a pair of morphisms is just the kernel of
, so it suffices to construct kernels. The kernel of a map
is in turn just its kernel in the ordinary algebraic sense (since
is continuous, it is already closed).
Corollary: is complete.
The closed monoidal category of Banach spaces
, as we saw above, has nice categorical properties. But it is undeniable that in practice we want to consider bounded linear operators of norm greater than
, and we don’t want to have to normalize them in order to consider them as morphisms.
Fortunately, we don’t have to. The space of bounded linear operators between two Banach spaces
may be regarded as a bifunctor (contravariant in
, covariant in
)
instead. This bifunctor has all the properties that morphisms in a category satisfy, which are summarized by the axioms of a closed category; we call such a functor an internal hom. Among other things, there is a distinguished object such that there is a natural isomorphism
; moreover,
.
The prototypical example of a category with internal homs is , where
is just the set of functions from
to
. An example where the distinction between the ordinary hom and the internal hom is clearer is the category
of
-sets for
a group; here the internal hom is the set of functions between two
-sets, which is itself a
-set.
is the trivial
-set,
is the functor sending a
-set to its subset of fixed points, and so
is the set of morphisms between two
-sets as expected.
A linear version of this family of examples is given by the categories of representations of groups
. There the internal hom is given by the space of linear maps between two representations, which is itself a representation.
in this context is the functor sending a representation to its invariant subspace, and the invariant subspace of the internal hom is the space of morphisms between two representations.
A feature that the above examples of internal homs have in mind is that there exists a monoidal operation and a natural identification
called the tensor-hom adjunction. This says precisely that is left adjoint to
, or equivalently that
is right adjoint to
. This relationship, which holds in very general contexts, allows us to characterize various tensor products we care about in terms of various internal homs we care about or vice versa.
In the monoidal operation is just the categorical product and this is one of the axioms describing a Cartesian closed category. In
the monoidal operation is the tensor product and this is one of the axioms describing a closed monoidal category.
Does such a monoidal operation exist for ? If it does, we can make sense of the statement that
enriches
over itself (since we need to define composition maps
to do this). As it turns out, yes. We will ask for a stronger statement to hold, namely
so that taking unit balls of both sides recovers the natural identification we want. The space on the right is precisely the space of bilinear maps which are continuous in both variables with norm given by
.
This suggests that we ought to define in the same way that we define the tensor product of vector spaces. A first thought is to start with the ordinary tensor product and complete it with respect to some norm, but the question of exactly what norm to place on the tensor product is somewhat subtle and there is more than one reasonable choice; see Wikipedia for a discussion.
Fortunately, the tensor product in the above sense has a norm which is already uniquely specified by its adjunction with , so in this sense the categorical machinery at work here picks out a unique tensor product, if it exists. But does it exist?
If we want to imitate the construction of the tensor product of vector spaces, we should complete earlier. Recall that the ordinary tensor product of two vector spaces is the quotient of the vector space on formal symbols
by the relations
,
.
When repeating this construction in the Banach space setting, we should give the formal symbol the norm
by bilinearity and compatibility with the norm used above on bilinear maps. After completing this space with respect to this norm, we should then quotient by the closure of the space spanned by the relations above. This gives, if I’m not mistaken, the projective tensor product of two Banach spaces.
With this tensor product becomes a closed monoidal category. Moreover, for any Banach space
, the space
of bounded linear operators becomes a monoid object in the category of Banach spaces, or a Banach algebra, which we’ll turn to soon.
[…] Let be the category of complex Banach spaces and weak contractions (see, for example, this previous post). The representable functor sends a Banach space to its unit ball; we will take this as the […]
You say an enriched functor gives a collection of maps between hom-objects. Perhaps a (single) morphism in the enrichment category between the hom-objects might be a better definition?
I’m not sure what you mean by this. The hom-objects are still indexed by their sources and targets, and the objects still form a class.
I read the sentence as Hom(a,b) -> Hom(F(a),F(b)) (for fixed a,b) was the collection of maps. I think I was thrown by the word “map”. But I see now how it is meant. Thanks.