Our route towards this result will turn out to pass through all of the most common types of characteristic classes: we’ll invoke, in order, Euler classes, Chern classes, Pontryagin classes, Wu classes, and Stiefel-Whitney classes.

**Examples in the plane**

Recall that a smooth projective hypersurface of degree is a projective variety cut out by a single homogeneous polynomial of degree which is smooth. This is the case if and only if the partial derivatives have no zeroes in common with in . Such a variety has complex dimension , hence real dimension .

*Example.* When we are considering smooth projective curves in the projective plane . Examples are given by the Fermat curves

.

Topologically, these are compact oriented surfaces, and hence their homeomorphism and even diffeomorphism type is completely determined by the rank of their first homology, or equivalently by their genus . The genus-degree formula asserts that the genus of a plane curve of degree is .

*Subexample.* When or the genus is , so we just get projective lines , or topologically we get -spheres . When the genus is , so we get elliptic curves (after choosing identities), or topologically we get tori .

There is a nice heuristic proof of the genus-degree formula (which can be made rigorous; see this MO discussion) which goes as follows. First consider the singular curve of degree given by lines in general position, so that every pair of lines intersects exactly once but otherwise there are no intersections. Topologically this gives a collection of spheres each pairwise intersecting in a point. If we perturb the coefficients of the singular curve, it will become smooth; topologically the spheres become pairwise connected by tubes. After using of these tubes to connect the spheres in a line, to obtain a sphere, the remaining tubes each increase the genus of the resulting surface by .

**An aside**

The following is not necessary for the computation to come but is nevertheless a nice explanation of a particular aspect of how it turns out. Eventually we’ll show that the cohomology of a smooth projective hypersurface depends only on the degree and the dimension of the ambient projective space, and this is explained by the fact that an even stronger statement than this holds.

**Theorem:** The diffeomorphism type of a smooth projective hypersurface of degree in depends only on and .

*Remark.* This statement cannot be strengthened to a statement about isomorphism in the holomorphic / algebraic category, as the example of cubic curves in already shows.

*Rough sketch.* The idea is that slightly perturbing the coefficients of a homogeneous polynomial does not affect the diffeomorphism type of the hypersurface it cuts out, and moreover that the space of homogeneous polynomials defining a smooth hypersurface is the complement of a subvariety (the subvariety of polynomials sharing at least one zero with its partial derivatives), hence has real codimension and in particular is path-connected, so we can perturb the coefficients of any such polynomial to get any other such polynomial.

*Proof.* Let be a complex vector space of dimension , so that we can identify with . A homogeneous polynomial of degree on is an element of , but since we’re only looking at the hypersurface cut out by such a polynomial we can ignore the zero polynomial and scaling, so we are really looking at an element of . Now let

be the complement in of the singular locus of polynomials having a zero in common with their partial derivatives, and let

.

The space admits a projection map onto the second coordinate , and the hypersurface cut out by is precisely the fiber .

Our goal is to show that the fibers of this map are diffeomorphic by applying Ehresmann’s theorem to it, which tells us that is a locally trivial smooth fibration provided that it is a proper surjective submersion. This implies in particular that the fibers are all diffeomorphic if is path-connected.

We’ll divide up the rest of the proof into the following steps.

**Step 1:** is a path-connected smooth manifold. More generally the following is true.

**Proposition:** Let be a Zariski-closed subset. Then is a path-connected smooth manifold.

*Proof.* A Zariski-closed subset is in particular closed, so is an open subset of a smooth manifold and hence a smooth manifold. The key point for path-connectedness is that has codimension at least , but we can avoid explicitly using this fact as follows. Any two distinct points determine a complex line passing through them. The intersection of this complex line with is finite, since it is a Zariski-closed subset of but not the whole thing. Now minus a finite set of points is path-connected, so can be connected by a path lying inside as desired.

It remains to show that the singular locus of polynomials having a zero in common with their partial derivatives is Zariski-closed, but this is a corollary of the existence of the multivariate resultant of the polynomials , which is a polynomial in the coefficients vanishing iff the polynomials have a common zero, together with the identity

showing that if all of the vanish at a point then so does .

**Step 2:** is a smooth manifold. To start with, we’ll work locally. On the open subset where and the coefficient of in is also nonzero, is locally the zero locus of the function

where and is the dehomogenization of , scaled so that the constant coefficient (the coefficient of in ) is . Fixing , the differential of this map in the has coefficients the partial derivatives , and since by assumption we’ve removed the singular hypersurfaces, at least one of these partial derivatives must be nonzero, so by the regular value theorem the zero locus is locally a smooth manifold. Running this argument with replaced by any and replaced by any monomial of degree , we get that is a smooth manifold as desired.

**Step 3:** is a submersion. is clearly surjective and proper (since hypersurfaces are compact), so this is the only interesting step remaining. Again working locally and on the open subset where and the coefficient of in is nonzero, locally takes the form

where again is the dehomogenization scaled to have constant coefficient , and . To show that is surjective on tangent spaces it suffices to show that any infinitesimal deformation in the coefficients of can be canceled out by a corresponding deformation in the so that the relation continues to hold (this is what it means to lift a tangent vector from our target to our source). But this is precisely guaranteed by the condition that at least one of the partial derivatives is nonzero. Again, running this argument with all of the other coordinates and monomials we get the result.

*Remark.* A simpler version of this argument can be used to give a proof of the fundamental theorem of algebra. The rough sketch here is to argue 1) that the space of polynomials with nonzero discriminant is connected, 2) that the number of roots of a polynomial with nonzero discriminant does not change when you perturb its coefficients, and 3) that establishing the fundamental theorem for polynomials with nonzero discriminant establishes it for all polynomials, since if is any polynomial then has nonzero discriminant, or equivalently is squarefree.

**Most of the cohomology**

Below all cohomologies are with integer coefficients unless otherwise stated.

Let be a smooth projective hypersurface of degree in . Most of the cohomology of is determined by the Lefschetz hyperplane theorem, as follows. Thinking again of as where , we have a Veronese embedding

and, essentially by definition, is the intersection of the image of the Veronese embedding with a hyperplane in . The Lefschetz hyperplane theorem then guarantees that the natural map is an isomorphism for and an injection for . Recalling that

we conclude that is if is even and otherwise for all . Moreover, since , by virtue of being a compact complex manifold, is in particular a compact oriented manifold, we can apply Poincaré duality to conclude that the same is true of . That is,

and so the only remaining question is what the middle cohomology looks like. So far all we know is that injects into it; this is if is odd but if is even.

**Reduction to the Euler characteristic**

We claim that to compute the middle cohomology of it suffices to compute its Euler characteristic . First, recall that a compact manifold has finitely generated cohomology. It follows that has a well-defined Euler characteristic. Since we know all of the Betti numbers except one, computing the Euler characteristic will tell us the remaining Betti number. Explicitly, our computations above give

.

However, we still need to rule out the possibility of torsion in the middle cohomology in order to be confident that knowing the Betti number is enough. We can do this using the universal coefficient theorem, which gives a short exact sequence

.

The group on the right is torsion-free because it is given by homomorphisms into a torsion-free group, and the group on the left is torsion-free because it vanishes: is free by another part of the Lefschetz hyperplane theorem, hence has no nontrivial extensions. It follows that is free abelian, so is determined by its rank .

**The Euler characteristic via Chern classes**

Recall that the Euler characteristic of a compact oriented smooth manifold can be computed as the evaluation of the Euler class of its tangent bundle on the fundamental class . (Since the Euler class of a vector bundle can be thought of as Poincaré dual to the zero locus of a generic section, this can be thought of as a restatement of the Poincaré-Hopf theorem.)

On a compact complex manifold, the tangent bundle has a complex structure and hence Chern classes . It is common to refer and to notate these as the Chern classes of itself. Moreover, the top Chern class is the Euler class. Hence one way to compute the Euler characteristic of a compact complex manifold is to compute its top Chern class, which is the approach we will take: in fact we will compute all Chern classes.

We will first need to compute the Chern classes of . The key tool is the Euler sequence

where is the trivial line bundle and is the dual of the tautological line bundle whose fiber at a point in is the line in it represents; equivalently, is the line bundle whose holomorphic sections are homogeneous polynomials of degree . Since the total Chern class is multiplicative in exact sequences, we get

where is a generator of the cohomology ring . It follows that the Chern classes of are given by

.

(In particular, the top Chern class is , which when evaluated on the fundamental class gives the Euler characteristic as expected.)

To get from here to the Chern classes of a hypersurface we need to relate the two tangent bundles, which we do via the short exact sequence

of vector bundles on , where is the normal bundle.

Now, it turns out that the normal bundle is the restriction to of the line bundle whose holomorphic sections are homogeneous polynomials of degree ; this is essentially the content of the adjunction formula. Roughly speaking this is because is defined as the zero locus of a nonvanishing section of , and the actual map can be thought of as the derivative of this section, although I’m not sure how to make this precise.

In particular, since , the total Chern class of is given by , and hence the total Chern class of is

where by abuse of notation denotes the pullback of our previously chosen generator of to . We can now compute that the top Chern class of is

.

It remains to evaluate on the fundamental class of . Now, is Poincaré dual to the intersection of generic hyperplanes in , which give a copy of , and since is cut out by a hypersurface of degree intersecting it with a generic line gives points, so we conclude that and hence that

which gives our desired computation of the rank of the middle cohomology:

.

*Example.* Let . As mentioned above, in this case is topologically a compact oriented surface The Betti numbers of are , and

and we recover the genus-degree formula.

*Example.* Rewriting the formula above as

makes it more convenient to do some kinds of computations with. In particular, for we get

as expected since in this case is just and we know its middle cohomology already. For we get

which is a little more interesting; the resulting hypersurfaces, namely the quadric hypersurfaces, are birational to but not necessarily homotopy equivalent. We’ll identify the quadric hypersurface when below; when it turns out to be the Grassmannian of complex planes in , with the embedding into being given up to projective change of coordinates by the Plücker embedding.

For by inspection the Betti number grows exponentially in .

**Complex surfaces as 4-manifolds**

Now let . In this case is topologically a compact oriented 4-manifold. The Betti numbers of are , and

.

*Example.* When , so that is , we get as expected.

*Example.* When , so that is a quadric surface, we get ; here is , so diffeomorphic to , with the embedding into being given up to projective change of coordinates by the Segre embedding.

*Example.* When , so that is a cubic surface, we get .

*Example.* When , so that is a quartic surface, we get ; in this case is also a K3 surface.

(When , is a surface of general type.)

For , the homotopy group version of the Lefschetz hyperplane theorem implies that the natural map is an isomorphism; since the latter is trivial, so is the former. Hence as 4-manifolds, our complex surfaces are compact, oriented, and simply connected.

For such a 4-manifold, once we know its cohomology groups the only additional data of the sort that one usually calculates in a first course in algebraic topology is the cup product, which is completely determined by the **intersection form**

where is the fundamental class in . Since is even, the intersection form is symmetric, so gives the structure of an integral **lattice** (that is, a free abelian group equipped with a symmetric bilinear -valued form), and by Poincaré duality this lattice is **unimodular**.

Thus identifying invariants of lattices immediately gives us (oriented homotopy) invariants of compact oriented 4-manifolds, and more generally of compact oriented manifolds in dimension . We’ll focus our attention on three such invariants.

- The
**rank**of a lattice is its rank as an abelian group; in the case of 4-manifolds this is just the second Betti number . - The
**signature**of a lattice is the signature of its bilinear form on . More explicitly, by Sylvester’s law of inertia any nondegenerate bilinear form on a real vector space can be diagonalized so that the corresponding quadratic form isfor two integers , which can equivalently be described as the number of positive resp. negative eigenvalues of a matrix describing the bilinear form. The signature is then ; note that the rank is , so the signature and the rank together determine the ordered pair , which is also sometimes called the signature. This gives an invariant of compact oriented manifolds in dimension also called the signature and denoted .

- The
**parity**of a lattice is defined as follows: if is always divisible by , then the lattice is**even**, and otherwise the lattice is**odd**. In other words, where the signature comes from looking at , the parity comes from looking at .

*Remark.* In general this is very far from being a complete set of invariants of lattices. In the case that the signature is equal to the rank (so that the lattice is positive definite), the Smith-Minkowski-Siegel mass formula implies that the number of isomorphism classes of lattices grows very rapidly with the rank.

*Remark.* The signature is a particularly interesting invariant: its definition can be extended to manifolds in dimension not divisible by by declaring the corresponding signatures to be , and then the signature is a genus, although we won’t use this fact.

The intersection form turns out to be a surprisingly strong invariant. Milnor and Whitehead showed that compact, oriented, simply connected 4-manifolds are determined up to oriented homotopy by their intersection forms as lattices. Freedman showed that every unimodular lattice arises in this way and that the only additional data required to determine such a 4-manifold up to homeomorphism is a class in called the Kirby-Siebenmann invariant; moreover,

- if the lattice is even, then there is a unique corresponding 4-manifold up to homeomorphism with Kirby-Siebenmann invariant , and
- if the lattice is odd, then there are exactly two corresponding 4-manifolds, one with each possible value of the Kirby-Siebenmann invariant.

The Kirby-Siebenmann invariant vanishes whenever a manifold has a smooth structure, and so in the odd case at least one of the two 4-manifolds does not have a smooth structure.

There are also other obstructions to having a smooth structure involving the intersection form. For example, by the above the lattice occurs as the intersection form of a unique homeomorphism class of compact, orientable, simply connected 4-manifold, the manifold. The lattice is positive definite but not diagonalizable, so by Donaldson’s theorem the manifold does not have a smooth structure.

The computations we’ve done so far don’t tell us what the intersection form is. Fortunately, we’ll be able to compute the intersection form, and hence the cup product structure on cohomology, as follows. First, we can compute the signature using the Hirzebruch signature theorem in terms of Pontryagin classes. Second, if the signature is not equal to the rank (so the lattice is indefinite) then the possible lattices have been completely classified. There are only two possibilities if the rank and signature are fixed, depending only on the parity:

- if the lattice is odd, it must be the lattice of vectors with integer entries in , the real vector space equipped with the symmetric bilinear form of signature ;
- if the lattice is even, the signature must be divisible by , and the lattice must be the lattice of vectors in whose entries are either all integers or all integers plus and which sum to an even number.

In other words, for indefinite unimodular lattices the rank, signature, and parity form a complete set of invariants. Hence if we compute that the signature is not equal to the rank , the only additional information we need to determine the lattice is its parity. It will turn out that this is determined by whether the second Stiefel-Whitney class vanishes, or equivalently by whether admits a spin structure.

**The signature via Pontryagin classes**

Recall that if is a real vector bundle over a space then it admits a complexification which is a complex vector bundle, and that the Pontryagin classes of are characteristic classes defined in terms of the Chern classes of the complexification via

.

For a compact smooth oriented 4-manifold , the Hirzebruch signature theorem asserts that the signature is given by

where is the first Pontryagin class

of (the tangent bundle of) and is the fundamental class as usual. In particular, it implies that the first Pontryagin number is divisible by .

Hence to compute the signature of a hypersurface we need to compute the second Chern class of the complexification of its tangent bundle, regarded as a real vector bundle (whereas above we computed the Chern classes of the tangent bundle, which already had a complex structure). In general we can compute the Chern classes of the complexification of a complex vector bundle in terms of the Chern classes of the original bundle as follows.

**Theorem:** Let be a complex vector bundle. Then the complexification of the underlying real vector bundle of is isomorphic, as a complex vector bundle, to , where is the conjugate vector bundle.

**Corollary:** The Pontryagin classes of the underlying real vector bundle of a complex vector bundle can be computed in terms of its Chern classes via the Whitney sum formula as

.

In particular,

.

*Proof.* Write

.

This tells us that to understand the endofunctor on complex vector bundles it suffices to understand as a -bimodule; the left -module structure tells us how to take the tensor product and the right -module structure tells us what the complex structure on the tensor product is. The theorem is then equivalent to the claim that, as a bimodule,

where

- denotes the identity bimodule, with acting on the left and right by left and right multiplication, so that tensoring with this bimodule is the identity endofunctor , and
- denotes the bimodule where left and right multiplication by disagree by a sign of (more explicitly, we can take the left module structure to be the usual one and the right module structure to be multiplication by the conjugate), so that tensoring with this bimodule is the endofunctor .

To see this, we will first think of as a right -module with basis , and then we will diagonalize left multiplication by . When we do this we find that on

left and right multiplication by agree, whereas on

left and right multiplication differ by a sign. The left, or equivalently right, -submodules generated by these vectors gives the desired decomposition.

Now let be a hypersurface of degree in . Above we computed the total Chern class to be

so we compute that

and hence that

and, using again the fact that , we compute the signature of a smooth projective hypersurface of degree in to be

.

Above the numerator has been written in a form that makes it clear that it is divisible by .

We conclude that for the signature is nonpositive, and so the intersection form is indefinite in this case, which tells us that to uniquely identify the intersection form we only need to know the parity as we hoped.

*Example.* When the signature is . This reflects the fact that the intersection form on is positive definite, since it is just given by .

*Example.* When the signature is . This reflects the fact that the intersection form on is indefinite, since by the Kunneth formula is generated by two elements (where denotes a generator of ) which square to zero but whose cup product is a generator of . An explicit diagonalization of the intersection form over is given by the basis .

*Example.* When the signature is . In particular it is not divisible by , so the intersection form is odd and hence must be the lattice .

*Example.* When the signature is . We’ll see later that in this case the intersection form is even, and hence must be the lattice .

In general, when is odd the signature is odd, so the intersection form is odd and hence is uniquely determined. When is even the signature is divisible by , and in particular is divisible by , so the intersection form could be even or odd.

**The parity via Stiefel-Whitney classes**

To summarize, the story so far is the following:

- If is a smooth projective hypersurface in of degree , then in particular it is a smooth, compact, oriented, and simply connected 4-manifold.
- For such a manifold, is a free abelian group of finite rank, and is determined up to homeomorphism by the intersection form on , which gives the structure of a unimodular lattice.
- The rank and the signature of are given by
and in particular, for , is indefinite.

- By the classification of indefinite unimodular lattices, the only remaining bit of information we need about to completely determine it is its parity. More specifically, if the parity is odd then must be and if the parity is even then must be , where
.

In this section we’ll compute the parity. It will turn out to depend only on , which via Freedman’s work gives an independent confirmation that when the homeomorphism type of a smooth projective hypersurface of degree only depends on (since the Kirby-Siebenmann invariant vanishes when a 4-manifold has a smooth structure).

Let be a smooth, compact, oriented, simply connected 4-manifold. Since vanishes, we have , and so the parity of is determined by whether or not the map

is identically zero. Over this map is linear; in fact it can be identified with the Steenrod square . By Poincaré duality (this step only requires that is compact, since every compact manifold is orientable over ) there must therefore be a unique cohomology class such that

.

This class is called the second Wu class, and by definition vanishes identically iff vanishes, so is even iff vanishes.

So it remains to compute . The Wu classes turn out to be closely related to the Stiefel-Whitney classes (of the tangent bundle). More precisely, the total Stiefel-Whitney class is the total Steenrod square of the total Wu class:

.

*Remark.* In particular, the Stiefel-Whitney classes of a compact smooth manifold depend only on its cohomology over as a module over the Steenrod algebra, which is surprising: a priori the Stiefel-Whitney classes also depend on the additional data of the tangent bundle.

This gives

and hence

.

Since we assumed that is oriented, vanishes (although this also follows from the fact that vanishes as well), from which it follows that vanishes iff the second Stiefel-Whitney class vanishes. Hence we have proven the following.

**Theorem:** Let be a compact oriented simply connected 4-manifold. Then is even iff vanishes.

*Remark.* Even if is not equipped with a smooth structure, hence is not equipped with a tangent bundle, as long as is compact we can still define its Stiefel-Whitney classes in terms of its Wu classes, and these will agree with the Stiefel-Whitney classes computed from any smooth structure on . If is equipped with a smooth structure and is oriented, then the vanishing of is equivalent to also admitting a spin structure.

*Remark.* The lattice is even; in fact it is the unique even positive definite unimodular lattice of rank . It follows that if the manifold had a smooth structure, it would also admit a spin structure, and then Rokhlin’s theorem would imply that its signature is divisible by . But its signature is ; contradiction. This gives a second proof that the manifold has no smooth structure.

*Remark.* If is not simply connected, or more precisely if has -torsion, then it is still true that if vanishes then is even, but the converse need not hold owing to the presence of an additional direct summand in coming from universal coefficients.

It remains to compute the second Stiefel-Whitney class. We can in fact compute all Stiefel-Whitney classes of a hypersurface of degree in as follows.

**Theorem:** Let be a complex vector bundle. Then the Stiefel-Whitney classes of the underlying real vector bundle are determined by the Chern classes as follows: the odd Stiefel-Whitney classes vanish, and the even Stiefel-Whitney classes satisfy

*Proof.* We’ll first prove this in the case when is a line bundle . (This is the only case we need but it’s not much harder to prove the general statement.) In this case we only need to show that vanishes and that .

First, vanishes if and only if has an orientation. But any complex structure induces an orientation, so this is clear.

To compute we can use the fact that the top Stiefel-Whitney class of an oriented vector bundle is the reduction of its Euler class while the top Chern class of a complex vector bundle is its Euler class, which gives since they are both the Euler class . If we want to avoid the Euler class, we can also argue as follows:

The functor from complex line bundles to real plane bundles is induced, at the level of classifying spaces, by the map

induced by the standard embedding . Since as subgroups of , the map above factors as a composite

where the first map is a homotopy equivalence, showing that the classification of complex line bundles is in fact equivalent to the classification of oriented real plane bundles.

From standard results about characteristic classes we know that on the one hand

is a polynomial algebra on the universal second Stiefel-Whitney class , while on the other hand

is a polynomial algebra on the universal first Chern class . In particular, generates while is the unique generator of , so the homotopy equivalence necessarily identifies the latter with the reduction of the former.

We have the desired result for line bundles. To obtain the result for all bundles we appeal to the splitting principle, which tells us in particular that to prove an equality of characteristic classes it suffices to prove it on a direct sum of line bundles.

So let be complex line bundles. We now know that the total Stiefel-Whitney class of the underlying real vector bundle of can be computed, using the Whitney sum formula, as

since we know that vanishes. This implies that all of the odd Stiefel-Whitney classes vanish. Since we also know that , this tells us that the total Stiefel-Whitney class is

and this is the reduction of the total Chern class as desired.

Now again let be a hypersurface of degree in . Above we computed the first Chern class to be

where , as before, denotes the pullback of the generator of to . By the Lefschetz hyperplane theorem, or from the fact that we know , the cohomology class is nonzero, hence the reduction

vanishes if and only if is even. We conclude that the parity of is precisely the parity of , so when is even the intersection form is even and uniquely determined. This completes our computation of the cohomology ring of .

*Remark.* When is even we also conclude that the hypersurfaces have a spin structure, and in particular we get an independent confirmation of Rokhlin’s theorem that the signature is divisible by in this case.

]]>

into its presheaf category (where we use to denote the category of functors ). The Yoneda lemma asserts in particular that is full and faithful, which justifies calling it an embedding.

When is in addition assumed to be small, the Yoneda embedding has the following elegant universal property.

**Theorem:** The Yoneda embedding exhibits as the **free cocompletion** of in the sense that for any cocomplete category , the restriction functor

from the category of cocontinuous functors to the category of functors is an equivalence. In particular, any functor extends (uniquely, up to natural isomorphism) to a cocontinuous functor , and all cocontinuous functors arise this way (up to natural isomorphism).

Colimits should be thought of as a general notion of gluing, so the above should be understood as the claim that is the category obtained by “freely gluing together” the objects of in a way dictated by the morphisms. This intuition is important when trying to understand the definition of, among other things, a simplicial set. A simplicial set is by definition a presheaf on a certain category, the simplex category, and the universal property above says that this means simplicial sets are obtained by “freely gluing together” simplices.

In this post we’ll content ourselves with meandering towards a proof of the above result. In a subsequent post we’ll give a sampling of applications.

**A toy version of the above result**

Coproducts in particular are examples of colimits, so if we think of coproducts as being analogous to addition, we can think of a cocomplete category as being analogous to a commutative monoid and a cocontinuous functor as being analogous to a morphism of commutative monoids. The universal property above can then be thought of as analogous to the following. Let be a set and let be the set of functions which vanish except at finitely many points in . There is an inclusion sending a point in to the indicator function which is equal to at that point and elsewhere.

**Theorem:** The natural inclusion exhibits as the the free commutative monoid on in the sense that for any commutative monoid , the restriction map

from the set of monoid homomorphisms to the set of functions is a bijection.

(Of course an intriguing difference between the toy theorem and the real theorem is that being cocomplete is a property of a category, while being a commutative monoid is a structure placed on a set.)

In the setting of commutative monoids, a shorter description of the above theorem is that there’s a forgetful functor from commutative monoids to sets and that describes its left adjoint. Similarly, we’d like to be able to say that there’s a forgetful functor from cocomplete categories to categories and that the Yoneda embedding is its left adjoint. Unfortunately, there are nontrivial size issues that get in the way: is never small, and in fact, the only cocomplete small categories are preorders by a theorem of Freyd.

In any case, before we get to discussing the result in full generality, let’s look at some illustrative examples.

**Sets**

Take to be the terminal category. Then is just the category of sets. This example already says something interesting: the universal property implies that is the free cocomplete category on an object in the sense that if is a cocomplete category, then the category of cocontinuous functors is equivalent to itself. The inverse of this equivalence sends an object to the functor

which, given a set , returns the coproduct of copies of , and conversely every cocontinuous functor has this form. This statement should be thought of as analogous to the statement that is the free commutative monoid on a point.

**Graphs**

Take to be the category with two objects and two parallel morphisms between them. (This category is in fact a truncation of the simplex category.) Think of as a vertex, as an edge, and the two morphisms as the two inclusions of the endpoints of the edge. A presheaf is then precisely a pair of sets together with a pair of functions

.

The two maps have been named because we can think of them as source and target maps: in fact, is precisely a (directed multi)graph with vertex set and edge set . Here the universal property of presheaves can be interpreted as the claim that graphs are obtained by freely gluing together edges along vertices.

The universal property also gives a natural way of describing graphs as topological spaces, as follows: is a cocomplete category, and there is a functor sending to a point, to an interval , and the two arrows to the two inclusions of the endpoints of the interval. By the universal property, this functor extends to a cocontinuous functor sending a graph to its underlying topological space (with directions on the edges ignored). This is a simple version of geometric realization.

But of course the universal property implies that there are many other more exotic notions of geometric realization for graphs. For example, instead of using topological spaces we could use affine schemes: fixing a field , the category of affine schemes over is cocontinuous, and there is a functor sending to a point , to , and the two maps to the inclusions of the two points into (for example). By the universal property we obtain a geometric realization functor which, for example, sends the loop (the graph consisting of a vertex and an edge from that vertex to itself) to the affine scheme with ring of functions

.

This affine scheme is precisely the nodal cubic. To see this, write the loop as the coequalizer of the two maps , thought of as natural transformations between the corresponding presheaves. To compute the ring of functions on the resulting affine scheme means computing the equalizer of the two maps given by evaluation at and respectively.

**Species**

Write for the category (really groupoid) of finite sets and bijections. This is equivalently the core of the category of finite sets and functions. It is equivalent, as a category, to the disjoint union

of the one-object groupoids corresponding to the symmetric groups , hence the name . We will often think of the objects of as the non-negative integers. A presheaf is, depending on who you ask, a **species**, **-module**, or **symmetric sequence** in sets; we’ll use the term species. More concretely, a species is a collection of sets indexed by the non-negative integers such that each set is equipped with a (right) action of the symmetric group .

Species are surprisingly fundamental objects in mathematics. Under the name species, they were introduced by Joyal to study combinatorics, and among other things to categorify the theory of exponential generating functions; see, for example, Bergeron, Labelle, and Leroux. I think the names -module and symmetric sequence are used by authors studying operads, as operads are species with extra structure (see the nLab for details).

The universal property tells us that we can extend any functor from to a cocomplete category to a cocontinuous functor . An important source of functors is given by taking to be a symmetric monoidal category, to be an object, and considering the functor

.

This observation can be codified as the following universal property.

**Theorem:** , equipped with disjoint union, is the free symmetric monoidal category on an object in the sense that for any symmetric monoidal category , the restriction functor from the category of symmetric monoidal functors to the category of functors , which is just , is an equivalence.

If is in addition cocomplete, in such a way that the monoidal operation is cocontinuous in both arguments (**symmetric monoidally cocomplete**), then after choosing an object , we get not only a symmetric monoidal functor but even a functor , which turns out to be symmetric monoidal if is given a monoidal structure via Day convolution. (Day convolution is the monoidal structure categorifying the product of exponential generating functions.) This observation can in turn be codified as a universal property.

**Theorem:** , equipped with Day convolution, is the free symmetric monoidally cocomplete category on an object in the sense that for any symmetric monoidally cocomplete category , the restriction functor from the category of symmetric monoidal cocontinuous functors to the category of functors (thinking of as a representable presheaf), which is just , is an equivalence.

What do these symmetric monoidal cocontinuous functors actually look like? For an object , the corresponding functor is

where is shorthand for taking coinvariants with respect to the diagonal action of , and is shorthand for the coproduct of an -indexed family of s (see copower for some motivation behind this notation). This is an important construction: in the special case that is an operad, so that describes the set of -ary operations in the operad, the above construction describes the free -algebra on . If all of the are finite sets, the above construction can also be thought of as categorifying the exponential generating function

(thinking of taking coinvariants with respect to an -action as categorifying dividing by , in accordance with the general yoga of groupoid cardinality.)

*Example.* Let be the associative operad. Here consists of operations of the form

for each permutation and hence, as a right -set, is isomorphic to . is then naturally isomorphic to , so the free associative algebra (monoid) on an object in a symmetric monoidally cocomplete category is the infinite coproduct

.

Regarded just as a combinatorial species, categorifies the generating function .

*Example.* Let be the commutative operad. Here consists of the single operation

and hence, as a right -set, is trivial. is then the quotient , so the free commutative algebra (commutative monoid) on an object in a symmetric monoidally cocomplete category is the infinite coproduct

.

Regarded just as a combinatorial species, categorifies the generating function .

**Intuitions about the proof**

Recall that we are trying to show that the restriction functor

is an equivalence. By analogy with the corresponding statement about sets, commutative monoids, and free commutative monoids, one way to proceed with this proof is to figure out how to write every presheaf as a colimit of representable presheaves (the image of the Yoneda embedding ), then turn this colimit into a colimit in by applying a given cocontinuous functor . This will show, roughly speaking, that the restriction map is “injective” (although we need to be careful about what this means because we’re dealing with categories, not sets).

To show that the restriction map is “surjective,” we need to extend a functor to a cocontinuous functor . We’d like to do this “by linearity,” by choosing an expression for a presheaf as a colimit of representable presheaves and turning this colimit into a colimit in by applying our functor; however, we need to be able to make this choice functorially, and then we still need to verify that the resulting functor is actually cocontinuous.

**Presheaves as colimits of representable presheaves**

The following result is at least implicit in the use of the terminology “free cocompletion” and is important in getting the above proof to work, as well as being a generally useful thing to know in category theory. It is sometimes called the co-Yoneda lemma for reasons that are a little difficult to explain without more background. Previously it showed up when we discussed operations and pro-objects, but there we rushed through the proof and here we’ll take a more leisurely pace.

**Theorem:** Let be a (locally small) category. Then every presheaf is canonically a colimit of representable presheaves.

*Idea #1.* One relevant intuition here is to think of a presheaf as a recipe for writing down a colimit in by prescribing how many copies of each object and morphism in appear in the diagram, in the same way that one can think of a function from a set to the non-negative integers (with finite support) as a recipe for writing down an element of the free commutative monoid on by prescribing how many copies of each element of to add up. This intuition is hopefully quite clear in the case of graphs, where a presheaf on tells you how many edges and vertices to glue together as well as how to glue them together.

*Idea #2.* For the more categorically minded, a related intuition is the following. Let be a diagram in . The colimit , if it exists, is defined by a universal property describing how maps out of it behave. This determines the covariant functor it represents uniquely, but says very little about the contravariant functor it represents. However, there is in some sense a “minimal” possibility for this contravariant functor. For example, if the colimit in question is the coproduct of two objects, then by definition

but the only thing we know about is that there are natural inclusion maps , hence we know that admits a natural map from , but this is all we know without further information. Now, since colimits in functor categories are computed pointwise, is none other than the coproduct of , but regarded as lying in the presheaf category. In general, the sense in which presheaves are “free colimits” of objects of is that, as contravariant functors, they describe the “minimal” contravariant functors that a colimit of objects in could represent.

Now we turn to the proof itself.

*Proof.* Let by a presheaf. Since we want to describe as a colimit, let’s think about the contravariant functor that represents. By definition, consists of families of maps satisfying the naturality condition that if is a morphism, then the diagram

(drawn using QuickLaTex) commutes. We want to write as a colimit of representable functors, and we know that by the Yoneda lemma, if (which we use to designate the representable functor ) is a representable functor, then . To go from elements of to maps we need copies of .

A clean way to obtain these copies is to write down a diagram whose objects are given by pairs of an object and an element , equipped with the map to given by forgetting . The preimage of is then precisely , and if we don’t specify any morphisms then a cocone over this diagram in is precisely a family of maps satisfying no naturality conditions.

To get the naturality conditions back we need to equip with morphisms. Choosing the morphisms such that sends to enforces precisely the naturality condition desired on the maps , and furthermore the maps canonically exhibit as the colimit of the corresponding diagram in as desired.

(The diagram we constructed above is the opposite of the **category of elements** of , which is a special case of the **Grothendieck construction**. As described in the nLab article, we can think of as the classifying space of -bundles, and then is the classifying map of a -bundle on and is the total space of the bundle. admits other more sophisticated descriptions that won’t concern us at the moment.)

**The actual proof**

Now we return to the proof of the theorem. Let be a small category and be a (locally small) cocomplete category. Recall, again, that we are trying to show that the restriction functor

is an equivalence of categories. If we wanted to show that a map of sets was a bijection, we’d just have to show that it’s injective and surjective, and we sketched some intuition for why this should be the case above. But an equivalence of categories is more subtle, and instead of verifying two conditions we need to verify three: needs to be full, faithful, and essentially surjective.

To show that is fully faithful, let be a natural transformation between two cocontinuous functors . We want to show that knowing the restriction of to representable functors uniquely determines for all presheaves , and moreover that given such a restriction we can always extend it to a natural transformation on all presheaves. But since is cocontinuous and is a colimit of representables, is freely determined by the universal property of colimits: in particular it is determined by its restriction to every representable , which is just composed with the inclusion by naturality, and given such a compatible family of restrictions it exists.

To show that is essentially surjective, let be a functor. We want to extend to a cocontinuous functor , which we will do “by linearity”: if is a presheaf, we’ll write it canonically as a colimit of representable presheaves using the diagram of shape we described above (which is small since is small), then apply to this diagram to obtain a diagram in , then take the colimit in . In symbols,

.

Every step of this process, including the formation of the category of elements, is functorial, so really is a functor. (It is crucial that be small to ensure that is a small diagram; “cocomplete” only means that all small colimits exist, and in fact the theorem of Freyd alluded to above also implies that a category with all colimits is a preorder.)

It remains to verify first that really is cocontinuous and second that it really does restrict to (a functor naturally isomorphic to) . These will both be a corollary of the following.

**Proposition:** is the left adjoint of the functor

.

(A version of this construction, the “left pro-adjoint,” appeared previously on this blog.)

(There is some mild abuse of notation going on here. should really denote the functor given by precomposition with , and should really denote the left adjoint of this functor, also known as **left Kan extension**. The decorations (pronounced “upper star” and “lower shriek” respectively) on these functors are by analogy with some of Grothendieck’s six operations on sheaves.)

*Proof.* We want to show that there is a natural bijection

.

We know that , hence we can write the RHS as

first by the universal property of colimits and second by the Yoneda lemma. On the other hand, by definition is also a colimit over , hence we can write the LHS as

by the universal property of colimits. The conclusion follows.

In particular, since is a left adjoint, it is necessarily cocontinuous, and if above is a representable presheaf then the above adjunction gives

by the Yoneda lemma, so by a second application of the Yoneda lemma. It follows that is essentially surjective, hence an equivalence as desired.

(In fact should really have denoted the functor given by precomposition with , and what we really wrote down above is the left adjoint to this functor, which is a genuine left Kan extension along . We could’ve written the proof so as to show that is not only a left adjoint but in fact an inverse once we restrict to cocontinuous functors.)

]]>

Although I’m sure there are more, I’m only aware of two other students at Berkeley who’ve posted transcripts of their quals, namely Christopher Wong and Eric Peterson. It would be nice if more people did this.

]]>

Standard presentations of propositional logic treat the Boolean operators “and,” “or,” and “not” as fundamental (e.g. these are the operators axiomatized by Boolean algebras). But from the point of view of category theory, arguably the most fundamental Boolean operator is “implies,” because it gives a collection of propositions the structure of a category, or more precisely a poset. We can endow the set of propositions with a morphism whenever , and no morphisms otherwise. Then the identity morphisms simply reflect the fact that a proposition always implies itself, while composition of morphisms

is a familiar inference rule (hypothetical syllogism). Since it is possible to define “and,” “or,” and “not” in terms of “implies” in the Boolean setting, we might want to see what happens when we start from the perspective that propositional logic ought to be about certain posets and figure out how to recover the familiar operations from propositional logic by thinking about what their universal properties should be.

It turns out that when we do this, we don’t get ordinary propositional logic back in the sense that the posets we end up identifying are not just the Boolean algebras: instead we’ll get Heyting algebras, and the corresponding notion of logic we’ll get is intuitionistic logic.

**True, false**

Propositional logic should have two special propositions, “true” and “false.” Categorically, we should expect “true” and “false” to have universal properties, and indeed they do: “true” should be implied by everything while “false” should imply everything. In other words, “true” should be a terminal object or **top element** or **greatest element** of the poset and “false” should be an initial object or **bottom element** or **least element**. We will denote these by and respectively.

Hence the posets we are interested in have both a top and a bottom element; these are the **bounded** posets.

*Example.* Starting from any poset , we can adjoin top and bottom elements in the obvious way, and every bounded poset arises in this way for a unique poset (namely the one obtained by removing the top and bottom elements). So this hypothesis is not very restrictive.

**And, or**

Propositional logic should have a logical “and” operator. Categorically, we again expect a universal property, which is the following: we should have if and only if and . This is precisely the universal property of products, so should be the product or **meet** or **infimum** of the two elements . The projection maps reproduce conjunction elimination, another familiar inference rule.

Dually, we need to be able to take the logical “or” of two propositions. The corresponding universal property is that we should have if and only if and . This is precisely the universal property of coproducts, so should be the coproduct or **join** or **supremum** of the two elements . The inclusion maps reproduce disjunction introduction. Note in particular that the empty meet is the top element and the empty join is the bottom element.

Hence the posets we are interested in have all finite joins and meets; these are the (bounded) **lattices**.

*Example.* Any total order with top and bottom elements is a lattice, where the meet of two elements is their minimum and the join of two elements is their maximum. For example, is such a total order, as is any successor ordinal.

*Example.* The poset of open subsets of a topological space and the poset of measurable subsets of a measurable space are by definition lattices.

*Example.* The poset of subobjects of an object in a category is often a lattice. For example, any group has a lattice of subgroups, where the meet is the intersection and the join is the subgroup generated by two subgroups. Similarly, any module has a lattice of submodules, where the meet is the intersection and the join is the sum. In particular, any ring has a lattice of ideals (left, right, or two-sided).

**Implies**

Propositional logic should have an “internal” notion of implication. In other words, should not only just be true or false but should itself be a proposition. This would allow us to state inference rules like modus ponens (). The corresponding universal property is that if and only if . This is precisely the universal property of exponential objects, which we encountered when talking about the Lawvere fixed point theorem.

For posets, having finite meets is equivalent to having finite limits, and dually having finite joins is equivalent to having finite colimits. A category with finite limits and exponential objects is cartesian closed, and a cartesian closed category with finite coproducts is bicartesian closed. Hence the posets we are interested in are precisely the bicartesian closed posets; these are in turn precisely the **Heyting algebras**.

*Example.* Let be a lattice with arbitrary joins such that finite meets distribute over arbitrary joins. Then is cartesian closed and hence a Heyting algebra. This is a consequence of the adjoint functor theorem for posets, and in particular implies that the lattice of open subsets of any topological space is a Heyting algebra. For open sets the implication turns out to be

.

**Not**

Finally, propositional logic should have a notion of negation. The notion of negation we’ll adopt is that the negation of a proposition asserts that it implies false, so

.

Note that there is no reason for double negation to hold in general. There is also no reason for excluded middle to hold in general. So we’re really in the realm of intuitionistic logic here.

*Example.* Let be the lattice of open subsets of a topological space as above. Then negation takes the form

.

and can easily fail. For example, let and let . Then , so . Excluded middle fails even more badly: in any topological space, iff is clopen, hence satisfies excluded middle iff is discrete, in which case is a Boolean algebra. is connected iff never satisfies any nontrivial case of excluded middle.

Note, however, that for topological spaces we always have , and in fact in any Heyting algebra we always have

.

To see this, observe that by the universal property this implication holds if and only if , but this follows from modus ponens.

]]>

**Method 1: Mayer-Vietoris**

For this particular method write . We can compute the cohomology of inductively by regarding it as the union of two copies of with intersection and using Mayer-Vietoris. The cohomological version of Mayer-Vietoris is a long exact sequence of the form

.

The maps are induced by pulling back along the inclusion , whereas the maps are induced by the difference between the pullbacks along the inclusions . Because these maps are homotopic to the identity map , we can think of as being given by

where , and we can think of as being given by two copies of a single map , which we’ll denote by . It follows that is the antidiagonal copy of in , hence factors through the map from to and contains a copy of given by .

It also follows that is the diagonal copy of , hence that is surjective. Finally, is the kernel of , hence the quotient of by is . In other words, we have short exact sequences

.

But inductively it will turn out that all the groups involved are free abelian so all of these exact sequences split. In fact, inducting on the above relation it follows that the Poincaré polynomials

satisfy and , hence

.

So by induction we conclude that . Note that we have not computed the cup product structure.

**Method 2: the Künneth formula**

This method will compute the cup product structure. is the product of copies of , whose cohomology as a ring is ; there are no interesting cup products. By the Künneth formula, the cohomology of is the graded tensor product, as algebras, of copies of (since all of the cohomology groups involved are free). This is precisely the exterior algebra , with each generator in degree . In particular, naturally and that under this isomorphism the cup product corresponds to the wedge product.

**Method 3: de Rham cohomology**

This method will compute the cohomology over by computing the de Rham cohomology of . One particularly nice way to do this is to use the following.

**Theorem:** Let be a compact connected Lie group acting on a smooth manifold . The inclusion of invariant differential forms into differential forms is a quasi-isomorphism (induces an isomorphism on cohomology).

The idea behind this result is that, since is compact, there is an averaging operator given by averaging over the action of with respect to normalized Haar measure on . But since is connected, the action of any individual element of is homotopic to the identity, so this average is also homotopic to the identity.

In particular, letting act on itself by translation, we conclude that we can compute its de Rham cohomology using translationally invariant differential forms on , or equivalently on its universal cover . But these are precisely the differential forms obtained by wedging together the -forms . The exterior derivative vanishes on all such forms, so we conclude that the de Rham cohomology of is the exterior algebra on .

**Method 4: Hopf algebras**

This method will compute the cohomology over . Since is a topological group, it’s equipped with a product operation . The induced map in cohomology has the form

by the Künneth formula. This map is coassociative and compatible with cup product, so equips with the structure of a bialgebra. Together with the map induced by the inversion map and the identity , the cohomology of acquires the structure of a Hopf algebra, and in fact this was Hopf’s motivation for introducing Hopf algebras. Hopf algebras arising in this way satisfy the following very stringent structure theorem.

**Theorem (Hopf):** Let be a finite-dimensional graded commutative and cocommutative Hopf algebra over a field of characteristic zero such that (the Hopf algebra is connected). Then is the exterior algebra on a finite collection of generators of odd degrees.

The comultiplication sends each generator to , the antipode sends each generator to , and the counit sends each generator to .

To compute the cohomology of it therefore suffices to determine what the possible generators of the exterior algebra are. For starters, let’s write more abstractly as where is a finite-dimensional real vector space of dimension and is a lattice in of full rank (the subgroup generated by a basis of ). Covering space theory gives us that . By the Hurewicz theorem, , so by the universal coefficient theorem,

.

This gives us generators of degree , one for each element of a basis of , and so at the very least contains the exterior algebra . But now we’re done: the cohomology can’t contain any generators of higher degree because wedging them with the generators we’ve already found would produce nonzero elements of the cohomology of in degrees higher than , and no such elements exist (either because admits a CW-decomposition involving cells of dimension at most or because the de Rham complex only extends up to dimension for a smooth manifold of dimension ).

**Method 5: suspension**

Recall that cohomology is a stable invariant in the sense that

where is the (reduced) suspension of (here a pointed space). Recall also that for nice pointed spaces the suspension of a product has homotopy type

where is the wedge sum and is the smash product. Finally, recall that and that , so .

Two spaces are said to be stably homotopy equivalent if for some ; in particular, stably homotopy equivalent spaces have isomorphic cohomology. The above result tells us that is stably homotopy equivalent to (once we know that suspension commutes with wedge sums). More generally, by induction we conclude that a product is stably homotopy equivalent to a wedge obtained formally by expanding

,

where denotes the unit of the smash product, and removing the unit. It follows that is stably homotopy equivalent to a wedge of copies of the -sphere, , and by a simple application of Mayer-Vietoris (for wedge sums), the cohomology of such a wedge is the same as what we’ve computed before.

This argument does not get us the cup product structure, since the cup product is an unstable phenomenon; after suspension, all cup products are trivial. However, it does describe the stable homotopy type of , which contains information that cohomology doesn’t (e.g. about stable homotopy groups).

**Method 6: cellular homology**

To compute the cohomology of it suffices to compute the homology and apply either universal coefficients or Poincaré duality. It is possible to describe fairly concretely what the homology of looks like using cellular homology. Recall that cellular homology describes a chain complex computing the homology of a CW-complex which in degree is free abelian on the -cells in a cell decomposition of . Our particular admits a cell decomposition with cells of dimension given by starting with the minimal cell decomposition of into two cells (a -cell and a -cell connecting the -cell to itself) and taking products, where we’re thinking of cubical -cells here. Equivalently, we can think of as being with opposite -faces identified, and then our cells are the faces of up to this identification.

The boundary maps in the cellular complex are as follows. If is a -cell and its attaching map (where here denotes the -skeleton of ), then the differential is

where runs over an enumeration of all -cells , denotes the degree, and is the map induced by collapsing all of except the cell to a point.

In this particular case all of the boundary maps in the cellular complex are trivial, so the homology is free abelian on cells. To see this, note that if is not surjective, then it necessarily has degree since it is null-homotopic, so we reduce to the surjective case. In this case the -cell must be a face of the -cell , and since we’ve collapsed everything else we can reduce to the case that , so that is the top-dimensional cell. At this point we will cheat a little: if in this case, then we would have , but is a compact orientable manifold and therefore must satisfy .

In particular, the cell decomposition we gave above is minimal: it is not possible to give a cell decomposition with fewer cells. In addition, by Poincaré duality the cohomology can also be thought of as free abelian on cells, and moreover we can describe the cup product in terms of transverse intersections of submanifolds representing homology classes. We can do this by explicitly intersecting the cells above, but the following description is perhaps more elegant: if we think of as , then a subspace represents a homology class if it is translation-invariant (given by the pushforward of a fundamental class). The images of two such subspaces intersect transversely if , and then their intersection represents a homology class which Poincaré dualizes to the cup product of the Poincaré duals of . In particular, note that the short exact sequence

implies that . Its Poincaré dual is therefore a class in , which has the correct degree.

**Method ???: the Lefschetz fixed point theorem**

This method is not numbered because the argument is incomplete. Consider the map

where each is a positive integer equal to at least . This map has fixed points, since in each coordinate the fixed points of are precisely the th roots of unity. Each fixed point has index . By the Lefschetz fixed point theorem it follows that

.

Knowing what we already know about the cohomology, it is tempting to identify a monomial on the LHS with a cohomology class on the RHS on which acts by multiplication by that monomial. We can do this as follows. For any subset of indices we have a projection map . Since is a compact orientable manifold, it has a fundamental class generating its top cohomology. The map induces a map on such that any point has preimages, hence has degree as a map on , so acts on the fundamental class by multiplication by . This action induces an action on the pullback of the fundamental class of to which is also by multiplication by .

As the vary this argument shows that the cohomology classes arising in this way are all linearly independent, hence all contribute to the RHS of the Lefschetz fixed point theorem. The sum of the corresponding contributions to the RHS exhaust all terms on the LHS, so if there is any more cohomology to be found then it isn’t being detected by .

**Method ???: Morse theory**

There is a convenient choice of Morse function on given by

.

The gradient of this function is , and in particular it vanishes iff for all . There are therefore critical points , organized in batches of critical points such that coordinates are equal to and coordinates are equal to . At such a point of the second derivatives of each term are equal to and are equal to , with no other contributions to the second-order Taylor series expansion of , so all critical points are nondegenerate (hence we do in fact have a Morse function) with index . Morse theory then guarantees that has the homotopy type of a CW-complex with cells of dimension .

This argument should be placed in the context of a Morse-theoretic proof of the Künneth formula; more generally, if are manifolds with Morse functions , then is a Morse function on the product , and critical points of are precisely products of critical points on the , and so forth.

With more effort Morse theory even provides a complex computing the homology, but I wasn’t able to easily compute the differentials in it (they should all vanish in this case).

**Interpretation**

Our computations admit the following interpretation. Recall that is the Eilenberg-MacLane space representing integral cohomology in the sense that there is a natural isomorphism , where denotes the space of homotopy classes of maps (or weak homotopy classes if are not CW-complexes) . It follows that represents -tuples of cohomology classes in . By the Yoneda lemma, cohomology classes in , or equivalently homotopy classes of maps , can naturally be identified with natural transformations

.

Such natural transformations between cohomology functors are called cohomology operations, and the computations we did above imply that the only cohomology operations of this form are generated by wedge products under addition. (“Interesting” cohomology operations over , not generated by addition and the wedge product, require higher cohomology classes as input. The smallest one is a cohomology operation ; see this math.SE question.)

]]>

Omega places before you two opaque boxes. Box A, it informs you, contains $1,000. Box B, it informs you, contains either $1,000,000 or nothing. You must decide whether to take only Box B or to take both Box A and Box B, with the following caveat: Omega filled Box B with $1,000,000 if and only if it predicted that you would take only Box B.

What do you do?

(If you haven’t heard this problem before, please take a minute to decide on an option before continuing.)

**The paradox**

The paradox is that there appear to be two reasonable arguments about which option to take, but unfortunately the two arguments

support opposite conclusions.

The **two-box** argument is that you should clearly take both boxes. You take Box B either way, so the only decision you’re making is whether to also take Box A. No matter what Omega did before offering the boxes to you, Box A is guaranteed to contain $1,000, so taking it is guaranteed to make you $1,000 richer.

The **one-box** argument is that you should clearly take only Box B. By hypothesis, if you take only Box B, Omega will predict that and will fill Box B, so you get $1,000,000; if you take both boxes, Omega will predict that and won’t fill Box B, so you only get $1,000.

The two-boxer might respond to the one-boxer as follows: “it sounds like you think a decision you make in the present, at the moment Omega offers you the boxes, will affect what Omega did in the past, at the moment Omega filled the boxes. That’s absurd.”

The one-boxer might respond to the two-boxer as follows: “it sounds like you think you can just make decisions without Omega predicting them. But by hypothesis he can predict them. That’s absurd.”

Now what do you do?

(Again, please take a minute to reassess your original choice before continuing.)

**The von Neumann-Morgenstern theorem**

Let’s avoid the above question entirely by asking some other questions instead. For example, a question one might want to ask after having thought about Newcomb’s paradox for a bit is “in general, how should I think about the process of making decisions?” This is the subject of **decision theory**, which is roughly about decisions in the same sense that game theory is about games. The things that make decisions in decision theory are abstractions that we will refer to as **agents**. Agents have some preferences about the world and are making decisions in an attempt to satisfy their preferences.

One model of preferences is as follows: there is a set of (mutually exclusive) **outcomes**, and we will model preferences by a binary relation on outcomes describing pairs of outcomes such that the agent **weakly prefers** to . This means either that in a decision between the two the agent would pick over (the agent **strictly prefers** to ; we write this as ) or that the agent is indifferent between them. The weak preference relation should be a total preorder; that is, it should satisfy the following axioms:

**Reflexivity:**. (The agent is indifferent between an outcome and itself.)**Transitivity:**If and , then . (The agent’s preferences are transitive.)**Totality:**Either or . (The agent has a preference about every pair of outcomes.)

If and then this means that the agent is indifferent between the two outcomes; we write this as . The axioms above imply that indifference is an equivalence relation.

The strong assumptions here are transitivity and totality. One reason totality is a reasonable axiom is that an agent whose preferences aren’t total may be incapable of making a decision if presented with a choice between two outcomes the agent doesn’t have a defined preference between, and this seems undesirable. For example, if we were trying to write a program to make medical decisions, we wouldn’t want the program to crash if faced with the wrong kind of medical crisis.

One reason transitivity is a reasonable axiom is that an agent whose preferences aren’t transitive can be **money pumped**. For example, if an agent strictly prefers apples to oranges, oranges to bananas, and bananas to apples, then I can offer the agent an apple, then offer to trade it a banana for the apple and a penny (say), then offer to trade it an orange for the banana and a penny (say), and so forth. Again, if we were trying to write a program to make important decisions of some kind, this kind of vulnerability would be very dangerous.

In this model, an agent makes decisions as follows. Each time it makes a decision, it must choose from some number of actions. It needs to determine what outcomes result from each of these actions. Then it needs to determine which of these outcomes is greatest in its preference ordering, and it selects the corresponding action.

This is very unsatisfying as a model of decision making because it fails to take into account uncertainty. In practice, agents making decisions cannot completely determine what outcomes result from their actions: instead, they have some uncertainty about possible outcomes, and that uncertainty should be factored into the decision-making process. We will take uncertainty into account as follows. Define a **lottery** over outcomes to be a formal linear combination

of outcomes, where the are real numbers summing to and should be interpreted as the probabilities that the outcomes occurs. (Equivalently, a lottery is a particularly simple kind of probability measure on the space of outcomes, which is given the discrete -algebra as a measurable space, but we will not need to use this language.) We now want our agent to have preferences over lotteries rather than preferences over outcomes. That is, the agent’s preferences are now modeled by a total order on lotteries.

Aside from the axioms defining a total order, what other axioms seem reasonable? First, suppose that are two lotteries such that . Now consider the modified lotteries and where with probability the original lotteries occur but with probability some other fixed lottery occurs. Whether we are in the first case or not, we either prefer or are indifferent to what happens in the second lottery, so the following seems reasonable.

**Independence:**If , then for all and all we have . Moreover, if and then .

Note that by taking the contrapositive of the second part of independence we get a partial converse of the first part: if such that , then . In particular, if , then . This will be useful later.

Another reasonable axiom is the following. Suppose are three lotteries such that . Now consider the family of lotteries . When the agent weakly prefers this lottery to , but when the agent weakly prefers to this lottery. What happens for intermediate values of ? It seems reasonable for an “intermediate value theorem” to hold here: the agent’s preferences should not jump as varies. So the following seems reasonable.

**Continuity:**If , then there exists some such that .

With these axioms we can now state the following foundational theorem.

**Theorem (von Neumann-Morgenstern):** Suppose an agent’s preferences satisfy the above axioms. Then there exists a function on outcomes, the **utility function** of the agent, such that if and only if

where and . The utility function is unique up to affine transformations where .

If is a lottery, the corresponding sum is the **expected utility** with respect to the lottery, so the von Neumann-Morgenstern theorem allows us to describe the goal of an agent (a **VNM-rational agent**) satisfying the above axioms as maximizing expected utility.

*Proof.* First observe that we can reduce to the case that is finite. If the theorem were false in the infinite case, then for any proposed utility function we would be able to find a pair of lotteries such that but . But since in total only involve finitely many outcomes, restricts to a utility function with the same property on the finitely many outcomes involved in , so the theorem is false in the finite case.

Now for the proof. It is possible to take a fairly concrete but tedious approach by first constructing using continuity and then proving that satisfies the conclusions of the theorem by induction. We will instead take a more abstract approach by appealing to the hyperplane separation theorem. To start with, think of the set of lotteries as sitting inside Euclidean space as the probability simplex . Let be outcomes which are minimal resp. maximal in the agent’s preference ordering. For , let .

We would like to show that the subset

(of lotteries the agent strictly prefers to , but strictly prefers to) and the subset

(of lotteries the agent strictly prefers to but strictly prefers to ) are disjoint convex open subsets of . That they are disjoint follows from the definition of strict preference. That they are convex can be seen as follows: if are two lotteries such that , then by independence we have

for all , hence for all . Applying this argument with and then applying the argument with reversed inequality signs, first with general and then with , gives the desired result.

Finally, that they are open can be seen as follows: let be a lottery such that . By inspection every point in an open ball around has the form where is some other lottery, which can be taken to be either a lottery equivalent to (in that the agent is indifferent between them) or a lottery such that . So it suffices by convexity to show that for any such there exists some such that .

In the case that can be taken to be equivalent to this is straightforward; by independence

.

In the case that can be taken to satisfy , a similar application of independence gives

.

Again, applying the argument with and then applying the argument with reversed inequality signs, first with general and then with , gives the desired result.

Now by the hyperplane separation theorem there exists a hyperplane separating and , where are constants. These constants are in fact independent of and are (up to affine transformation, and in particular we may need to flip their signs) the utility function we seek. To see this, let be two lotteries. Then by independence , and by continuity there is a constant such that

.

If , then , and the separating hyperplane must pass through both and (since are in neither nor , and the complement of their union consists of lotteries equivalent to ), so they have the same utility. Conversely, if a separating hyperplane passes through two lotteries then they must be equivalent to the same and hence must be equivalent.

Otherwise, , and the separating hyperplane separates and . With the correct choice of signs, it follows that as desired. Conversely, if a separating hyperplane separates two lotteries then they cannot have the same expected utility and hence cannot be equivalent; with the correct choice of signs, if then .

It remains to address the uniqueness claim. The above discussion shows that the utility function is uniquely determined by its value on and , subject to the constraint that . To fix the correct choice of signs above we may set ; any other choice is related to this choice by a unique positive affine linear transformation.

**But what about the paradox?**

The relevance of the von Neumann-Morgenstern theorem to Newcomb’s paradox is that a particular interpretation of Newcomb’s paradox in the context of expected utility maximization supports the one-box argument. A VNM-rational agent participating in Newcomb’s paradox should be acting in order to maximize expected utility. For the purposes of recasting Newcomb’s paradox in this framework, it’s reasonable to equate utility with money; agents certainly don’t need to have the property that their utility functions are linear in money, but Newcomb’s paradox can just be restated in units of utility (**utilons**) rather than money.

So, it remains to determine the expected utility of the lottery that occurs if the agent takes one box and the lottery that occurs if the agent takes two boxes. Newcomb’s paradox can be interpreted as saying that in the first lottery, the box contains $1,000,000 with high probability (whatever probability the agent assigns to Omega being an accurate predictor), while in the second lottery, the two boxes together contain $1,000 with high probability. Provided that this probability is sufficiently high, which again can be absorbed into a suitable restatement of Newcomb’s paradox, it seems clear that a VNM-rational agent should take one box. (Note that stating the one-box argument in this way shows that it does not depend on Omega being a perfect predictor; Omega need only be a sufficiently good predictor, where the meaning of “sufficiently” depends on the ratio of the amounts of money in each box.)

This version of the one-box argument is therefore based on the **principle of expected utility** (to be distinguished from the von Neumann-Morgenstern theorem); roughly speaking, that rational agents should act so as to maximize expected utility. Relative to the definition of expected utility given above this says exactly that rational agents should be VNM-rational.

The two-box argument can also be based on a decision-making principle, namely the **principle of dominance**, which says the following. Suppose an agent is choosing between two options and . Say that **dominates** if there is a way to partition possible states of the world such that in each partition, the agent would prefer to choice . (The notion of domination does not depend on having a notion of probability distribution over world states; it requires something much weaker, namely a set of possible world states.) The principle of dominance asserts that rational agents should choose dominant options.

This seems plausible. But it also seems to be the case that taking two boxes dominates taking one box in Newcomb’s paradox:

- If Omega has filled Box B with $1,000,000, then taking both boxes gives you $1,001,000 rather than $1,000,000, so it’s $1,000 better.
- If Omega hasn’t filled Box B with $1,000,000, then taking both boxes gives you $1,000 rather than $0, so it’s still $1,000 better.

One situation in which the principle of dominance doesn’t make sense is if the choice between options itself affects which partition of world-states you’re in. For example, if you chose which boxes to open and then Omega chose whether to fill Box B based on your choice, then the above reasoning doesn’t seem to apply since Omega gets to choose which partition of world-states you’re in after seeing your choice between the two options. But in the setting of Newcomb’s paradox itself this doesn’t seem to be the case: Omega has already made its decision in the past, and it seems absurd to think of the agent’s decision in the present as having an effect on Omega’s past decision.

So Newcomb’s paradox appears to show that the principle of expected utility maximization and the principle of dominance are inconsistent.

Now what do you do?

**Further reading**

Newcomb’s paradox remains, as far as I can tell, a hotly debated topic in the philosophical literature, and in particular is considered unresolved. Campbell and Sowden’s *Paradoxes of Rationality and Cooperation* is a thorough, if somewhat outdated, overview of some aspects of Newcomb’s paradox and its relationship to the prisoner’s dilemma.

]]>

be a function. Then we can write down a function such that . If we **curry** to obtain a function

it now follows that there cannot exist such that , since .

Currying is a fundamental notion. In mathematics, it is constantly implicitly used to talk about function spaces. In computer science, it is how some programming languages like Haskell describe functions which take multiple arguments: such a function is modeled as taking one argument and returning a function which takes further arguments. In type theory, it reproduces function types. In logic, it reproduces material implication.

Today we will discuss the appropriate categorical setting for understanding currying, namely that of cartesian closed categories. As an application of the formalism, we will prove the Lawvere fixed point theorem, which generalizes the argument behind Cantor’s theorem to cartesian closed categories.

**Some examples of mathematical currying**

*Example.* A group action on a set is often described using a function . Currying gives a function ; in other words, it associates to every element a function . It seems more natural to define a group action in this way, but what works in may work less well in other categories; for example, when defining actions of Lie groups on manifolds, we talk about smooth functions because it is unclear in this setting in what sense the space of smooth functions is a smooth manifold (hence in what sense we should be asking for smooth functions from into this space).

*Example.* A vector space is equipped with a dual pairing . Currying gives a function , and the corresponding functions are in fact linear, so we can associate to every an element of the double dual space . In other words, currying gives us the double dual map . There is a similar map in the setting of Pontrjagin duality.

*Example.* A topological space is equipped with an evaluation map , where here denotes the space of continuous complex-valued functions . Currying gives a function which associates to every an evaluation map . When is compact Hausdorff, every homomorphism of complex algebras has this form.

**Cartesian closed categories**

A **cartesian closed category** is a category with finite products in which the product functor has a right adjoint, the **exponential** . In other words, there is a natural identification

.

The notation is nonstandard; a more conventional notation is , but the notation (which is sometimes used for the more general notion of internal hom) emphasizes the fact that a Cartesian closed category is in particular a closed monoidal category, and in particular is enriched over itself.

Letting be the terminal object, we get that there is a natural identification

.

In other words, the **global points** (morphisms from , also just called **points**) of are naturally identified with the set of morphisms from to .

More generally, the -points of , which by definition are naturally identified with , should be thought of as “-parameterized families of morphisms from to .”

Uncurrying the identity map , we obtain the **evaluation map**

describing, internally, how to evaluate functions on arguments. In computer science, this function is also called **apply**.

*Example.* is cartesian closed, and is the basic example. Here the internal hom is the set of functions from to and the global points of a set are its set of points in the ordinary sense. The same applies to .

*Example.* The category of (small) categories is cartesian closed. Here the product is the usual product of catgories and the internal hom is the category of functors from to , with morphisms given by natural transformations. The global points of a category are its objects.

*Subexample.* In , the subcategory of groupoids is cartesian closed, since the product of groupoids and the functor category between two groupoids both remain groupoids. If are two groups regarded as one-object categories, the functor category is the groupoid whose objects are the morphisms and whose morphisms are given by pointwise conjugation by elements of . Note that the category of groups is not cartesian closed.

*Subexample.* In , the subcategory of posets is cartesian closed, since the product of posets and the functor category between two posets both remain posets. If are two posets, then is the poset of order-preserving functions with iff for all .

*Example.* Let be a group. The category is cartesian closed; it has a product inherited from , and exponential objects are given by the set of all functions from to together with the -action

.

The global points of a -set are its fixed points, and in particular the global points of are the set of -morphisms .

*Example.* Any Boolean algebra, regarded as a poset and then regarded as a category, is cartesian closed. The product of two propositions is their logical “and” , and the exponential object is the material implication . The currying adjunction

simply says that implies if and only if implies . The terminal object is the proposition “true,” and a proposition has a global point if and only if it is a tautology. The evaluation map is an internal description of modus ponens.

*Non-example.* It is an unfortunate fact about point-set topology that is not cartesian closed (see, for example, this math.SE question). When it exists, the exponential is often given the compact-open topology. This problem is fixed by working instead with a convenient category of topological spaces, such as the category of compactly generated spaces.

*Non-example.* Suppose a cartesian closed category has a zero object . Since there is a unique morphism from to any other object, it follows that every exponential has a unique global point, hence that there is a unique morphism from any object to any other object (necessarily the zero morphism). Conversely, if has a zero object and a nonzero morphism, then cannot be cartesian closed.

**Proposition:** In a cartesian closed category, products distribute over colimits in both variables, and exponentials send colimits in to limits and preserves limits in .

**Corollary:** If is a cartesian closed category with finite coproducts (a **bicartesian closed category**), then letting denote the coproduct, we have the following natural identifications:

- (so is a distributive category),
- .

*Proof.* These all follow from the natural identifications

.

In more detail, is a left adjoint and hence preserves arbitrary colimits, is a right adjoint and hence preserves arbitrary limits, and is a (contravariant) right adjoint (to itself!) and hence, as a contravariant functor on , sends colimits to limits.

Specialized to the cartesian closed category of finite sets, the above result explains from a categorical point of view the algebraic axioms satisfied by addition, multiplication, and exponentiation of non-negative integers.

**Corollary:** Let be a category. Then the category of presheaves on is cartesian closed.

This greatly generalizes the example of ; we get, for example, a version of the category of graphs and the category of simplicial sets as special cases.

*Proof.* Products are easy to construct, since limits are computed pointwise. To construct exponentials, suppose that are two presheaves whose exponential exists. The universal property and the Yoneda lemma together imply that

which uniquely defines a presheaf. It remains to check that this presheaf really satisfies the universal property, but this follows from the fact that every presheaf is a colimit of representable presheaves and from the fact that products distribute over colimits, which is true because it is true pointwise; that is, in .

The terminal object in is the presheaf sending every object to and sending every morphism to the unique morphism . If itself has a terminal object , then it represents the terminal presheaf, hence a global point of a presheaf is just an element of , so we can explicitly verify that . In general, a global point of a presheaf is a choice of element for each which is compatible with every morphism in in the sense that if is any morphism, then ; in other words, it is an element of the limit .

In particular, if is the category of open subsets of a topological space (so that a presheaf on is a presheaf on in the usual sense), then a global point of a presheaf is a **global section**. Note that this is equivalently an element of (since is the terminal object) or a choice of element for each open which is compatible with inclusions in the sense that if then restricts to .

The category of sheaves on a topological space is also a cartesian closed category, and moreover is a topos.

**Presheaves on a monoid**

We showed earlier that is cartesian closed for a group, but the explicit description we gave of the exponential requires talking about inverses in . On the other hand, the above theorem implies in particular that is cartesian closed for a monoid which is not necessarily a group. What does the exponential look like in this case?

Let be a category with one object with endomorphism monoid . Then is the category of right -sets, and the unique representable presheaf is as a right -module over itself. If are two -sets, then the above description of the exponential gives

with right -action induced from the left -action of on itself. If is a group, this is naturally isomorphic to , since a morphism of right -sets is freely and uniquely determined by what it does to (where is the identity). This can fail if is not a group, since the value of such a morphism on may not be determined by the value on an element of the form if is not of the form for any , and also since the value of a morphism on may be constrained by the value on two elements of the form if there exists an such that .

*Example.* Let be the free monoid on an idempotent, so that . This is the smallest monoid which is not a group. The category of right -sets is the category of sets equipped with idempotent endomorphisms. The subcategory of such sets such that is constant (equivalently, such that has a unique fixed point) is equivalent to the category of pointed sets: a morphism between such -sets is precisely a map of sets which preserves the unique fixed point. Thus if are such -sets of cardinalities respectively, then is again such a set of cardinality , and so there are morphisms . On the other hand, there are maps of sets.

**The Lawvere fixed point theorem**

To motivate the Lawvere fixed point theorem, let’s write the diagonalization argument above in somewhat greater generality. If is any function, then we can find a function such that iff . Now we curry to obtain a function . If there exists such that , then as before and cannot be in the image of , hence is not surjective.

The crucial step is the step where we write down the function such that . A systematic way to do this is to compose with a function with no fixed points. Lawvere realized that, by taking contrapositives, this means the basic argument behind Cantor’s theorem can be recast as the following fixed point theorem.

**Theorem (Lawvere):** Let be objects in a category with finite products such that the exponential exists (in particular, this is true for any pair of objects in a cartesian closed category). Let and suppose that is **surjective on points** in the sense that the induced map is surjective. Then every morphism has a **fixed point** in the sense that the induced map has a fixed point; that is, has the **fixed point property**.

*Proof.* Let be any morphism and let

where is the diagonal map; see, for example, this blog post. ( specializes to the paradoxical subset constructed in the usual proof of Cantor’s theorem.) By hypothesis, there exists a point such that (where, if are two morphisms in a category with finite products, denotes the product morphism .) But then

whereas by definition , from which it follows that is a fixed point of .

**Taking the contrapositive**

Taking the contrapositive, we conclude that if is an object in a cartesian closed category such that there exists a function with no fixed points, then no morphism can be surjective on points. When in we immediately reproduce Cantor’s theorem, and morally we reproduce Russell’s paradox as well. The proof of the Lawvere fixed point theorem actually provides a particular morphism not in the image of any morphism ; this particular morphism generalizes CantorBot and also gives us the unsolvability of the halting problem.

]]>

- Given a Lie group , its tangent space at the identity is
*a priori*a vector space, but it ends up having the structure of a Lie algebra. - Given a space , its cohomology is
*a priori*a graded abelian group, but it ends up having the structure of a graded ring. - Given a space , its cohomology over is
*a priori*a graded abelian group (or a graded ring, once you make the above discovery), but it ends up having the structure of a module over the mod- Steenrod algebra.

The following question suggests itself: given a construction which we believe to output objects having a certain amount of structure, can we show that in some sense there is no extra structure to be found? For example, can we rule out the possibility that the tangent space to the identity of a Lie group has some mysterious natural trilinear operation that cannot be built out of the Lie bracket?

In this post we will answer this question for the homotopy groups of a space: that is, we will show that, in a suitable sense, each individual homotopy group is “only a group” and does not carry any additional structure. (This is not true about the collection of homotopy groups considered together: there are additional operations here like the Whitehead product.)

**Extra structure on a functor**

The setting in which we will work is the following. Suppose we have some functor which *a priori* takes values in a category . To what extent can we lift to a functor taking values in a “more structured” category equipped with a forgetful functor such that the obvious diagram commutes? As phrased, this question is incredibly general, so we will restrict ourselves to lifts which are described by taking into account structure coming from -ary operations, as follows.

Suppose has finite products. Then we can consider natural transformations to be -ary operations (as in this previous post on Lawvere theories) on the outputs of the functor which equip the objects with extra structure. More precisely, the full subcategory of the functor category on the objects is a Lawvere theory, the **endomorphism Lawvere theory** of (named in analogy with the endomorphism operad). Note that equipping an object in a category with finite products with the structure of a model of a Lawvere theory is equivalent to giving a morphism of Lawvere theories; in particular, itself is tautologically a model of , and this model structure passes to . This lets us lift to a functor taking values in the category of -valued models of , or more precisely the category of product-preserving functors .

If , is representable by some object , and also has finite coproducts, then we can identify natural transformations with morphisms by the Yoneda lemma. Consequently, we can identify with , where is regarded as an object in the opposite category . There is a corresponding story where is a contravariant representable functor; here we just have .

It may be hard to compute the entire endomorphism Lawvere theory of a functor, but any natural transformations that we can find may already provide extra structure that wasn’t there before. More generally it is often possible to identify Lawvere theories and morphisms of Lawvere theories, which allow us to lift to the category of -valued models of . These kinds of observations are already enough to reproduce many familiar examples of extra structure, and generalize the observation that is acted on from the left by the monoid of endomorphisms and from the right by the monoid of endomorphisms .

*Example.* If is a group object in a category with finite products, then the group operation gives a morphism from the Lawvere theory of groups to . Hence naturally acquires the structure of a group. (Conversely, by the Yoneda lemma, if naturally has the structure of a group then is a group object.)

*Example.* Dually, if is a cogroup object in a category with finite coproducts, then the cogroup operation gives a morphism from the Lawvere theory of groups to . Hence naturally acquires the structure of a group. (Again, conversely, by the Yoneda lemma, if naturally has the structure of a group then is a cogroup object.)

*Example.* In the category of schemes over a base ring , the endomorphism Lawvere theory of the affine line is the Lawvere theory of polynomials over , or equivalently the Lawvere theory of commutative -algebras. Hence naturally acquires the structure of a commutative -algebra. (We previously discussed the case for affine schemes in this blog post.)

*Example.* In the category of topological spaces, the space admits addition and multiplication operations in addition to scalar multiplication operations , and these generate the Lawvere theory of polynomials over . Hence naturally acquires the structure of a commutative -algebra.

*Example.* A **distributive category** is a category with finite products and coproducts such that the former naturally distribute over the latter; the standard example is , although and more generally any cartesian closed category also qualify, and and (the category of schemes) are important examples which are not cartesian closed.

In any distributive category, the endomorphism Lawvere theory of the object canonically admits a morphism from the Lawvere theory of Boolean algebras, or equivalently the Lawvere theory of Boolean rings, or equivalently the category of Boolean functions (the full subcategory of on finite sets of size ). Hence naturally acquires the structure of a Boolean algebra, or equivalently a Boolean ring. In this reproduces the lattice of clopen subsets of a topological space. In general I think it should be interpreted as something like the “lattice of decidable properties.”

*Example.* If is an abelian group, then the group operation is itself a morphism in , giving a morphism from the Lawvere theory of abelian groups to . Hence naturally acquires the structure of an abelian group. (We discussed a more general setting in which such an abelian group structure exists in this previous post on semiadditive categories.)

**The homotopy groups are groups**

Recall that the **pointed homotopy category** is the category whose objects are pointed topological spaces and whose morphisms are homotopy classes of pointed continuous maps preserving the base point. Recall also that the homotopy groups are a sequence of functors naturally defined on this category and represented by the spheres with some choice of base point, which we will usually omit in our notation. That the homotopy groups are groups is equivalent to the statement that the spaces , as objects of the pointed homotopy category, are all cogroup objects.

The basic idea is to observe that a pointed map from to a pointed space is the same thing as a map from the -cube to such that the boundary is sent to . In general, morphisms from the -cube can be glued together along any pair of -dimensional faces provided that the images of those faces match. There are distinguished such gluings coming from gluing together each of the copies of in the product in the usual way that one glues two intervals together. These gluing operations are natural, associative, and have inverses up to homotopy. They give compatible group operations on which, when , make it an abelian group by the Eckmann-Hilton argument.

The appearance of maps out of and multiple composition operations suggests a higher-category-theoretic perspective on the situation where we can think of as a suitable automorphism group. More precisely, for any we can associate to an unpointed topological space its fundamental -groupoid , which is the -category whose

- objects are the points of ,
- morphisms are the paths between points of ,
- -morphisms are the homotopies between paths,
- -morphisms are the homotopies between homotopies,

… - -morphisms are the homotopy classes of homotopies between homotopies between…

Note that a -morphism can be thought of as a map , with its source and its target determined by its restriction to a suitable choice of two copies of in it. -morphisms have notions of composition given by gluing along the coordinate directions, generalizing horizontal and vertical composition of -morphisms in -categories (in particular, of functors).

The homotopy group of a pointed space can then be interpreted as the group of -automorphisms of the identity -endomorphism of the identity -endomorphism of… of the identity endomorphism of in the fundamental -groupoid.

**The homotopy groups are only groups**

We would like to show that the homotopy groups are only groups in the sense that the endomorphism Lawvere theories of the functors are generated by the Lawvere theory of groups. In fact we will be able to say slightly more than this.

**Theorem:** The endomorphism Lawvere theory of is precisely the Lawvere theory of groups.

*Proof.* By the Yoneda lemma, this means we want to show that the full subcategory of on the finite wedge sums of is equivalent, as a category with finite coproducts, to the full subcategory of on the finitely generated free groups. To show this it more or less suffices to show that the fundamental group of a wedge of circles is the free group generated by each circle (strictly speaking we should show that this identification can be made compatible with partial composition, but we already know this because we already know that the fundamental group is a group), but this follows from Seifert-van Kampen.

In the context of a more general result, not only has fundamental group but is an Eilenberg-MacLane space , since its universal cover is a tree, and the subcategory of on Eilenberg-MacLane spaces (suitably pointed) is known to be equivalent to , with the equivalence given by .

**Theorem:** The endomorphism Lawvere theory of is precisely the Lawvere theory of abelian groups.

*Proof.* By the Yoneda lemma, this means we want to show that the full subcategory of on the finite wedge sums is equivalent, as a category with finite coproducts, to the subcategory of on the finitely generated free abelian groups. To show this it more or less suffices to show that is the free abelian group generated by each inclusion of into the wedge (where there are spheres in the wedge) (and, again, strictly speaking we should show compatibility with partial composition, but we already know this).

Since admits a CW-structure with a single -cell and no -cells, , it is -connected by cellular approximation. By the Hurewicz theorem, it follows that the Hurewicz map is an isomorphism, so to compute the former it suffices to compute the latter. But now by Mayer-Vietoris.

]]>

]]>

**Theorem:** Let be a finite -group acting on a finite set . Let denote the subset of consisting of those elements fixed by . Then ; in particular, if then has a fixed point.

Although this theorem is an elementary exercise, it has a surprising number of fundamental corollaries.

**Proof**

is a disjoint union of orbits for the action of , all of which have the form and hence all of which have cardinality divisible by except for the trivial orbits corresponding to fixed points.

**Some group-theoretic applications**

**Theorem:** Let be a finite -group. Then its center is nontrivial.

**Corollary:** Every finite -group is nilpotent.

*Proof.* acts by conjugation on . If , this set has cardinality , which is not divisible by . Hence has a fixed point, which is precisely a nontrivial central element.

**Cauchy’s theorem:** Let be a finite group. Suppose is a prime dividing . Then has an element of order .

*Proof.* acts by cyclic shifts on the set

since if then . This set has cardinality , which is not divisible by , hence has a fixed point, which is precisely a nontrivial element such that .

**Some number-theoretic applications**

**Fermat’s little theorem:** Let be a non-negative integer and be a prime. Then .

*Proof.* acts by cyclic shifts on the set , where ; equivalently, on the set of strings of length on letters. (Orbits of this group action are sometimes called **necklaces**; see, for example, this previous blog post.) This set has cardinality and its fixed point set has cardinality , since a function is fixed if and only if it is constant.

**Fermat’s little theorem for matrices:** Let be a square matrix of non-negative integers and let be a prime. Then .

This result also appeared in this previous blog post.

*Proof.* Interpret as the adjacency matrix of a graph. acts by cyclic shifts on the set of closed walks of length on this graph. This set has cardinality and its fixed point set has cardinality , since a closed walk is fixed if and only if it consists of repetitions of the same loop at a vertex.

In the same way that Fermat’s little theorem can be used to construct a primality test, the Fermat primality test, Fermat’s little theorem for matrices can also be used to construct a primality test, which doesn’t seem to have a name. For example, the Perrin pseudoprimes are the numbers that pass this test when

.

More generally, a stronger version of this test gives rise to the notion of Frobenius pseudoprime.

**Wilson’s theorem:** Let be a prime. Then .

*Proof.* Consider the set of total orderings of the numbers modulo addition ; that is, we identify the total ordering

with the total ordering

where the addition occurs .

acts by cyclic shifts on this set. It has cardinality and its fixed point set has cardinality , since the fixed orderings are precisely the ones such that for some .

**Lucas’ theorem:** Let be non-negative integers and be a prime. Suppose the base- expansions of are

respectively. Then

.

*Proof.* It suffices to show that

where , and then the desired result follows by induction. To see this, consider a set of size , and partition it into a block of size together with blocks of size . Identify all of the blocks of size with each other in some way. Then acts by cyclic shift on these blocks. This action extends to an action on the set of subsets of of size . A fixed point of this action consists of a subset of the block of size together with copies of a subset of any block of size . Since the union of the copies has size divisible by , and since , it follows that there must be elements in the block of size and elements in each of the blocks. Thus the set of subsets has cardinality , and its fixed point set has cardinality .

(It is also possible to prove this result directly by considering a more complicated -group action. The relevant -group is the Sylow -subgroup of the symmetric group of . It is an iterated wreath product of the cyclic groups , and describing how it acts on requires recursively partitioning blocks in a way that is somewhat tedious to describe.)

The following proof is due to Don Zagier.

**Fermat’s two-square theorem:** Every prime is the sum of two squares.

*Proof.* Let be the set of all solutions in the positive integers of the equation . The involution has the property that its fixed points correspond to solutions to the equation . We will show that such fixed points exist by showing that there are an odd number of them; to do so, it suffices to show that is odd, and to show that is odd it suffices to exhibit any other involution on which has an odd number of fixed points. (Notice that this is two applications of the -group fixed point theorem.) We claim that

is an involution on with exactly one fixed point, which suffices to prove the desired claim. Verifying that it sends to is straightforward. If is a fixed point, then since are all assumed to be positive integers we cannot have , so we are not in the first case. If we are in the second case, we conclude that , but then , and since we conclude that , which gives . If we are in the third case, we conclude that , then that , but this contradicts . Hence is the unique fixed point as desired.

The following proof is due to V. Lebesgue.

**Quadratic reciprocity:** Let be odd primes. Let denote the Legendre symbol. Then

.

*Proof.* Let be an odd prime and consider the set

.

For an odd prime, we will compute in two different ways. The first way is to observe that acts by cyclic shifts on . The number of fixed points is the number of such that , hence

.

The second way is to find a recursive formula for which will end up being expressed in terms of . The details are as follows.

First, it is clear that . For nonzero, we claim that is independent of . First, observe that by the pigeonhole principle (there are possible values of and possible values of , hence at least one common value) we always have . Second, observe that the map is the norm map from to ; in particular, it restricts to a homomorphism from the unit group of the former ring to the unit group of the latter ring, and we know that this homomorphism is surjective, so each fiber has the same size. The unit group of has size if (since then we are looking at ) and size if (since then we are looking at ). Hence

for any nonzero . When , the equation always has the solution . There are no nonzero solutions if (since does not have a square root) and nonzero solutions if (since for each of the possible values of there are possible values of ). Hence

.

We will now compute inductively by rewriting as

.

From this it follows that

.

We know that takes the value for possible values of and takes any particular other value for possible values of . Hence we can write

But it’s clear that

since this is just the number of possible tuples . Hence

.

This gives a recursion which we can unwind into an explicit formula for as follows. Substituting the recursion into itself and canceling terms gives

.

This pattern continues, and by induction we can show that

.

Hence if is odd, we conclude that

.

Now let be an odd prime. Then by Fermat’s little theorem and by Euler’s criterion, so we we find that

.

On the other hand, we know that . We conclude that

and since the LHS and RHS are both equal to this equality holds on the nose.

This proof is in some sense a proof via Gauss sums in disguise; the numbers are closely related to Jacobi sums, which can be written in terms of Gauss sums. More generally, various kinds of exponential sums can be related to counting points on varieties over finite fields. The Weil conjectures give bounds on the latter which translate into bounds on the former, and this has various applications, for example the original proof of Zhang’s theorem on bounded gaps between primes (although my understanding is that this step is unnecessary just to establish the theorem).

Similar methods can be used to prove the second supplementary law, although strictly speaking we will use more than the -group fixed point theorem.

**Quadratic reciprocity, second supplement:** . Equivalently, iff .

*Proof.* We will compute in two different ways. The first is using the formula we found earlier, which gives

.

The second is to observe that the dihedral group of order acts on by rotation and reflection (thinking of the points as being on the “circle” defined by ). Most orbits of this action have size and hence are invisible . The ones that are not invisible correspond to points with some nontrivial stabilizer. Any stabilizer contains an element of order , and there are five such elements in , so the remaining possible orbits are as follows:

- Points stable under . These are precisely the points such that , hence there are of them.
- Points stable under . These are precisely the points such that , hence there are of them.
- Points stable under . These are precisely the points such that , hence there are of them.
- Points stable under . There are no such points.
- Points stable under . These are precisely the points such that , hence there are of them.

In total, we conclude that

and the conclusion follows.

A general lesson of the above proofs is that it is a good idea to replace integers with sets because sets can be equipped with more structure, such as group actions, than integers. This is a simple form of categorification.

]]>