Feeds:
Posts

## Constructions vs. specifications

What does it mean to define a mathematical object?

Roughly speaking, there are two strategies you could use. One is to construct that object from other objects. For example, $\mathbb{R}$ can be constructed from $\mathbb{Q}$ using Cauchy sequences or Dedekind cuts.

Another is to give a list of properties that uniquely specify the object. For example, $\mathbb{R}$ can be defined as the unique Dedekind-complete totally ordered field.

Let’s call these strategies construction and specification. Construction has the advantage of concreteness, and it also proves that the object exists. However, it has the disadvantage that constructions are not unique in general and you might not be working with the most useful construction at a given time; to prove things about an interesting mathematical object you might have to switch between various constructions (while also proving that they all construct the same object).

Relying excessively on construction also has a curious disadvantage when talking to students about the object in question: namely, they might ask you how to show that object $X$ satisfies property $P$, and you might respond “well, it’s obvious if you use construction $A$,” and they might respond “but I was taught that $X$ is [construction $B$]!” The possibility of multiple different constructions of the same mathematical object makes construction somewhat unsatisfying in terms of describing what you even mean when you say “object $X$.”

This is the appeal of specification. Although specification is more abstract and harder to grasp at first, it has the advantage you don’t have to use any particular construction of an object to prove things about it: instead, you should in principle be able to prove everything from the properties alone, since those uniquely specify the object. This often leads to cleaner proofs. Specifications also usually give a better motivation for looking at an object: it’s easier to make sense of the claim that we want a mathematical object with various properties than to make sense of the claim that we want a mathematical object which is constructed in some arbitrary-looking (to the uninitiated) fashion.

Below we’ll go through a few examples of mathematical objects and some constructions and specifications of them.

The natural numbers

How would you define the natural numbers $\mathbb{N}$? Here are a few options:

1. The finite von Neumann ordinals in ZF.
2. The unique model of second-order Peano arithmetic.
3. The isomorphism classes of finite sets.
4. The initial algebra of the endofunctor $X \mapsto 1 + X$ on $\text{Set}$ (where $1$ denotes a one-element set and $+$ denotes the disjoint union).

Of these options, I would describe Options 1 and 3 as constructions and Options 2 and 4 as specifications.

I am not a fan of Option 1 as a definition. Of the four choices above, it does the worst job of explaining why we care about the natural numbers in the first place.

Option 2 has the virtue of making the conceptual point that we care about the natural numbers as a way of formalizing our experiences with numbers, which we codify into the axioms of Peano arithmetic. It’s also interesting in that no first-order version of Peano arithmetic suffices (e.g. by the Löwenheim-Skolem theorem); there will always exist nonstandard models.

Option 3 is a nice example of (de)categorification (thinking of $\text{FinSet}$ as a categorification of the natural numbers). One reason to like Option 3 is that it imbues the standard operations on natural numbers with categorical meaning. Addition of natural numbers is a decategorification of taking coproducts, multiplication is a decategorification of taking products, and exponentiation is a decategorification of taking exponential objects. Hence together these three structures reflect the fact that $\text{FinSet}$ is a bicartesian closed category.

Option 4 may be unfamiliar, but it is currently my favorite, and unlike Option 3 it has a strong relationship to induction. In more detail: if $F : C \to C$ is an endofunctor of a category, then an algebra of $F$, or $F$-algebra, is an object $x \in C$ together with a morphism $F(x) \to x$, and the initial algebra of $F$ is the initial object in the category of $F$-algebras. Initial algebras offer an elegant formalism for recursive definitions.

Example.If $F : \text{Set} \to \text{Set}$ is the functor $X \mapsto 1 + S \times X$ where $S$ is a fixed set, then the initial $F$-algebra is the set $1 + S + S^2 + ...$ of (finite) lists of elements of $S$.

Example. If $F : \text{Set} \to \text{Set}$ is the functor $X \mapsto 1 + X \times X$, then the initial $F$-algebra is the set of full binary trees.

Initial algebras are specified via a particularly nice kind of property, namely a universal property. Objects satisfying universal properties are always guaranteed to be unique up to unique isomorphism, so half of the problem of using a specification doesn’t occur and we only need to show existence.

Option 4 can be made more explicit as follows. An algebra over the functor $X \mapsto 1 + X$ is a set together with a map $1 + X \to X$. This is precisely the data of an endomorphism $f : X \to X$ together with a point $x \in X$. Hence the category of algebras over this functor is the category of pointed dynamical systems, and Option 4 says that $\mathbb{N}$, together with the point $1 \in \mathbb{N}$ and the endomorphism $n \mapsto n + 1$, is the initial pointed dynamical system. Said another way, $\mathbb{N}$ is the free dynamical system on one element. More explicitly, if $f : X \to X$ is a dynamical system and $x \in X$ is a point, then there is a unique map $\varphi : \mathbb{N} \to X$ such that $\varphi(1) = x$ and $\varphi(n+1) = f(\varphi(n))$ for all $n$.

But this is a form of the principle of induction! In fact, it constitutes a definition of the notation $f^n(x)$ describing the iterated application of a function to an element, since we of course have

$\varphi(n) = f^n(x)$.

The lesson here is that $\mathbb{N}$ is the thing that indexes iterations. When you talk about doing something, then doing it twice, then doing it a third time, and so forth, you’re talking about the universal property of $\mathbb{N}$!

The universal property gives an alternate method of constructing the standard operations on natural numbers which is in better agreement with how they are usually constructed. First, apply the universal property to $\mathbb{N}$ itself: for any natural number $m \in \mathbb{N}$, there is a unique map $\varphi_m : \mathbb{N} \to \mathbb{N}$ such that $\varphi_m(1) = m + 1$ and $\varphi_m(n+1) = \varphi_m(n) + 1$, hence

$\displaystyle \varphi_m(n) = m + n$.

This gives us addition (“addition is repeated successor”). Now $\varphi_m$ above is itself an endomorphism of $\mathbb{N}$, so we can apply the universal property again: for any natural number $m \in \mathbb{N}$ there is a unique map $\psi_m : \mathbb{N} \to \mathbb{N}$ such that $\psi_m(1) = m$ and $\psi_m(n+1) = \psi_m(n) + m$, hence

$\displaystyle \psi_m(n) = mn$.

This gives us multiplication (“multiplication is repeated addition”). $\psi_m$ is itself an endomorphism of $\mathbb{N}$, so we can apply the universal property once again to recover exponentiation (“exponentiation is repeated multiplication”). Repeating this process gives tetration and so on, which isn’t suggested by Option 3 (tetration doesn’t seem to decategorify an interesting categorical operation on $\text{FinSet}$). It’s unclear what to make of this.

The exponential function

How would you define the exponential function $\exp(x) = e^x$? Here are a few options:

1. The limit $\displaystyle \lim_{n \to \infty} \left( 1 + \frac{x}{n} \right)^n$.
2. The series $\displaystyle \sum_{n=0}^{\infty} \frac{x^n}{n!}$.
3. The unique solution to the differential equation $f'(x) = f(x)$ with initial condition $f(0) = 1$.

Of these options, I would describe Options 1 and 2 as constructions and Options 3 as a specification.

Here I think the situation is more clear-cut than with the natural numbers: as a definition, Option 3 is the clear winner. It describes concisely why we care about the exponential function and why it’s important: it’s an eigenfunction of the differentiation operator with eigenvalue $1$. This fundamental fact is ultimately responsible for the important role that the exponential function plays in the theory of differential equations and hence all of mathematics.

Option 3 also explains why we care about Option 1 and Option 2. Why do we care about that particular limit? Because it’s the limit you get when you apply Euler’s method with $n$ steps and step size $\frac{x}{n}$ to the initial value problem $f'(x) = f(x), f(0) = 1$ and let $n \to \infty$. Why do we care about that particular series? Because it’s the unique power series that satisfies $f'(x) = f(x), f(0) = 1$.

One of the most important properties of the exponential function is that $\exp(x + y) = \exp(x) \exp(y)$. What does a proof of this property look like starting from each of the three options?

Option 1: We claim that

$\displaystyle \exp(x) = \lim_{n \to \infty} \left( 1 + \frac{x}{n} + O \left( \frac{1}{n^2} \right) \right)^n$.

The desired conclusion follows from this claim and the observation that

$\displaystyle \exp(x) \exp(y) = \lim_{n \to \infty} \left( 1 + \frac{x + y}{n} + \frac{xy}{n^2} \right)^n$.

To see this it suffices to observe that the expression in question is, for sufficiently large $n$, bounded by $\exp(x - \epsilon)$ and $\exp(x + \epsilon)$ for any $\epsilon > 0$, and that $\exp(x)$ is continuous.

Option 2: Formally, we have

$\displaystyle \exp(x + y) = \sum_{n=0}^{\infty} \frac{(x + y)^n}{n!} = \sum_{n=0}^{\infty} \frac{1}{n!} \sum_{k=0}^n {n \choose k} x^k y^{n-k}$

which we can rewrite as

$\displaystyle \sum_{n=0}^{\infty} \sum_{k=0}^n \frac{x^k}{k!} \frac{y^{n-k}}{(n-k)!} = \left( \sum_{i=0}^{\infty} \frac{x^i}{i!} \right) \left( \sum_{j=0}^{\infty} \frac{y^j}{j!} \right)$

but various bounds need to be checked to ensure that these formal manipulations are valid.

Option 3: $\exp(x + y)$ and $\exp(x) \exp(y)$ are both solutions to the differential equation $f'(x) = f(x)$ with initial value $f(0) = \exp(y)$, and this solution is unique.

I find the proof from Option 3 the least mysterious and the most satisfying. The proof from Option 2 can be given a combinatorial interpretation using exponential generating functions, but since there is a combinatorial interpretation of differentiation, Option 3 can also be given a combinatorial interpretation.

Free groups

How would you define the free groups $F_n, n \in \mathbb{N}$? Here are a few options:

1. The group of words on the letters $x_1, x_1^{-1}, ... x_n, x_n^{-1}$ where no $x_i$ appears adjacent to an $x_i^{-1}$ and where multiplication is given by concatenation together with the removal of any adjacent pairs $x_i, x_i^{-1}$.
2. The unique group such that there is a natural isomorphism $\text{Hom}(F_n, G) \cong G^n$.
3. The fundamental group of a wedge of $n$ circles.

Of these options, I would describe Options 1 and 3 as constructions and Option 2 as a specification. Option 2 is, of course, the universal property of the free groups. Option 3 is a standard application of the Seifert-van Kampen theorem (which reveals a close relationship between Option 2 and Option 3, so perhaps Option 2 is in between a construction and a specification).

Many simple properties of the free groups are cleaner to prove from Option 2 than from Option 1. For example, a standard exercise is to show that the free groups $F_n, F_m$ are not isomorphic if $n \neq m$. The shortest proof is to let $G = \mathbb{Z}/2\mathbb{Z}$ in the universal property; then $F_n \cong F_m$ implies $2^n = 2^m$. It’s less clear where to start from the construction, unless from the construction we prove the universal property.

The standard application of Option 3 is to proving that subgroups of free groups are free via covering space theory (although the use of algebraic topology can be avoided if we instead develop covering space theory for groupoids, as in Brown). This is a statement which does not seem easy to prove from either Option 1 or Option 2 and is a good example of the value of having multiple perspectives on a mathematical object.

Option 2 reveals that statements about the free groups have a strong effect on the rest of group theory. For example, a group $G$ is said to be residually finite if for every $g \in G$ there is a finite quotient $G \to G'$ of $G$ in which the image of $g$ is not the identity. The free groups $F_n$ are residually finite. This shouldn’t be surprising. If the free groups were not residually finite, there would exist an element $g$ of some free group $F_n$ whose image under any homomorphism $F_n \to G$, for $G$ finite, is the identity. But this would be a strange state of affairs: it would mean that $g$ constitutes an identity which holds in every finite group, but not in some infinite group!

### 7 Responses

1. I think by “$x\in X$” in the paragraph beginning “Option 4 may be unfamiliar” you mean “$X \in C$”. As an aside, what’s an example of an endofunctor on $\mathrm{Set}$ that doesn’t have an initial algebra? For all reasonable endofunctors that I can think of, it looks to me like the category of algebras is in fact presentable.

On an unrelated note, I find the residual finiteness of free groups utterly mysterious. The eigenvalues of any element of any finite group acting on any vector space are roots of unity, and therefore algebraic integers with unit norm (at all places). I could certainly imagine a situation in which some word held for algebraic integers of unit norm, but not for more general numbers.

• Take the double powerset functor $x \mapsto 2^{2^x}$. This doesn’t have an initial algebra by Lambek’s theorem and Cantor’s theorem.

Hmm. Maybe.

• Well, I’m certainly happy to believe that the algebras for $x \mapsto 2^{2^x}$ has no initial object. (My math doesn’t seem to process in comments — what am I don’t wrong?) For functors built out of products and coproducts, I think I see how to prove that the category of algebras is presentable, but for these “exponential” functors I will make no such claim.

• On WordPress you need to add “latex” after the first dollar sign (apparently other blogs need to write ordinary dollar signs sometimes).

• I agree with Theo that the residual finiteness of free groups is mysterious. The only proof that I can think of at the moment is to use the ping-pong lemma to show that $SL_2(\mathbb{Z})$ contains a free subgroup, and then show that $SL_2(\mathbb{Z})$ itself is residually finite, since any element is non-trivial modulo an appropriate prime $p$. Is there a simpler way to do it?

Moreover, the state of affairs with discrete groups often is strange: for example, in finite groups, the equation $a^{-1}b^2a=b^3$ implies that $a^{-1}ba$ commutes with $b$, but this implication fails in some infinite groups! This is a classical result of Baumslag and Solitar; see this shameless self-promotion (p.6, proof of Thm 1(a)) for a proof along the lines of Theo’s comment.

Groupprops has a cute and reasonably short proof: the idea is just to show that one can cook up a suitable homomorphism into a symmetric group. Geometrically one is cooking up a suitable covering space of a wedge of $n$ circles.