*Remark: To forestall empty set difficulties, whenever I talk about arbitrary sets in this post they will be non-empty.*

We continue our exploration of ultrafilters from the previous post. Recall that a (proper) **filter** on a poset is a non-empty subset such that

- For every , there is some such that .
- For every , if then .
- is not the whole set .

If has finite infima (meets), then the first condition, given the second, can be replaced with the condition that if then . (This holds in particular if is the poset structure on a Boolean ring, in which case .) A filter is an **ultrafilter** if in addition it is maximal under inclusion among (proper) filters. For Boolean rings, an equivalent condition is that for every either or , but not both. Recall that this condition tells us that ultrafilters are precisely complements of maximal ideals, and can be identified with morphisms in . If for some set , then we will sometimes call an ultrafilter on an ultrafilter on (for example, this is what people usually mean by “an ultrafilter on “).

Today we will meander towards an ultrafilter point of view on topology. This point of view provides, among other things, a short, elegant proof of Tychonoff’s theorem.

**Some intuitions**

There are several ways to get a more intuitive grasp on what it is, exactly, that ultrafilters are. We saw one of these intuitions already: an ultrafilter on is a maximal consistent deductively closed set of propositions in . If is a set of propositions in , then ultrafilters on containing can be identified with “models of ,” that is, ways to make the statements of all true. (We do not actually need to construct these models because the only thing we care about is which statements are true or false in a model, which is precisely what the ultrafilter tells us!) According to this intuition, the ultrafilter axioms are just familiar properties of propositional logic:

- Conjunction: if and , then ( and ).
- Modus ponens: if and (that is, , or implies ), then .
- Excluded middle: if and only if (not not ).

In his post on ultrafilters, Terence Tao also uses the closely related “voting” intuition. Here we only consider ultrafilters on sets. We think of the set as a collection of voters, and a collection of votes as a subset of (the subset of people who voted “yes” on something). An ultrafilter on is then a consistent way of deciding elections: given a collection of votes on some proposition, the outcome of the election is “yes” if the set of “yes” votes is in and “no” otherwise. According to this intuition, the ultrafilter axioms then say that this voting system does not produce voting paradoxes: the explanations are essentially the same as the ones above if one substitutes “ is true” with “the electorate voted for .”

Terence Tao also uses a third probabilistic, or measure-theoretic, intuition: an ultrafilter on can be thought of as defining a “measure” which assigns to each subset of measure if the subset is in (“almost surely true”) and otherwise (“almost surely false”). Informally, this is a kind of zero-one law. is precisely the morphism associated to the maximal ideal associated to , but here we are thinking of its codomain in a somewhat different way. According to this intuition, the ultrafilter axioms set out several natural properties a “measure” like should have:

- If and , then .
- If and , then .
- if and only if .

Together with the other axioms, the third axiom is actually equivalent to being **finitely additive** (and equal to on the entire set): if are disjoint subsets, then either they all have measure (hence so does their union) or some has measure , hence its complement (hence all of the other subsets) has measure . This is not as strong as countable additivity, so does not actually form a measure.

Nevertheless, finite additivity is a convenient way to think about how ultrafilters work: if is a “random variable” which takes on finitely many values, it breaks up into a disjoint union of finitely many sets, exactly one of which can have measure , hence takes that value “almost surely” (so that value is its “expected value”). In voter-theoretic terms, the electorate can hold elections with any finite number of options, rather than just two.

**Non-principal ultrafilters**

Given a set , there are some obvious ultrafilters we can write down on . These are the ultrafilters consisting of the set of all subsets containing some , or the **principal** ultrafilters. It’s not hard to see that this condition is equivalent to the condition that the ultrafilter has a least element (the intersection of all of its elements), since this least element must be a singleton. Algebraically, principal ultrafilters correspond to principal ideals generated by elements of vanishing at one point and not vanishing anywhere else, and the corresponding morphism is the obvious projection to the coordinate. In voting terms these are “dictatorial” voting systems, in measure-theoretic terms these are Dirac measures supported on points, and in probabilistic terms these are deterministic random variables taking a single value.

When is finite, all ultrafilters are principal. However, when is infinite, we know that is compact, Hausdorff, and totally disconnected. But it’s not hard to see that has the discrete topology, so there must be more points than the elements of ; in other words, there must be **non-principal** ultrafilters (hence morphisms other than the obvious ones) which, as points in the spectrum, give a compactification of the discrete space .

**Proposition:** An ultrafilter on an infinite set is non-principal if and only if it contains the Fréchet filter of cofinite subsets of .

*Proof.* An ultrafilter contains the Fréchet filter if and only if its complement contains all finite subsets of . Since principal ultrafilters contain singletons, no principal ultrafilter can contain the Fréchet filter, and conversely if a principal ultrafilter contains the Fréchet filter it cannot contain any singletons.

At this point, the statements I’ve been making about compactification are probably worth justifying. Recall that given a space , a space equipped with a continuous injection is said to be a **compactification** of if is compact Hausdorff and the image of is dense. In this case, has the property that its topology is determined by the continuous functions , so to show that is dense it suffices to show that a function is determined by its values on . But this is true by definition. So the space of ultrafilters on is in fact a compactification of the discrete space , which sits inside it as the subspace of principal ultrafilters.

In voting terms, if a principal ultrafilter is like having a dictator, a non-principal ultrafilter is like having a “virtual” or “ideal” dictator. This perspective was explained to me, in different words, by Todd Trimble in the comments to the previous post. Given an ultrafilter , imagine playing the following game (which we might call “Find the Dictator”): repeatedly ask yes or no questions of the electorate. Then you know that the set of voters who voted against the outcome of any particular election have no say in any of them; in probabilistic terms, they have measure . So the next question you ask, you only look at the set of voters who do have some say (measure ), and ask them some question. Continuing in this way you end up looking at a smaller and smaller set of voters who have some say in the elections. (In topological terms, these smaller and smaller sets of voters are precisely intersections of neighborhoods of the ultrafilter with the discrete subspace of principal ultrafilters; by density, this essentially determines the neighborhoods.) If is principal, then if you get lucky with the questions you’ll eventually find the dictator. But if is non-principal, there isn’t one! So your chain of questions is attempting to converge to a dictator which isn’t actually there, but which is mysteriously somehow still influencing the election.

**Convergence and Tychonoff’s theorem**

The remarks about convergence above suggest the following definition.

**Definition:** Let be a topological space and let be a filter on . We say that **converges** to (written ) if for every open set containing , there exists such that .

Filter convergence is a generalization of convergence of sequences which is better adapted to spaces which are far from being metric spaces (for example, spaces which aren’t first-countable). For a thorough explanation of the issues here, I recommend Pete Clark’s notes on convergence.

It is useful to extend the definition of filter convergence and clustering to **filter bases**, or **prefilters**, which are nonempty collections of nonempty sets closed under finite intersection. The collection of all supersets of the elements of a prefilter forms a filter, the filter **generated by** the prefilter, and a prefilter converges to a point if and only if the filter it generates converges to .

We can recover convergence of sequences as follows. If is a continuous function and a prefilter on , then is a prefilter on (the **image** or **pushforward** prefilter). If and is the Fréchet filter, then converges to precisely when the sequence defined by converges to in the usual sense. If we want to stick to filters, we can instead take the filter generated by .

In a metric space, the closure of a subset is precisely the set of limits of sequences of elements in the subset. This fails for more general topological spaces, but the corresponding statement is true for filters.

**Proposition:** Let be a topological space, , and . Then the following are equivalent:

- .
- There exists a prefilter on which converges to .
- There exists a filter containing which converges to .
- There exists an ultrafilter containing which converges to .

*Proof.* Let denote the collection of all neighborhoods of . This is the **neighborhood filter** at , and by definition, if and only if converges to . This gives . That follows by taking the filter generated by the prefilter in question, and that follows by taking any ultrafilter containing the filter in question, so it remains to prove that .

Let be an ultrafilter containing and converging to , hence for every open set containing there exists such that . Then does not contain any subset of the complement of , so for any the intersection is nonempty. Moreover, implies that . It follows that every open set containing contains elements of , so as desired.

In voting terms, an ultrafilter containing is an electorate where the voters in are the ones which have power, and it converges to if and only if any neighborhood of contains voters in power. In probabilistic terms, an ultrafilter containing is a “random variable” which is in almost surely, and it converges to if and only if it is in any neighborhood of almost surely.

Recall that the topology of a space is determined by the closure operator , and that conversely any such operator satisfying the Kuratowski closure axioms defines a topology. It follows that the topology of a space is determined by (pre/ultra)filter convergence. In particular, a function is continuous if and only if it preserves convergence of (pre/ultra)filters. As the following proposition shows, this is a convenient way to rephrase certain topological properties.

**Proposition:** Let be a topological space.

- is Hausdorff if and only if every ultrafilter on converges to at most one point.
- is compact if and only if every ultrafilter on converges to at least one point.

*Proof.*

- : Let be an ultrafilter converging to , and let . By assumption, there exist disjoint open sets with . Since , there is such that . It follows that , so cannot converge to . : By assumption, there exist such that every neighborhood of one intersects a neighborhood of another. It follows that any ultrafilter containing the filter (the collection of intersections of neighborhoods of and neighborhoods of ) converges to both and .
- : Given an ultrafilter , the collection of closures of elements of satisfies the finite intersection property, so by assumption there exists which is in the closure of every element of . For any open set containing , the closure of does not contain , hence cannot contain and must contain . It follows that . : Any collection of closed sets with the finite intersection property is contained in an ultrafilter . (In order for a collection of sets to be contained in an ultrafilter it is necessary and sufficient that it have the finite intersection property.) By assumption this ultrafilter converges to some point . Since contains each , it cannot contain their complements (or any subset thereof), so cannot converge to any element of . It follows that .

Note that unlike the case of sequential compactness of metric spaces, we don’t have to pass to a subsequence to find a limit; in some sense the ultrafilter has already done the passing to a subsequence for us. This is one reason why ultrafilters are so convenient in analysis.

In voting terms, an ultrafilter converging to a set of points is a voting system where the elements of are “approximate dictators.” A Hausdorff space of voters is then one for which at most one approximate dictator can occupy the throne, and a compact space of voters is one for which at least one approximate dictator must occupy the throne. In probabilistic terms, an ultrafilter converging to a set of points is a “random variable” which is in a neighborhood of any almost surely. A Hausdorff space is then one for which such a random variable can be at most one such collection of neighborhoods almost surely, and a compact space is one for which such a random variable must be in at least one such collection of neighborhoods almost surely.

Since every ultrafilter on a finite set is principal, I think the above proposition is a elegant way to justify the intuition that compact spaces “behave like” finite discrete spaces. It also manifests a certain duality between the properties of being compact and being Hausdorff. Finally, it helps explains why compact Hausdorff spaces are so special: these are the ones for which every ultrafilter converges to **exactly** one point.

Thus giving the structure of a compact Hausdorff space on a set is equivalent to giving a function describing which ultrafilters converge to which points in a consistent manner. One can think of this as a kind of algebraic structure on whose -ary operations can be identified with ultrafilters on ; we’ll come back to this in a later post.

We will now give the ultrafilter proof of Tychonoff’s theorem. First recall that if are a collection of topological spaces, then the product topology on is the initial topology with respect to the projections . In ultrafilter terms, this means that an ultrafilter on converges to if and only if for all .

**Theorem (Tychonoff):** A product of compact spaces is compact.

*Proof.* Let be an ultrafilter on . Since each space is compact, converges to some non-empty set of points for each . Then converges to precisely the points in , and by the axiom of choice (!) this set is non-empty.

I find this proof much more elegant and easier to follow than the typical proof, which is some combination of the Alexander subbase theorem and messing around with a basis for the product topology. Munkres gave an alternate proof when I took his class, but I don’t remember any of the details. This proof, on the other hand, is perhaps maximally easy to reconstruct.

Note that if we assume in addition that each is Hausdorff, each is a singleton, so we do not need to use the axiom of choice to prove that is non-empty. We only need the ultrafilter lemma (to prove all of the equivalences above), so it follows that Tychonoff’s theorem for compact Hausdorff spaces follows from (hence is equivalent to) the ultrafilter lemma.

**Topology and logic**

Since ultrafilters are, at their heart, logical constructions, what we have described above is a tantalizing connection between topology and logic. There appear to be several such connections. There is, for example, this answer and this answer to Minhyong Kim’s MO question about open sets, which identifies topology with an idealization of measurement and with an idealization of computation, respectively, both of which appear to be related to what I’ve said above. There is also an interpretation of the open sets of a topological space as a Heyting algebra, which is related to intuitionistic logic. Perhaps this all fits into a cohesive picture which I’ll understand one day.

on December 14, 2010 at 12:41 am |Ultrafilters in Ramsey theory « Annoying Precision[...] Comments Ultrafilters in topo… on Boolean rings, ultrafilters, a…Peter Cameron on Empty setsTodd Trimble on [...]

on December 15, 2010 at 9:30 am |PeterI think I once looked up that the earliest introduction of ultrafilters is found in topology (I think Cartan in the 30s?) to get convergence defined properly. So even though you could consider them creatures of logic, their origin is more topological.

on December 16, 2010 at 1:37 am |Pete L. ClarkDear Peter,

Yes, you are remembering correctly. Bibliographic data for Cartan’s paper (Comptes Rendus, 1937) on filters is given in my notes on convergence that Qiaochu links to above:

http://math.uga.edu/~pete/convergence.pdf

Of course in many ways (and probably in others yet to be discovered) topology, algebra and logic are three sides of the same coi….er, everyday object with at least three sides. Moreover, once you learn about ultrafilters, you start seeing traces of them everywhere: when combined with the fact that probably the majority of research mathematicians live in blissful ignorance of their existence, it’s a pretty strange situation.

on December 16, 2010 at 7:11 am |PeterPete, thank you for the link. I agree that mathematicians are too unaware of their (terrible) beauty (non-measurable sets anyone?). From my point of view I would add that even those aware are not aware how different ultrafilters can be.

Take for example non-standard arguments using ultrapowers. Outside of logic and set theory it seems to be enough to consider an ultrapower by any odd free ultrafilter. At most, I have seen a selective or a p-point being used (which reminds me of another proof of Ramsey’s Theorem via forcing).

Do you know of any example where specific properties of an ultrafilter are used? I can only think of minimal idempotent ultrafilters in ergodic theory.

on August 30, 2014 at 4:55 am |Mozibur UllahThere are actually three sides to a *real* coin – top, bottom & the circular rim;of course with the idealised coins that mathematicians like to play with there are only two!

on April 13, 2013 at 4:40 am |AnonymousHi,

I know this post is a bit old but I am curious to know how ultrafilters are useful. In particular, did you learn ultrafilters while learning point-set topology for the first time ? I haven’t studied topology in detail yet except for that used in real analysis and I am trying to select a textbook- there are so many texts out there !

Thanks!

on April 13, 2013 at 3:23 pm |Qiaochu YuanNo, I didn’t learn ultrafilters while learning point-set topology for the first time. The first time it’s probably unnecessary to learn about ultrafilters. They’re useful elsewhere, though (e.g. in model theory).

on July 15, 2014 at 2:18 pm |André CaldasI think Anonymous meant:

- Did you lear about ultrafilters for the first time in a topological context? :-)

on April 14, 2013 at 5:13 am |AnonymousThanks for taking the time to reply to my question!

on June 9, 2013 at 12:40 pm |Operations and Lawvere theories | Annoying Precision[…] is just the functor which associates to a set the set of ultrafilters on it (see, for example, this previous post). It follows that the ultrafilters on a set give us the -ary operations on compact Hausdorff […]