Feeds:
Posts

## Coxeter groups

At SPUR this summer I’ll be working on the Kazhdan-Lusztig polynomials, although my mentor and I haven’t quite pinned down what problem I’m working on. I thought I’d take the chance to share some interesting mathematics and also to write up some background for my own benefit. I’ll mostly be following the second half of Humphreys.

A Coxeter system $(W, S)$ is a group $W$ together with a generating set $S$ and presentation of the form

$\langle s_1, ... s_n | (s_i s_j)^{m(i, j)} = 1 \rangle$

where $m(i, j) = m(j, i), m(i, i) = 1$, and $m(i, j) \ge 2, i \neq j$. (When there is no relation between $s_i, s_j$, we write this as $m(i, j) = \infty$.) The group $W$ is a Coxeter group, and is usually understood to come with a preferred presentation, so we will often abuse terminology and use “group” and “system” interchangeably. $S$ is also referred to as the set of simple reflections in $W$, and $n$ the rank. (We will only consider finitely-generated Coxeter groups.)

Historically, Coxeter groups arose as symmetry groups of regular polytopes and as Weyl groups associated to root systems, which in turn are associated to Lie groups, Lie algebras, and/or algebraic groups; the former are very important in understanding the latter. John Armstrong over at the Unapologetic Mathematician has a series on root systems. In addition, for a non-technical overview of Coxeter groups and $q$-analogues, I recommend John Baez’s week184 through week187. The slogan you should remember is that Weyl groups are “simple algebraic groups over $\mathbb{F}_1$.”

Coxeter diagrams

A convenient way of describing a Coxeter system $(W, S)$ is through the construction of its Coxeter diagram. This is a graph whose vertices are the elements of $S$ and where there is an edge between $s_i$ and $s_j$ with label $m(i, j)$ if $m(i, j) \ge 3$, although traditionally edges with label $3$ are left unlabeled. This is because $m(i, j) = 2$ if and only if $s_i$ and $s_j$ commute. It follows that if the Coxeter diagram of a Coxeter system is disconnected, $W$ is the direct product of the Coxeter groups associated to each connected component of the Coxeter diagram. A Coxeter system is called irreducible if its Coxeter diagram is connected, and from now on we will only talk about irreducible Coxeter systems.

Alternating subgroups

Thanks to the nature of the relations in a Coxeter group, the assignment $s_i \mapsto -1$ always gives a homomorphism $\varepsilon : W \to \{ +1, -1 \}$ for any Coxeter group. This is called the sign homomorphism, and reflects the fact that the generators $s_i$ are supposed to act as reflections. The kernel of the sign homomorphism is called the alternating subgroup of $W$. One might think of it as the “orientation-preserving” elements of $W$.

Examples

The only Coxeter group of rank $1$ is $\mathbb{Z}/2\mathbb{Z}$, so that’s not very interesting.

The irreducible Coxeter groups of rank $2$ are precisely the finite and infinite dihedral groups. When $m(1, 2) \ge 3$ is finite, $s_1$ can be thought of as a reflection across, say, the $x$-axis in $\mathbb{R}^2$, and $s_2$ can be thought of as a reflection across the line $y = x \tan \frac{2\pi}{m(1, 2)}$. When $m(1, 2)$ is infinite, $s_2$ can be thought of as a reflection across a line which makes an angle to the $x$-axis which is not a rational multiple of $\pi$. The sign homomorphism is the determinant of the corresponding linear transformation on $\mathbb{R}^2$, and the alternating subgroup is the subgroup of rotations.

An important class of finite Coxeter groups occurs when $m(i, i+1) = 3$ and $m(i, j) = 2, |i - j| \ge 2$; their Coxeter diagrams are the path graphs. The resulting Coxeter group of rank $n-1$ is none other than the symmetric group $S_n$ with generators $s_i = (i, i+1)$ (and this is where the sign homomorphism and the alternating subgroup get their name), and it is the Weyl group associated to the general linear group and its relatives. This relationship is ultimately responsible for the identity

$\displaystyle \sum_{g \in S_n} q^{\text{inv}(g)} = (n)!_q$

where the LHS counts permutations according to their inversions and the RHS is the $q$-factorial, one interpretation of which is the number of maximal flags of $\mathbb{F}_q^n$. This relationship generalizes to other Weyl groups, but we won’t pursue this connection here.

Note that $m(i, i+1) = 3$ is equivalent to the braid relation $s_i s_{i+1} s_i = s_{i+1} s_i s_{i+1}$. It is instructive to prove this by writing out permutations in “graphical notation,” as this is the fundamental relation describing the braid group. (One can also associate generalized braid groups to any Coxeter system.)

A particularly interesting class of examples of Coxeter systems are the Coxeter systems of rank $3$, which are (more or less) the triangle groups (and the corresponding alternating subgroups are the von Dyck groups). Given a rank $3$ Coxeter system, construct a triangle with angles $\frac{\pi}{m(1, 2)}, \frac{\pi}{m(2, 3)}, \frac{\pi}{m(1, 3)}$. (The triangle does not necessarily live in the Euclidean plane; this construction requires the triangle to be curved in general.) Then the reflections across each side of the triangle generate a group isomorphic to the corresponding Coxeter group, essentially because of our earlier discussion about dihedral groups. One can think of these reflections as instructions for patching together copies of the triangle. Eventually, one of three things will happen in this process.

1. If the triangle’s angles add up to $\pi$, the copies of the triangle will form a tessellation of $\mathbb{R}^2$. This only occurs in finitely many cases, which are described in the Wikipedia article.
2. If the triangle’s angles add up to greater than $\pi$, the copies of the triangle will form a tessellation of the sphere. This is related to the McKay correspondence, and again the cases are described in the Wikipedia article.
3. If the triangle’s angles add up to less than $\pi$, the copies of the triangle will form a tessellation of the hyperbolic plane. This is the “generic” case, since it occurs whenever $m(1, 2), m(2, 3), m(1, 3)$ are sufficiently large.

This division between Euclidean, spherical, and hyperbolic geometry is a common theme throughout mathematics; it is particularly fundamental, for example, in the uniformization theorem for surfaces.

Of particular interest outside of the theory of Coxeter groups is the $(2, 3, 7)$ triangle group. By a special case of the Gauss-Bonnet theorem, the area of a hyperbolic triangle is proportional to the amount by which its angle sum is less than $\pi$. Among the hyperbolic triangles above, the one of maximal angle sum, hence minimal area, has angle measures $\frac{\pi}{2}, \frac{\pi}{3}, \frac{\pi}{7}$ (exercise!). This observation turns out to imply, at least heuristically, Hurwitz’s theorem on the automorphisms of a Riemann surface of genus greater than $1$. In particular, the automorphism group of a Hurwitz surface is a finite quotient of the alternating subgroup of the $(2, 3, 7)$ triangle group. (On that note, there is a beautiful paper by Noam Elkies, the Klein quartic in number theory, on the most famous Hurwitz surface.)

Also of particular interest is the triangle group $(2, 3, \infty)$, which has the property that one of the vertices of the corresponding hyperbolic triangle is “at infinity” and consequently has hyperbolic angle zero. It is in fact isomorphic to $\text{PGL}_2(\mathbb{Z})$. With relations $m(1, 2) = 3, m(2, 3) = \infty, m(1, 3) = 2$, the isomorphism is given by

$\displaystyle s_1 \mapsto \left[ \begin{array}{cc} 0 & 1 \\ 1 & 0 \end{array} \right], s_2 \mapsto \left[ \begin{array}{cc} -1 & 1 \\ 0 & 1 \end{array} \right], s_3 \mapsto \left[ \begin{array}{cc} -1 & 0 \\ 0 & 1 \end{array} \right].$

Its alternating subgroup $\text{PSL}_2(\mathbb{Z})$ is the infamous modular group, whose action on the hyperbolic plane we’ve discussed before.

The geometric representation

So far we know nothing about an arbitrary Coxeter group except that its order is at least two. In particular, we we don’t know that $s_i s_j$ actually has order exactly $m(i, j)$. This will be rectified by constructing a particularly well-behaved linear representation of an arbitrary Coxeter group.

The geometric representation of a Coxeter group $(W, S)$ is defined as follows. Let $V$ be the free vector space on the basis $\{ \alpha_s | s \in S \}$. Define a bilinear form on $V$ by extending

$\displaystyle B(\alpha_{s_i}, \alpha_{s_j}) = - \cos \frac{\pi}{m(i, j)}$

bilinearly. We are going to represent $s_i$ as a “reflection” across the hyperplane orthogonal to $\alpha_{s_i}$, and in that case the above bilinear form is precisely what the inner product “should be” in order to agree with what we said above about dihedral groups. ($s_i$ won’t actually act as a reflection in the usual sense unless $B$ is positive-definite.) We now define

$\displaystyle \sigma_s v = v - 2B(\alpha_s, v) \alpha_s, s \in S, v \in V$.

By construction, $\sigma_s \alpha_s = - \alpha_s$ and $\sigma_s v = v$ for all $v$ orthogonal to $\alpha_s$; exactly what a reflection would do (again, if $B$ were positive-definite). It follows that $\sigma_s$ has order $2$. It’s also not hard to show that $\sigma_s$ preserves the bilinear form, hence so does any element in the group generated by the $\sigma_s$ in $\text{GL}(V)$. But we don’t know yet whether this group is actually $W$.

Claim: $(\sigma_{s_i} \sigma_{s_j})^{m(i, j)} = 1$.

Proof. Restrict to the subspace $U$ spanned by $s_i, s_j$. Computation shows that $B$ is positive-semidefinite when restricted to $U$, and in particular is positive-definite if and only if $m(i, j) < \infty$ (exactly as one would expect from our discussion of dihedral groups above). It also shows that $\sigma_{s_i}, \sigma_{s_j}$ fix $U$. When $m(i, j) < \infty$, we compute that $B(\alpha_{s_i}, \alpha_{s_j}) = - \cos \frac{\pi}{m(i, j)}$, which implies that $\sigma_{s_i} \sigma_{s_j}$ is the rotation through $\frac{2\pi}{m(i, j)}$ radians. When $m(i, j) = \infty$, we compute that

$\displaystyle (\sigma_{s_i} \sigma_{s_j})^k \alpha_{s_i} = 2k(\alpha_{s_i} + \alpha_{s_j}) + \alpha_s$

from which it follows that $\sigma_{s_i} \sigma_{s_j}$ has infinite order as desired.

It follows that there is a unique homomorphism $\sigma : W \to \text{GL}(V)$ sending $s$ to $\sigma_s$, hence that $s_i s_j$ really does have order $m(i, j)$ for all $i, j$ in $W$. However, we do not yet know whether $\sigma$ has trivial kernel. We will show this in the next section.

(Aside: we do know that if $B$ is positive-definite, the geometric representation actually maps into the orthogonal group, so the corresponding Coxeter groups (which turn out to be precisely the finite Coxeter groups) act on a sphere. In the rank $3$ case, this is precisely the action of the spherical triangle groups described above. In the “simply laced” case where $m(i, j) \in \{ 1, 2, 3 \}$ (hence the corresponding Coxeter diagram has only unlabeled edges), it’s not hard to see that $2B = 2 - A$ where $A$ is the adjacency matrix of the Coxeter diagram, hence that $B$ is positive-definite if and only if $A$ has spectral radius less than $2$. This is the connection which is responsible for the ADE classification showing up both here and in the McKay correspondence, and in fact one can classify the finite Coxeter groups by a process almost exactly like that used in the linked post, except that we must take labels into account. The affine Dynkin diagrams that occur there are associated to the simply laced affine Weyl groups, which are the ones where $B$ is positive-semidefinite and has corank $1$ (in the simply laced case, the ones where $A$ has spectral radius exactly $2$), which are in turn associated to affine Lie algebras.)

The length function

Since all the generators square to the identity, we know that any element of $W$ can be written as a word in the generators, without the need for inverses, such that no two consecutive generators are the same. Let the length $\ell(w)$ of an element $w \in W$ denote the minimal length of such a word. When $W$ is a Weyl group, the length of a word corresponds to the dimension of the corresponding cell in the Bruhat decomposition of the corresponding flag variety. (This is merely by way of motivation and has nothing to do with how we will use the length function.)

Example. The length function on $S_n$ is the number of inversions.

Below is a list of elementary properties of the length function.

1. $\ell(w) = \ell(w^{-1})$.
2. $\ell(w) = 1$ if and only if $w \in S$.
3. $\ell(w w') \le \ell(w) + \ell(w')$ (the triangle inequality).
4. $\ell(w w') \ge \ell(w) - \ell(w')$.
5. $\ell(w) - 1 \le \ell(ws) \le \ell(w) + 1$ for all $s \in S, w \in W$.
6. $\varepsilon(w) = (-1)^{\ell(w)}$.

It follows that $\ell(ws)$ is always either equal to $\ell(w) + 1$ or to $\ell(w) - 1$. To determine which is the case, we turn to the geometric representation. First, we declare by convention that $w(\alpha_s) = \sigma_w(\alpha_s)$ in order to simplify our notation.

The root system of $(W, S)$ in its geometric representation $V$ is the set of all vectors of the form $w(\alpha_s), w \in W, s \in S$. (This notion of root system is more general than the notion used in Lie theory.) Any root $\alpha$ can be written in the form

$\displaystyle \alpha = \sum_{s \in S} c_s \alpha_s, c_s \in \mathbb{R}$

where, to distinguish them from the other roots, we will refer to the $\alpha_s$ as the simple roots. Call a root positive if $c_s \ge 0$ for all $s$, and write $\alpha > 0$. Similarly, call a root negative if $c_s \le 0$ for all $s$, and write $\alpha < 0$. Let $\Phi^{+}, \Phi^{-}$ denote the positive and negative roots, respectively.

If $I \subset S$, let $W_I$ denote the parabolic subgroup of $W$ generated by the elements of $I$ and let $\ell_I(w)$ denote the corresponding length function, with $w \in W_I$. (The name comes from the corresponding notion in the theory of algebraic groups.) Although we have not proven it yet, it will turn out that the restriction of the presentation of $W$ to $W_I$ makes $W_I$ a Coxeter group and that $\ell_I(w) = \ell(w)$. For now, all we know is that $\ell_I(w) \ge \ell(w)$.

Theorem: If $\ell(ws) > \ell(w)$, then $w(\alpha_s) > 0$. (Consequently, if $\ell(ws) < \ell(w)$, then $w(\alpha_s) < 0$.)

Proof. We proceed by induction on the length of $w$. The statement is obvious when $\ell(w) = 0, 1$. In general, let $s'$ be a generator that appears as the last letter in a reduced word for $w$; then $\ell(ws') < \ell(w)$ and $s' \neq s$. We want to write $w$ as a product of two words in a sufficiently clever way that we can apply the inductive hypothesis. To this end, let $I = \{ s, s' \}$ and let $A$ be the set of all words $a$ such that $wa^{-1} \in W_I$ and $\ell(w) = \ell(a) + \ell_I(wa^{-1})$. Clearly $w \in A$, so $A$ is nonempty. Let $v$ be an element in $A$ of minimal length, and write

$\displaystyle w = v v_I, v_I \in W_I$.

Since $ws' \in A$, it follows that $\ell(v) \le \ell(w) - 1$, so we will be able to apply the inductive hypothesis to $v$ once we know that $\ell(vs) > \ell(v)$.

Suppose to the contrary that that $\ell(vs) < \ell(v)$. Then we can argue as follows. By the triangle inequality,

$\displaystyle \ell(w) \le \ell(vs) + \ell(sv^{-1} w)$.

The hypothesis implies that $\ell(vs) = \ell(v) - 1$, whereas the fact that $sv^{-1} w \in W_I$ implies that $\ell(sv^{-1} w) \le \ell_I(sv^{-1} w) \le \ell_I(v^{-1} w) + 1$. But this implies

$\ell(w) \le \ell(v) + \ell_I(v^{-1} w) = \ell(w)$.

It follows that the first inequality must be an equality, hence that $vs \in A$. But $v$ has minimal length in $A$; contradiction. Hence $\ell(vs) > \ell(v)$, and the inductive hypothesis applies to $v$. By induction, it follows that $v(\alpha_s) > 0$. The same argument shows that $\ell(vs') > \ell(v)$, hence that $v(\alpha_{s'}) > 0$. It now remains to show that $v_I$ maps $\alpha_s$ to a non-negative linear combination of $\alpha_s, \alpha_{s'}$.

We now claim that $\ell_I(v_I s) \ge \ell_I(v_I)$. Otherwise, we get

$\displaystyle \ell(ws) \le \ell(v) + \ell(v^{-1} ws) \le \ell(v) + \ell_I(v_I s) < \ell(v) + \ell_I(v_I) = \ell(w)$

which contradicts the inductive hypothesis. It follows that a reduced expression for $v_I$ in $W_I$, which can only be an alternating product of $s, s'$, must end in $s'$. The rest is computation and casework.

Case: $m(s, s') = \infty$. Then $B(\alpha_s, \alpha_{s'}) = -1$, hence $s'(\alpha_s) = \alpha_s + 2 \alpha_{s'}, ss'(\alpha_s) = 2 \alpha_s + 3 \alpha_{s'}$, and so forth.

Case: $m(s, s') = m < \infty$. It follows that $v_I$ is a product of less than $\frac{m}{2}$ terms of the form $ss'$ together with possibly an $s'$ in front. Now, observe that $ss'$ rotates $\alpha_s$ an angle of $\frac{2\pi}{m}$ towards $\alpha_{s'}$, hence less than $\frac{m}{2}$ such rotations keep us within the cone spanned by $\alpha_s, \alpha_{s'}$. It's also not hard to see (draw a picture!) that the extra reflection caused by a starting $s'$ also keeps us within this cone.

Corollary: Every root is either positive or negative.

Corollary: The geometric representation $\sigma : W \to \text{GL}(V)$ is faithful.

Proof. Suppose $w$ is not the identity but is in the kernel of $\sigma$. Then there exists $s$ such that $\ell(ws) < \ell(w)$, hence $w(\alpha_s) < 0$. But the identity fixes $\alpha_s$; contradiction.