In order to continue our discussion of symmetric functions it will be useful to have some group representation theory prerequisites, although I will use many of the results in the representation theory of the symmetric groups as black boxes. I had planned on using this post to discuss Frobenius reciprocity, but got so carried away with motivating it that this post now stands alone.
Today I’d like to discuss the representation theory of finite groups over . As these are strong assumptions, the resulting theory is quite elegant, but I always found the proofs a little unmotivated, so I’m going to try to use the categorical perspective to fix that. Admittedly, I don’t have much experience with this kind of thing, so this post is for my own benefit as much as anyone else’s. The main focus of this post is motivating the orthogonality relations.
When talking about a subject as popular as representation theory, a math blogger runs the risk of repeating material that has been thoroughly exposited on other blogs. Here are some of the posts I will try to avoid repeating, at least not without citation: John Armstrong has written about the general category-theoretic picture, hilbertthm90 has some posts about the basic results of group representation theory, and Akhil Mathew over at Delta Epsilons has also written about group representations. (For the sake of completeness, there are also the higher-level representation-theoretic posts at the Secret Blogging Seminar and at Concrete Nonsense.)
Let be a finite group regarded as a one-object category and let denote the category of finite vector spaces (here, over ) and linear operators between them. The finite-dimensional representations of a group form a functor category whose objects are functors . We will use capital uppercase letters such as to denote the underlying vector space of a representation and Greek letters such as to denote the function that sends an element of to a linear transformation , and we will often name a representation by its underlying vector space. Concretely, a representation of on is a homomorphism (in ; from here on in, should be interpreted as being in ).
The morphisms in are natural transformations of functors. Concretely, a morphism between two representations of the same group on vector spaces is a linear transformation such that . To avoid confusion regarding what “morphism” refers to we will call such a map an intertwining operator. Two representations are isomorphic if there is an invertible intertwining operator between them.
The category is quite rigid. One can think of it as “with extra structure” (just as, say, groups are sets with extra structure), and from that perspective inherits the structure of an additive category, so it has a biproduct in the form of the direct sum of representations defined by setting . Additivity also implies that its Hom-sets are enriched over . (All this means is that intertwining operators from to form a vector space.) The universal properties of the biproduct imply that for any three representations one has natural identifications
Anticipating later discussion we will write these as
Thus the functor is “bilinear.”
Concretely, let’s talk about how this works for first. The direct sum is characterized by the fact that every vector in it is uniquely representable as a sum where . (I’m abusing notation here, but it should be clear what I mean by this.) It is this uniqueness that is the key to its universal properties. If we ignore all the representation stuff and just consider linear operators, then:
- Every linear operator satisfies , and must be a linear operator on while must be a linear operator on . Conversely, any pair of linear operators describes an operator . Hence the direct sum is a finite coproduct in .
- Every linear operator has the property that the image can be uniquely written as the sum of two functions , which must themselves be linear operators. Conversely, any pair of linear operators describes a linear operator . Hence the direct sum is a finite product in .
But it’s easy to verify that above arguments continue to hold if is required to be an intertwining operator rather than a mere linear operator; the operators must all in fact be intertwining.
A subobject in is a subrepresentation, or a subspace invariant under the action of , and a representation is irreducible if it has no nontrivial subrepresentations. A second source of rigidity in comes from Schur’s lemma, which can be stated (over ) as follows: if are irreducible, then
Thus irreducible representations are “orthonormal.”
The proof is short enough to give here: the kernel of any intertwining operator must be -invariant, hence it forms a subrepresentation of , which by irreducibility is either all of or . In the latter case are isomorphic, so one can regard an intertwining operator as an endomorphism of . Over , this endomorphism has an eigenvector with some eigenvalue , and the nullspace of is again -invariant, hence all of .
Combining Schur’s lemma and “bilinearity” gives the following result.
Proposition: Let be non-isomorphic irreducible complex representations of a group , and let be semisimple, i.e. direct sums of irreducible representations where indicates that the irreducible representation occurs as a summand times. Then
If the point hasn’t already been made, the functor in has the remarkable property that it behaves like a categorified inner product. I believe some people call this structure a 2-Hilbert space, and I really like the circle of ideas here. As John Baez writes,
The inner product of a Hilbert space is a bilinear map
taking each pair of elements to the inner product . Here denotes the conjugate of the Hilbert space . Similarly, the hom functor in a category is a bifunctor
taking each pair of objects to the set of morphisms from to . This analogy clarifies the relation between category theory and quantum theory that is so important in topological quantum field theory. In quantum theory the inner product is a number representing the amplitude to pass from to , while in category theory is a set of morphisms passing from to .
This is a beautiful insight; I think intuition from physics is a fundamental tool in mathematics and I wish I knew more examples as nice as this.
Another important source of rigidity in is Maschke’s theorem, which states that every representation of a finite group is semisimple. As a corollary, the previous results now completely describe the morphisms in . I’m somewhat unsatisfied with the standard proof of Maschke’s theorem, so I’ll delay it until the end; it interrupts the flow of the post somewhat.
The orthogonality relations and character theory
The character theory of finite groups allows us to compute using surprisingly little information about the representations themselves. Associated to any representation is a function called the character of the representation. Given two characters of irreducible representations the orthogonality relations state that
By bilinearity it then follows that . In other words, in the case of we can in fact associate a concrete inner product with our categorified inner product. The proofs of this result that I have seen do not mention explicitly the connection to and are annoyingly computational, so I would like to give a more conceptual (read: functorial) proof.
As a way of motivating it, here is a proof for permutation representations. Let act on two finite sets and consider the corresponding representations on , which have characters . By the orbit-counting lemma,
is precisely the number of orbits in the product group action on . Thus it remains to prove the following.
Proposition: The intertwining operators have a basis which is in one-to-one correspondence with orbits of the action of on .
Proof. Given any orbit of there is an intertwining operator which sends to and sends everything else to zero. Conversely, suppose is an intertwining operator with matrix (where we pick the basis given by the elements of ). Then the intertwining condition requires that, working through the definition of matrix multiplication,
or equivalently that is constant on orbits of . The conclusion follows.
A proof for general representations
Let’s generalize the above proof. The generalization of the product group action is the (internal) tensor product of representations,
but defining it properly requires recognizing a subtlety I usually ignore. (Edit, 8/30/09: The text that used to be here has been moved to where it is more relevant; see Theo’s comment below.)
The tensor product
of a right representation and a left representation is a left representation defined on the tensor product of the underlying vector spaces with the action given by the linear extension of . It satisfies the following universal property: any bilinear function preserving the action of in the sense that must factor through the tensor product, i.e. must come from an intertwining operator (regarded as the trivial representation). As expected from a generalization of the product action, the tensor product has character .
The generalization of the orbit-counting lemma is the following special case of the orthogonality relations: for any representation with character , counts the number of copies of the trivial representation in . Since adding a copy of the trivial representation increases the value of by (in the permutation picture, every permutation gains a fixed point), it suffices to show that for a non-trivial irreducible representation one has
One way to prove this is as follows: the image of the linear transformation is a -invariant subspace of dimension at most and hence (under the assumption of non-triviality) must be zero. I dislike this proof for the same reason that I dislike the usual proof of Maschke’s theorem, but I’ll let this one slide.
There is an additional step necessary to account for conjugate linearity. The definition of a representation we gave above is equivalent to giving a left action satisfying . We can also define a right action satisfying ; note that the order of multiplication is reversed. (Equivalently, a right action of is a left action of the opposite group .)
Any left representation of a group on a finite-dimensional vector space naturally (that is, functorially) defines a right dual representation on the dual space , as follows: there is a unique representation preserving the natural pairing in the sense that . In matrix form, if one thinks of the natural pairing as multiplying row and column vectors, , but acting on row vectors instead of column vectors; thus if we want to think of the dual representation as a left representation we should take the transpose of , which recovers the usual order of multiplication. Because every element of a finite group has finite order, is diagonalizable and has eigenvalues consisting of roots of unity for every , and roots of unity have the property that their inverses are their conjugates; in other words, the dual representation has character the conjugate of the character of . (This is essentially because representations of finite groups over are isomorphic to unitary representations, which is most of the proof of Maschke’s theorem; more on this later.)
It now follows that for two representations the inner product is equal to the number of copies of the trivial representation in the representation . Thus it remains to prove the following.
Proposition: is isomorphic to the subspace of consisting of the vectors fixed by all of .
Proof. Suspend the representation theory and again focus on the linear algebra. Any pure tensor naturally defines a linear operator defined by letting act on and multiplying the result by . Consequently one can show by induction or by picking a basis that every linear operator is a linear combination of such pure tensors.
Now we reintroduce the representation theory. Generalizing the permutation proof, the condition that every element of fixes an element of is precisely the condition that the corresponding linear operator satisfies ! (In other words, one way to motivate the definition of both the tensor and dual representations is that this should be true.) This completes the proof.
A corollary of the orthogonality relations is the surprising fact that a representation is determined by its character (up to isomorphism), since its decomposition into irreducible representations can be computed from the character of alone. Since characters are invariant under conjugation, another corollary is that the number of irreducible representations is exactly the number of conjugacy classes of .
More general than the notion of representation of a group is a representation of an associative algebra, which can be defined as an algebra homomorphism (since has the structure of an algebra with the multiplication given by composition). For the case of finite groups the relevant algebra is the group algebra , which consists of formal linear combinations of the elements of which multiply by the group operation. From this perspective, one way to think about the representation theory of finite groups is in terms of special properties enjoyed by group algebras.
Even more general than the notion of a representation of an associative algebra is that of a module for a ring. The module-theoretic perspective clarifies many of the above constructions, since the category of (left) modules over a fixed ring directly generalizes the category of vector spaces over a field. An intertwining operator of -representations is precisely a homomorphism of -modules, since the requirement that “scalar multiplication” be respected is precisely the requirement that the action of be respected. is an additive category for any ring , so one gets the “bilinearity” of the functor in a very general context. And the notion of irreducible representation (as well as the proof of Schur’s lemma) carries through to simple modules in general.
However, neither internal tensor products nor dual representations are defined for modules over general rings. Their existence is due to the fact that group algebras carry a lot of extra structure: they are in fact Hopf algebras. The comultiplication gives us the internal tensor product and the antipode gives us duals. The Wikipedia article explains this quite well.
The two major tools left to be explained are Maschke’s theorem and characters. Maschke’s theorem is essentially a compactness result. It generalizes straightforwardly to compact groups as follows: given a representation of a compact group on a Hilbert space with inner product one uses the Haar integral to average over , obtaining a new inner product which is -invariant. -invariant inner products have the property that the orthogonal complement of a -invariant subspace (that is, a subrepresentation) is also -invariant, so it follows by induction that every finite-dimensional representation is semisimple, since any finite-dimensional vector space can be turned into a Hilbert space. (The standard results are much stronger than this, but I don’t know much about them.) As a corollary, every compact subgroup of is conjugate to a subgroup of the unitary group ; in other words, the unitary group is an essentially unique maximal compact subgroup of .
Characters, on the other hand, are still mysterious to me. There is a functorial definition of the trace based on the identification of with , then extending the natural bilinear pairing , but I can’t come up with a particularly good reason why this should accurately reflect the structure of a representation. If anyone has any comments on this, they would be much appreciated.