In order to continue our discussion of symmetric functions it will be useful to have some group representation theory prerequisites, although I will use many of the results in the representation theory of the symmetric groups as black boxes. I had planned on using this post to discuss Frobenius reciprocity, but got so carried away with motivating it that this post now stands alone.
Today I’d like to discuss the representation theory of finite groups over . As these are strong assumptions, the resulting theory is quite elegant, but I always found the proofs a little unmotivated, so I’m going to try to use the categorical perspective to fix that. Admittedly, I don’t have much experience with this kind of thing, so this post is for my own benefit as much as anyone else’s. The main focus of this post is motivating the orthogonality relations.
When talking about a subject as popular as representation theory, a math blogger runs the risk of repeating material that has been thoroughly exposited on other blogs. Here are some of the posts I will try to avoid repeating, at least not without citation: John Armstrong has written about the general category-theoretic picture, hilbertthm90 has some posts about the basic results of group representation theory, and Akhil Mathew over at Delta Epsilons has also written about group representations. (For the sake of completeness, there are also the higher-level representation-theoretic posts at the Secret Blogging Seminar and at Concrete Nonsense.)
Intertwining operators
Let be a finite group regarded as a one-object category and let
denote the category of finite vector spaces (here, over
) and linear operators between them. The finite-dimensional representations of a group
form a functor category
whose objects are functors
. We will use capital uppercase letters such as
to denote the underlying vector space of a representation and Greek letters such as
to denote the function that sends an element of
to a linear transformation
, and we will often name a representation by its underlying vector space. Concretely, a representation of
on
is a homomorphism
(in
; from here on in,
should be interpreted as being in
).
The morphisms in are natural transformations of functors. Concretely, a morphism between two representations
of the same group on vector spaces
is a linear transformation
such that
. To avoid confusion regarding what “morphism” refers to we will call such a map an intertwining operator. Two representations are isomorphic if there is an invertible intertwining operator between them.
The category is quite rigid. One can think of it as
“with extra structure” (just as, say, groups are sets with extra structure), and from that perspective
inherits the structure of an additive category, so it has a biproduct in the form of the direct sum of representations
defined by setting
. Additivity also implies that its Hom-sets are enriched over
. (All this means is that intertwining operators from
to
form a vector space.) The universal properties of the biproduct imply that for any three representations
one has natural identifications
.
Anticipating later discussion we will write these as
.
Thus the functor is “bilinear.”
Concretely, let’s talk about how this works for first. The direct sum
is characterized by the fact that every vector in it is uniquely representable as a sum
where
. (I’m abusing notation here, but it should be clear what I mean by this.) It is this uniqueness that is the key to its universal properties. If we ignore all the representation stuff and just consider linear operators, then:
- Every linear operator
satisfies
, and
must be a linear operator on
while
must be a linear operator on
. Conversely, any pair of linear operators
describes an operator
. Hence the direct sum is a finite coproduct in
.
- Every linear operator
has the property that the image
can be uniquely written as the sum of two functions
, which must themselves be linear operators. Conversely, any pair of linear operators
describes a linear operator
. Hence the direct sum is a finite product in
.
But it’s easy to verify that above arguments continue to hold if is required to be an intertwining operator rather than a mere linear operator; the operators
must all in fact be intertwining.
A subobject in is a subrepresentation, or a subspace invariant under the action of
, and a representation is irreducible if it has no nontrivial subrepresentations. A second source of rigidity in
comes from Schur’s lemma, which can be stated (over
) as follows: if
are irreducible, then
.
Thus irreducible representations are “orthonormal.”
The proof is short enough to give here: the kernel of any intertwining operator must be
-invariant, hence it forms a subrepresentation of
, which by irreducibility is either all of
or
. In the latter case
are isomorphic, so one can regard an intertwining operator
as an endomorphism of
. Over
, this endomorphism has an eigenvector with some eigenvalue
, and the nullspace of
is again
-invariant, hence all of
.
Combining Schur’s lemma and “bilinearity” gives the following result.
Proposition: Let be non-isomorphic irreducible complex representations of a group
, and let
be semisimple, i.e. direct sums of irreducible representations
where
indicates that the irreducible representation
occurs as a summand
times. Then
.
If the point hasn’t already been made, the functor in
has the remarkable property that it behaves like a categorified inner product. I believe some people call this structure a 2-Hilbert space, and I really like the circle of ideas here. As John Baez writes,
The inner product of a Hilbert space is a bilinear map
taking each pair of elements
to the inner product
. Here
denotes the conjugate of the Hilbert space
. Similarly, the hom functor in a category
is a bifunctor
taking each pair of objects
to the set
of morphisms from
to
. This analogy clarifies the relation between category theory and quantum theory that is so important in topological quantum field theory. In quantum theory the inner product
is a number representing the amplitude to pass from
to
, while in category theory
is a set of morphisms passing from
to
.
This is a beautiful insight; I think intuition from physics is a fundamental tool in mathematics and I wish I knew more examples as nice as this.
Another important source of rigidity in is Maschke’s theorem, which states that every representation of a finite group is semisimple. As a corollary, the previous results now completely describe the morphisms in
. I’m somewhat unsatisfied with the standard proof of Maschke’s theorem, so I’ll delay it until the end; it interrupts the flow of the post somewhat.
The orthogonality relations and character theory
The character theory of finite groups allows us to compute using surprisingly little information about the representations themselves. Associated to any representation
is a function
called the character of the representation. Given two characters
of irreducible representations the orthogonality relations state that
.
By bilinearity it then follows that . In other words, in the case of
we can in fact associate a concrete inner product with our categorified inner product. The proofs of this result that I have seen do not mention explicitly the connection to
and are annoyingly computational, so I would like to give a more conceptual (read: functorial) proof.
As a way of motivating it, here is a proof for permutation representations. Let act on two finite sets
and consider the corresponding representations on
, which have characters
. By the orbit-counting lemma,
is precisely the number of orbits in the product group action on
. Thus it remains to prove the following.
Proposition: The intertwining operators have a basis which is in one-to-one correspondence with orbits of the action of
on
.
Proof. Given any orbit of
there is an intertwining operator which sends
to
and sends everything else to zero. Conversely, suppose
is an intertwining operator with matrix
(where we pick the basis given by the elements of
). Then the intertwining condition requires that, working through the definition of matrix multiplication,
or equivalently that is constant on orbits of
. The conclusion follows.
A proof for general representations
Let’s generalize the above proof. The generalization of the product group action is the (internal) tensor product of representations,
but defining it properly requires recognizing a subtlety I usually ignore. (Edit, 8/30/09: The text that used to be here has been moved to where it is more relevant; see Theo’s comment below.)
The tensor product
of a right representation and a left representation is a left representation defined on the tensor product of the underlying vector spaces with the action given by the linear extension of .
It satisfies the following universal property: any bilinear function As expected from a generalization of the product action, the tensor product has character preserving the action of
in the sense that
must factor through the tensor product, i.e. must come from an intertwining operator
(regarded as the trivial representation).
.
The generalization of the orbit-counting lemma is the following special case of the orthogonality relations: for any representation with character
,
counts the number of copies of the trivial representation in
. Since adding a copy of the trivial representation increases the value of
by
(in the permutation picture, every permutation gains a fixed point), it suffices to show that for a non-trivial irreducible representation one has
.
One way to prove this is as follows: the image of the linear transformation is a
-invariant subspace of dimension at most
and hence (under the assumption of non-triviality) must be zero. I dislike this proof for the same reason that I dislike the usual proof of Maschke’s theorem, but I’ll let this one slide.
There is an additional step necessary to account for conjugate linearity. The definition of a representation we gave above is equivalent to giving a left action satisfying
. We can also define a right action
satisfying
; note that the order of multiplication is reversed. (Equivalently, a right action of
is a left action of the opposite group
.)
Any left representation of a group
on a finite-dimensional vector space
naturally (that is, functorially) defines a right dual representation
on the dual space
, as follows: there is a unique representation
preserving the natural pairing
in the sense that
. In matrix form, if one thinks of the natural pairing as multiplying row and column vectors,
, but acting on row vectors instead of column vectors; thus if we want to think of the dual representation as a left representation we should take the transpose of
, which recovers the usual order of multiplication. Because every element of a finite group has finite order,
is diagonalizable and has eigenvalues consisting of roots of unity for every
, and roots of unity have the property that their inverses are their conjugates; in other words, the dual representation has character the conjugate of the character of
. (This is essentially because representations of finite groups over
are isomorphic to unitary representations, which is most of the proof of Maschke’s theorem; more on this later.)
It now follows that for two representations the inner product
is equal to the number of copies of the trivial representation in the representation
. Thus it remains to prove the following.
Proposition: is isomorphic to the subspace of
consisting of the vectors fixed by all of
.
Proof. Suspend the representation theory and again focus on the linear algebra. Any pure tensor naturally defines a linear operator
defined by letting
act on
and multiplying the result by
. Consequently one can show by induction or by picking a basis that every linear operator
is a linear combination of such pure tensors.
Now we reintroduce the representation theory. Generalizing the permutation proof, the condition that every element of fixes an element of
is precisely the condition that the corresponding linear operator
satisfies
! (In other words, one way to motivate the definition of both the tensor and dual representations is that this should be true.) This completes the proof.
A corollary of the orthogonality relations is the surprising fact that a representation is determined by its character (up to isomorphism), since its decomposition into irreducible representations can be computed from the character of alone. Since characters are invariant under conjugation, another corollary is that the number of irreducible representations is exactly the number of conjugacy classes of .
Other perspectives
More general than the notion of representation of a group is a representation of an associative algebra, which can be defined as an algebra homomorphism (since
has the structure of an algebra with the multiplication given by composition). For the case of finite groups the relevant algebra is the group algebra
, which consists of formal linear combinations of the elements of
which multiply by the group operation. From this perspective, one way to think about the representation theory of finite groups is in terms of special properties enjoyed by group algebras.
Even more general than the notion of a representation of an associative algebra is that of a module for a ring. The module-theoretic perspective clarifies many of the above constructions, since the category of (left) modules over a fixed ring
directly generalizes the category of vector spaces over a field. An intertwining operator of
-representations is precisely a homomorphism of
-modules, since the requirement that “scalar multiplication” be respected is precisely the requirement that the action of
be respected.
is an additive category for any ring
, so one gets the “bilinearity” of the
functor in a very general context. And the notion of irreducible representation (as well as the proof of Schur’s lemma) carries through to simple modules in general.
However, neither internal tensor products nor dual representations are defined for modules over general rings. Their existence is due to the fact that group algebras carry a lot of extra structure: they are in fact Hopf algebras. The comultiplication gives us the internal tensor product and the antipode gives us duals. The Wikipedia article explains this quite well.
The two major tools left to be explained are Maschke’s theorem and characters. Maschke’s theorem is essentially a compactness result. It generalizes straightforwardly to compact groups as follows: given a representation of a compact group on a Hilbert space
with inner product
one uses the Haar integral to average over
, obtaining a new inner product which is
-invariant.
-invariant inner products have the property that the orthogonal complement of a
-invariant subspace (that is, a subrepresentation) is also
-invariant, so it follows by induction that every finite-dimensional representation is semisimple, since any finite-dimensional vector space can be turned into a Hilbert space. (The standard results are much stronger than this, but I don’t know much about them.) As a corollary, every compact subgroup of
is conjugate to a subgroup of the unitary group
; in other words, the unitary group is an essentially unique maximal compact subgroup of
.
Characters, on the other hand, are still mysterious to me. There is a functorial definition of the trace based on the identification of with
, then extending the natural bilinear pairing
, but I can’t come up with a particularly good reason why this should accurately reflect the structure of a representation. If anyone has any comments on this, they would be much appreciated.
You mention being unsatisfied with your proof (“I dislike this proof for the same reason that I dislike the usual proof of Maschke’s theorem”). I sense there is a different approach which would stem from observing that the character of an irreducible representation preserves products. Product preserving maps in the Hopf algebra $[G, \mathbb{C}]$ are always orthogonal for $G$ finite.
Akhil: although Qiaochu’s post gives a very nice exposition of some of classical representation theory, I suspect it is only very nice for those already well-versed in either representation theory or category theory. Although I agree that this is a much nicer presentation than in several books already mentioned in the comments, I do not see how this category-theoretic viewpoint answers the question “Why is this interesting?”
Point taken. This post wasn’t intended to be a beginner’s introduction; I very much had in mind an audience who had seen the basic results at least once, and I wasn’t trying to motivate interest in the subject. Hopefully my own motivation regarding the representation theory of the symmetric groups will become clear in a few posts.
Nice post! A few comments:
I’m bothered by the second paragraph under “A proof for general representations”. The problem is the left/right thing. The category of _left_ G-reps is monoidal, where on objects the monoidal structure is \otimes_\CC and the action of g on V\otimes W is \rho_V(g) \otimes \rho_W(g), where \rho_V,\rho_W are the actions on V,W. Similarly, the category of _right_ G-reps is monoidal. In either case, the one-dimensional rep \CC with the trivial action is the monoidal unit.
You propose an a tensor product that multiplies a right G-rep with a left one. This will not define a representation of G (unless G is abelian, whence “right” and “left” are the same). What you can do is this: if R is any ring (perhaps \CC[G], the group algebra), and V is a right R-module and W is a left R-module, then the product V \otimes_R W is defined. However, it is not a module over R at all — the R action has canceled out. If you want an R-linear monoidal category, the easiest thing to do is to take the category of R-R bimodules, i.e. spaces with both a left R action and a right R action. (If R is commutative, these can be required to be the same, or not, depending on your goals.) The morphisms are required to commute with both actions, and the two actions are requires to commute. This category is monoidal with \otimes_R as the monoidal structure, and R with its canonical left- and right- actions on itself is the monoidal unit.
Ok, so it looks like you wanted that right representation in order to say that \Hom(V,W) = V^* \otimes W. The trick is this. If V is a left module over R, then V^* is a right R-module with the transpose action. How do you turn a right R-module into a left R-module? In general you cannot — doing so functorially requires R to have an antiautomorphism, i.e. an additive map s: R\to R satisfying s(ab) = s(b) s(a).
Fortunately, in the category of representations of a group, you have one of these: the map G \to G sending g \mapsto g^{-1} extends to an antiautomorphism of the group algebra \CC[G]. So this lets you define on V^* a _left_ G-action, namely g \mapsto \rho(g^{-1})^*, where \rho(g) is the action of g on V.
If you’re doing Hopf algebras, the map s is called the “antipode”. There is one more condition on the antipode in a Hopf algebra: it is required to be an antiautomorphism of the comultiplication as well as of the multiplication. This assures that (V\otimes W)^* = W^* \otimes V^*, which it should be. In the group algebra, the comultiplication is almost trivial, and so this condition is free.
Anyhoo, so all of this defines on \Hom_\CC(V,W) the structure of a left G-module, if V and W are both left G-modules. And then the action is that g acts on a linear map \phi by taking it to \rho(g) \circ \phi \circ \rho(g^{-1}), so you’re exactly correct that the G-intertwiners \Hom_G(V,W) are precisely the fixed points of the G-action \Hom_\CC(V,W).
A good interpretation of this is as follows. \CC with the trivial action is the monoidal unit in G-rep, and \Hom_G(\CC,V) is the vector space (not naturally a G-rep) of fixed points of the G-action on V. If we replace the word “G-rep” by, say, “sheaf”, then \Hom_G(\CC,V) is “the space of global sections of V”; this is a useful notion in any closed monoidal category, and the discussion above shows that G-rep is closed. So we’re saying that the global sections of the enriched \Hom(V,W) comprise precisely the set of morphisms \Hom(V,W). This is true in an arbitrary closed monoidal category.
Thanks for the comment! That’s the part I was wondering about – the left/right convention was taken from the Wikipedia article about tensor products of modules, but it seemed out of place here after I thought about it. I’ll correct that part.
I’ll have to look this over later, but it seems at first glance that it might be useful as a reference when I finally get around to more finite-group representation theory.
I wouldn’t trust my definitions to be exactly right, since I’m not working off of a reference here; in particular I may be somewhat confused about the tensor product.
Nice post.
Most people who write about representation theory (including, unfortunately, myself in what was supposed to be a quick summary though) tend not to emphasize the categorical nature of all this. As a different (and more trivial) example from what you mentioned, if
is a (say covariant additive) functor, it induces a functor on
just by composition. Similarly for contravariant functors and bifunctors. Thus the facts that tensor products and hom-sets of representations are representations all become immediate rather than requiring separate ad-hoc arguments.
I think this tends to be a problem in textbooks, which often just pressent the material with minimal motivation and without answering the question “Why is this interesting?” especially when the material becomes so ad hoc.
I agree. For example, Artin’s proof of the orthogonality relations involves a lot of juggling around terms in sums, which I’ve never found a satisfying way to convince myself of anything. To my mind, the categorical perspective is just a generalization of the idea that binomial coefficient identities, for example, should be proven bijectively.
Ditto for Serre, which means I forget the proof quickly. Fulton and Harris reduce to the case where one of the irreducible modules is 1, which I think is an improvement.
I think Lang’s Algebra proves them a bit more functorially though.
I have a vector space V and I look at the functor which is tensor by V, the representation I get on
will be trivial on the V and whatever it was on W. How do you get an arbitrary action on V to carry over via a tensor product functor?
I think you have to consider the bifunctor
given by
(and similarly for
).