Introduction to symmetric functions

August 20, 2009

The theory of symmetric functions, which generalizes some ideas that came up in the previous discussion of Polya theory, can be motivated by thinking about polynomial functions of the roots of a monic polynomial P(x) \in \mathbb{Z}[x]. Problems on high school competitions often ask for the sum of the squares or the cubes of the roots of a given polynomial, and those not familiar with symmetric functions will often be surprised that these quantities end up being integers even though the roots themselves aren’t integers. The key is that the sums being asked for are always symmetric in the roots, i.e. invariant under an arbitrary permutation. The coefficients of a monic polynomial are the elementary symmetric polynomials of its roots, which we are given are integers. It follows that any symmetric polynomial that can be written using integer coefficients in terms of the elementary symmetric polynomials must in fact be an integer, and as we will see, every symmetric polynomial with integer coefficients has this property.

These ideas lead naturally to the use of symmetric polynomials in Galois theory, but combinatorialists are interested in symmetric functions for a very different reason involving the representation theory of S_n. This post is a brief overview of the basic results of symmetric function theory relevant to combinatorialists. I’m mostly following Sagan, who recommends Macdonald as a more comprehensive reference.

A lemma on group actions

First, a straightforward lemma. Let G be a group acting on a set X and extend this group action to a representation of G on the free module R^X with basis X over some commutative ring R.

Lemma: The fixed points of the action of G on R^X form a submodule spanned by the vectors of the form \displaystyle \sum_{x \in O} x, where O is an orbit of the action of G on X.

Proof. Clearly any linear combination of such vectors are fixed points. In the other direction, if v = \sum_{x \in X} c_x x is a fixed point, then it must be equal to

\displaystyle gv = \sum_{x \in X} c_x gx

for every g \in G. It follows that c_x = c_{g^{-1} x} for every g, x; in particular, c_x = c_y if x, y are in the same orbit. The conclusion follows.

Symmetric polynomials

The symmetric group S_n acts on R[x_1, ... x_n] by permutation of the variables, and the fixed points of this action are the symmetric polynomials, denoted R[x_1, ... x_n]^{S_n}. Since symmetry is preserved under addition and multiplication, and since symmetric polynomials inherit the grading from R[x_1, ... x_n], the symmetric polynomials form a graded R-algebra. General results in algebra then guarantee that the symmetric polynomials are finitely generated, but we’ll be able to prove this directly.

Applying the lemma above tells us that R[x_1, ... x_n]^{S_n} is spanned by the orbits of monomials, which we describe as follows. Given a partition k = \lambda_1 + \lambda_2 + ... + \lambda_n into at most n parts, the monomial symmetric polynomial m_{\lambda} is defined to be

\displaystyle m_{\lambda}(x_1, ... x_n) = \frac{1}{| \text{Stab}(\lambda) | } \sum_{\sigma \in S_n} x_{\sigma(1)}^{\lambda_1} x_{\sigma(2)}^{\lambda_2} ... x_{\sigma(n)}^{\lambda_n}

where \text{Stab}(\lambda) is the stabilizer of any monomial of type \lambda under the action of S_n; this ensures that the above sum has every monomial in the orbit exactly once. If the partition doesn’t have n parts then the extra factors are set to 1. By the lemma, the degree k component of R[x_1, ... x_n]^{S_n} has as a basis the monomial symmetric polynomials corresponding to partitions of k with at most n parts. This implies the following.

Theorem: The Hilbert series of R[x_1, ... x_n]^{S_n} is \displaystyle \prod_{i=1}^{n} \frac{1}{1 - t^i}.

Proof. The conjugate \lambda' of a partition \lambda with at most n parts is a partition in which every part has size at most n, and such a partition is uniquely specified by describing how many parts of each size 1, 2, ... n it has. This is exactly what the generating function above does.

(Molien’s theorem generalizes this result, but the invariant-theoretic aspects aren’t too relevant to what we’re doing here.)

This result suggests that R[x_1, ... x_n]^{S_n} is generated by n algebraically independent polynomials, one of each degree; such an algebra would have the same Hilbert series as the one above, since the homogeneous elements of degree k would be spanned by products of the generators of total degree k. This is in fact the content of the fundamental theorem of symmetric polynomials. Recall that the elementary symmetric polynomials can be defined by

\displaystyle E_n(t) = \sum_{k=0}^{n} t^k e_k(x_1, ... x_n) = \prod_{i=1}^{n} (1 + tx_i).

Equivalently, we can define e_k = m_{(1^k)}. Considering arbitrary products of such polynomials is easiest with the following notation: given a partition k = \lambda_1 + ... + \lambda_m where each part has size at most n, define

\displaystyle e_{\lambda} = e_{\lambda_1} e_{\lambda_2} ... e_{\lambda_n}

where e_0 = 1. Note that the restriction on \lambda is by size of parts rather than number; the restriction is natural because e_{n+1} isn’t well-defined.

Theorem: The degree k part of R[x_1, ... x_n]^{S_n} has as a basis the polynomials e_{\lambda} corresponding to partitions of k where each part has size at most n. Equivalently, R[x_1, ... x_n]^{S_n} is freely generated as an R-algebra by e_1, ... e_n.

Proof. We are done if we can show that the coefficients expressing e_{\lambda} in terms of m_{\lambda} form a matrix whose inverse has coefficients in \mathbb{Z}. In fact, it turns out that with a suitable choice of ordering this matrix is upper-triangular with all diagonal entries 1, which gives the result immediately. To see this, we’ll use the dominance or majorization order on two partitions \lambda, \mu of the same integer, where \lambda \trianglelefteq \mu if and only if \lambda_1 + ... + \lambda_i \le \mu_1 + ... + \mu_i for all i. Then we claim that there exist integer coefficients d_{\lambda \mu} such that

\displaystyle e_{\lambda'} = m_{\lambda} + \sum_{\mu \triangleleft \lambda} d_{\lambda \mu} m_{\mu}.

To see this visually, write down a monomial in e_{\lambda'} by drawing the Young diagram of \lambda' and filling in the row corresponding to \lambda'_i with the monomial chosen from the factor e_{\lambda'}, placing the labels in increasing order. This gives a Young tableau with rows strictly increasing. Now note that since x_1 can only appear in the first column of such a tableau, the greatest power of x_1 in any such monomial is \lambda_1; generally, x_i can only appear in the first i columns, which means that the monomials that appear in e_{\lambda'} are precisely the monomials dominated by \lambda. The equality case only occurs if the i^{th} column consists entirely of x_i, so the coefficient of m_{\lambda} is 1 as desired.

As the above discussion suggests, symmetric polynomials can be thought of as certain kinds of weighted generating functions. For example, m_{\lambda} counts compositions of type \lambda, e_k counts subsets of \{ x_1, ... x_n \} of size k, and e_{\lambda} counts Young tableaux with strictly increasing rows. Changing “subset” to “multiset,” i.e. “strictly” to “weakly,” we obtain the complete homogeneous symmetric polynomials, which can be defined by

\displaystyle H_n(t) = \sum_{k=0}^{\infty} t^k h_k(x_1, ... x_n) = \prod_{i=1}^{n} \frac{1}{1 - tx_i}.

Again, given a partition k = \lambda_1 + ... + \lambda_m we can define

h_{\lambda}(x_1, ... x_n) = h_{\lambda_1} h_{\lambda_2} ... h_{\lambda_n}

and we expect as before that R[x_1, ... x_n]^{S_n} has basis h_{\lambda} where the parts of the partition have size at most n. In fact, we can deduce this directly from the fundamental theorem thanks to the identity E_n(t) H_n(-t) = 1, which gives

h_k - e_1 h_{k-1} \pm ... + (-1)^k e_k = 0, k \ge 1.

Newton’s sums

The power symmetric polynomials p_k are defined by

p_k(x_1, ... x_n) = m_{(k)} = x_1^k + ... + x_n^k.

Again, given a partition k = \lambda_1 + ... + \lambda_m we define

p_{\lambda}(x_1, ... x_n) = p_{\lambda_1} p_{\lambda_2} ... p_{\lambda_m}.

Again, p_{\lambda} has degree k. I personally think of p_k as computing \text{tr} A^k where A is a matrix with eigenvalues x_1, x_2, ... x_n. It follows that p_n should satisfy a linear recurrence with characteristic polynomial \prod_{i=1}^{n} (t - x_i), so we should have

\displaystyle p_{n+k} = e_1 p_{n+k-1} - e_2 p_{n+k-2} \pm ... .

This is most of Newton’s identities already: we just need to describe p_k for k < n. The generating functions proof is essentially one I've already given in past GILA posts. Write \displaystyle \log E_n(t) = \sum_{i=1}^{n} \log (1 - tx_i) and differentiate both sides, then multiply by -t. This gives

\displaystyle \frac{- t E_n'(t)}{E_n(t)} = \sum_{i=1}^{n} \frac{tx_i}{1 - tx_i} = \sum_{k \ge 1} p_k(x_1, ... x_n) t^k.

Multiplying both sides by E_n(t) we obtain the identity

p_k - e_1 p_{k-1} + e_2 p_{k-2} \mp ... = (-1)^{k+1} ke_k.

Essentially the same argument applied to \log H_n(t) also gives

p_k + h_1 p_{k-1} + h_2 p_{k-2} + ... = kh_k.

One might now expect that by writing the elementary symmetric polynomials in terms of the power symmetric polynomials, we could conclude that the symmetric polynomials have as a basis the power symmetric polynomials. Unfortunately, that pesky k factor implies that this isn’t true for general R. For example, if R has characteristic 2, then p_1^2 = p_2, so we have less polynomials to work with than we expect. And if 2 isn’t invertible in R, then e_2 = \frac{p_1^2 - p_2}{2} can’t be generated by the power symmetric polynomials. Fortunately, if R contains \mathbb{Q}, everything works out. To see this more explicitly, note that Newton’s sums are equivalent to the pair of identities

\displaystyle E_n(t) = \prod_{i=1}^{n} (1 + tx_i) = \exp \left( p_1 t - \frac{p_2 t^2}{2} + \frac{p_3 t^3}{3} \mp ... \right)

and

\displaystyle H_n(t) = \prod_{i=1}^{n} \frac{1}{1 - tx_i} = \exp \left( p_1 t + \frac{p_2 t^2}{2} + \frac{p_3 t^3}{3} + ... \right).

The second identity should be familiar from the previous post on the cycle index polynomials of the symmetric groups. As stated in this form we get an explicit formula for the e_k in terms of the p_k by expanding the exponential as follows: given a permutation \sigma with cycle type a partition \lambda, define p_{\sigma} = p_{\lambda}. Then

\displaystyle e_k = \frac{1}{k!} \sum_{\sigma \in S_k} \text{sgn}(\sigma) p_{\sigma}.

This follows because the sign of a cycle of even length is negative and sign is multiplicative. Similarly,

\displaystyle h_k = \frac{1}{k!} \sum_{\sigma \in S_k} p_{\sigma}.

These identities hold over any field of characteristic zero and more generally over any \mathbb{Q}-algebra. They imply, among other things, that it is possible to compute the characteristic polynomial of a square matrix by taking traces of its powers.

The ring of symmetric functions

As the above examples show, the interesting identities among symmetric polynomials hold independent of the number of variables; they only depend on the partitions \lambda involved. One reason we should expect this to be true is that setting x_n = 0 in a symmetric polynomial identity among n variables immediately gives a symmetric polynomial identity among n-1 variables that is essentially identical in the terms involved for sufficiently large n. It would therefore be nice to reverse this process and prove symmetric polynomial identities “abstractly” rather than fixing the number of variables in advance.

This abstract setting is known as the ring of symmetric functions, although it does not really consist of a ring of functions in the usual sense. As the Wikipedia article describes, it can be thought of in two ways:

  1. Start with the ring R[[x_1, x_2, ... ]] of power series in countably many variables over a commutative ring R and consider only the elements of finite degree, then only the elements invariant under the full permutation group S_{\infty} of a countable set, or
  2. Consider the homomorphisms R[x_1, ... x_{n-1}]^{S_{n-1}} \to R[x_1, ... x_n]^{S_n} given by sending e_k(x_1, ... x_{n-1}) to e_k(x_1, ... x_n) and set up a direct limit / colimit.

To keep things concrete we’ll take the first perspective, although the second is an important conceptual point. The ring of symmetric functions over R, denoted \Lambda_R, is spanned as an R-module by the monomial symmetric functions

m_{\lambda} = \sum x_{i_1}^{\lambda_1} x_{i_2}^{\lambda_2} ... x_{i_n}^{\lambda_n}

where i_1, ... i_n range over all distinct indices and \lambda ranges over all partitions. \Lambda_R is graded by degree with multiplication is defined as expected, although it is no longer finitely generated, and one can check that all of the above proofs make perfect sense in this setting; in particular \Lambda_R is freely generated by the elementary symmetric functions, defined as the coefficients of

\displaystyle \sum_{k \ge 0} e_k t^k = \prod_{i \ge 1} (1 + tx_i)

and extended to partitions by multiplicativity; similarly for the complete homogeneous symmetric functions. We therefore know that the functions m_{\lambda}, e_{\lambda}, h_{\lambda} are all bases of \Lambda_R with no restriction on the partition \lambda and under the same assumptions as before so is p_{\lambda}. The Hilbert series of \Lambda_R is therefore equal to \displaystyle \prod_{i \ge 1} \frac{1}{1 - t^i}; this is precisely the limit, in the t-adic topology, of the Hilbert series of R[x_1, ... x_n]^{S_n} as n tends to infinity.

Relationship to representation theory

Recall that the number of conjugacy classes of S_n, hence the number of irreducible representations, is the number of partitions p(n) of n. This implies that the class functions C_R(S_n) of S_n over R form a R-module of dimension p(n), so the countable direct sum

\mathcal{R} = R \oplus C_R(S_1) \oplus C_R(S_2) \oplus ...

has Hilbert series \displaystyle \prod_{i \ge 1} \frac{1}{1 - t^i} as an R-module if the degree of the elements of C_R(S_n) is defined to be n.

Over \mathbb{C}, there is a distinguished basis of C_{\mathbb{C}}(S_n) given by the irreducible characters \chi_{\lambda} associated to the partitions of n, and there is a distinguished inner product given by

\displaystyle < \alpha, \beta >  = \frac{1}{n!} \sum_{\sigma \in S_n} \alpha(\sigma) \overline{ \beta(\sigma) }

with respect to which the irreducible characters are orthonormal, i.e. one has < \chi_{\lambda}, \chi_{\mu} > = \delta_{\lambda \mu}. This inner product extends to \mathcal{R} by declaring characters of different degree to be orthogonal. A question then naturally presents itself: does there exist an invertible linear transformation \phi : \mathcal{R} \to \Lambda = \Lambda_{\mathbb{C}} such that

  1. there exists a nice inner product on \Lambda such that \phi preserves the inner product of class functions, and
  2. there exists a nice multiplication on \mathcal{R} such that \phi^{-1} preserves the product of symmetric functions?

As we will see, the answer to both questions is yes.

Tags: , , ,

3 Responses to “Introduction to symmetric functions”


  1. [...] of the Frobenius map…GILA 0: Roth’s… on Mathematical historical f…Introduction to symm… on GILA V: The Polya enumeration …Dave on Going beyond your comfort…Akhil Mathew on [...]

  2. Jason Bandlow Says:

    Nice presentation. One minor correction: it’s ‘Macdonald’ and not ‘MacDonald’.


  3. [...] sums, necklace congruences, and zeta functions IIGILA I: Group actions and equivalence relationsIntroduction to symmetric functionsIMO 2009 and proof systemsYoung's latticeExtracting the diagonalGenerally Interested Lay [...]


Leave a Reply