Feeds:
Posts

## Introduction to symmetric functions

The theory of symmetric functions, which generalizes some ideas that came up in the previous discussion of Polya theory, can be motivated by thinking about polynomial functions of the roots of a monic polynomial $P(x) \in \mathbb{Z}[x]$. Problems on high school competitions often ask for the sum of the squares or the cubes of the roots of a given polynomial, and those not familiar with symmetric functions will often be surprised that these quantities end up being integers even though the roots themselves aren’t integers. The key is that the sums being asked for are always symmetric in the roots, i.e. invariant under an arbitrary permutation. The coefficients of a monic polynomial are the elementary symmetric polynomials of its roots, which we are given are integers. It follows that any symmetric polynomial that can be written using integer coefficients in terms of the elementary symmetric polynomials must in fact be an integer, and as we will see, every symmetric polynomial with integer coefficients has this property.

These ideas lead naturally to the use of symmetric polynomials in Galois theory, but combinatorialists are interested in symmetric functions for a very different reason involving the representation theory of $S_n$. This post is a brief overview of the basic results of symmetric function theory relevant to combinatorialists. I’m mostly following Sagan, who recommends Macdonald as a more comprehensive reference.

A lemma on group actions

First, a straightforward lemma. Let $G$ be a group acting on a set $X$ and extend this group action to a representation of $G$ on the free module $R^X$ with basis $X$ over some commutative ring $R$.

Lemma: The fixed points of the action of $G$ on $R^X$ form a submodule spanned by the vectors of the form $\displaystyle \sum_{x \in O} x$, where $O$ is an orbit of the action of $G$ on $X$.

Proof. Clearly any linear combination of such vectors are fixed points. In the other direction, if $v = \sum_{x \in X} c_x x$ is a fixed point, then it must be equal to

$\displaystyle gv = \sum_{x \in X} c_x gx$

for every $g \in G$. It follows that $c_x = c_{g^{-1} x}$ for every $g, x$; in particular, $c_x = c_y$ if $x, y$ are in the same orbit. The conclusion follows.

Symmetric polynomials

The symmetric group $S_n$ acts on $R[x_1, ... x_n]$ by permutation of the variables, and the fixed points of this action are the symmetric polynomials, denoted $R[x_1, ... x_n]^{S_n}$. Since symmetry is preserved under addition and multiplication, and since symmetric polynomials inherit the grading from $R[x_1, ... x_n]$, the symmetric polynomials form a graded $R$-algebra. General results in algebra then guarantee that the symmetric polynomials are finitely generated, but we’ll be able to prove this directly.

Applying the lemma above tells us that $R[x_1, ... x_n]^{S_n}$ is spanned by the orbits of monomials, which we describe as follows. Given a partition $k = \lambda_1 + \lambda_2 + ... + \lambda_n$ into at most $n$ parts, the monomial symmetric polynomial $m_{\lambda}$ is defined to be

$\displaystyle m_{\lambda}(x_1, ... x_n) = \frac{1}{| \text{Stab}(\lambda) | } \sum_{\sigma \in S_n} x_{\sigma(1)}^{\lambda_1} x_{\sigma(2)}^{\lambda_2} ... x_{\sigma(n)}^{\lambda_n}$

where $\text{Stab}(\lambda)$ is the stabilizer of any monomial of type $\lambda$ under the action of $S_n$; this ensures that the above sum has every monomial in the orbit exactly once. If the partition doesn’t have $n$ parts then the extra factors are set to $1$. By the lemma, the degree $k$ component of $R[x_1, ... x_n]^{S_n}$ has as a basis the monomial symmetric polynomials corresponding to partitions of $k$ with at most $n$ parts. This implies the following.

Theorem: The Hilbert series of $R[x_1, ... x_n]^{S_n}$ is $\displaystyle \prod_{i=1}^{n} \frac{1}{1 - t^i}$.

Proof. The conjugate $\lambda'$ of a partition $\lambda$ with at most $n$ parts is a partition in which every part has size at most $n$, and such a partition is uniquely specified by describing how many parts of each size $1, 2, ... n$ it has. This is exactly what the generating function above does.

(Molien’s theorem generalizes this result, but the invariant-theoretic aspects aren’t too relevant to what we’re doing here.)

This result suggests that $R[x_1, ... x_n]^{S_n}$ is generated by $n$ algebraically independent polynomials, one of each degree; such an algebra would have the same Hilbert series as the one above, since the homogeneous elements of degree $k$ would be spanned by products of the generators of total degree $k$. This is in fact the content of the fundamental theorem of symmetric polynomials. Recall that the elementary symmetric polynomials can be defined by

$\displaystyle E_n(t) = \sum_{k=0}^{n} t^k e_k(x_1, ... x_n) = \prod_{i=1}^{n} (1 + tx_i)$.

Equivalently, we can define $e_k = m_{(1^k)}$. Considering arbitrary products of such polynomials is easiest with the following notation: given a partition $k = \lambda_1 + ... + \lambda_m$ where each part has size at most $n$, define

$\displaystyle e_{\lambda} = e_{\lambda_1} e_{\lambda_2} ... e_{\lambda_n}$

where $e_0 = 1$. Note that the restriction on $\lambda$ is by size of parts rather than number; the restriction is natural because $e_{n+1}$ isn’t well-defined.

Theorem: The degree $k$ part of $R[x_1, ... x_n]^{S_n}$ has as a basis the polynomials $e_{\lambda}$ corresponding to partitions of $k$ where each part has size at most $n$. Equivalently, $R[x_1, ... x_n]^{S_n}$ is freely generated as an $R$-algebra by $e_1, ... e_n$.

Proof. We are done if we can show that the coefficients expressing $e_{\lambda}$ in terms of $m_{\lambda}$ form a matrix whose inverse has coefficients in $\mathbb{Z}$. In fact, it turns out that with a suitable choice of ordering this matrix is upper-triangular with all diagonal entries $1$, which gives the result immediately. To see this, we’ll use the dominance or majorization order on two partitions $\lambda, \mu$ of the same integer, where $\lambda \trianglelefteq \mu$ if and only if $\lambda_1 + ... + \lambda_i \le \mu_1 + ... + \mu_i$ for all $i$. Then we claim that there exist integer coefficients $d_{\lambda \mu}$ such that

$\displaystyle e_{\lambda'} = m_{\lambda} + \sum_{\mu \triangleleft \lambda} d_{\lambda \mu} m_{\mu}$.

To see this visually, write down a monomial in $e_{\lambda'}$ by drawing the Young diagram of $\lambda'$ and filling in the row corresponding to $\lambda'_i$ with the monomial chosen from the factor $e_{\lambda'}$, placing the labels in increasing order. This gives a Young tableau with rows strictly increasing. Now note that since $x_1$ can only appear in the first column of such a tableau, the greatest power of $x_1$ in any such monomial is $\lambda_1$; generally, $x_i$ can only appear in the first $i$ columns, which means that the monomials that appear in $e_{\lambda'}$ are precisely the monomials dominated by $\lambda$. The equality case only occurs if the $i^{th}$ column consists entirely of $x_i$, so the coefficient of $m_{\lambda}$ is $1$ as desired.

As the above discussion suggests, symmetric polynomials can be thought of as certain kinds of weighted generating functions. For example, $m_{\lambda}$ counts compositions of type $\lambda$, $e_k$ counts subsets of $\{ x_1, ... x_n \}$ of size $k$, and $e_{\lambda}$ counts Young tableaux with strictly increasing rows. Changing “subset” to “multiset,” i.e. “strictly” to “weakly,” we obtain the complete homogeneous symmetric polynomials, which can be defined by

$\displaystyle H_n(t) = \sum_{k=0}^{\infty} t^k h_k(x_1, ... x_n) = \prod_{i=1}^{n} \frac{1}{1 - tx_i}$.

Again, given a partition $k = \lambda_1 + ... + \lambda_m$ we can define

$h_{\lambda}(x_1, ... x_n) = h_{\lambda_1} h_{\lambda_2} ... h_{\lambda_n}$

and we expect as before that $R[x_1, ... x_n]^{S_n}$ has basis $h_{\lambda}$ where the parts of the partition have size at most $n$. In fact, we can deduce this directly from the fundamental theorem thanks to the identity $E_n(t) H_n(-t) = 1$, which gives

$h_k - e_1 h_{k-1} \pm ... + (-1)^k e_k = 0, k \ge 1$.

Newton’s sums

The power symmetric polynomials $p_k$ are defined by

$p_k(x_1, ... x_n) = m_{(k)} = x_1^k + ... + x_n^k$.

Again, given a partition $k = \lambda_1 + ... + \lambda_m$ we define

$p_{\lambda}(x_1, ... x_n) = p_{\lambda_1} p_{\lambda_2} ... p_{\lambda_m}$.

Again, $p_{\lambda}$ has degree $k$. I personally think of $p_k$ as computing $\text{tr} A^k$ where $A$ is a matrix with eigenvalues $x_1, x_2, ... x_n$. It follows that $p_n$ should satisfy a linear recurrence with characteristic polynomial $\prod_{i=1}^{n} (t - x_i)$, so we should have

$\displaystyle p_{n+k} = e_1 p_{n+k-1} - e_2 p_{n+k-2} \pm ...$.

This is most of Newton’s identities already: we just need to describe $p_k$ for $k < n$. The generating functions proof is essentially one I've already given in past GILA posts. Write $\displaystyle \log E_n(t) = \sum_{i=1}^{n} \log (1 - tx_i)$ and differentiate both sides, then multiply by $-t$. This gives

$\displaystyle \frac{- t E_n'(t)}{E_n(t)} = \sum_{i=1}^{n} \frac{tx_i}{1 - tx_i} = \sum_{k \ge 1} p_k(x_1, ... x_n) t^k$.

Multiplying both sides by $E_n(t)$ we obtain the identity

$p_k - e_1 p_{k-1} + e_2 p_{k-2} \mp ... = (-1)^{k+1} ke_k$.

Essentially the same argument applied to $\log H_n(t)$ also gives

$p_k + h_1 p_{k-1} + h_2 p_{k-2} + ... = kh_k$.

One might now expect that by writing the elementary symmetric polynomials in terms of the power symmetric polynomials, we could conclude that the symmetric polynomials have as a basis the power symmetric polynomials. Unfortunately, that pesky $k$ factor implies that this isn’t true for general $R$. For example, if $R$ has characteristic $2$, then $p_1^2 = p_2$, so we have less polynomials to work with than we expect. And if $2$ isn’t invertible in $R$, then $e_2 = \frac{p_1^2 - p_2}{2}$ can’t be generated by the power symmetric polynomials. Fortunately, if $R$ contains $\mathbb{Q}$, everything works out. To see this more explicitly, note that Newton’s sums are equivalent to the pair of identities

$\displaystyle E_n(t) = \prod_{i=1}^{n} (1 + tx_i) = \exp \left( p_1 t - \frac{p_2 t^2}{2} + \frac{p_3 t^3}{3} \mp ... \right)$

and

$\displaystyle H_n(t) = \prod_{i=1}^{n} \frac{1}{1 - tx_i} = \exp \left( p_1 t + \frac{p_2 t^2}{2} + \frac{p_3 t^3}{3} + ... \right)$.

The second identity should be familiar from the previous post on the cycle index polynomials of the symmetric groups. As stated in this form we get an explicit formula for the $e_k$ in terms of the $p_k$ by expanding the exponential as follows: given a permutation $\sigma$ with cycle type a partition $\lambda$, define $p_{\sigma} = p_{\lambda}$. Then

$\displaystyle e_k = \frac{1}{k!} \sum_{\sigma \in S_k} \text{sgn}(\sigma) p_{\sigma}$.

This follows because the sign of a cycle of even length is negative and sign is multiplicative. Similarly,

$\displaystyle h_k = \frac{1}{k!} \sum_{\sigma \in S_k} p_{\sigma}$.

These identities hold over any field of characteristic zero and more generally over any $\mathbb{Q}$-algebra. They imply, among other things, that it is possible to compute the characteristic polynomial of a square matrix by taking traces of its powers.

The ring of symmetric functions

As the above examples show, the interesting identities among symmetric polynomials hold independent of the number of variables; they only depend on the partitions $\lambda$ involved. One reason we should expect this to be true is that setting $x_n = 0$ in a symmetric polynomial identity among $n$ variables immediately gives a symmetric polynomial identity among $n-1$ variables that is essentially identical in the terms involved for sufficiently large $n$. It would therefore be nice to reverse this process and prove symmetric polynomial identities “abstractly” rather than fixing the number of variables in advance.

This abstract setting is known as the ring of symmetric functions, although it does not really consist of a ring of functions in the usual sense. As the Wikipedia article describes, it can be thought of in two ways:

1. Start with the ring $R[[x_1, x_2, ... ]]$ of power series in countably many variables over a commutative ring $R$ and consider only the elements of finite degree, then only the elements invariant under the full permutation group $S_{\infty}$ of a countable set, or
2. Consider the homomorphisms $R[x_1, ... x_{n-1}]^{S_{n-1}} \to R[x_1, ... x_n]^{S_n}$ given by sending $e_k(x_1, ... x_{n-1})$ to $e_k(x_1, ... x_n)$ and set up a direct limit / colimit.

To keep things concrete we’ll take the first perspective, although the second is an important conceptual point. The ring of symmetric functions over $R$, denoted $\Lambda_R$, is spanned as an $R$-module by the monomial symmetric functions

$m_{\lambda} = \sum x_{i_1}^{\lambda_1} x_{i_2}^{\lambda_2} ... x_{i_n}^{\lambda_n}$

where $i_1, ... i_n$ range over all distinct indices and $\lambda$ ranges over all partitions. $\Lambda_R$ is graded by degree with multiplication is defined as expected, although it is no longer finitely generated, and one can check that all of the above proofs make perfect sense in this setting; in particular $\Lambda_R$ is freely generated by the elementary symmetric functions, defined as the coefficients of

$\displaystyle \sum_{k \ge 0} e_k t^k = \prod_{i \ge 1} (1 + tx_i)$

and extended to partitions by multiplicativity; similarly for the complete homogeneous symmetric functions. We therefore know that the functions $m_{\lambda}, e_{\lambda}, h_{\lambda}$ are all bases of $\Lambda_R$ with no restriction on the partition $\lambda$ and under the same assumptions as before so is $p_{\lambda}$. The Hilbert series of $\Lambda_R$ is therefore equal to $\displaystyle \prod_{i \ge 1} \frac{1}{1 - t^i}$; this is precisely the limit, in the $t$-adic topology, of the Hilbert series of $R[x_1, ... x_n]^{S_n}$ as $n$ tends to infinity.

Relationship to representation theory

Recall that the number of conjugacy classes of $S_n$, hence the number of irreducible representations, is the number of partitions $p(n)$ of $n$. This implies that the class functions $C_R(S_n)$ of $S_n$ over $R$ form a $R$-module of dimension $p(n)$, so the countable direct sum

$\mathcal{R} = R \oplus C_R(S_1) \oplus C_R(S_2) \oplus ...$

has Hilbert series $\displaystyle \prod_{i \ge 1} \frac{1}{1 - t^i}$ as an $R$-module if the degree of the elements of $C_R(S_n)$ is defined to be $n$.

Over $\mathbb{C}$, there is a distinguished basis of $C_{\mathbb{C}}(S_n)$ given by the irreducible characters $\chi_{\lambda}$ associated to the partitions of $n$, and there is a distinguished inner product given by

$\displaystyle < \alpha, \beta > = \frac{1}{n!} \sum_{\sigma \in S_n} \alpha(\sigma) \overline{ \beta(\sigma) }$

with respect to which the irreducible characters are orthonormal, i.e. one has $< \chi_{\lambda}, \chi_{\mu} > = \delta_{\lambda \mu}$. This inner product extends to $\mathcal{R}$ by declaring characters of different degree to be orthogonal. A question then naturally presents itself: does there exist an invertible linear transformation $\phi : \mathcal{R} \to \Lambda = \Lambda_{\mathbb{C}}$ such that

1. there exists a nice inner product on $\Lambda$ such that $\phi$ preserves the inner product of class functions, and
2. there exists a nice multiplication on $\mathcal{R}$ such that $\phi^{-1}$ preserves the product of symmetric functions?

As we will see, the answer to both questions is yes.

### 4 Responses

1. […] The trace of acting on is given by where is the complete homogeneous symmetric polynomial of degree […]

2. […] sums, necklace congruences, and zeta functions IIGILA I: Group actions and equivalence relationsIntroduction to symmetric functionsIMO 2009 and proof systemsYoung's latticeExtracting the diagonalGenerally Interested Lay […]

3. on October 2, 2009 at 4:39 am | Reply Jason Bandlow

Nice presentation. One minor correction: it’s ‘Macdonald’ and not ‘MacDonald’.

4. […] of the Frobenius map…GILA 0: Roth’s… on Mathematical historical f…Introduction to symm… on GILA V: The Polya enumeration …Dave on Going beyond your comfort…Akhil Mathew on […]