Extracting the diagonal

October 7, 2009

Needless to say, I have been very, very busy. But enough about me.

Suppose you are given a bivariate generating function

\displaystyle F(x, y) = \sum_{m, n \ge 0} f(m, n) x^m y^n

in “closed form,” where I’ll be vague about what that means. Such a generating function may arise, for example, from counting lattice paths in \mathbb{Z}_{\ge 0}^2; then f(m,  n) might count the number of paths from (0, 0) to (m, n). If the path is only constrained by the fact that its steps must come from some set S \subset \mathbb{Z}_{\ge 0}^2 of steps containing only up or left steps, then we have the simple identity

\displaystyle F_S(x, y) = \frac{1}{1 - \sum_{(a, b) \in S} x^a y^b}.

Question: When can we determine the generating function \displaystyle D_F(x) = \sum_{n \ge 0} f(n, n) x^n in closed form?

I’d like to discuss an analytic approach to this question that gives concrete answers in at least a few important special cases, including the generating function for the central binomial coefficients, which is our motivating example.

The motivating example

Let’s be clear that extracting the diagonal is hard in general. As an important special case, if F(x, y) = A(x) B(y) where A, B are two generating functions, then extracting the diagonal D_F allows you to compute the Hadamard product of A and B, and Hadamard products are hard to compute in closed form in general; we get a lot of hypergeometric series this way, such as \displaystyle \sum_{k \ge 0} {n \choose k}^r x^k.

Nevertheless, at least one special case of this problem has a well-known answer. If S = \{ (1, 0), (0, 1) \}, then

\displaystyle F_S(x, y) = \frac{1}{1 - x - y} = \sum_{i, j \ge 0} {i+j \choose j} x^i y^j

and

\displaystyle D_{F_S}(x, y) = \sum_{n \ge 0} {2n \choose n} x^n = \frac{1}{ \sqrt{1 - 4x} }.

I am aware of at least four proofs of this result, and we’ll be discussing a fifth:

  1. Expand the RHS with the binomial theorem. This is perhaps the least informative proof.
  2. Establish the corresponding result for the Catalan numbers using their functional equation, then differentiate.
  3. Establish the corresponding result for the Catalan numbers, then use a path decomposition argument. (More on this later.)
  4. Prove the identity \sum_{k=0}^{n} {2k \choose k} {2(n-k) \choose n-k} = 4^n combinatorially. Unfortunately, I don’t know this proof; I’ve been told it’s hard.

Of these, I prefer proof #3. It goes as follows: let C(x) = \sum_{n \ge 0} \frac{1}{n+1} {2n \choose n} x^n be the usual generating function for the Catalan numbers. \frac{1}{n+1} {2n \choose n} counts the number of paths using the step set S = \{ (0, 1), (1, 0) \} = \{ U, R \} of length 2n which never go above (say) the diagonal y = x, and which end at (n, n); call these Catalan paths. {2n \choose n} counts the number of paths ending on (n, n) with no such restriction; call these grand Catalan paths. We decompose a grand path as follows: mark all the points at which such a path intersects the diagonal. Between two consecutive such points the path is of the form UCR or RCU where C is an arbitrary Catalan path. It follows that a grand Catalan path uniquely decomposes as a sequence of Catalan paths padded by two other steps in two ways, hence

\displaystyle D_{F_S}(x) = \frac{1}{1 - 2x C(x)}.

(Note that the exponent of x is half the number of steps.) A little algebra shows that the desired result follows. Unfortunately, if S is more complicated this argument doesn’t readily generalize. (The proof does, however, generalize to the grand Motzkin paths, which is the case S = \{ (2, 0), (1, 1), (0, 2) \}.)

So, barring a much more general combinatorial argument, how should we handle more complicated step sets S?

Fourier and complex analysis

Since the transformation F \mapsto D_F is linear, what we would like to do is to find a nice linear operator such that the terms x^i y^j, i \neq j are sent to zero but the terms x^n y^n are sent to r^n for some other variable r. In fact, there is a very nice way to do this. Consider the substitution

\displaystyle F(re^{it}, re^{-it}) = \sum_{m, n \ge 0} f(m, n) r^{m+n} e^{i(m-n)t}.

If F(x, y) converges absolutely in some neighborhood of the origin in \mathbb{C}^2, which is true whenever f(m, n) satisfies some kind of growth condition – in particular, it is true if F is rational – we can take r to be small enough so that this is an honest-to-goodness holomorphic function in r, t. Now observe that if m - n \neq 0 then e^{i(m-n)t} has exactly the property we want: namely, it integrates to 0 on the circle. But the terms with m - n = 0 integrate to 2\pi. In other words,

\displaystyle \frac{1}{2\pi} \int_{-\pi}^{\pi} F(re^{it}, re^{-it}) \, dt = \sum_{n \ge 0} f(n, n) r^{2n} = D_F(r^2).

This is exactly the kind of result we were looking for. Let me state the strongest result I can prove regarding when this integral is computable in “closed form.”

Theorem: If F is rational, then D_F is algebraic.

Proof. Let z = e^{it} and write F(re^{it}, re^{-it}) = \frac{P(r, z)}{Q(r, z)} where P, Q are polynomials (after clearing out by the appropriate power of z). We want to think of r as a fixed constant small enough so that \frac{P(r, z)}{Q(r, z)} is holomorphic as a function of z in a disc containing the unit disc. Then one has

\displaystyle \frac{1}{2\pi} \int_{-\pi}^{\pi} \frac{P(r,z )}{Q(r, z)} \, dt = \frac{1}{2\pi i} \int_{\gamma} \frac{P(r, z)}{z Q(r, z)} \, dz

after the appropriate change of variables, where \gamma is the contour |z| = 1. Note the relationship to Cauchy’s integral formula. Now the residue theorem gives

\displaystyle \frac{1}{2\pi i} \int_{\gamma} \frac{P(r, z)}{z Q(r, z)} \, dz = \sum_{z_i Q(r, z_i) = 0, |z_i| < 1} \text{Res}_{z=z_i} \frac{P(r, z)}{z Q(r, z)}

where the sum is over all poles z_i (as a function of z) in the unit circle and \text{Res} denotes the residue. These poles occur at the roots of z Q(r, z) = 0, which will be a finite collection of algebraic functions of r along with z_0 = 0, and the residues c_i are rational functions of the roots, hence are also going to be algebraic functions of r. The rule for whether z_i should be included in the summation is essentially that |z_i(0)| \le 1, since then every neighborhood of z_i(0) intersects the unit circle. And the desired result more or less follows.

(It may not be clear that the functions z_i exist, at least in the sense that they can be distinguished from each other. I can think of two arguments that should work in most cases: either use Hensel’s lemma to show that the z_i exist as formal power series in r, or work with small enough r so that the discriminant of Q(r, z) in z doesn’t vanish and the zeroes, as a function of r, are distinct and vary smoothly, in fact holomorphically, so the functions z_i have an analytic continuation.)

Example 1. Let S = \{ (1, 0), (0, 1) \}. Then F_S is rational, in particular

\displaystyle F_S(x, y) = \frac{1}{1 - x - y} = \frac{z}{-rz^2 + z - r}.

Then \frac{P(r, z)}{z Q(r, z)} has poles

\displaystyle z_0 = \frac{1 + \sqrt{1 - 4r^2}}{2r}, z_1 = \frac{1 - \sqrt{1 - 4r^2}}{2r}.

For sufficiently small r the root z_0 tends to infinity (in other words it does not have a well-defined power series expansion at r = 0), so we only sum the residue at z_1, giving

\displaystyle D_{F_S}(r^2) = \frac{1}{1 - 2rz_1} = \frac{1}{ \sqrt{1 - 4r^2} }.

Example 2. Let S = \{ (2, 0), (1, 1), (0, 2) \}. Then F_S is rational, in particular

\displaystyle F_S(x, y) = \frac{1}{1 - xy - x^2 - y^2} = \frac{z^2}{-r^2 z^4 + (1 - r^2) z^2 - r^2}.

Then \frac{P(r, z)}{z Q(r, z)} has poles

\displaystyle z_0^2 = z_1^2 = \frac{1 - r^2 + \sqrt{1 - 2r^2 - 3r^4}}{2r^2}, z_2^2 = z_3^2 = \frac{1 - r^2 - \sqrt{1 - 2r^2 - 3r^4}}{2r^2}.

For sufficiently small r the first two roots tend to infinity, so we only sum the residues at z_2, z_3, giving

\displaystyle D_{F_S}(r^2) = \frac{1}{2(1 - r^2) - 4r^2 z_2^2} + \frac{1}{2(1 - r^2) - 4r^2 z_3^2} = \frac{1}{ \sqrt{1 - 2r^2 - 3r^4} }.

Example 3. Let S = \mathbb{Z}_{\ge 0}^2 - \{ (0, 0) \}. Then F_S is rational, in particular

\displaystyle F_S(x, y) = \frac{1}{2 - \frac{1}{(1 - x)(1 - y)}} = \frac{1}{2} \left( 1 + \frac{z}{-2rz^2 + (1 + 2r^2)z - 2r} \right).

Then \frac{P(r, z)}{z Q(r, z)} has poles

\displaystyle z_0 = 0, z_1 = \frac{1 + 2r^2 + \sqrt{1 + 4r^2 - 12r^4}}{4r}, z_2 = \frac{1 + 2r^2 - \sqrt{1 + 4r^2 - 12r^4}}{4r}.

For sufficiently small r the root z_1 tends to infinity, so we sum over the residues at z_0, z_2 and obtain

\displaystyle D_{F_S} = \frac{1}{2} + \frac{1}{2} \frac{1}{1 + 2r^2 - 4rz_2} = \frac{1}{2} + \frac{1}{2 \sqrt{1 + 4r^2 - 12r^4} }.

This problem appeared on one of my problem sets recently and I’m curious if a path decomposition argument can be made to work.

Example 4. Let S = \{ (a, 0), (0, b) \} where a, b \ge 1. Then

\displaystyle D_{F_S}(r^2) = \sum_{n \ge 0} {(a+b)n \choose bn} r^{2(a+b)n}

is an algebraic function. For a = 2, b = 1 this appeared on one of my problem sets last year; see also this thread on ArtofProblemSolving.com where the path decomposition and differentiation argument is discussed for b = 1.

Combinatorial objections

In principle, none of the above computations should require analysis. There should be an alternate approach along the following lines: returning to our motivating case, one way to think about the binomial coefficients is that they describe all words on two letters a, b with a fixed number of as and a fixed number of bs. Then the central binomial coefficients describe the context-free language consisting of all words with the same number of as as bs. (We can take the generating grammar S \mapsto SS | \epsilon | aSb | bSa, if I’m not mistaken.) The following is known.

Theorem: Let L be a (edit: unambiguous) context-free language, let L_n be the number of words of length n, and let L(x) = \sum_{n \ge 0} L_n x^n. Then L(x) is algebraic.

I therefore suspect that the following is true, which would imply the result proven above.

Conjecture: Let S be a step set such that a map from S to uniquely decodable words preserving length, extended to paths consisting of steps in S, results in regular languages (read: F_S is rational). Then the sublanguage of words corresponding to paths ending on the diagonal y = x is context-free.

I hope it’s clear what I mean by this. Does anyone have a proof? (Probably it’s obvious.)

Edit, 10/15/09: Indeed, it is. See the proof at MathOverflow.net.

Tags: , , , ,

7 Responses to “Extracting the diagonal”

  1. uncudh Says:

    I think that perhaps your last theorem concerning context-free languages is not entirely true. For it to be true I believe one should add the hypothesis that the language be generated by an unambiguous context-free grammar. This paper of Flajolet gives numerous examples of context-free languages with transcendental generating functions.

    • Qiaochu Yuan Says:

      Sorry, of course you’re right. The post has been edited.

      • Akhil Mathew Says:

        “Theorem: Let be a (edit: unambiguous) context-free language, let be the number of words of length , and let . Then is algebraic.”

        Interesting- do you have a reference for the proof of this result? Also, is a regular language necessarily an unambiguous context-free language?

        I think I see how to prove this for unambiguous regular languages (by induction on the number of symbols defining the regular expression) but that’s as far as I’ve got.

      • Qiaochu Yuan Says:

        The theorem is quoted, although I’m not sure it’s proven, in Flajolet and Sedgwick, so there should be a reference there. An unambiguous regular language is necessarily an unambiguous context-free language, but a stronger result holds there: the generating function of an unambiguous regular language is rational.

      • Akhil Mathew Says:

        Thanks for the reference!

  2. Lauris Pauris Says:

    This was complex and crazy, but awesome as hell!

  3. Qiaochu Yuan Says:

    If people would like to see a discussion of diagonals that is less handwavy, I just learned today that this topic is discussed in Chapter 6, Section 3 of EC2. Stanley even uses the same examples (which makes sense since this post was inspired by his problem sets).


Leave a Reply