Feeds:
Posts

## IMO 2009 and proof systems

The problems from IMO 2009 are now available. I haven’t had much time to work on them, though.

There are two classical geometry problems, which I already know I won’t attempt. While I am well aware that classical geometry often requires a great deal of ingenuity, I am also aware of the existence of the field of automatic geometric theorem proving. On this subject I largely agree with Doron Zeilberger: it is more interesting to find an algorithm to prove classes of theorems than to prove individual theorems. The sooner we see areas like classical geometry as “low-level,” the sooner we can work on the really interesting “high-level” stuff. (Plus, I’m not very good at classical geometry.)

Zeilberger’s typical example of a type of theorem with a proof system is the addition or multiplication of very large numbers: it is more interesting to prove $(a + 1)(a - 1) = a^2 - 1$ symbolically than it is to prove individual “theorems” such as $999 \cdot 1001 = 999999$. Zeilberger himself played a significant role in the creation of another proof system, but for the far less trivial case of hypergeometric identities (which includes binomial identities as a special case).

But so I can make my point concretely, I’d like to discuss some examples based on the types of problems most of us had to deal with in middle or high school.

Trigonometric identities

… are trivial. For the purposes of this post a trigonometric identity in one angle is an identity of the form

$P(\cos \theta, \sin \theta) = Q(\cos \theta, \sin \theta)$

where $P, Q$ are polynomials. This is equivalent to allowing $P, Q$ to be rational functions, as well as being equivalent to allowing $P, Q$ to be rational functions in $\cos n \theta, \sin n \theta, n \in \mathbb{Q}$, since we can redefine $\theta$ so that $n \in \mathbb{Z}$ without loss of generality and since the Chebyshev polynomials allow us to remove the coefficients of $n$.

There are (at least) four ways to prove a trigonometric identity in one variable in a totally algorithmic manner:

1. Let $q = e^{i \theta}$. Then $\cos \theta = \frac{q + q^{-1}}{2}, \sin \theta = \frac{q - q^{-1}}{2i}$, and checking an identity of the form $P(\cos \theta, \sin \theta) = Q(\cos \theta, \sin \theta)$ is equivalent (after clearing denominators) to checking a polynomial identity in $q$, and checking polynomial identities is trivial: verify that each coefficient of the polynomial is zero.
2. Let $t = \tan \frac{\theta}{2}$. Then $\cos \theta = \frac{1 - t^2}{1 + t^2}, \sin \theta = \frac{2t}{1 + t^2}$, and again we can reduce to checking a polynomial identity. (This is one of my favorite tricks in all of mathematics and I may write a separate post about it at some point.)
3. Predict the degree of the polynomial you would get by using either of the substitutions above and evaluate both sides at more points than the degree of the polynomial. (If you’re worried that doing this numerically isn’t enough, repeatedly use the half-angle formula.)
4. The ring $\mathbb{C}[\cos \theta, \sin \theta]$ is isomorphic to $\mathbb{C}[x, y]/(x^2 + y^2 - 1)$. Any element of this ring has a unique representative in which no more than single powers of $x$ appear, so checking whether an element is zero again reduces to polynomial identity testing.

Admittedly, the problem of testing polynomial identities efficiently is far from trivial; see, for example, the Wikipedia article on the Schwartz-Zippel theorem. Efficiency is, at the moment, not important: what I am concerned with is the fact that such an algorithm exists. Serious mathematicians do not care about trigonometric identities; they have already internalized some algorithm for proving them, and that’s the end of the matter. As for students, it seems to me that the main reason students have difficulty proving trigonometric identities is that there are many equivalence relations in play and nobody says the word “equivalence relation,” much less “quotient ring,” until much later. Only when you sternly decide to find nice representatives to deal with all of them does everything reduce to polynomial identity testing.

Methods 1, 2, and 4 work just fine for trigonometric identities in a finite number of angles where we allow rational linear combinations of the angles, since after using the angle addition formulas (which aren’t even necessary in Method 1) we reduce to polynomial identity testing in $n$ variables, which again reduces to checking that every coefficient is equal to zero. Method 3 can be salvaged by an inductive argument: checking a polynomial identity in $n-1$ variables after setting the last variable equal to enough constant values should work. (I make no claim as to the efficiency of this algorithm.)

Factoring polynomials over the integers

… is trivial. The key to practical algorithms is that the irreducible factors of a polynomial over the integers are “visible” modulo primes; this is the local perspective in number theory again. Finding the irreducible factors of a polynomial over a particular finite field is a problem with a finite search space, and good algorithms are known for doing this.

Given that you can find the irreducible factors of a polynomial $p(x) \in \mathbb{Z}[x]$ modulo various primes, you obtain a great deal of information about its irreducible factors over $\mathbb{Z}$. The simplest example is that if $p(x)$ is found to be irreducible modulo any prime, it is already irreducible; more generally, if $p(x)$ is found to factor into factors $p_1(x), ... p_k(x) \bmod q$ then every irreducible factor over $\mathbb{Z}$ must reduce to some product of these factors $\bmod q$. Although I don’t know an explicit upper bound on the number of distinct primes needed to “tease out” the irreducible factors, it seems quite reasonable (at least to me) that the CRT and a finite amount of casework should settle the matter.

Indefinite integration of elementary functions

… is almost trivial. There is an algorithm called the Risch algorithm that determines the indefinite integral of an elementary function if it exists, based on differential Galois theory, except that one step (checking whether a certain expression is identically zero) is apparently an open problem.

As a special case, the indefinite integration of rational functions can be done algorithmically, at least in the sense that the coefficients of the indefinite integral can be found to arbitrary precision. To compute an indefinite integral

$\displaystyle \int \frac{P(x)}{Q(x)} \, dx, P, Q \in \mathbb{C}[x]$

for $P, Q$ polynomials, one simply computes the partial fraction decomposition of the integrand. If $Q(x)$ has solvable Galois group this can be done in “closed form”; otherwise, one can approximate the roots $r_i$ of $Q(x)$ numerically and get the coefficients of the partial fraction decomposition numerically as well. I don’t know how numerically stable doing this naively is.

Via Method 2 for trigonometric identities, the indefinite integration of rational functions of $\cos \theta, \sin \theta$ can be done in the same way. It follows that the indefinite integration of rational functions of $x, \sqrt{1 - x^2}$ can be done in the same way. In fact one can bypass trigonometric functions entirely to get this done: Method 2 is equivalent to directly using the substitution $x = \frac{2t}{1 + t^2}$.

Concluding remarks

A recurring theme in Zeilberger’s opinions is that students should be taught how to program using computer algebra. I rather like this idea. The surest way to show that you understand, for example, trig identities is to write an algorithm that proves them for you; similarly for certain classes of indefinite integrals. I think a generation of mathematicians raised in this way would do very interesting mathematics. At the very least, they would conjecture very interesting mathematics.

Another reason I like the algorithmic perspective is that it suggests interesting problems that are deeper than the problems the algorithms solve, and not just the obvious problems (such as “can this algorithm be improved?”). For example, here are some problems that the above discussion might suggest:

1. One way to generate new trigonometric identities from old ones is to differentiate with respect to $\theta$. As an abstract operation on $\mathbb{C}[x, y]/(x^2 + y^2 - 1)$, differentiation is a derivation $D$ defined by

$D(x) = -y, D(y) = x$

and then the claim that $D$ is “well-defined” is equivalent to the claim that the derivation of any multiple of $x^2 + y^2 - 1$ is identically zero. By the Leibniz identity this is equivalent to the simple condition that $D(x^2 + y^2 - 1) = D(x^2 + y^2) = 0$, which in turn is equivalent to the condition that $2x D(x) = -2y D(y)$. It follows that, up to multiplication by a “constant,” $D$ is essentially the only sensible notion of a derivative on the ring of functions of the circle.

This leads to an interesting idea: studying the derivations on the ring of functions of a variety $V$ gives us information about the possible “nice” parameterizations of $V$. (What I have in mind here is a possible motivation of the definition of the Weierstrass elliptic functions. This is probably standard material somewhere.)

2. Let $p(x) \in \mathbb{Z}[x]$ be an irreducible polynomial. Must there exist a prime relative to which $p(x)$ remains irreducible? (This is a classic problem; the answer is no.)
3. What functions share with $\sqrt{1 - x^2}$ the property that a rational substitution allows us to reduce to integrating rational functions? (One way to answer this problem is to find algebraic curves of genus zero.)

By the way, if you want to learn more about the proof system that inspired this post, a very readable introduction is the relevant section in Ideals, Varieties, and Algorithms by Cox, Little, and O’Shea. If you don’t want to get the book, here’s a one-sentence summary: everything is about Gröbner bases. (Zeilberger once remarked that the impact of the theory of Gröbner bases on mathematics should be worth ten Fields medals.)

### 3 Responses

1. [...] racional de e utilizando estas substituições. Este é  um dos métodos indicados neste post de Annoying [...]

2. Hmm…this is interesting but I have to say that I respectfully disagree. It seems to me that if you solve a problem generally, you may loose a lot. Perhaps there is a nicer solution that you miss by checking off the problem as solved. Or maybe there are certain special cases that merit more attention.

I would also argue that less general problems may have an aesthetic quality to them. If you consider algebraic inequalities, you may say well I can solve all of these with lagrange multipliers so lets just ignore them. But I feel like there are problems that would have nice elementary solutions (say with AM-GM or Cauchy) that are nice in themselves.

(Oh and obviously I feel the same way about geometry and thats why I wouldn’t just discount the problems haha)

• I certainly agree that different solutions to a problem offer different perspectives. I think the biggest value in a solution is what direction of generalization it suggests. Using Grobner bases points towards computational algebraic geometry, which has lots of interesting applications, but I agree that this is not all there is to say about geometry and different proofs suggest different generalizations – the axiomatic study of projective geometries, inner product spaces, and so forth. (I write posts like this precisely because I want to hear people disagree with them in interesting ways!)

I don’t agree that an elementary solution to an inequality is more interesting or even more elegant than a general one. Most Olympians think of Muirhead’s inequality as being “cheap,” but the proof of Muirhead’s inequality is quite beautiful; its central ingredient is the Birkhoff-von Neumann theorem, which can be proven via, of all things, Hall’s marriage theorem. I think the underlying idea of convexity is a powerful one and Olympians shouldn’t ignore it just because it’s “cheap.” (Again, this proof also suggests an interesting direction, which is to try to generalize the Birkhoff-von Neumann theorem, for example to certain types of measures generalizing doubly stochastic matrices.)

But I agree that Lagrange multipliers don’t make polynomial inequalities a pointless subject to study. For example, I believe it’s an open problem to characterize the polynomial inequalities satisfied by the elementary symmetric functions of n real variables.