Let be two
matrices. If
don’t commute, then
; however, the two share several properties. If either
or
is invertible, then
is conjugate to
, so in particular they have the same characteristic polynomial.
What if neither nor
are invertible? As it turns out,
and
still have the same characteristic polynomial, although they are not conjugate in general (e.g. we might have
but
nonzero). There are several ways of proving this result, which implies in particular that
and
have the same eigenvalues.
What if are linear transformations on an infinite-dimensional vector space? Do
and
still have the same eigenvalues in an appropriate sense? As it turns out, the answer is yes, and the key lemma in the proof is an interesting piece of “noncommutative high school algebra.”
Square matrices
Proposition: Let be two
matrices over an algebraically closed field
. Then
and
have the same characteristic polynomial.
(Note that this is a polynomial identity in the entries of regarded as
formal variables, so to prove it identically (in particular for all commutative rings
) it suffices to prove it over a fixed algebraically closed field, e.g. we could take
.)
Proof 1. This is clear if is invertible since
. The invertible matrices are Zariski-dense in all matrices (when
they are also dense in the usual topology), so the result follows in general.
Proof 2. Recall that . (This is straightforward to prove by computation but it also has an elegant proof using the invariant description of the trace as tensor contraction
, and one can also prove it in the same way as above.) By induction we conclude that
for all
, so the power sum symmetric polynomials in the eigenvalues of
and
are identical. By the Newton-Girard identities, it follows that the elementary symmetric polynomials in the eigenvalues of
and
are identical.
(Edit, 10/25/17:) Proof 2 only works over a field of characteristic zero. Fixing it basically gives a version of Proof 3.
Proof 3. We work universally. Let the entries of be formal variables
in a polynomial ring
. Note that we have
where is another (scalar) variable and
is the identity matrix. Since
is an integral domain, we can cancel
from both sides to obtain
.
Eigencomplications
What can we say beyond matrices? It might be reasonable to guess that
and
still have the same eigenvalues if
are linear transformations on an infinite-dimensional vector space; however, since no analogue of the characteristic polynomial, the trace, or the determinant is available in general in this setting, the proofs above don’t generalize, and in fact the result is false.
Example. Consider the differential operators and
acting on
(
a field of characteristic zero). The former has eigenvectors
with eigenvalues
while the latter has eigenvectors
with eigenvalues
; thus the eigenvalues of the former are
while the eigenvalues of the latter are
.
and
provide examples of other infinite-dimensional phenomena too: they are endomorphisms of a vector space like
matrices, but one is injective without being surjective, one is surjective without being injective, and
doesn’t have any eigenvectors whatsoever (even if
is algebraically closed)!
The issue that occurs in the above example is the following. Let be two linear operators on a vector space
. Suppose that
is an eigenvector for
, thus
for some . Then
so that is an eigenvector for
with the same eigenvalue, unless it is equal to zero. But in that case
, so
. If
, then this cannot occur. So we have proven the following.
Proposition: Let be two linear transformations on a vector space
. Then
is a nonzero eigenvalue of
if and only if it is a nonzero eigenvalue of
.
Thus the only possible discrepancy in the sets of eigenvalues occurs when , and the example of
shows that this can occur.
Unfortunately, as we have seen, if is infinite-dimensional there are linear transformations with no eigenvectors whatsoever, such as
acting on
. Is there no hope of generalizing the above result to such linear transformations?
The spectrum
When we talk about eigenvalues, what are we really talking about? On a finite-dimensional vector space , to say that a linear transformation
has an eigenvector
with eigenvalue
is to say that
, or
. This is true if and only if
fails to be invertible, which suggests the following definition.
Definition: Let be a linear transformation on a vector space over a field
. The spectrum
consists of all
such that
is not invertible.
If is not invertible because it is not injective, then
is an eigenvalue; in functional analysis terms,
lies in the point spectrum
. However,
may also fail to be invertible because it is not surjective, and in this case the spectrum is strictly larger than the point spectrum.
Example. The linear operator acting on
has spectrum all of
even though it has no point spectrum.
Example. The linear operator acting on
has empty spectrum.
Example. Let be a topological space,
be the space of continuous functions
, and let
be multiplication by some function
. Then the spectrum of
is precisely the range of
. (This is one basic reason it’s reasonable to think of eigenvalues of operators in quantum mechanics as being values of functions such as position and momentum.) In many cases,
has no point spectrum (e.g.
).
There is a nice connection between spectra in this sense and spectra of rings in algebraic geometry. A simple version is as follows. If is a linear transformation of a finite-dimensional vector space
over an algebraically closed field
, then we can consider the ring
generated by
in
. This ring is isomorphic to
where
is the minimal polynomial of
, so the spectrum of
can naturally be identified with the set of eigenvalues (that is, the spectrum) of
!
A more general connection can be obtained by generalizing the definition of spectrum.
Definition: Let be a field and
a
-algebra. Let
. The spectrum
of
consists of all
such that
is not invertible in
.
(We recover the definition applied to linear operators by taking .)
Now observe that if is commutative, then
is not invertible if and only if it is contained in a maximal ideal
. This maximal ideal has the property that
in
, thus thinking of
as a function on
it follows that
takes on every value in its spectrum at an appropriate maximal ideal! If the residue fields of
are all
(which occurs for example if
is algebraically closed and
is finitely-generated by the Nullstellensatz) then the converse is also true.
We are now ready to state the appropriate generalization of the proposition above about characteristic polynomials.
Proposition: Let be two elements of a
-algebra
. Then
.
The crucial lemma
The proposition claims that if is nonzero, then
is invertible if and only if
is invertible. Equivalently,
is invertible if and only if
is invertible. By setting
, we see that the following piece of “noncommutative high school algebra” implies and in fact generalizes our proposition.
Lemma: Let be elements of a ring
. Then
is invertible if and only if
is invertible.
This lemma is somewhat infamous for having the following “proof.” Pretend that it makes sense to write
in a general ring. Then
.
And indeed, if we write , then we dutifully find that
and similarly
.
Halmos once posed the problem of explaining in what sense the geometric series proof works. There is a discussion of this problem on MO which I would summarize as follows. The universal ring describing this problem is the free ring
on two elements , and an inverse to
. If we show that
has an inverse in this ring, then we are done by the universal property. And the idea is that this ring ought to embed in a suitable ring of formal power series where one can make sense of geometric series expansions. To make this easier, we’ll work with a different ring
where is a new central variable, and we want to embed
into
by sending
to
. In this ring the geometric series argument makes perfect sense and proves that
is an inverse to
. Applying the evaluation homomorphism out of
obtained by setting
then solves our original problem.
However, it does not seem trivial to me to prove that the natural map from to
is actually injective, and nobody in the MO discussion above seems to actually prove this.
Hi Qiaochu, I don’t know if you have any plans to update this. But the rectangular matrix case can be dealt with very easily after having done the square case: see http://math.stackexchange.com/a/332688/22857
In proof 3 for square matrices, you wrote an argument of the form $\det(B) \det(X) = \det(Z) = \det(Y) \det(B)$ and since we work over an integral domain we can cancel $\det(B)$ from both side. Put it in another way $$ \det(B) \cdot ( \det(X)-\det(Y) ) = 0 \ .$$
While the image of the det homomorphism is an integral domain, its domain is not! You can have a non-zero non-invertible matrix $B$ and then $\det(B)=0$ and you can’t say anything about $\det(X)-\det(Y)$. So the proof is valid only when both $A$ and $B$ are invertible (or at least when $A$ and $B$ have a non-zero determinant).
In proof 3 the entries of the matrices are all algebraically independent formal variables; in particular, the determinant is literally the determinant as a polynomial in the entries, and so is definitely nonzero. That’s the benefit of working universally. (That is, in that proof the matrices aren’t particular matrices; they’re literally the universal pair of matrices.)
Very nice post. I just stumbled upon your blog and wanted to mention that I’ve really loved surfing around your weblog posts. After all I will be subscribing to your feed and I hope you write once more very soon!
In proof-3 why is $det(\lambda I – bab) = det(b) det(\lambda I – ab)$?
It’s not, of course! I meant
.
in proof 3 you missed b in the first determinant:det(b\lambdaI-bab)=det(b)(det(\lambdaI-ab))=det(b)det(\lambdaI-ba). And K[T] is not isomorphic to K[x]/m(x) where m is the charateristic polynomial but to K[x]/p(x) where p is the minimal polynomial(otherwise for example K[Id] would be n dimensional,rather than 1 dimensional)
Thanks! Those were both typos; I meant the minimal polynomial (that’s why I called it
!).