Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces

5.A: Invariant Subspaces

Suppose TL(V). If we have a direct sum decomposition:

V=U1Um

where each Uj is a proper subspace of V, then to understand T:VV then we only need to understand each T|Uj, restricting T to our domain for a subspace Uj. But the problem is that T|Uj may not map back to Uj itself, a requirement for being an operator (see Chapter 3 - Linear Maps#Operators as to why this is the case). So we must only consider decompositions of V that allow that property to arise.

invariant subspace

Suppose TL(V). A subspace U of V is an invariant under T if uU implies that TuU.

This allows U to be invariant when T|U is an operator on U.

Some examples of invariant subspaces under T over V:

So the big question is if TL(V) has invariant subspaces that aren't the ones above, as these are the so-called trivial invariants. Notice that this is because null(T) may very well be the first set {0} and likewise range(T)=V is also a possibility.

Eigenvalues/vectors

Let's look at invariant subspaces of dimension 1, the simplest one we know.

Take any vV where v0 and let U be:

U={λv:λF}=span(v)

Then U is a 1-dimensional subspace of V. If U is invariant under and operator TL(V) then TvU and thus there is some scalar λF such that:

Tv=λv

Conversely, if Tv=λv for some λF then span(v) is a 1-dimensional subspace of V invariant under T.

eigenvalue

Suppose TL(V). A number λF is called an eigenvalue of T if there exists vV such that v0 and Tv=λv.

Thus T has a 1-dimensional invariant subspace iff T has an eigenvalue.

Lemma

Suppose V is finite-dimensional and TL(V) and λF. Then the following are equivalent.

  • λ is an eigenvalue of T;
  • TλI is not injective;
  • TλI is not surjective;
  • TλI is not invertible.

The proof of this is in Lecture 22 (online) - Invariant Subspaces#^dedd4b.

eigenvector

Suppose TL(V) and λF is an eigenvalue of T. A vector vV is an eigenvector of T corresponding to λ if v0 and Tv=λv.

Since Tv=λv(TλI)v=0 then a vector vV is an eigenvector corresponding to λ iff vnull(TλI).

Linearly Independent Eigenvectors

Let TL(V). Suppose λ1,...,λm are distinct eigenvalues of T and v1,...,vm are corresponding eigenvectors. Then v1,...,vm is LI.

The proof is at Lecture 22 (online) - Invariant Subspaces#^4f4e13.

Lemma

Suppose V is a finite-dimensional vector space. Then each linear transformation T on V has at most dim(V) distinct eigenvalues and corresponding eigenvectors.

The proof is at Lecture 22 (online) - Invariant Subspaces#^cecd64.

Restriction and Quotient Operators

If TL(V) and U is a subspace of V invariant under T, then U determines two other operators T|UL(U) and T/UL(V/U):

T|U and T/U

Suppose TL(V) and U is a subspace of V invariant under T

  • The restriction operator T|UL(U) is defined by:
T|U(u)=Tu

for uU.

  • The quotient operator T/UL(V/U) is defined by:
(T/U)(v+U)=Tv+U

for vV.

Here we have it that v+U is the set v+U={v+u:uU}. Notice if v+U=w+U then Tv+U=Tw+U. Suppose v+U=w+U then vwU. To be honest, we're not covering these, so see section 3.D and 3.E later on when I get to there to use these in more detail.

5.B: Eigenvectors and Upper-Triangular Matrices

Polynomials Applied to Operators

Tm

Suppose TL(V) and m is a positive integer.

  • Tm is defined by TTTm times
  • T0 is defined to be the identity operator I on V.
  • If T is invertible with inverse T1 then Tm is defined by: Tm=(T1)m.
p(T)

Suppose TL(V) and pP(F) is a polynomial given by:

p(z)=a0+a1z+a2z2++amzm

for zF. Then p(T) is the operator defined by:

p(T)=a0I+a1T+a2T2++amTm

If we fix TL(V) then the function from P(F) to L(V) given by pp(T) is linear.

product of polynomials

If p,qP(F) then pqP(F) is the polynomial defined by:

(pq)(z)=p(z)q(z)

for zF

We get some properties as a result:

Multiplicative properties

Suppose p,qP(F) and TL(V) then:

  1. (pq)(T)=p(T)q(T)
  2. p(T)q(T)=q(T)p(T)

The proof is pretty straightforward, but if you're curious check out Year3/Winter2024/MATH306-LinearAlgebraII/2015_Book_LinearAlgebraDoneRight.pdf#page=144.

Existence of Eigenvalues

Operators on complex vector spaces have an eigenvalue

Every operator on a finite-dimensional, nonzero, complex vector space has an eigenvalue.

We talked about the proof in Lecture via Lecture 23 - Polynomial Operator#^9bac66.

Upper-Triangular Matrices

matrix of an operator, M(T)

Suppose TL(V) and v1,...,vn is a basis of V. The matrix of T with respect to this basis is the n×n matrix:

M(T)=(A1,1A1,nAn,1An,n)

whose entries Aj,k are defined by:

Tvk=A1,kv1++An,kvn

If the basis is not clear from context, M(T,β) is used instead.

A really simple basis to have is to have the first vector be nothing but zeroes, after a non-zero term λ, then repeating this process:

(λ000)

If V is finite-dimensional and a complex vector space, we know that an eigenvalue like λ exists, with its associated eigenvector that we use as the first vector of the basis. We can then repeat with the smaller matrix with the first row and column removed, and get the same thing, all the way down!

diagonal of a matrix

The diagonal of a square matrix consists of the entries along the line from the upper left corner to the bottom right corner.

upper-triangular matrix

A matrix is called upper-triangular if all the entries below the diagonal equal 0.

Typically they have the shape of:

(λ10λn)
Conditions for upper-triangular matrix

Suppose TL(V) and v1,...,vn is a basis of V. Then the following are equivalent:

  1. The matrix of T (M(T)) with respect to v1,...,vn is upper-triangular
  2. Tvjspan(v1,...,vj) for each j=1,...,n
  3. span(v1,...,vj) is invariant under T for each j=1,...,n

To see the intuition for the proof below, refer to Lecture 23 - Polynomial Operator#Upper Triangular for an overview of how the proof works.

Proof
(1) equals (2) from the definitions. (3) implies (2) is easy to show, so to finish we only prove (2) implies (c).

Suppose (2). Fix j=1,...,n. From (2) we know that:

Tv1span(v1)span(v1,...,vj)Tv2span(v1,v2)span(v1,...,vj)Tvjspan(v1,...vj)

Thus if v is a linear combination of v1,...,vj then:

Tvspan(v1,...,vj)

So then span(v1,...,vj) is invariant under T.

Now we really want to show that each operator on a finite-dimensional complex vector space has a matrix of the operator with only 0's below the diagonal.

Over C, every operator has an upper-triangular matrix

Suppose V is a finite-dimensional complex vector space and TL(V). Then T has an upper-triangular matrix with respect to some basis of V.

See the proof at Lecture 24 - Finishing Eigenstuff#^98fa4c. Notice that to construct such a proof, one usually creates it from the back, then proves it forwards. In this case, U seems arbitrary, but notice that it's a good choice because the bottom right matrix at the end of the proof specifically is guaranteed to have λ's on the diagonal.

Determination of invertibility from upper-triangular matrix

Suppose TL(V) has an upper triangular matrix with respect to some basis of V. Then T is invertible iff all the entries on the diagonal of that upper-triangular matrix are non-zero.

Again the proof we did in lecture is at Lecture 24 - Finishing Eigenstuff#^d13947 for the first half, and Lecture 25 - Eigenvalues (cont.)#^a6bb24 for the more detailed proof of the second half.

This will be used for a really important lemma, so hold onto your butts.

Determination of eigenvalues from upper-triangular matrix

Suppose TL(V) has an upper-triangular matrix with respect to some basis of V. Then the eigenvalues of T are precisely the entries on the diagonal of that upper-triangular matrix.

The proof is at Lecture 25 - Eigenvalues (cont.)#^8972ff.

5.C: Eigenspaces and Diagonal Matrices

diagonal matrix

A diagonal matrix is a square matrix that is 0 everywhere except possibly along the diagonal.

For instance:

(800050005)=diag(8,5,5)

Note that every diagonal matrix is upper triangular, so all the properties from the previous section apply here.

Using Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^bb525c, if an operator T has a diagonal matrix with respect to some basis, then the entries along the diagonal are precisely the eigenvalues themselves.

eigenspace, E(λ,T)

Suppose TL(V) and λF. The eigenspace of T corresponding to λ, denoted E(λ,T), is defined by:

E(λ,T)=null(TλI)

In other words, E(λ,T) is the set of all eigenvectors of T corresponding to λ, along with the 0 vector.

Notice that for TL(V) and λF, the eigenspace E(λ,T) is a subspace of V because the null space of each linear map on V is a subspace of V. Thus, these definitions imply that λ is an eigenvalue of T iff E(λ,T){0}.

For example, consider the diag(8,5,5) matrix above. Here:

E(8,T)=span(v1)E(5,T)=span(v2,v3)

Now notice that if we restrict T to just E(λ,T), then all that happens is vectors get scaled:

Sum of eigenspaces is a direct sum

Suppose V is finite-dimensional and TL(V). Suppose also that λ1,...,λm are distinct eigenvalues of T. Then:

E(λ1,T)++E(λm,T)

is a direct sum, and furthermore,

dim(E(λ1,T))++dim(E(λm,T))dim(V)

Proof
To show that E(λ1,T)++E(λm,T) is a direct sum, suppose:

u1++um=0

where each ujE(λj,T). Since eigenvectors corresponding to distinct eigenvalues are LI (from Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^45cb5f), then each uj=0. Thus, we have a direct sum via Chapter 1 - Vector Spaces#^038942. Now:

dim(E(λ1,T))++dim(E(λm,T))=dim(E(λ1,T)E(λm,T))dim(V)

where the = comes from HW 2 - Finite Dimensional Vector Spaces#16.

diagonalizaable

An operator TL(V) is diagonalizable if the operator has a diagonal matrix with respect to some basis of V.

For instance, consider TL(R2) where:

T(x,y)=(41x+7y,20x+74y)

The matrix of T with respect to the standard basis of R2 is:

(4172074)

which isn't diagonal. But T is diagonalizable, since T with respect to the basis (1,4),(7,5) is:

(690046)

which is diagonal.

Conditions equivalent to diagonalizability

Suppose V is finite-dimensional and TL(V). Let λ1,...,λm denote the distinct eigenvalues of T. Then the following are equivalent:

  • T is diagonalizable;
  • V has a basis consisting of eigenvectors of T;
  • There exist 1-dimensional subspaces U1,...,Un of V, each invariant under T, such that:
V=U1Un
  • V=E(λ1,T)E(λm,T);
  • dim(V)=dim(E(λ1,T))++dim(E(λm,T))

Proof
(a) = (b). An operator TL(V) has a diagonal matrix diag(λ1,...,λn) with respect to basis v1,...,vn iff Tvj=λjvj for each j. Thus, (a) and (b) are equivalent above.

(b) (c). Suppose (b). Then V has a basis v1,...,vn consisting of eigenvectors of T. For each j, let Uj=span(vj). Each Uj is 1-dimensional subspace that is invariant under T. Because v1,...,vn is a basis of V, each vector in V can be written uniquely as a linear combination of v1,...,vn. Hence, each vector in V can be written uniquely as a sum u1++un, where each ujUj. Thus, V=U1Un, so (b) implies (c).

(c) (b). Suppose (c); so there are 1-dimensional subspaces U1,...,Un of V, each invariant under T, such that V=U1Un. For each j, let vj be a non-zero vector in Uj. Then each vj is an eigenvector of T, as Uj is 1-dimensional. Hence, because each vector in V is uniquely written as a sum u1++un where each ujUj (so each uj is a scalar multiple of vj), we see that v1,...,vn is a basis of V. So (c) implies (b).

We know that (a,b,c) are all equivalent. We finish showing (b) implies (d), (d) implies (e), and (e) implies (b)

Suppose (b) holds; thus V has a basis consisting of eigenvectors of T. Hence, every vector in V is a linear combinations of eigenvectors of T, which means that:

V=E(λ1,T)++E(λm,T)

so then Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^c87cad shows that (d) holds specifically.

(d) implies from (e) comes straight, again, from HW 2 - Finite Dimensional Vector Spaces#16.

Finally, suppose (e) holds; so:

dim(V)=dim(E(λ1,T))++dim(E(λm,T))

Choose a basis of each E(λj,T), put all these bases together to form a list v1,...,vn of eigenvectors of T, where n=dim(V) via our equation of dimension above. To show v1,...,vn is LI, suppose:

a1v1++anvn=0

For each j=1,...,m let uj denote the sum of all the terms akvk such that vkE(λj,T). Thus each uj is in E(λj,T), and:

u1++um=0

Because eigenvectors corresponding to distinct eigenvalues are LI via Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^45cb5f, this implies each uj=0. Because each uj=akvk where the vk's are chosen to be a basis of E(λj,T), this implies that all ak=0. Thus v1,...,vn is LI, and hence a basis of V (since we have the right number of vectors). Thus (e) implies (b).

Okay that was a long proof. Feel free to digest this, then move onto the next thing.

The sad thing though is that not all TL(V) have diagonalizable matrices M(T). For example, if TL(C2) where:

T(w,z)=(z,0)

is not diagonalizable. 0 is the only eigenvalue of T and furthermore E(0,T)={(w,0)C2:wC}, so all (b - e) in our theorem above fail, so (a) of it fails and thus T isn't diagonalizable.

However, we can guaruntee that we can be diagonalizable if we follow the following lemma:

Enough eigenvalues implies diagonalizablility

If TL(V) has dim(V) distinct eigenvalues, then T is diagonalizable.

Proof
Suppose TL(V) has dim(V) distinct eigenvalues λ1,...,λdim(V). For each j, let vjV be eigenvector corresponding to the eigenvalue λj. Because eigenvectors corresponding to distinct eigenvalues are LI via Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^45cb5f, then v1,...,vdim(V) is LI. A LI list of dim(V) is a basis of V, so v1,...,vdim(V) is a basis of V. With respect to this basis consisting of eigenvectors, T has a diagonal matrix.

Note that the converse is not true; namely, we could have not all distinct eigenvalues while still being diagonalizable.