Chapter 8 - Operators on Complex Vector Spaces

8.A: Generalized Eigenvectors and Nilpotent Operators

Null Spaces of Powers of an Operator

Sequence of increasing null spaces

Suppose TL(V). Then:

{0}=null(T0)null(T1)null(Tk)null(Tk+1)

Proof
Suppose k is a nonnegative integer and vnull(Tk). Then Tkv=0 and hence Tk+1v=0 so vnull(Tk+1). Hence null(Tk)null(Tk+1).

Equality in the sequence of null spaces

Suppose TL(V). Suppose m is a nonnegative integer such that null(Tm)=null(Tm+1). Then:

null(Tm)=null(Tm+1)=

Proof
Let kZ+. We want to show:

null(Tm+k)=null(Tm+k+1)

We know from sequences of increasing null spaces that null(Tm+k)null(Tm+k+1).

To prove the other direction, suppose vnull(Tm+k+1). Then:

Tm+1(Tkv)=Tm+k+1v=0

Hence:

Tkvnull(Tm+1)=null(Tm)

Thus Tm+kv=Tm(Tkv)=0 so vnull(Tm+k). This shows so then we have equality.

Null spaces stop growing

Suppose TL(V). Let n=dim(V). Then:

null(Tn)=null(Tn+1)=

Proof
Using equality in the sequence of null spaces will give the final answer. We just have to show null(Tn)=null(Tn+1). Suppose it wasn't true. Then by the two previous lemmas we have:

{0}=null(T0)null(T1)null(Tn)null(Tn+1)

The dimension then becomes:

dim(null(T0))<<dim(null(Tn))<dimnull(Tn+1)

At each step of the way the dimension must increase by at least 1 (because we have instead of 's and thus < rather than ). Thus, after n steps that implies that dimnull(Tn+1)>n which is a contradiction since it's a subspace of V with dimension n.

While in general it's not true that V=null(T)range(T) the next could be used as a substitute:

V is a direct sum of null(Tdim(V)) and range(Tdim(V))

Suppose TL(V). Let n=dim(V). Then:

V=null(Tn)range(Tn)

Proof
First we show the intersection is trivial. Suppose vnull(Tn)range(Tn). So then Tnv=0 and u where Tnu=v. Apply Tn to both sides to get that Tnv=T2nu=0. Thus Tnu=0 by the fact that null spaces stop growing. Thus v=Tnu=0 so then the intersection is trivial.

Since the intersection is trivial, then the sum is a direct sum. Also:

dim(null(Tn)range(Tn))=dim(null(Tn))+dim(range(Tn))=dim(V)

via Chapter 3 (cont.) - Products and Quotients of Vector Spaces#^5193d7 and via the FTOLM.

Generalized Eigenvectors

More often than we desire, we don't have enough eigenvectors to lead to diagonalization. We need something more general. Let's examine this issue by fixing TL(V). We which we can describe V by direct sums of simpler spaces:

V=i=1mUi

where each Ui is a subspace of V invariant under T. The simplest Ui's would be 1-dimensional, but this is only possible iff V has a basis consisting of eigenvectors of T iff V has an eigenspace decomposition:

V=i=1nE(λi,T)

where λi are all distinct eigenvalues of T. The Chapter 7 - Operators on Inner Product Spaces#^6995f7 says if V is an inner product space then a decomposition of the form of the equation above holds for every normal operator if F=C and for every self-adjoint operator if F=R as there's enough eigenvectors in these operators to form a basis of V.

But the form above may not always hold, so we need to expand our definition of eigenvectors.

generalized eigenvector

Suppose TL(V) and λ is an eigenvalue of T. A vector vV is called a generalized eigenvector of T corresponding to λ if v0 and:

(TλI)jv=0

for some jZ+.

Although j is arbitrary, we will soon prove that every generalized eigenvector satisfies this equation with j=dim(V).

generalized eigenspace, G(λ,T)

Suppose TL(V) and λF. The generalized eigenspace of T corresponding to λ, denoted G(λ,T), is defined to be the set of all generalized eigenvectors of T corresponding to λ, along with the 0.

Because every eigenvector of T is a generalized eigenvector of T (take j=1 in the definition), each eigenspace is contained in the corresponding eigenspace (ie: E(λ,T)G(λ,T)). The next result implies that G(λ,T) is a subspace of V since the null space of each linear map on V is a subspace of V.

Description of generalized eigenspaces

Suppose TL(V) and λF. Then G(λ,T)=null(TλI)dim(V).

Proof
Suppose vnull(TλI)dim(V) . Using the definitions where j=dim(V), vG(λ,T) so then we have .

For , suppose vG(λ,T), so there is some jZ+ where vnull(TλI)j. From Chapter 8 - Operators on Complex Vector Spaces#^3fdb93 and Chapter 8 - Operators on Complex Vector Spaces#^56d370, using TλI instead of T, we get that vnull(TλI)dim(V), showing .

As an example consider the transformation:

T(z1,z2,z3)=(4z2,0,5z3)

Using the definition of eigenvalue shows that T's eigenvalues are 0 and 5. The corresponding eigenspaces are E(0,T)={(z1,0,0):z1C} and E(5,T)={(0,0,z3:z3C}. There's clearly not enough eigenvectors to span the vector space C3.

But we have T3(z1,z2,z3)=(0,0,125z3), so then the description of generalized eigenspaces implies that G(0,T)={(z1,z2,0):z1,z2C}. We have (T5I)3(z1,z2,z3)=(125z1+300z2,125z2,0) so the same theorem says G(5,T)={(0,0,z3):z3C}

Because of this then C3=G(0,T)G(5,T). But we want to show this in general for vector spaces. We'll do this later on.

Linearly Independent generalized eigenvectors

Let TL(V). Suppose λ1,...,λm are distinct eigenvalues of T and v1,...,vm are corresponding generalized eigenvectors. Then v1,...,vm is linearly independent.

Proof
See Lecture 22 - Finishing G. Eigenspaces, Starting 8.B#^73f19f.

Nilpotent Operators

nilpotent

An operator is called nilpotent if some power of it equals 0.

For instance, the operator NL(F4) defined by:

N(z1,z2,z3,z4)=(z3,z4,0,0)

is nilpotent because N2=0.

Nilpotent operator raised to dimension of domain is 0

Suppose NL(V) is nilpotent. Then Ndim(V)=0

Proof
Because N is nilpotent, G(0,N)=V. Thus by the description of generalized eigenvectors then null(Ndim(V))=V.

Given an operator T on V we want to find a basis of V such that the matrix of T with respect to this basis is a simple as possible, so hopefully M(T) has a lot of 0's.

Matrix of a nilpotent operator

Suppose N is a nilpotent operator on V. Then a basis of V w.r.t. which the matrix of N has the form:

[000]

so all entries on and below the diagonal are 0's.

Proof
First choose a basis of null(N). Then extend this to a basis of null(N2). then extend to a basis of null(N3). Continue in this fashion, eventually getting a basis of V since null(Ndim(V))=V.

Now let's think about the matrix of N w.r.t. this basis. The first column consists of all 0's since the corresponding basis vectors are in null(N). Applying N to any such vector we get a vector in null(N) which is a vector that's a linear combination of the previous basis vectors. Thus all nonzero entries in these columns lie above the diagonal. The next set of columns comes from the basis vectors in null(N3). Applying N to any such vector, we get a vector in null(N2); in other words we get a vector that is a linear combination of the previous basis vectors. Thus once again, all nonzero entries in these columns lie above the diagonal. Continue in this fashion to get the result.

8.B: Decomposition of an Operator

Description of Operators on Complex Vector Spaces

We saw prior that we may lack enough eigenvectors to have a deconstruction of V onto those eigenspaces. But we observed that TL(V) has it that null(T) and range(T) are invariant under T via the nullspace proof and the range proof. Now we can show that the null space and the range of each polynomial of T is also invariant under T:

The null space and range of p(T) are invariant under T

Suppose TL(V) and pP(F). Then null(p(T)) and range(p(T)) are invariant under T

Proof
Suppose vnull(p(T)), so then p(T)v=0. Thus:

p(T)(Tv)=Tp(T)v=T0=0

Hence Tvnull(p(T)) so it's invariant under T.

Suppose vrange(p(T)), so w where p(T)w=v. Then:

Tv=Tp(T)w=p(T)(Tw)

So clearly Tvrange(p(T)), so it's invariant under T.

The following result shows that every operator on a complex vector space can be though of as composed of pieces, each of which is a nilpotent operator plus a scalar multiple of the identity.

Description of operators on complex vector spaces

Suppose V is a complex vector space and TL(V). Let λ1,...,λm be the distinct eigenvalues of T.

  • V=i=1mG(λi,T)
  • Each G(λi,T) is invariant under T.
  • Each (TλiI)|G(λi,T) is nilpotent.

Proof
Have n=dim(V) for clarity.

(b), (c): Recall that:

G(λi,T)=null(TλiI)n

for each i via the description of generalized eigenspaces. Using our previous lemma saying the nullspace and range are invariant under T, but instead use:

p(z)=(zλi)n

then we get (b). (c) comes from the definitions, because vG(λi,T) then (Tλi)dim(V)v=0 so then:

(TλiI)|G(λi,T)=0the zero operator

showing nilpotency.

(a): See (a) proof from this proof.

A basis of generalized eigenvectors

Suppose V is a complex vector space and TL(V). Then there is a basis of V consisting of generalized eigenvectors of T.

Proof
Choose the basis of each G(λi,T) via our previous lemma. Put all the bases together to form a basis of V consisting of generalized eigenvectors of T. This is more specifically covered via this in-depth look.

Multiplicity of an Eigenvalue

If V is a complex vector space and TL(V), then the decomposition of V provided by the description of operators on complex vector spaces can be a power tool. The dimensions of the subspaces involved are sufficiently important. So much so that they get their own name:

multiplicity

Suppose TL(V). The multiplicity of an eigenvalue λ of T is defined to be the dimension of the corresponding generalized eigenspace G(λ,T). In other words, the multiplicity of an eigenvalue λ of T equals dim(null(TλI)dim(V)).

Notice that using Chapter 8 - Operators on Complex Vector Spaces#^90d6af, we can justify the second sentence in the definition.

As an example, consider the transformation TL(C3) given by:

T(z1,z2,z3)=(6z1+3z2+4z3,6z2+2z3,6z3)

Here:

M(T)=[634062007]

Here the eigenvalues of T are λ{6,7} via looking at the diagonal. The generalized eigenspaces are:

G(6,T)=span((1,0,0),(0,1,0))G(7,T)=span((10,2,1))

Thus λ=6 has multiplicity 2 while λ=7 has multiplicity 1. The direct sum of our G's gives a decomposition for C3. Thus a basis for C3 is:

{(1,0,0),(0,1,0),(10,2,1)}

You may ask if the sum of the multiplicities always equals the dimension. It turns out it does!

Sum of the multiplicities equals dim(V)

Suppose V is a complex vector space and TL(V). Then the sum of the multiplicities of all eigenvalues of T equals dim(V).

Proof
Use our lemma giving the description of complex vector spaces combined with the sum additions of direct sums.

The term algebraic multiplicity and geometric multiplicity have different meanings in many books:

Block Diagonal Matrices

To interpret our results in matrix form, we make the following definition, generalizing the notion of a diagonal matrix. In the case all Aj are 1×1 then we actually have a diagonal matrix.

block diagonal matrix

A block diagonal matrix is a square matrix of the form:

[A100Am]

where A1,...,Am are square matrices lying along the diagonal and all the other entries of the matrix equal 0.

For instance, the 5×5 matrix:

A=[[4]000000[2302]0000000000[1701]]

is a block diagonal matrix of the form:

[A1000A2000A3]
Block diagonal matrix with upper-triangular blocks

Suppose V is a complex vector space and TL(V). Let λ1,...,λm be the distinct eigenvalues of T, with multiplicities d1,...,dm. Then there is a basis of V with respect to which T has a blcok diagonal matrix like seen above:

[A100Am]

where each Aj is a dj×dj upper-triangular matrix:

Aj=[λj0λj]

Proof
Each (TλiI)|G(λi,T) is nilpotent via description of operators on complex vector spaces (namely (c)). For each j, choose a basis of G(λj,T) which is a vector space with dimension dj, such that the matrix of (TλjI)|G(λj,T) with respect to this basis is as in what the matrix of a nilpotent operator should look like. Thus the matrix of T|G(λj,T), which equals (TλjI)|G(λj,T)+λjT|G(λj,T) with respect to this basis will look like the desired form for Aj.

Putting the bases of the G(λj,T)'s together gives a basis of V via operators on vector spaces (a). The matrix of T with respect to this basis has our desired form.

For example, suppose TL(C3) is defined by:

T(z1,z2,z3)=(6z1+3z2+4z3,6z2+2z3,7z3)

We have:

M(T)=[634062007]

We found that:

G(6,T)=span((1,0,0),(0,1,0))G(7,T)=span((10,2,1))

We saw that the basis of C3 was:

{(1,0,0),(0,1,0),(10,2,1)}

The matrix of T with respect to this basis is:

[[6306]0000[7]]

Square Roots

Recall a square root of an operator TL(V) is an operator RL(V) such that R2=T via its definition. Every complex number has a square root, but not every operator on a complex vector space has a square root.

For example, the operator in C3 given by this problem has no square root. The non-invertibility has something to do with it. But first, we'll show that the identity plus any nilpotent operator has a square root.

Identity plus nilpotent has a square root

Suppose NL(V) is nilpotent. Then I+N has a square root.

Proof
Consider the Taylor series for the function 1+x:

1+x=1+a1x+a2x2+

We will not find an explicit formula for the coefficents or worry about convergence because we just want to use this equation only for motivation.

Because N is nilpotent, then Nm=0 for some positive integer mZ+. In the equation for 1+x up top, suppose we replace x with N and 1 with I. Then the infinite sum on the right side becomes finite since all Nm,Nm+1,...=0:

I+N1+a1N+a2N2++am1Nm1

Having made this guess, we try to choose a1,...,am1 such that the operator above has its square equal to I+N. Just apply this squaring process:

I+N=(I+a1N+a2N2++am1Nm1)=I+2a1N+(2a2+a1)2N2+(2a3+2a1a2)N3++(2am1+)Nm1

We want the right side of the equation to equal I+N. Thus, notice we should choose a1 where 2a1=1a1=12. Next, choose a2 such that 2a2+a12=0a2=18. Then choose a3 such that 2a3+2a1a2=0a3=116. Continue in this manner for j=4,...,m1.

We don't actually care the formula for each aj, we just need to know that for each j we can make a correct choice of aj to make a square root of I+N.

The previous result works on real as well as complex vector spaces. However, the next result holds only on complex vector spaces. For example, the operator of multiplication by 1 on the 1-dimensional real vector space R has no square root.

Over C, invertible operators have square roots

Suppose V is a complex vector space and TL(V) is invertible. Then T has a square root.

Proof
Let λ1,...,λm be the distinct eigenvalues of T. For each j there exists a nilpotent operator NjL(G(λj,T)) such that T|G(λj,T)=λjI+Nj via the description of operators on complex vector spaces. Because T is invertible then λj0 for all j, so then:

T|G(λj,T)=λj(I+Njλj)

for each j. Clearly Njλj is nilpotent, so I+Njλj has a square root using our freshly proved lemma. Multiplying a square root of the complex number λj by a square root of I+Njλj we obtain a square root Rj of T|G(λj,T) (see the equation above). A typical vector vV can be written uniquely in the form:

v=u1++um

where each ujG(λj,T) because of the direct sum structure of complex vector spaces. Using this decomposition, define an operator RL(V) by:

Rv=R1u1++Rmum

We can verify that this operator R is a square root of T by just applying R2:

R2v=R(R1u1++Rmum)=RR1u1++RRmum=R12u1++Rm2um=j=1m(λjI+Njλj)2uj=j=1mλj(I+Njλj)uj=j=1m(Iλj+Nj)uj=j=1mT|G(λj,T)uj=Tv

8.C: Characteristic and Minimal Polynomials

characteristic polynomial

Suppose V is a complex vector space and TL(V). Let λ1,...,λm denote the distinct eigenvalues of T with multiplicities d1,...,dm. The polynomial:

(zλ1)d1(zλm)dm

is called the characteristic polynomial of T.

Example

Suppose TL(C3) is defined as before. Because the eigenvalues of T are 6 multiplicity 2, and 7 multiplicity 1, then the characteristic polynomial of T is (z6)2(z7).

Degree and zeros of characteristic polynomial

Suppose V is a complex vector space and TL(V). Then:

  • the characteristic polynomial of T has degree dim(V)
  • the zeroes of the characteristic polynomial of T are the eigenvalues of T.

Proof
(a): Use the fact that the multiplicities add up to dim(V) to show the same thing with the polynomial degree.

(b): Use the definition and plug in z:=λi for each i.

We can now prove a really cool proof easily, without determinants:

Cayley-Hamilton Theorem

Suppose V is a complex vector space and TL(V). Let q denote the characteristic polynomial of T. Then q(T)=0.

Proof
Let λ1,...,λm be the distinct eigenvalues of operator T and let d1,...,dm be the dimensions of the corresponding generalized eigenspaces G(λ1,T),...,G(λm,T). For each 1jm we know that (TλjI)dj|G(λj,T) is nilpotent using the fact that raising a nilpotent operator to the dimension of the subspace is the zero operator.

Every vector in V is a unique sum of vector in each G(λj,T) by how operators are described via generalized eigenspaces. To prove q(T)=0 we need to only show that q(T)|G(λj,T)=0 for each j. We have:

q(T)=(Tλ1I)d1(TλmI)dm

The operators on the right side of the equation all commute, so move out the factor (TλjI)dj to the last term, and since (TλjI)dj|G(λj,T)=0 we can conclude that q(T)|G(λj,T)=0 as desired.

The Minimal Polynomial

monic polynomial

A monic polynomial is a polynomial whose highest-degree coefficient equals 1.

For instance, the polynomial 2+9z2+z7 is a monic polynomial of degree 7 (since the highest degree coefficient is 1).

Minimal Polynomial

Suppose TL(V). Then there is a unique monic polynomial p of smallest degree such that p(T)=0.

Proof
Let n=dim(V). Then the list:

I,T,T2,...,Tn2

is not linearly independent in L(V) because the vector space L(V) has dimension n2 Chapter 3 - Linear Maps#^ee27e8 while we have a list of length n2+1. Let m be the smallest positive integer such that:

I,T,T2,...,Tm

is LD. The Linear Dependence Lemma implies that one of the operators above is a linear combination of the others, and because m was chosen to be the smallest positive integer where the above list is LD, then we conclude that Tm is a linear combination of I,...,Tm1, so ai where:

a0I++am1Tm1+Tm=0

Define the monic polynomial pP(F) by:

p(z)=a0+a1z++am1zm1+zm

Plugging in z:=T into the polynomial and using our our equation above, we conclude that p(T)=0.

To show uniqueness of p, note that choice of m implies that no monic polynomial qP(F) with degree smaller than m can satisfy q(T)=0. Suppose qP(F) is a monic polynomial with degree m and q(T)=0. Then (pq)(T)=0 and deg(pq)<m. The choice of m now implies that q=p.

minimal polynomial

Suppose TL(V). Then the minimal polynomial of T is the unique monic polynomial p of smallest degree such that p(T)=0.

Using the fact that minimal polynomials have at most deg(n2) this implies that each operator on V has a minimal polynomial of, at most, deg(dim(V)2). The Cayley-Hamilton Theorem tells us that if V is a complex vector space then the minimal polynomial of each operator on V has degree at most dim(V). This remarkable improvement also holds on real vector spaces.

Programming a computer

Suppose you are given TL(V), and thus M(T). You can program a computer to find the minimal polynomial of T by considering that:

a0M(I)+a1M(T)++am1M(Tm1)=M(T)m

for values m=1,2,... until the system of equations has a solution a0,...,am1 which then are the coefficients of the minimal polynomial of T. This process itself can be done via Gaussian elimination or similar methods.

Example

Let T be the operator on C5 whose matrix is w.r.t. the standard basis:

[0000310006010000010000010]

The minimal polynomial can be calculated using the powers of M(Tk) of increasing k: To save you some trouble, there's no solution until m=5 in this case, where then:

a0M(I)++a4M(T4)=M(T5)

where then solving quickly gives that a0=3,a1=6,a2=a3=a4=0 so then the characteristic polynomial is z56z+3.

q(T)=0 implies q is a multiple of the minimal polynomial.

Suppose TL(V) and qP(F). Then q(T)=0 iff q is a polynomial multiple of the minimal polynomial of T.

Proof
Let p denote the minimal polynomial of T.

First we prove (). Suppose q is a polynomial multiply of p. Thus sP(F) such that q=ps. Thus:

q(T)=p(T)s(T)=0s(T)=0

as desired.

For (), suppose q(T)=0. By the division algoirthm for polynomials, s,rP(F) such that:

q=ps+r

and here deg(r)<deg(p). We have:

0=q(T)=p(T)s(T)+r(T)

Which implies that r(T)=0 since otherwise dividing r by its highest degree coefficient would produce a monic polynomial that, when applied to T, gives 0; this polynomial would have a smaller degree than the minimal polynomial, creating a contradiction.

Thus q=ps so q is a polynomial multiply of p as desired.

Characteristic polynomial is a multiple of minimal polynomial

Suppose F=C and TL(V). Then the characteristic polynomial of T is a polynomial multiple of the minimal polynomial of T.

Proof
Use Cayley-Hamilton to get q as the characteristic polynomial where q(T)=0, then use our proved lemma to show that q is a multiple of the minimal polynomial of T.

Eigenvalues are the zeroes of the minimal polynomial

Let TL(V). Then the zeroes of the minimal polynomial of T are precisely the eigenvalues of T.

Proof
Let:

p(z)=a0+a1z++am1zm1+zm

be the minimal polynomial of T.

First to prove () suppose λF is a zero of p. Then p can be written as:

p(z)=(zλ)q(z)

for some monic polynomial qP(F). Because p(T)=0 then:

0=p(T)=(TλI)q(T)

Because deg(q)<deg(p) then vV such that q(T)v0. This is an eigenvector of the transformation above as vnull(TλI) when applied above. Thus λ must be an eigenvalue of T.

To prove () suppose λF is an eigenvalue of T. Thus v where Tv=λv. Apply T to both sides to show that Tjv=λjv for all jZ+. Thus:

0=p(T)v=(a0I+a1T++am1Tm1+Tm)v=(a0I+a1λ++am1λm1+λm)v=p(λ)v

Since v0 then p(λ)=0 as desired.

Let's do some examples:

Example

Find the minimal polynomial of TL(C3) given:

M(T)=[634062007]

Proof
We found that λ1=6 had multiplicity d1=2 and λ2=7 had multiplicity d2=1, so then the characteristic polynomial is:

p(z)=(z6)2(z7)

The minimal polynomial is either (z6)2(z7) or just the characteristic polynomial itself. To check, we check if q(T)=0 when we reduce degree. Notice:

(T6I)(T7I)=T213T+42I0

Thus the minimal polynomial is just p(z).

Example

Find the minimal polynomial of the operator TL(C3) defined by T(z1,z2,z3)=(6z1,6z2,7z3).

Proof
Clearly λ1=6 with d1=2 and λ2=7 where d2=1. For later computations notice that:

T2(z1,z2,z3)=(36z1,36z2,49z3)

Thus consider if q(z)=(z6)(z7) is the minimal polynomial:

q(T)=T213T+42I=(36z1136z1+42z1,36z2136z2+42z2,49z3137z3+42z3)=(0z1,0z2,0z3)=0

Thus q(z) is the minimal polynomial.

8.D: Jordan Form

We know if V is a complex vector space then all TL(V) has a basis β for V where M(T) is upper triangular. For this section, we can add more 0's to that diagonal. To show what we mean, consider the matrix for a nilpotent operator NL(V) where:

N(z1,z2,z3,z4)=(0,z1,z2,z3)

Where we have N3v,N2v,Nv,v as our basis, giving:

M(N)=[0100001000010000]

The idea here is that we can always have a diagonal of eigenvalues (in this case all 0's) and then block diagonal "submatrices" that are also block diagonal (have only eigenvalues with submatrices that are block diagonal). As another example, the block diagonal matrix:

[[010001000]000000000000000[0100]0000000[0]]

Here all the submatrices are block diagonal, making the whole matrix still block diagonal. To interpret this in the general case, we use the following proof:

Basis corresponding to a nilpotent operator

Suppose NL(V) is nilpotent. Then v1,...,vnV and m1,...,mnZ0 such that:

  1. Nm1v1,...,Nv1,v1,...,Nmnvn,...,Nvn,vn is a basis of V
  2. Nm1+1v1==Nmn+1vn=0

Proof
While Axler does an induction over dim(V), I do like the usage of quotient spaces used in Lecture 28 - Continuing Jordan Block Decomposition. I feel it works at the whole "process" of how this gets derived, rather than going through the motions of an induction proof. While it's not a "proof" per say, the findings are easily generalized.

We want to define this "form" of a matrix, as well as the basis required to make it.

Jordan Basis

Suppose TL(V). A basis of V is called a Jordan basis for T if w.r.t. this basis then T has a block diagonal matrix:

[A100Ap]

where each Aj is an UT matrix of the form:

Aj=[λj1010λj]

Jordan Form

Suppose V is a complex vector space. If TL(V) then there is a basis of V that is a Jordan basis for T.

Proof
First consider nilpotent operator NL(V) and the vectors v1,...,vnV are as they are given given the basis for a nilpotent operator. For each j note that N sends the first vector in the list:

Nmjvj,...,Nvj,vj

to 0 and that N sends each vector in this list other than the first to the previous vector. In other words, the basis:

β={Nm1v1,...,Nv1,v1,...,Nmnvn,...,Nvn,vn}

is a basis for which M(N,β) is now block diagonal.

Now suppose TL(V). Let λ1,...,λm be the distinct eigenvalues of T. We have the generalized eigenspace decomposition:

V=j=1mG(λj,T)

where each (TλjI)|G(λj,T) is nilpotent via the description of operators on complex vector spaces. Thus, some basis of each G(λj,T) is a Jordan basis for this transformation by reapplying our logic above, using N:=(TλjI)|G(λj,T). Thus, putting the bases together gets a basis of V that is a Jordan basis for T.