Chapter 8 - Operators on Complex Vector Spaces

8.A: Generalized Eigenvectors and Nilpotent Operators

Null Spaces of Powers of an Operator

Sequence of increasing null spaces

Suppose $T \in L (V)$ . Then:

{\vec{0}} = null (T^{0}) \subseteq null (T^{1}) \subseteq \dots \subseteq null (T^{k}) \subseteq null (T^{k + 1}) \subseteq \dots

Proof
Suppose $k$ is a nonnegative integer and $v \in null (T^{k})$ . Then $T^{k} v = \vec{0}$ and hence $T^{k + 1} v = \vec{0}$ so $v \in null (T^{k + 1})$ . Hence $null (T^{k}) \subseteq null (T^{k + 1})$ .
☐

Equality in the sequence of null spaces

Suppose $T \in L (V)$ . Suppose $m$ is a nonnegative integer such that $null (T^{m}) = null (T^{m + 1})$ . Then:

null (T^{m}) = null (T^{m + 1}) = \dots

Proof
Let $k \in Z^{+}$ . We want to show:

null (T^{m + k}) = null (T^{m + k + 1})

We know from sequences of increasing null spaces that $null (T^{m + k}) \subseteq null (T^{m + k + 1})$ .

To prove the other direction, suppose $v \in null (T^{m + k + 1})$ . Then:

T^{m + 1} (T^{k} v) = T^{m + k + 1} v = \vec{0}

Hence:

T^{k} v \in null (T^{m + 1}) = null (T^{m})

Thus $T^{m + k} v = T^{m} (T^{k} v) = \vec{0}$ so $v \in null (T^{m + k})$ . This shows $\supseteq$ so then we have equality.
☐

Null spaces stop growing

Suppose $T \in L (V)$ . Let $n = \dim (V)$ . Then:

null (T^{n}) = null (T^{n + 1}) = \dots

Proof
Using equality in the sequence of null spaces will give the final answer. We just have to show $null (T^{n}) = null (T^{n + 1})$ . Suppose it wasn't true. Then by the two previous lemmas we have:

{\vec{0}} = null (T^{0}) \subset null (T^{1}) \subset \dots \subset null (T^{n}) \subset null (T^{n + 1})

The dimension then becomes:

\dim (null (T^{0})) < \dots < \dim (null (T^{n})) < \dim null (T^{n + 1})

At each step of the way the dimension must increase by at least 1 (because we have $\subset$ instead of $\subseteq$ 's and thus $<$ rather than $\leq$ ). Thus, after $n$ steps that implies that $\dim null (T^{n + 1}) > n$ which is a contradiction since it's a subspace of $V$ with dimension $n$ .
☐
While in general it's not true that $V = null (T) \oplus range (T)$ the next could be used as a substitute:

V

is a direct sum of

null (T^{\dim (V)})

and

range (T^{\dim (V)})

Suppose $T \in L (V)$ . Let $n = \dim (V)$ . Then:

V = null (T^{n}) \oplus range (T^{n})

Proof
First we show the intersection is trivial. Suppose $v \in null (T^{n}) \cap range (T^{n})$ . So then $T^{n} v = \vec{0}$ and $\exists u$ where $T^{n} u = v$ . Apply $T^{n}$ to both sides to get that $T^{n} v = T^{2 n} u = \vec{0}$ . Thus $T^{n} u = \vec{0}$ by the fact that null spaces stop growing. Thus $v = T^{n} u = \vec{0}$ so then the intersection is trivial.

Since the intersection is trivial, then the sum is a direct sum. Also:

\dim (null (T^{n}) \oplus range (T^{n})) = \dim (null (T^{n})) + \dim (range (T^{n})) = \dim (V)

via Chapter 3 (cont.) - Products and Quotients of Vector Spaces#^5193d7 and via the FTOLM.
☐

Generalized Eigenvectors

More often than we desire, we don't have enough eigenvectors to lead to diagonalization. We need something more general. Let's examine this issue by fixing $T \in L (V)$ . We which we can describe $V$ by direct sums of simpler spaces:

V = ⨁_{i = 1}^{m} U_{i}

where each $U_{i}$ is a subspace of $V$ invariant under $T$ . The simplest $U_{i}$ 's would be 1-dimensional, but this is only possible iff $V$ has a basis consisting of eigenvectors of $T$ iff $V$ has an eigenspace decomposition:

V = ⨁_{i = 1}^{n} E (λ_{i}, T)

where $λ_{i}$ are all distinct eigenvalues of $T$ . The Chapter 7 - Operators on Inner Product Spaces#^6995f7 says if $V$ is an inner product space then a decomposition of the form of the equation above holds for every normal operator if $F = C$ and for every self-adjoint operator if $F = R$ as there's enough eigenvectors in these operators to form a basis of $V$ .

But the form above may not always hold, so we need to expand our definition of eigenvectors.

generalized eigenvector

Suppose $T \in L (V)$ and $λ$ is an eigenvalue of $T$ . A vector $v \in V$ is called a generalized eigenvector of $T$ corresponding to $λ$ if $v \neq \vec{0}$ and:

(T - λ I)^{j} v = \vec{0}

for some $j \in Z^{+}$ .

Although $j$ is arbitrary, we will soon prove that every generalized eigenvector satisfies this equation with $j = \dim (V)$ .

generalized eigenspace,

G (λ, T)

Suppose $T \in L (V)$ and $λ \in F$ . The generalized eigenspace of $T$ corresponding to $λ$ , denoted $G (λ, T)$ , is defined to be the set of all generalized eigenvectors of $T$ corresponding to $λ$ , along with the $\vec{0}$ .

Because every eigenvector of $T$ is a generalized eigenvector of $T$ (take $j = 1$ in the definition), each eigenspace is contained in the corresponding eigenspace (ie: $E (λ, T) \subseteq G (λ, T)$ ). The next result implies that $G (λ, T)$ is a subspace of $V$ since the null space of each linear map on $V$ is a subspace of $V$ .

Description of generalized eigenspaces

Suppose $T \in L (V)$ and $λ \in F$ . Then $G (λ, T) = null (T - λ I)^{\dim (V)}$ .

Proof
Suppose $v \in null (T - λ I)^{\dim (V)}$ . Using the definitions where $j = \dim (V)$ , $v \in G (λ, T)$ so then we have $\supseteq$ .

For $\subseteq$ , suppose $v \in G (λ, T)$ , so there is some $j \in Z^{+}$ where $v \in null (T - λ I)^{j}$ . From Chapter 8 - Operators on Complex Vector Spaces#^3fdb93 and Chapter 8 - Operators on Complex Vector Spaces#^56d370, using $T - λ I$ instead of $T$ , we get that $v \in null (T - λ I)^{\dim (V)}$ , showing $\subseteq$ .
☐
As an example consider the transformation:

T (z_{1}, z_{2}, z_{3}) = (4 z_{2}, 0, 5 z_{3})

Using the definition of eigenvalue shows that $T$ 's eigenvalues are 0 and 5. The corresponding eigenspaces are $E (0, T) = {(z_{1}, 0, 0) : z_{1} \in C}$ and $E (5, T) = {(0, 0, z_{3} : z_{3} \in C}$ . There's clearly not enough eigenvectors to span the vector space $C^{3}$ .

But we have $T^{3} (z_{1}, z_{2}, z_{3}) = (0, 0, 125 z_{3})$ , so then the description of generalized eigenspaces implies that $G (0, T) = {(z_{1}, z_{2}, 0) : z_{1}, z_{2} \in C}$ . We have $(T - 5 I)^{3} (z_{1}, z_{2}, z_{3}) = (- 125 z_{1} + 300 z_{2}, - 125 z_{2}, 0)$ so the same theorem says $G (5, T) = {(0, 0, z_{3}) : z_{3} \in C}$

Because of this then $C^{3} = G (0, T) \oplus G (5, T)$ . But we want to show this in general for vector spaces. We'll do this later on.

Linearly Independent generalized eigenvectors

Let $T \in L (V)$ . Suppose $λ_{1}, . . ., λ_{m}$ are distinct eigenvalues of $T$ and $v_{1}, . . ., v_{m}$ are corresponding generalized eigenvectors. Then $v_{1}, . . ., v_{m}$ is linearly independent.

Proof
See Lecture 22 - Finishing G. Eigenspaces, Starting 8.B#^73f19f.
☐

Nilpotent Operators

nilpotent

An operator is called nilpotent if some power of it equals $\vec{0}$ .

For instance, the operator $N \in L (F^{4})$ defined by:

N (z_{1}, z_{2}, z_{3}, z_{4}) = (z_{3}, z_{4}, 0, 0)

is nilpotent because $N^{2} = \vec{0}$ .

Nilpotent operator raised to dimension of domain is

\vec{0}

Suppose $N \in L (V)$ is nilpotent. Then $N^{\dim (V)} = \vec{0}$

Proof
Because $N$ is nilpotent, $G (0, N) = V$ . Thus by the description of generalized eigenvectors then $null (N^{\dim (V)}) = V$ .
☐
Given an operator $T$ on $V$ we want to find a basis of $V$ such that the matrix of $T$ with respect to this basis is a simple as possible, so hopefully $M (T)$ has a lot of 0's.

Matrix of a nilpotent operator

Suppose $N$ is a nilpotent operator on $V$ . Then $\exists$ a basis of $V$ w.r.t. which the matrix of $N$ has the form:

[\begin{matrix} 0 & * \\ ⋱ \\ 0 & 0 \end{matrix}]

so all entries on and below the diagonal are 0's.

Proof
First choose a basis of $null (N)$ . Then extend this to a basis of $null (N^{2})$ . then extend to a basis of $null (N^{3})$ . Continue in this fashion, eventually getting a basis of $V$ since $null (N^{\dim (V)}) = V$ .

Now let's think about the matrix of $N$ w.r.t. this basis. The first column consists of all 0's since the corresponding basis vectors are in $null (N)$ . Applying $N$ to any such vector we get a vector in $null (N)$ which is a vector that's a linear combination of the previous basis vectors. Thus all nonzero entries in these columns lie above the diagonal. The next set of columns comes from the basis vectors in $null (N^{3})$ . Applying $N$ to any such vector, we get a vector in $null (N^{2})$ ; in other words we get a vector that is a linear combination of the previous basis vectors. Thus once again, all nonzero entries in these columns lie above the diagonal. Continue in this fashion to get the result.
☐

8.B: Decomposition of an Operator

Description of Operators on Complex Vector Spaces

We saw prior that we may lack enough eigenvectors to have a deconstruction of $V$ onto those eigenspaces. But we observed that $T \in L (V)$ has it that $null (T)$ and $range (T)$ are invariant under $T$ via the nullspace proof and the range proof. Now we can show that the null space and the range of each polynomial of $T$ is also invariant under $T$ :

The null space and range of

p (T)

are invariant under

T

Suppose $T \in L (V)$ and $p \in P (F)$ . Then $null (p (T))$ and $range (p (T))$ are invariant under $T$

Proof
Suppose $v \in null (p (T))$ , so then $p (T) v = \vec{0}$ . Thus:

p (T) (T v) = T p (T) v = T \vec{0} = \vec{0}

Hence $T v \in null (p (T))$ so it's invariant under $T$ .

Suppose $v \in range (p (T))$ , so $\exists w$ where $p (T) w = v$ . Then:

T v = T p (T) w = p (T) (T w)

So clearly $T v \in range (p (T))$ , so it's invariant under $T$ .
☐
The following result shows that every operator on a complex vector space can be though of as composed of pieces, each of which is a nilpotent operator plus a scalar multiple of the identity.

Description of operators on complex vector spaces

Suppose $V$ is a complex vector space and $T \in L (V)$ . Let $λ_{1}, . . ., λ_{m}$ be the distinct eigenvalues of $T$ .

$V = ⨁_{i = 1}^{m} G (λ_{i}, T)$
Each $G (λ_{i}, T)$ is invariant under $T$ .
Each $(T - λ_{i} I) |_{G (λ_{i}, T)}$ is nilpotent.

Proof
Have $n = \dim (V)$ for clarity.

(b), (c): Recall that:

G (λ_{i}, T) = null (T - λ_{i} I)^{n}

for each $i$ via the description of generalized eigenspaces. Using our previous lemma saying the nullspace and range are invariant under $T$ , but instead use:

p (z) = (z - λ_{i})^{n}

then we get (b). (c) comes from the definitions, because $v \in G (λ_{i}, T)$ then $(T - λ_{i})^{\dim (V)} v = \vec{0}$ so then:

(T - λ_{i} I) |_{G (λ_{i}, T)} = \vec{0} \leftarrow the zero operator

showing nilpotency.

(a): See (a) proof from this proof.
☐

A basis of generalized eigenvectors

Suppose $V$ is a complex vector space and $T \in L (V)$ . Then there is a basis of $V$ consisting of generalized eigenvectors of $T$ .

Proof
Choose the basis of each $G (λ_{i}, T)$ via our previous lemma. Put all the bases together to form a basis of $V$ consisting of generalized eigenvectors of $T$ . This is more specifically covered via this in-depth look.
☐

Multiplicity of an Eigenvalue

If $V$ is a complex vector space and $T \in L (V)$ , then the decomposition of $V$ provided by the description of operators on complex vector spaces can be a power tool. The dimensions of the subspaces involved are sufficiently important. So much so that they get their own name:

multiplicity

Suppose $T \in L (V)$ . The multiplicity of an eigenvalue $λ$ of $T$ is defined to be the dimension of the corresponding generalized eigenspace $G (λ, T)$ . In other words, the multiplicity of an eigenvalue $λ$ of $T$ equals $\dim (null (T - λ I)^{\dim (V)})$ .

Notice that using Chapter 8 - Operators on Complex Vector Spaces#^90d6af, we can justify the second sentence in the definition.

As an example, consider the transformation $T \in L (C^{3})$ given by:

T (z_{1}, z_{2}, z_{3}) = (6 z_{1} + 3 z_{2} + 4 z_{3}, 6 z_{2} + 2 z_{3}, 6 z_{3})

Here:

M (T) = [\begin{matrix} 6 & 3 & 4 \\ 0 & 6 & 2 \\ 0 & 0 & 7 \end{matrix}]

Here the eigenvalues of $T$ are $λ \in {6, 7}$ via looking at the diagonal. The generalized eigenspaces are:

G (6, T) = span ((1, 0, 0), (0, 1, 0))

G (7, T) = span ((10, 2, 1))

Thus $λ = 6$ has multiplicity 2 while $λ = 7$ has multiplicity 1. The direct sum of our $G$ 's gives a decomposition for $C^{3}$ . Thus a basis for $C^{3}$ is:

{(1, 0, 0), (0, 1, 0), (10, 2, 1)}

You may ask if the sum of the multiplicities always equals the dimension. It turns out it does!

Sum of the multiplicities equals

\dim (V)

Suppose $V$ is a complex vector space and $T \in L (V)$ . Then the sum of the multiplicities of all eigenvalues of $T$ equals $\dim (V)$ .

Proof
Use our lemma giving the description of complex vector spaces combined with the sum additions of direct sums.
☐
The term algebraic multiplicity and geometric multiplicity have different meanings in many books:

Algebraic multiplicity: $\dim (G (λ, T))$
Geometric multiplicity: $\dim (E (λ, T))$

Block Diagonal Matrices

To interpret our results in matrix form, we make the following definition, generalizing the notion of a diagonal matrix. In the case all $A_{j}$ are $1 \times 1$ then we actually have a diagonal matrix.

block diagonal matrix

A block diagonal matrix is a square matrix of the form:

[\begin{matrix} A_{1} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & A_{m} \end{matrix}]

where $A_{1}, . . ., A_{m}$ are square matrices lying along the diagonal and all the other entries of the matrix equal 0.

For instance, the $5 \times 5$ matrix:

A = [\begin{matrix} [\begin{matrix} 4 \end{matrix}] & \begin{matrix} 0 & 0 \end{matrix} & \begin{matrix} 0 & 0 \end{matrix} \\ \begin{matrix} 0 \\ 0 \end{matrix} & [\begin{matrix} 2 & - 3 \\ 0 & 2 \end{matrix}] & \begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} \\ \begin{matrix} 0 \\ 0 \end{matrix} & \begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix} & [\begin{matrix} 1 & 7 \\ 0 & 1 \end{matrix}] \end{matrix}]

is a block diagonal matrix of the form:

[\begin{matrix} A_{1} & 0 & 0 \\ 0 & A_{2} & 0 \\ 0 & 0 & A_{3} \end{matrix}]

Block diagonal matrix with upper-triangular blocks

Suppose $V$ is a complex vector space and $T \in L (V)$ . Let $λ_{1}, . . ., λ_{m}$ be the distinct eigenvalues of $T$ , with multiplicities $d_{1}, . . ., d_{m}$ . Then there is a basis of $V$ with respect to which $T$ has a blcok diagonal matrix like seen above:

[\begin{matrix} A_{1} & \dots & 0 \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & A_{m} \end{matrix}]

where each $A_{j}$ is a $d_{j} \times d_{j}$ upper-triangular matrix:

A_{j} = [\begin{matrix} λ_{j} & \dots & * \\ ⋮ & ⋱ & ⋮ \\ 0 & \dots & λ_{j} \end{matrix}]

Proof
Each $(T - λ_{i} I) |_{G (λ_{i}, T)}$ is nilpotent via description of operators on complex vector spaces (namely (c)). For each $j$ , choose a basis of $G (λ_{j}, T)$ which is a vector space with dimension $d_{j}$ , such that the matrix of $(T - λ_{j} I) |_{G (λ_{j}, T)}$ with respect to this basis is as in what the matrix of a nilpotent operator should look like. Thus the matrix of $T |_{G (λ_{j}, T)}$ , which equals $(T - λ_{j} I) |_{G (λ_{j}, T)} + λ_{j} T |_{G (λ_{j}, T)}$ with respect to this basis will look like the desired form for $A_{j}$ .

Putting the bases of the $G (λ_{j}, T)$ 's together gives a basis of $V$ via operators on vector spaces (a). The matrix of $T$ with respect to this basis has our desired form.
☐
For example, suppose $T \in L (C^{3})$ is defined by:

T (z_{1}, z_{2}, z_{3}) = (6 z_{1} + 3 z_{2} + 4 z_{3}, 6 z_{2} + 2 z_{3}, 7 z_{3})

We have:

M (T) = [\begin{matrix} 6 & 3 & 4 \\ 0 & 6 & 2 \\ 0 & 0 & 7 \end{matrix}]

We found that:

G (6, T) = span ((1, 0, 0), (0, 1, 0))

G (7, T) = span ((10, 2, 1))

We saw that the basis of $C^{3}$ was:

{(1, 0, 0), (0, 1, 0), (10, 2, 1)}

The matrix of $T$ with respect to this basis is:

[\begin{matrix} [\begin{matrix} 6 & 3 \\ 0 & 6 \end{matrix}] & \begin{matrix} 0 \\ 0 \end{matrix} \\ \begin{matrix} 0 & 0 \end{matrix} & [\begin{matrix} 7 \end{matrix}] \end{matrix}]

Square Roots

Recall a square root of an operator $T \in L (V)$ is an operator $R \in L (V)$ such that $R^{2} = T$ via its definition. Every complex number has a square root, but not every operator on a complex vector space has a square root.

For example, the operator in $C^{3}$ given by this problem has no square root. The non-invertibility has something to do with it. But first, we'll show that the identity plus any nilpotent operator has a square root.

Identity plus nilpotent has a square root

Suppose $N \in L (V)$ is nilpotent. Then $I + N$ has a square root.

Proof
Consider the Taylor series for the function $\sqrt{1 + x}$ :

\sqrt{1 + x} = 1 + a_{1} x + a_{2} x^{2} + \dots

We will not find an explicit formula for the coefficents or worry about convergence because we just want to use this equation only for motivation.

Because $N$ is nilpotent, then $N^{m} = \vec{0}$ for some positive integer $m \in Z^{+}$ . In the equation for $\sqrt{1 + x}$ up top, suppose we replace $x$ with $N$ and $1$ with $I$ . Then the infinite sum on the right side becomes finite since all $N^{m}, N^{m + 1}, . . . = \vec{0}$ :

\sqrt{I + N} \approx 1 + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1}

Having made this guess, we try to choose $a_{1}, . . ., a_{m - 1}$ such that the operator above has its square equal to $I + N$ . Just apply this squaring process:

\begin{aligned} I + N & = (I + a_{1} N + a_{2} N^{2} + \dots + a_{m - 1} N^{m - 1}) \\ = I + 2 a_{1} N + (2 a_{2} + a_{1})^{2} N^{2} + (2 a_{3} + 2 a_{1} a_{2}) N^{3} + \dots \\ + (2 a_{m - 1} + \dots) N^{m - 1} \end{aligned}

We want the right side of the equation to equal $I + N$ . Thus, notice we should choose $a_{1}$ where $2 a_{1} = 1 \Rightarrow a_{1} = \frac{1}{2}$ . Next, choose $a_{2}$ such that $2 a_{2} + a_{1}^{2} = 0 \Rightarrow a_{2} = \frac{- 1}{8}$ . Then choose $a_{3}$ such that $2 a_{3} + 2 a_{1} a_{2} = 0 \Rightarrow a_{3} = \frac{1}{16}$ . Continue in this manner for $j = 4, . . ., m - 1$ .

We don't actually care the formula for each $a_{j}$ , we just need to know that for each $j$ we can make a correct choice of $a_{j}$ to make a square root of $I + N$ .
☐
The previous result works on real as well as complex vector spaces. However, the next result holds only on complex vector spaces. For example, the operator of multiplication by $- 1$ on the $1$ -dimensional real vector space $R$ has no square root.

Over

C

, invertible operators have square roots

Suppose $V$ is a complex vector space and $T \in L (V)$ is invertible. Then $T$ has a square root.

Proof
Let $λ_{1}, . . ., λ_{m}$ be the distinct eigenvalues of $T$ . For each $j$ there exists a nilpotent operator $N_{j} \in L (G (λ_{j}, T))$ such that $T |_{G (λ_{j}, T)} = λ_{j} I + N_{j}$ via the description of operators on complex vector spaces. Because $T$ is invertible then $λ_{j} \neq 0$ for all $j$ , so then:

T |_{G (λ_{j}, T)} = λ_{j} (I + \frac{N_{j}}{λ_{j}})

for each $j$ . Clearly $\frac{N_{j}}{λ_{j}}$ is nilpotent, so $I + \frac{N_{j}}{λ_{j}}$ has a square root using our freshly proved lemma. Multiplying a square root of the complex number $λ_{j}$ by a square root of $I + \frac{N_{j}}{λ_{j}}$ we obtain a square root $R_{j}$ of $T |_{G (λ_{j}, T)}$ (see the equation above). A typical vector $v \in V$ can be written uniquely in the form:

v = u_{1} + \dots + u_{m}

where each $u_{j} \in G (λ_{j}, T)$ because of the direct sum structure of complex vector spaces. Using this decomposition, define an operator $R \in L (V)$ by:

R v = R_{1} u_{1} + \dots + R_{m} u_{m}

We can verify that this operator $R$ is a square root of $T$ by just applying $R^{2}$ :

\begin{aligned} R^{2} v & = R (R_{1} u_{1} + \dots + R_{m} u_{m}) \\ = R R_{1} u_{1} + \dots + R R_{m} u_{m} \\ = R_{1}^{2} u_{1} + \dots + R_{m}^{2} u_{m} \\ = \sum_{j = 1}^{m} {(\sqrt{λ_{j}} \sqrt{I + \frac{N_{j}}{λ_{j}}})}^{2} u_{j} \\ = \sum_{j = 1}^{m} λ_{j} (I + \frac{N_{j}}{λ_{j}}) u_{j} \\ = \sum_{j = 1}^{m} (I λ_{j} + N_{j}) u_{j} \\ = \sum_{j = 1}^{m} T |_{G (λ_{j}, T)} u_{j} \\ = T v \end{aligned}

☐

8.C: Characteristic and Minimal Polynomials

characteristic polynomial

Suppose $V$ is a complex vector space and $T \in L (V)$ . Let $λ_{1}, . . ., λ_{m}$ denote the distinct eigenvalues of $T$ with multiplicities $d_{1}, . . ., d_{m}$ . The polynomial:

(z - λ_{1})^{d_{1}} \dots (z - λ_{m})^{d_{m}}

is called the characteristic polynomial of $T$ .

Example

Suppose $T \in L (C^{3})$ is defined as before. Because the eigenvalues of $T$ are $6$ multiplicity 2, and $7$ multiplicity 1, then the characteristic polynomial of $T$ is $(z - 6)^{2} (z - 7)$ .

Degree and zeros of characteristic polynomial

Suppose $V$ is a complex vector space and $T \in L (V)$ . Then:

the characteristic polynomial of $T$ has degree $\dim (V)$
the zeroes of the characteristic polynomial of $T$ are the eigenvalues of $T$ .

Proof
(a): Use the fact that the multiplicities add up to $\dim (V)$ to show the same thing with the polynomial degree.

(b): Use the definition and plug in $z := λ_{i}$ for each $i$ .
☐
We can now prove a really cool proof easily, without determinants:

Cayley-Hamilton Theorem

Suppose $V$ is a complex vector space and $T \in L (V)$ . Let $q$ denote the characteristic polynomial of $T$ . Then $q (T) = 0$ .

Proof
Let $λ_{1}, . . ., λ_{m}$ be the distinct eigenvalues of operator $T$ and let $d_{1}, . . ., d_{m}$ be the dimensions of the corresponding generalized eigenspaces $G (λ_{1}, T), . . ., G (λ_{m}, T)$ . For each $1 \leq j \leq m$ we know that $(T - λ_{j} I)^{d_{j}} |_{G (λ_{j}, T)}$ is nilpotent using the fact that raising a nilpotent operator to the dimension of the subspace is the zero operator.

Every vector in $V$ is a unique sum of vector in each $G (λ_{j}, T)$ by how operators are described via generalized eigenspaces. To prove $q (T) = 0$ we need to only show that $q (T) |_{G (λ_{j}, T)} = \vec{0}$ for each $j$ . We have:

q (T) = (T - λ_{1} I)^{d_{1}} \dots (T - λ_{m} I)^{d_{m}}

The operators on the right side of the equation all commute, so move out the factor $(T - λ_{j} I)^{d_{j}}$ to the last term, and since $(T - λ_{j} I)^{d_{j}} |_{G (λ_{j}, T)} = \vec{0}$ we can conclude that $q (T) |_{G (λ_{j}, T)} = \vec{0}$ as desired.
☐

The Minimal Polynomial

monic polynomial

A monic polynomial is a polynomial whose highest-degree coefficient equals 1.

For instance, the polynomial $2 + 9 z^{2} + z^{7}$ is a monic polynomial of degree 7 (since the highest degree coefficient is 1).

Minimal Polynomial

Suppose $T \in L (V)$ . Then there is a unique monic polynomial $p$ of smallest degree such that $p (T) = \vec{0}$ .

Proof
Let $n = \dim (V)$ . Then the list:

I, T, T^{2}, . . ., T^{n^{2}}

is not linearly independent in $L (V)$ because the vector space $L (V)$ has dimension $n^{2}$ Chapter 3 - Linear Maps#^ee27e8 while we have a list of length $n^{2} + 1$ . Let $m$ be the smallest positive integer such that:

I, T, T^{2}, . . ., T^{m}

is LD. The Linear Dependence Lemma implies that one of the operators above is a linear combination of the others, and because $m$ was chosen to be the smallest positive integer where the above list is LD, then we conclude that $T^{m}$ is a linear combination of $I, . . ., T^{m - 1}$ , so $\exists a_{i}$ where:

a_{0} I + \dots + a_{m - 1} T^{m - 1} + T^{m} = \vec{0}

Define the monic polynomial $p \in P (F)$ by:

p (z) = a_{0} + a_{1} z + \dots + a_{m - 1} z^{m - 1} + z^{m}

Plugging in $z := T$ into the polynomial and using our our equation above, we conclude that $p (T) = 0$ .

To show uniqueness of $p$ , note that choice of $m$ implies that no monic polynomial $q \in P (F)$ with degree smaller than $m$ can satisfy $q (T) = 0$ . Suppose $q \in P (F)$ is a monic polynomial with degree $m$ and $q (T) = 0$ . Then $(p - q) (T) = 0$ and $\deg (p - q) < m$ . The choice of $m$ now implies that $q = p$ .
☐

minimal polynomial

Suppose $T \in L (V)$ . Then the minimal polynomial of $T$ is the unique monic polynomial $p$ of smallest degree such that $p (T) = \vec{0}$ .

Using the fact that minimal polynomials have at most $\deg (n^{2})$ this implies that each operator on $V$ has a minimal polynomial of, at most, $\deg (\dim (V)^{2})$ . The Cayley-Hamilton Theorem tells us that if $V$ is a complex vector space then the minimal polynomial of each operator on $V$ has degree at most $\dim (V)$ . This remarkable improvement also holds on real vector spaces.

Programming a computer

Suppose you are given $T \in L (V)$ , and thus $M (T)$ . You can program a computer to find the minimal polynomial of $T$ by considering that:

a_{0} M (I) + a_{1} M (T) + \dots + a_{m - 1} M (T^{m - 1}) = - M (T)^{m}

for values $m = 1, 2, . . .$ until the system of equations has a solution $a_{0}, . . ., a_{m - 1}$ which then are the coefficients of the minimal polynomial of $T$ . This process itself can be done via Gaussian elimination or similar methods.

Example

Let $T$ be the operator on $C^{5}$ whose matrix is w.r.t. the standard basis:

[\begin{matrix} 0 & 0 & 0 & 0 & - 3 \\ 1 & 0 & 0 & 0 & 6 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & 0 & 0 \\ 0 & 0 & 0 & 1 & 0 \end{matrix}]

The minimal polynomial can be calculated using the powers of $M (T^{k})$ of increasing $k$ : To save you some trouble, there's no solution until $m = 5$ in this case, where then:

a_{0} M (I) + \dots + a_{4} M (T^{4}) = - M (T^{5})

where then solving quickly gives that $a_{0} = 3, a_{1} = - 6, a_{2} = a_{3} = a_{4} = 0$ so then the characteristic polynomial is $z^{5} - 6 z + 3$ .

q (T) = \vec{0}

implies

q

is a multiple of the minimal polynomial.

Suppose $T \in L (V)$ and $q \in P (F)$ . Then $q (T) = \vec{0}$ iff $q$ is a polynomial multiple of the minimal polynomial of $T$ .

Proof
Let $p$ denote the minimal polynomial of $T$ .

First we prove $(\leftarrow)$ . Suppose $q$ is a polynomial multiply of $p$ . Thus $\exists s \in P (F)$ such that $q = p s$ . Thus:

q (T) = p (T) s (T) = \vec{0} s (T) = \vec{0}

as desired.

For $(\to)$ , suppose $q (T) = \vec{0}$ . By the division algoirthm for polynomials, $\exists s, r \in P (F)$ such that:

q = p s + r

and here $\deg (r) < \deg (p)$ . We have:

\vec{0} = q (T) = p (T) s (T) + r (T)

Which implies that $r (T) = \vec{0}$ since otherwise dividing $r$ by its highest degree coefficient would produce a monic polynomial that, when applied to $T$ , gives $\vec{0}$ ; this polynomial would have a smaller degree than the minimal polynomial, creating a contradiction.

Thus $q = p s$ so $q$ is a polynomial multiply of $p$ as desired.
☐

Characteristic polynomial is a multiple of minimal polynomial

Suppose $F = C$ and $T \in L (V)$ . Then the characteristic polynomial of $T$ is a polynomial multiple of the minimal polynomial of $T$ .

Proof
Use Cayley-Hamilton to get $q$ as the characteristic polynomial where $q (T) = \vec{0}$ , then use our proved lemma to show that $q$ is a multiple of the minimal polynomial of $T$ .
☐

Eigenvalues are the zeroes of the minimal polynomial

Let $T \in L (V)$ . Then the zeroes of the minimal polynomial of $T$ are precisely the eigenvalues of $T$ .

Proof
Let:

p (z) = a_{0} + a_{1} z + \dots + a_{m - 1} z^{m - 1} + z^{m}

be the minimal polynomial of $T$ .

First to prove $(\to)$ suppose $λ \in F$ is a zero of $p$ . Then $p$ can be written as:

p (z) = (z - λ) q (z)

for some monic polynomial $q \in P (F)$ . Because $p (T) = \vec{0}$ then:

\vec{0} = p (T) = (T - λ I) q (T)

Because $\deg (q) < \deg (p)$ then $\exists v \in V$ such that $q (T) v \neq \vec{0}$ . This is an eigenvector of the transformation above as $v \in null (T - λ I)$ when applied above. Thus $λ$ must be an eigenvalue of $T$ .

To prove ( $\leftarrow$ ) suppose $λ \in F$ is an eigenvalue of $T$ . Thus $\exists v$ where $T v = λ v$ . Apply $T$ to both sides to show that $T^{j} v = λ^{j} v$ for all $j \in Z^{+}$ . Thus:

\begin{aligned} \vec{0} & = p (T) v \\ = (a_{0} I + a_{1} T + \dots + a_{m - 1} T^{m - 1} + T^{m}) v \\ = (a_{0} I + a_{1} λ + \dots + a_{m - 1} λ^{m - 1} + λ^{m}) v \\ = p (λ) v \end{aligned}

Since $v \neq \vec{0}$ then $p (λ) = 0$ as desired.
☐
Let's do some examples:

Example

Find the minimal polynomial of $T \in L (C^{3})$ given:

$M (T) = [\begin{matrix} 6 & 3 & 4 \\ 0 & 6 & 2 \\ 0 & 0 & 7 \end{matrix}]$

Proof
We found that $λ_{1} = 6$ had multiplicity $d_{1} = 2$ and $λ_{2} = 7$ had multiplicity $d_{2} = 1$ , so then the characteristic polynomial is:

p (z) = (z - 6)^{2} (z - 7)

The minimal polynomial is either $(z - 6)^{2} (z - 7)$ or just the characteristic polynomial itself. To check, we check if $q (T) = \vec{0}$ when we reduce degree. Notice:

(T - 6 I) (T - 7 I) = T^{2} - 13 T + 42 I \neq \vec{0}

Thus the minimal polynomial is just $p (z)$ .
☐

Example

Find the minimal polynomial of the operator $T \in L (C^{3})$ defined by $T (z_{1}, z_{2}, z_{3}) = (6 z_{1}, 6 z_{2}, 7 z_{3})$ .

Proof
Clearly $λ_{1} = 6$ with $d_{1} = 2$ and $λ_{2} = 7$ where $d_{2} = 1$ . For later computations notice that:

T^{2} (z_{1}, z_{2}, z_{3}) = (36 z_{1}, 36 z_{2}, 49 z_{3})

Thus consider if $q (z) = (z - 6) (z - 7)$ is the minimal polynomial:

\begin{aligned} q (T) & = T^{2} - 13 T + 42 I \\ = (36 z_{1} - 13 \cdot 6 z_{1} + 42 z_{1}, 36 z_{2} - 13 \cdot 6 z_{2} + 42 z_{2}, 49 z_{3} - 13 \cdot 7 z_{3} + 42 z_{3}) \\ = (0 z_{1}, 0 z_{2}, 0 z_{3}) \\ = \vec{0} \end{aligned}

Thus $q (z)$ is the minimal polynomial.
☐

8.D: Jordan Form

We know if $V$ is a complex vector space then all $T \in L (V)$ has a basis $β$ for $V$ where $M (T)$ is upper triangular. For this section, we can add more 0's to that diagonal. To show what we mean, consider the matrix for a nilpotent operator $N \in L (V)$ where:

N (z_{1}, z_{2}, z_{3}, z_{4}) = (0, z_{1}, z_{2}, z_{3})

Where we have $N^{3} v, N^{2} v, N v, v$ as our basis, giving:

M (N) = [\begin{matrix} 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ 0 & 0 & 0 & 0 \end{matrix}]

The idea here is that we can always have a diagonal of eigenvalues (in this case all 0's) and then block diagonal "submatrices" that are also block diagonal (have only eigenvalues with submatrices that are block diagonal). As another example, the block diagonal matrix:

[\begin{matrix} [\begin{matrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 0 & 0 & 0 \end{matrix}] & \begin{matrix} 0 & 0 \\ 0 & 0 \\ 0 & 0 \end{matrix} & \begin{matrix} 0 \\ 0 \\ 0 \end{matrix} \\ \begin{matrix} 0 & 0 & 0 \\ 0 & 0 & 0 \end{matrix} & [\begin{matrix} 0 & 1 \\ 0 & 0 \end{matrix}] & \begin{matrix} 0 \\ 0 \end{matrix} \\ \begin{matrix} 0 & 0 & 0 \end{matrix} & \begin{matrix} 0 & 0 \end{matrix} & [\begin{matrix} 0 \end{matrix}] \end{matrix}]

Here all the submatrices are block diagonal, making the whole matrix still block diagonal. To interpret this in the general case, we use the following proof:

Basis corresponding to a nilpotent operator

Suppose $N \in L (V)$ is nilpotent. Then $\exists v_{1}, . . ., v_{n} \in V$ and $m_{1}, . . ., m_{n} \in Z^{\geq 0}$ such that:

$N^{m_{1}} v_{1}, . . ., N v_{1}, v_{1}, . . ., N^{m_{n}} v_{n}, . . ., N v_{n}, v_{n}$ is a basis of $V$
$N^{m_{1} + 1} v_{1} = \dots = N^{m_{n} + 1} v_{n} = 0$

Proof
While Axler does an induction over $\dim (V)$ , I do like the usage of quotient spaces used in Lecture 28 - Continuing Jordan Block Decomposition. I feel it works at the whole "process" of how this gets derived, rather than going through the motions of an induction proof. While it's not a "proof" per say, the findings are easily generalized.
☐
We want to define this "form" of a matrix, as well as the basis required to make it.

Jordan Basis

Suppose $T \in L (V)$ . A basis of $V$ is called a Jordan basis for $T$ if w.r.t. this basis then $T$ has a block diagonal matrix:

[\begin{matrix} A_{1} & 0 \\ ⋱ \\ 0 & A_{p} \end{matrix}]

where each $A_{j}$ is an UT matrix of the form:

A_{j} = [\begin{matrix} λ_{j} & 1 & 0 \\ ⋱ & ⋱ \\ ⋱ & 1 \\ 0 & λ_{j} \end{matrix}]

Jordan Form

Suppose $V$ is a complex vector space. If $T \in L (V)$ then there is a basis of $V$ that is a Jordan basis for $T$ .

Proof
First consider nilpotent operator $N \in L (V)$ and the vectors $v_{1}, . . ., v_{n} \in V$ are as they are given given the basis for a nilpotent operator. For each $j$ note that $N$ sends the first vector in the list:

N^{m_{j} v_{j},} . . ., N v_{j}, v_{j}

to $\vec{0}$ and that $N$ sends each vector in this list other than the first to the previous vector. In other words, the basis:

β = {N^{m_{1}} v_{1}, . . ., N v_{1}, v_{1}, . . ., N^{m_{n}} v_{n}, . . ., N v_{n}, v_{n}}

is a basis for which $M (N, β)$ is now block diagonal.

Now suppose $T \in L (V)$ . Let $λ_{1}, . . ., λ_{m}$ be the distinct eigenvalues of $T$ . We have the generalized eigenspace decomposition:

V = ⨁_{j = 1}^{m} G (λ_{j}, T)

where each $(T - λ_{j} I) |_{G (λ_{j}, T)}$ is nilpotent via the description of operators on complex vector spaces. Thus, some basis of each $G (λ_{j}, T)$ is a Jordan basis for this transformation by reapplying our logic above, using $N := (T - λ_{j} I) |_{G (λ_{j}, T)}$ . Thus, putting the bases together gets a basis of $V$ that is a Jordan basis for $T$ .
☐