Lecture 17 - Quadratic Forms (More Applications)

Last week we talked about isometries and positive operators. We finished up that section, so see previous lecture notes for information on that.

But recall the definition of quadratic forms:

Quadratic Forms

A quadratic form in $n$ commuting variables is a homogeneous polynomial of degree 2 in terms of those variables.

Taylor Polynomials

Recall from Calc III that we have Taylor Polynomials, which is an approximation at a point of some function $f : R \to R$ . For instance the degree 1 Taylor polynomial approximation is:

f (x) \approx f (a) + f^{'} (a) (x - a)

centered at $x = a$ . For degree 2:

f (x) \approx f (a) + f^{'} (a) (x - a) + \frac{f^{″} (a)}{2} (x - a)^{2}

Notice that if we have $f^{'} (a) = 0$ , then $f^{'} (a) (x - a) = 0$ so then the function approximation just looks flat. We can ask if we have a maximum/minimum there though. It all comes down to the rightmost $\frac{f^{″} (a)}{2} (x - a)^{2}$ term:

If it's positive then $f^{″} (a) > 0$ so then we have a minimum.
If it's the other way around, where $f^{″} (a) < 0$ , we have a maximum.
If $f^{″} (a) = 0$ then the right term is equal to 0, and we have neither.

This idea can extend to 3rd, 4th, ..., $n$ -th derivates to make similar arguments.

Two or More Variable Taylor Polynomials

The Degree 1 approximation is:

f (x, y) \approx f (a, b) + f_{x} (a, b) (x - a) + f_{y} (a, b) (y - b)

The Degree 2 (Quadratic) approximation is:

f (x, y) \approx f (a, b) + f_{x} (a, b) (x - a) + f_{y} (a, b) (y - b) + \frac{1}{2} \underset{This is a quadratic form!}{\underset{⏟}{[f_{x x} (a, b) (x - a)^{2} + 2 f_{x y} (a, b) (x - a) (y - b) + f_{y y} (a, b) (y - b)^{2}]}}

And similar to the one variable case, if our tangent plane points straight up, namely if the $f_{x} (a, b) (x - a) + f_{y} (a, b) (y - b)$ term is zero, we are at a critical point, notice the right braced term can be written as a matrix:

[\begin{matrix} x - a & y - b \end{matrix}] \underset{A}{\underset{⏟}{[\begin{matrix} f_{x x} (a, b) & f_{x y} (a, b) \\ f_{x y} (a, b) & f_{y y} (a, b) \end{matrix}]}} \overset{\vec{x}}{\overset{⏞}{[\begin{matrix} x - a \\ y - b \end{matrix}]}}

We can simplify:

f (x, y) \approx f (a, b) + {\vec{x}}^{T} A \vec{x}

Here $A$ is just an operator (it really represents and operator).

If $A$ is positive-definite, namely ${\vec{x}}^{T} A \vec{x} > 0$ for all $\vec{x} \neq \vec{0}$ , then we must have a minimum, so $f (a, b)$ is a local minimum.
Similarly, if $A$ is negative-definite, namely ${\vec{x}}^{T} A \vec{x} < 0$ then $f (a, b)$ is a local maximum.
If $A$ is indefinite, so then we have both cases, then $f (a, b)$ is a saddle point

As such, let $Q (\vec{x}) = {\vec{x}}^{T} A \vec{x}$ . It's a quadratic form, but we want to look at the extreme values of $Q (\vec{x})$ . We don't really care about necessarily maximizing $Q$ in terms of output value since we saw before that:

Q (α \vec{x}) = α^{2} Q (\vec{x})

so we could get any arbitrary large value from $Q$ . However, we can ask, out of all vectors $\vec{x}$ that live on the unit circle, what vector $\vec{x}$ maximizes $Q$ :

Note

Since we have an $α^{2}$ here, then the alpha only can point and extend in only one direction. This will be dictated by $\vec{x}$ ! If $\vec{x}$ makes a positive inner product, those are our maxes and for a negative inner product, it makes minimums!

An Aside

As an aside, we'll want to thus define our inner product ${⟨ \vec{x}, \vec{y} ⟩}_{ε}$ . Where:

\begin{array}{r} {⟨ \vec{x}, \vec{y} ⟩}_{ε} = ⟨ \vec{x}, (A + ε I) \vec{y} ⟩ \end{array}

we claim that ${⟨ \cdot, \cdot ⟩}_{ε}$ is an inner product. Going through the list of:

Positive-definiteness (mainly comes from it via $⟨ \cdot, \cdot ⟩$ )
Conjugate linearity (not needed since we're working in $R$ )
Linearity (mainly uses linearity from $⟨ \cdot, \cdot ⟩$ )

Cauchy-Schwartz then says that:

\begin{array}{r} | {⟨ \vec{x}, \vec{y} ⟩}_{ε} | \leq \sqrt{⟨ \vec{x}, (A + ε I) \vec{x} ⟩} \sqrt{⟨ \vec{y}, (A + ε I) \vec{y} ⟩} \end{array}

If we let $ε \to 0$ then:

| {⟨ \vec{x}, \vec{y} ⟩}_{ε} | \leq \sqrt{⟨ \vec{x}, A \vec{x} ⟩} \sqrt{⟨ \vec{y}, A \vec{y} ⟩}

As a result, then:

$| ⟨ \vec{x}, A \vec{y} ⟩ |^{2} \leq ⟨ \vec{x}, A \vec{x} ⟩ ⟨ \vec{y}, A \vec{y} ⟩$
If $⟨ \vec{y}, A \vec{y} ⟩ = 0$ then $A \vec{y} = \vec{0}$ (comes from (1))

Using these on Quadratic Forms

Going back, we can make a theorem using these:

Theorem

An extremal value of $Q (\vec{x})$ on $‖ \vec{x} ‖ = 1$ :

occurs at an eigenvector of $A$
is an eigenvalue of $A$

Here we know that $A$ is symmetric by our requirements.

Proof
Suppose $m$ is the minimum value of $Q (\vec{x})$ with $‖ \vec{x} ‖ = 1$ . Then $Q (\vec{x}) \geq m$ for all $‖ \vec{x} ‖ = 1$ by the definition of being a minimum. Thus:

\begin{aligned} \Rightarrow & Q (\vec{x}) - m \geq 0 \\ \Rightarrow & {\vec{x}}^{T} A \vec{x} - {\vec{x}}^{T} m \vec{x} \geq 0 & (since ‖ \vec{x} ‖ = 1) \\ \Rightarrow & {\vec{x}}^{T} (A - m I) \vec{x} \geq 0 \\ \Rightarrow & ⟨ \vec{x}, (A - m I) \vec{x} ⟩ \geq 0 \\ \Rightarrow & (A - m I) \vec{x} = \vec{0} & by (2) prior \\ \Rightarrow & \vec{x} is an eigenvector w/ e-val m \end{aligned}

☐
Notice that since $A$ is a positive operator, then we only will get all the eigenvalues as positive or negative, showing maximums or minimums. Notice for our 2nd degree case, we only have a 2x2 matrix, so we only have two eigenvalues:

Conclusion

If both $λ$ 's are positive, then $A$ is positive-definite
If both $λ$ 's are negative, then $A$ is negative-definite
If the $λ$ 's have different sign, then $A$ is indefinite

The way to check this, we have to use the determinant. We'll give a quick definition just for this case:

determinant

The determinant of an operator $T \in L (V)$ is the product of its eigenvalues (up to multiplicity).
As such, the determinant of $M (T)$ is equal to the determinant of the operator $T$ .

So looking at our matrix $A$ for this problem:

A = [\begin{matrix} f_{x x} (a, b) & f_{x y} (a, b) \\ f_{x y} (a, b) & f_{y y} (a, b) \end{matrix}]

Then:

det (A) = f_{x x} f_{y y} (a, b) - f_{x y}^{2} (a, b) = D

If $D > 0$ then the $λ$ 's must have the same sign, since the determinant is the product of the eigenvalues (so you are at a local max or min).

If $D < 0$ then they don't share the same sign. Then $A$ is an indefinite matrix (we have a saddle point)

If $D = 0$ then we have one $λ = 0$ , so it's flat in one direction (but possibly not for the other direction).

So if we have $D > 0$ , how do we know if we have a maximum or a minimum? We can just plug in any $\vec{x}$ on the unit circle and see if we have a sign change. As an easy example, if we plug in $\hat{i}$ then we get:

{\hat{i}}^{T} A \hat{i} = f_{x x} (a, b)

so if $f_{x x} > 0$ at this point then it's a minimum. If it's $< 0$ it's a maximum. But this can be arbitrary! We can have it be $\hat{j}$ to give:

{\hat{j}}^{T} A \hat{j} = f_{y y} (a, b)