Chapter 6 - Inner Product Spaces

6.A: Inner Products and Norms

Inner Products

Think of vectors in R2 or R3. Specifically, think of distances in these spaces. We denote the norm of x, via ||x|| where for x=(x1,x2) we have ||x||=x12+x22. Similarly if x=(x1,x2,x3) then we have ||x||=x12+x22+x32. In Rn we have:

||x||=x12++xn2

The norm itself isn't linear on Rn, but to inject linearity we define:

dot product

For x,yRn the dot product of x,y, denoted xy, is defined by:

xy=x1y1++xnyn

where x=(x1,...,xn) and y=(y1,...,yn).

Keep in mind the dot product is a binary operator: :Rn×RnR. Obviously xx=||x||2. We notice the following properties of this dot product:

The inner product is a generalization of this dot product. But keep in mind that the properties above are true mainly for real spaces, but we need something to deal with complex spaces. Namely, if λ=a+bi where a,bR then:

So for z=(z1,...,zn)Cn we have the norm as:

||z||=|z1|2++|zn|2

Notice we need |zi|2 instead of zi2 since it's possible for zi2 to be a negative number (which is bad under a square root, as in the example z=(i) then ||z||=i in that case). Note that:

||z||2=z1z1++znzn

You want to think of ||z||2 as the inner product of z with itself, similar to the dot product above. As such, then this implies that if w is defined similar to z then the inner product of w with z is:

w1z1++wnzn

If w and z are swapped, we get the complex conjugate, suggesting that the inner product of w with z is the complex conjugate of the inner product of z and w.

Some things before we define the inner product:

inner product

An inner product on V is a function that takes each ordered pair (u,v) of elements of V to a number u,vF and has the following properties:

  • (positivity): v,v0 for all vV
  • (definiteness): v,v=0 iff v=0
  • (additivity in the first slot): u+v,w=u,w+v,w for all u,v,wV
  • (homogeneity in the first slot): λu,v=λu,v for all λF and all u,vV
  • (conjugate symmetry): u,v=v,u for all u,vV

Since any real is it's own complex conjugate, if we're dealing with a real vector space then the last condition just says that u,v=v,u for all u,vV.

An Inner Product on Functions

An inner product can be defined on the vector space of continuous real-valued functions over the interval [1,1] by:

f,g=11f(x)g(x)dx

Another Inner Product on Functions

An inner product can be defined on P(R) by:

p,q0p(x)q(x)exdx
inner product space

An inner product space is a vector space V along with an inner product on V.

If you're given V=Rn you can usually assume that refers to the standard dot product we talked about earlier (called the Euclidean Inner Product).

For the sake of brevity we make the following assumption:

V

For the rest of this chapter, V denotes an inner product space over F.

Note the abuse of language here. V itself is an inner product space, meaning it talks about it's own vectors space V (same name, different thing), with an obvious (from context) inner product.

Note

Note that the inner product from the examples were "obvious" because they took the idea of the Euclidean dot product, of multiplying similar numbers and adding them up, and wrapped the adding up part into an integral, which means essentially the same thing.

Basic Properties of an Inner Product

  • (a) For each fixed uV the function that takes v to v,u is linear from V to F
  • (b) 0,u=0 for every uV
  • (c) u,0=0 for every uV
  • (d) u,v+w=u,v+u,w for all u,v,wV
  • (e) u,λv=λu,v for all λF and u,vV

Proof
(a): This comes from the conditions of additivity in the first slot and homogeneity in the first slot in the definition of an inner product.

(b): Follows from (a) and the result that every linear map takes 0 to 0.

(c): Follows from (a) and the conjugate symmetry property in the definition of an inner product.

(d): Suppose u,v,wV. Then:

u,v+w=v+w,u=v,u+w,u=v,u+w,u=u,v+u,w

(e): Suppose λF and u,vV then:

u,λv=λv,u=λv,u=λ¯v,u=λ¯u,v

Norms

Our initial desire was to define distances for other spaces. Now we see that the inner product determines this norm:

norm, ||v||

For vV the norm of v, denoted ||v||, is defined by:

||v||=v,v

For instance, the norm of Rn is:

||x||=x12++xn2
Example

In the vector space of continuous real-valued functions on [1,1] with the inner product given from Chapter 6 - Inner Product Spaces#^b56893 is:

||f||=11(f(x))2dx
Basic Properties of the norm

Suppose vV:

  • (a) ||v||=0 iff v=0
  • (b) ||λv||=|λ|||v|| for all λF

Proof
(a): Comes from the fact that v,v=0 iff v=0 from the properties of the inner product.

(b): Suppose λF, then:

||λv||2=λv,λv=λv,λv=λλ¯v,v=|λ|2||v||2

Taking the square roots of both sides finishes the proof.

Notice that the proof above used the norm squared. In general, it's better to do proofs in this way, because the norm is never negative (so the ± we usually get gets nullified).

orthogonal

Two vectors u,vV are called orthogonal if u,v=0.

Notice that the order here doesn't matter since even in a complex vector space the complex conjugate of 0 is 0, so u,v=0=v,u.

In HW 7 - Inner Product Spaces#13 we show that if u,vR2 are non-zero then we get that:

u,v=||u||||v||cos(θ)

where θ is the angle between u and v (thinking of u and v as arrows pointing from the origin). Thus, the two vectors are othogonal, using the Euclidean inner product, iff cos(θ)=0 or when θ=π/2 or equivalent. Thus, we're able to take the words perpendicular and orthogonal as meaning the same thing.

Orthogonality and 0

  • (a) 0 is orthogonal to every vector in V
  • (b) 0 is the only vector in V that is orthogonal to itself

Proof
(a): Part (b) from Chapter 6 - Inner Product Spaces#^6bc3f4 states that 0,u=0 for every uV.

(b): If vV and v,v=0 then v=0 by the definition of the inner product.

For the special case V=R2 the proof of the next thing is super classic. But now, we can abstract it away!

Pythagorean Theorem

Suppose u,v are orthogonal vectors in V. Then:

||u+v||2=||u||2+||v||2

Proof

||u+v||2=u+v,u+v=u,u+u,v+v,u+v,v

Here notice that u,v=v,u so then:

u,v+v,u=u,v+u,v=2R(u,v)

In general. In this case since u,v are orthogonal then the real part becomes 0, so then:

||u+v||2=u,u+v,v=||u||2+||v||2


Suppose u,vV with v0. We would like to write u as a scalar multiple of v plus a vector w orthogonal to v, as suggest by:

Pasted image 20240308221135.png

We want to get the vector w above. Notice here that, where cF:

u=cv+(ucv)w

we want to get the vector perpendicular to the value above, so namely:

0=ucv,v=u,vc||v||2

Since v0 then ||v||20 so then solve for c:

c=u,v||v||2

So then plug it back in to get:

u=u,v||v||2v+(uu,v||v||2v)

Thus, we proved the following:

An orthogonal decomposition

Suppose u,vV, with v0. Then set c=u,v||v||2 and w=uu,v||v||2v. Then:

w,v=0

and:

u=cv+w

Cauchy-Schwarz Inequality

Suppose u,vV. Then:

|u,v|||u||||v||

This inequality is an equality iff one of u,v is a scalar multiple of the other.

Proof
If v=0 then we get an equality, so let v0. Then consider the orthogonal decomposition:

u=u,v||v||2v+w

given by our above decomposition, where w,v are orthogonal. By the Pythagorean Theorem:

||u||2=u,v||v||22+w2=|u,v|2v2+w2|u,v|2v2

Multiply both sides by v2 then square root both sides to get the inequality above.

Notice equality only happens when w2=0 which only happens when w=0. But w=0 iff u is a multiple of v via Chapter 6 - Inner Product Spaces#^315ed2, so then we only get equality iff u is a scalar multiple of v or v is a scalar multiple of u.

Example

If f,g are continuous real-valued functions on [1,1], then:

(11f(x)g(x)dx)2(11(f(x))2dx)(11(g(x))2dx)
Triangle Inequality

Suppose u,vV. Then:

u+vu+v

where we get equality iff one of u,v is a non-negative multiple of the other.

Proof
We have:

u+v2=u+v,u+v=u,u+v,v+u,v+v,u=u,u+v,v+2R(u,v)(u,v+v,u=2R(u,v))u2+v2+2|u,v|u2+v2+2uv(Triangle Inequality)=(u+v)2

Square rooting both sides gives the identity. Notice that we have an equality only if we hav equality from the top to the bottom, requiring from both 's that:

u,v=uv

where notice that if one of u,v are nonnegative multiples of the other, then we get the equation above. Conversely, if the equation holds, then the condition for equality for the Cauchy-Scharz Inequality (see Chapter 6 - Inner Product Spaces#^2ef5ba) implies that one of u,v is a scalar multiple of the other, forcing the scalar in question to be nonnegative as needed.

Similar to the triangle in equality, geometric interpretations suggest a parallelogram equality:

Pasted image 20240308225041.png

Parallelogram Equality

Suppose u,vV. Then:

u+v2+uv2=2(u2+v2)

Proof
We have:

u+v2+uv2=u+v,u+v+uv,uv=u2+v2+u,v+v,u+u2+v2u,vv,u=2(u2+v2)

6.B: Orthonormal Bases

orthonormal

A list of vectors is called orthonormal if each vector in the list has norm 1 and is orthogonal to all the other vectors in the list. In other words, a list e1,...,em of vectors in V is orthonormal if:

ej,ek={1j=k0jk

For instance, the standard basis in Fn is an orthonormal list.

The norm of an orthonormal linear combination

If e1,...,em is an orthonormal list of vectors in V, then:

a1e1++amem2=|a1|2++|am|2

for all aiF.

Proof
Since each ej has norm 1, then we can just apply Chapter 6 - Inner Product Spaces#^4c14b5 over m1 iterations.

An orthonormal list is LI

Every orthonormal list of vectors is LI.

Proof
Suppose e1,...,em is an orthonormal list of vectors in V and a1,...,amF such that:

a1e1++amem=0

Then |a1|2++|am|2=0 from Chapter 6 - Inner Product Spaces#^845f7d. Thus, all aj=0 showing our list of vectors is LI.

orthonormal basis

An orthonormal basis of V is an orthonormal list of vectors in V that is also a basis of V.

For instance, the standard basis is an orthonormal basis of Fn.

An orthonormal list of the right length is an orthonormal basis

Every orthonormal list of vectors in V with length dim(V) is an orthonormal basis of V

Proof
By Chapter 6 - Inner Product Spaces#^73b58d, and since we have the right number of vectors by having dim(V) of them, then it's a basis.

In general, given a basis e1,...,en of V, and a vector vV, we know that there's choices of scalars a1,...,anF such that:

v=a1e1++anen

But how do we find all these ai's in an efficient manner? The next results will help us in doing that:

Writing a vector as a linear combination of orthonormal basis

Suppose e1,...,en is an orthonormal basis of V and vV. Then:

v=v,e1e1++v,enen

and:

v2=|v,e1|2++|v,en|2

Proof
Because e1,...,en is a basis of V, there are ai such that:

v=a1e1++anen

Since e1,...,en is orthonormal, taking the inner product of both sides with ej gives v,ej=aj. This shows the first equation of our lemma.

The second equation follows immediately from using the first equation with Chapter 6 - Inner Product Spaces#^845f7d.

See Lecture 29 - More on Orthonormality#Gram Schmidt Process for a more in-depth look as to what's going on here.

We see how it's useful to have an orthonormal basis, so how do we get one? This is the Gram-Schmidt Procedure:

Gram-Schmidt Procedure

Suppose v1,...,vm is a LI list of vectors from V. Let e1=v1/v1. For j=2,...,m, define ej inductively by:

ej=vjvj,e1e1vj,ej1ej1vjvj,e1e1vj,ej1ej1

Then e1,...,em is an orthonromal list of vectors in V such that:

span(v1,...,vj)=span(e1,...,ej)

for j=1,...,m

Proof
We'll use induction over j. Start with j=1. Notice that span(v1)=span(e1) since v1 is a positive multiple of e1.

Suppose 1<j<m and we have it that:

span(v1,...,vj1)=span(e1,...,ej1)

Notice that vjspan(v1,...,vj1) since v1,...,vm is LI. Thus vjspan(e1,...,ej1) by our inductive hypothesis. Hence, we are not dividing by 0 in the new definition of ej given by the lemma. Dividing a vector by its norm produces a new vector with norm 1, so ej=1.

Let 1k<j. Then:

ej,ek=vjvj,e1e1vj,ej1ej1vjvj,e1e1vj,ej1ej1,ek=vj,ekvj,ekvjvj,e1e1vj,ej1ej1=0

Thus e1,...,ej is an orthonormal list.

From the definition of ej given by the lemma, we see that vjspan(e1,...,ej), and combining this information with the inductive hypothesis gives:

span(v1,...,vj)span(e1,...,ej)

Both lists are LI (the v's by hypothesis, the e's by orthonormality and Chapter 6 - Inner Product Spaces#^ef122d). Thus, both subspaces above have dimension j, and hence they are equal, completing the proof.

An Example

We'll find an orthonormal basis of P2(R), where the inner product is given by:

p,q11p(x)q(x)dx

Apply Gram-Schmidt to the basis 1,x,x2. To get started, we see that:

1=1112dx=2

Thus 1=2 so then e1=22.

Now the numerator for e2 should be:

xx,e1e1=x11x22dx22=x

We have:

x2=11x2dx=23

Thus have e2=32x. Now the numerator for e3 is:

x2x2,e1e1x2,e2e2=x21211x212dx32x11x232xdx=x213

And then:

x21/32=11(x21/3)2dx=845

Thus e3=458(x21/3).

Thus e1,e2,e3 is 12,32x,458(x21/3), which is our orthonormal list of length 3 in our vector space. Hence, this orthonormal list is an orthonormal basis of P2(R) since it's LI (from orthonormality), and is of the right dimension.

Existence of orthonormal basis

Every finite-dimensional inner product space has an orthonormal basis.

Proof
If V is finite-dimensional, then there's a basis v1,...,vn for V. Apply Gram-Schmidt to get an orthonormal list with length dim(V)=n. This orthonormal list is LI, so it's an orthonormal basis of V.

Orthornormal list extends to orthonormal basis

Suppose V is finite-dimensional. Then every orthonormal list of vectors in V can be extended to an orthonormal basis of V.

Proof
If e1,...,em is an orthonormal list of vectors in V, then e1,...,em is LI. This list can also be extended to a basis as a result, to e1,...,em,v1,...,vn of V. Applying Gram-Schmidt, we can an orthonormal list:

e1,...,em,f1,...,fn

here the first m vectors are unchanged since they are already orthonormal. The list above is an orthonormal basis of V since it's the right length.

Recall that a matrix is upper triangular if all the entries below the diagonal equal 0. From Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^2e8389, we would like to know whether there exists an orthonormal basis specifically, with respect to which we have an upper-triangular matrix.

Upper-triangular matrix with respect to orthonormal basis

Suppose TL(V). If T has an upper-triangular matrix with respect to some basis of V, then T has an upper-triangular matrix with respect to some orthonormal basis of V.

Proof
Suppose T has an upper-triangular matrix with respect to some basis v1,...,vn of V. Then span(v1,...,vj) is invariant under T for each j=1,...,n via Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^a5a043.

Apply the Gram-Schmidt Procedure to v1,...,vn, producing an orthonormal basis e1,...,en of V. Because:

span(e1,...,ej)=span(v1,...,vj)

for each j via our Gram-Schmidt procedure, we can conclude that span(e1,...,ej) is invariant under T for each j=1,...,n. Thus, by our invariant property, T has an upper-triangular matrix with respect to the orthonormal basis e1,...,en.

Schur's Theorem

Suppose V is a finite-dimensional complex vector space and TL(V). Then T has an upper-triangular matrix with respect to some orthonormal basis of V.

Proof
Recall that T has an upper-triangular matrix with respect to some basis of V via Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^2e8389. Apply Gram-Schmidt and Chapter 6 - Inner Product Spaces#^b22ab5.

Linear Functionals on Inner Product Spaces

linear functional

A linear functional on V is a linear map from V to F. In other words, a linear functional is an element of L(V,F).

For instance, the function ϕ:F3F given by:

ϕ(z1,z2,z3)=2z15z2+z3

is a linear function on F3. We could write this linear functional in the form:

ϕ(z)=z,u

for all zF3 where u=(2,5,1).

We won't need to cover linear functionals for the time being, so we just end here!

6.C: Orthogonal Complements and Minimization Problems

(I'll see you back here next quarter!!!)