Think of vectors in or . Specifically, think of distances in these spaces. We denote the norm of , via where for we have . Similarly if then we have . In we have:
The norm itself isn't linear on , but to inject linearity we define:
dot product
For the dot product of , denoted , is defined by:
where and .
Keep in mind the dot product is a binary operator: . Obviously . We notice the following properties of this dot product:
for all . (Distance is always a positive quantity)
iff . (The distance to the origin is 0 when we have the zero vector)
For fixed, the map from to that sends to is linear. (Namely here , which is distributive property)
for all (commutativity)
The inner product is a generalization of this dot product. But keep in mind that the properties above are true mainly for real spaces, but we need something to deal with complex spaces. Namely, if where then:
So for we have the norm as:
Notice we need instead of since it's possible for to be a negative number (which is bad under a square root, as in the example then in that case). Note that:
You want to think of as the inner product of with itself, similar to the dot product above. As such, then this implies that if is defined similar to then the inner product of with is:
If and are swapped, we get the complex conjugate, suggesting that the inner product of with is the complex conjugate of the inner product of and .
Some things before we define the inner product:
If then implies that is both real and nonnegative.
We use to denote the inner product.
inner product
An inner product on is a function that takes each ordered pair of elements of to a number and has the following properties:
(positivity): for all
(definiteness): iff
(additivity in the first slot): for all
(homogeneity in the first slot): for all and all
(conjugate symmetry): for all
Since any real is it's own complex conjugate, if we're dealing with a real vector space then the last condition just says that for all .
An Inner Product on Functions
An inner product can be defined on the vector space of continuous real-valued functions over the interval by:
Another Inner Product on Functions
An inner product can be defined on by:
inner product space
An inner product space is a vector space along with an inner product on .
If you're given you can usually assume that refers to the standard dot product we talked about earlier (called the Euclidean Inner Product).
For the sake of brevity we make the following assumption:
For the rest of this chapter, denotes an inner product space over .
Note the abuse of language here. itself is an inner product space, meaning it talks about it's own vectors space (same name, different thing), with an obvious (from context) inner product.
Note
Note that the inner product from the examples were "obvious" because they took the idea of the Euclidean dot product, of multiplying similar numbers and adding them up, and wrapped the adding up part into an integral, which means essentially the same thing.
Basic Properties of an Inner Product
(a) For each fixed the function that takes to is linear from to
(b) for every
(c) for every
(d) for all
(e) for all and
Proof
(a): This comes from the conditions of additivity in the first slot and homogeneity in the first slot in the definition of an inner product.
(b): Follows from (a) and the result that every linear map takes to .
(c): Follows from (a) and the conjugate symmetry property in the definition of an inner product.
(d): Suppose . Then:
(e): Suppose and then:
☐
Norms
Our initial desire was to define distances for other spaces. Now we see that the inner product determines this norm:
Proof
(a): Comes from the fact that iff from the properties of the inner product.
(b): Suppose , then:
Taking the square roots of both sides finishes the proof.
☐
Notice that the proof above used the norm squared. In general, it's better to do proofs in this way, because the norm is never negative (so the we usually get gets nullified).
orthogonal
Two vectors are called orthogonal if .
Notice that the order here doesn't matter since even in a complex vector space the complex conjugate of is , so .
where is the angle between and (thinking of and as arrows pointing from the origin). Thus, the two vectors are othogonal, using the Euclidean inner product, iff or when or equivalent. Thus, we're able to take the words perpendicular and orthogonal as meaning the same thing.
Orthogonality and
(a) is orthogonal to every vector in
(b) is the only vector in that is orthogonal to itself
(b): If and then by the definition of the inner product.
☐
For the special case the proof of the next thing is super classic. But now, we can abstract it away!
Pythagorean Theorem
Suppose are orthogonal vectors in . Then:
Proof
Here notice that so then:
In general. In this case since are orthogonal then the real part becomes , so then:
☐
Suppose with . We would like to write as a scalar multiple of plus a vector orthogonal to , as suggest by:
We want to get the vector above. Notice here that, where :
we want to get the vector perpendicular to the value above, so namely:
Since then so then solve for :
So then plug it back in to get:
Thus, we proved the following:
An orthogonal decomposition
Suppose , with . Then set and . Then:
and:
Cauchy-Schwarz Inequality
Suppose . Then:
This inequality is an equality iff one of is a scalar multiple of the other.
Proof
If then we get an equality, so let . Then consider the orthogonal decomposition:
given by our above decomposition, where are orthogonal. By the Pythagorean Theorem:
Multiply both sides by then square root both sides to get the inequality above.
Notice equality only happens when which only happens when . But iff is a multiple of via Chapter 6 - Inner Product Spaces#^315ed2, so then we only get equality iff is a scalar multiple of or is a scalar multiple of .
☐
Example
If are continuous real-valued functions on , then:
Triangle Inequality
Suppose . Then:
where we get equality iff one of is a non-negative multiple of the other.
Proof
We have:
Square rooting both sides gives the identity. Notice that we have an equality only if we hav equality from the top to the bottom, requiring from both 's that:
where notice that if one of are nonnegative multiples of the other, then we get the equation above. Conversely, if the equation holds, then the condition for equality for the Cauchy-Scharz Inequality (see Chapter 6 - Inner Product Spaces#^2ef5ba) implies that one of is a scalar multiple of the other, forcing the scalar in question to be nonnegative as needed.
☐
Similar to the triangle in equality, geometric interpretations suggest a parallelogram equality:
Parallelogram Equality
Suppose . Then:
Proof
We have:
☐
6.B: Orthonormal Bases
orthonormal
A list of vectors is called orthonormal if each vector in the list has norm 1 and is orthogonal to all the other vectors in the list. In other words, a list of vectors in is orthonormal if:
For instance, the standard basis in is an orthonormal list.
An orthonormal basis of is an orthonormal list of vectors in that is also a basis of .
For instance, the standard basis is an orthonormal basis of .
An orthonormal list of the right length is an orthonormal basis
Every orthonormal list of vectors in with length is an orthonormal basis of
Proof
By Chapter 6 - Inner Product Spaces#^73b58d, and since we have the right number of vectors by having of them, then it's a basis.
☐
In general, given a basis of , and a vector , we know that there's choices of scalars such that:
But how do we find all these 's in an efficient manner? The next results will help us in doing that:
Writing a vector as a linear combination of orthonormal basis
Suppose is an orthonormal basis of and . Then:
and:
Proof
Because is a basis of , there are such that:
Since is orthonormal, taking the inner product of both sides with gives . This shows the first equation of our lemma.
We see how it's useful to have an orthonormal basis, so how do we get one? This is the Gram-Schmidt Procedure:
Gram-Schmidt Procedure
Suppose is a LI list of vectors from . Let . For , define inductively by:
Then is an orthonromal list of vectors in such that:
for
Proof
We'll use induction over . Start with . Notice that since is a positive multiple of .
Suppose and we have it that:
Notice that since is LI. Thus by our inductive hypothesis. Hence, we are not dividing by 0 in the new definition of given by the lemma. Dividing a vector by its norm produces a new vector with norm 1, so .
Let . Then:
Thus is an orthonormal list.
From the definition of given by the lemma, we see that , and combining this information with the inductive hypothesis gives:
Both lists are LI (the 's by hypothesis, the 's by orthonormality and Chapter 6 - Inner Product Spaces#^ef122d). Thus, both subspaces above have dimension , and hence they are equal, completing the proof.
☐
An Example
We'll find an orthonormal basis of , where the inner product is given by:
Apply Gram-Schmidt to the basis . To get started, we see that:
Thus so then .
Now the numerator for should be:
We have:
Thus have . Now the numerator for is:
And then:
Thus .
Thus is , which is our orthonormal list of length 3 in our vector space. Hence, this orthonormal list is an orthonormal basis of since it's LI (from orthonormality), and is of the right dimension.
Existence of orthonormal basis
Every finite-dimensional inner product space has an orthonormal basis.
Proof
If is finite-dimensional, then there's a basis for . Apply Gram-Schmidt to get an orthonormal list with length . This orthonormal list is LI, so it's an orthonormal basis of .
☐
Orthornormal list extends to orthonormal basis
Suppose is finite-dimensional. Then every orthonormal list of vectors in can be extended to an orthonormal basis of .
Proof
If is an orthonormal list of vectors in , then is LI. This list can also be extended to a basis as a result, to of . Applying Gram-Schmidt, we can an orthonormal list:
here the first vectors are unchanged since they are already orthonormal. The list above is an orthonormal basis of since it's the right length.
☐
Recall that a matrix is upper triangular if all the entries below the diagonal equal 0. From Chapter 5 - Eigenvalues, Eigenvectors, and Invariant Subspaces#^2e8389, we would like to know whether there exists an orthonormal basis specifically, with respect to which we have an upper-triangular matrix.
Upper-triangular matrix with respect to orthonormal basis
Suppose . If has an upper-triangular matrix with respect to some basis of , then has an upper-triangular matrix with respect to some orthonormal basis of .
Apply the Gram-Schmidt Procedure to , producing an orthonormal basis of . Because:
for each via our Gram-Schmidt procedure, we can conclude that is invariant under for each . Thus, by our invariant property, has an upper-triangular matrix with respect to the orthonormal basis .
☐
Schur's Theorem
Suppose is a finite-dimensional complex vector space and . Then has an upper-triangular matrix with respect to some orthonormal basis of .