4.1 Orthogonal and Orthonormal Bases

1Reading¶

Material related to this page, as well as additional exercises, can be found in ALA 4.1.

2Learning Objectives¶

By the end of this page, you should know:

orthogonal and orthonormal basis and their examples
how to check if a basis is orthogonal and orthonormal
to write coordinates of a vector in orthogonal and orthonormal basis

3Orthogonality¶

Orthogonality is a generalization/abstraction of perpendicularity (right angles) to general inner product spaces. Algorithms using orthogonality are at the core of modern linear algebra, and include the Gram-Schmidt algorithm, the QR decomposition, and the least-squares algorithm, all of which we shall see in this lecture.

More abstract applications of orthogonality, that you will see for example, in ESE 2240 include the Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT), algorithms that lie at the heart of modern digital media (e.g., JPEG image compression and MP3 audio compression).

4Orthogonal and Orthonormal Bases¶

Let $V$ be an inner product space (as usual, we assign that the scalars over which $V$ is defined are real valued). Remember that $\vv v, \vv w \in V$ are orthogonal if $\langle \vv v, \vv w\rangle = 0$ . If $\vv v, \vv w \in \mathbb{R}^n$ and $\langle \vv v, \vv w\rangle = \vv v \cdot \vv w$ is the dot product, this simply means that $\vv v$ and $\vv w$ are perpendicular (meet at a right angle).

Orthogonal vectors are useful, because they point in completely different directions, making them particularly well-suited for defining bases. Orthogonal vectors give rise to the concept of an orthogonal basis.

If each basis vector in an orthogonal basis is a unit vector (has norm equal to one), then they form a special type of orthogonal basis known as an orthonormal basis.

A simply way to construct an orthonormal basis from an orthogonal basis is to normalize each of its elements, that is, to replace each basis element $\vv{b_i}$ with its normalized counterpart $\frac{\vv{b_i}}{\|\vv{b_i}\|}$ . As an exercise, can you formally verify that $\frac{\vv{b_1}}{\|\vv{b_1}\|}, ..., \frac{\vv{b_n}}{\|\vv{b_n}\|}$ is an orthonormal basis for if $\vv{b_1}, ..., \vv{b_n}$ is an orthogonal one? Can you explain why rescaling each entry does not affect the mutual orthogonality of this set?

Note

A very useful property of a collection of mutually orthogonal vectors is that they are automatically linearly independent. In particular, if $\vv{v_1}, ..., \vv{v_k}$ satisfy $\langle \vv{v_i}, \vv{v_j} \rangle = 0$ for all $i \neq j$ (and $\vv{v_i} \neq 0$ for all $i$ ), then they are linearly independent.

To see this, we take an arbitrary linear combination of the $\vv{v_i}$ and set it to 0:

\begin{align*} c_1 \vv{v_1} + c_2\vv{v_2} + ... + c_k\vv{v_k} = \vv 0 \end{align*}

(2)

Let’s take the inner product of both sides of this equation with any $\vv{v_i}$ :

\begin{align*} 0 = \langle \vv 0, \vv{v_i} \rangle &= \langle c_1\vv{v_1} + c_2\vv{v_2} + ... + c_k\vv{v_k}, \vv{v_i}\rangle\\ &= c_1\langle \vv{v_1}, \vv{v_i} \rangle + ... + c_i\langle \vv{v_i}, \vv{v_i} \rangle + ... + c_k\langle \vv{v_k}, \vv{v_i} \rangle \quad \text{(linearity of $\langle \cdot, \vv{v_i}\rangle$)}\\ &= c_i\langle \vv{v_i}, \vv{v_i}\rangle = c_i \|\vv{v_i}\|^2 \quad\text{(orthogonality)} \end{align*}

(3)

Since $\vv{v_i} \neq 0$ , $\|\vv{v_i}\|^2 > 0$ , which means $c_i = 0$ . We can repeat this game with all $\vv{v_i}$ for $i = 1, ..., k$ , to conclude that (2) holds only if $c_1 = c_2 = ... = c_k = 0$ . Hence, the mutually orthogonal collection $\vv{v_1}, ..., \vv{v_k}$ is linearly independent.

Example 2 (Normalizing an orthogonal basis)

The vectors

\begin{align*} \vv{b_1} = \bm 1 \\2 \\ -1\em, \quad \vv{b_2} = \bm 0\\1\\2\em, \quad \vv{b_3} = \bm 5\\-2\\1\em \end{align*}

(4)

are an orthogonal basis for $\mathbb{R}^3$ . One easy way to check this is to confirm that $\vv{b_i} \cdot \vv{b_j} = 0$ for all $i\neq j$ (this is indeed true). Since $\text{dim} (\mathbb{R}^3) = 3$ , and $\vv{b_1}, \vv{b_2}, \vv{b_3}$ are linearly independent, they must be a basis.

To turn them from an orthogonal basis into an orthonormal basis, we simply divide every vector by its length to obtain

\begin{align*} \vv{v_1} = \frac{\vv{b_1}}{\|\vv{b_1}\|} = \frac{1}{\sqrt 6}\bm 1\\2\\-1\em, \quad \vv{v_2} = \frac{\vv{b_2}}{\|\vv{b_2}\|} = \frac{1}{\sqrt 5}\bm 0\\1\\2\em, \quad \vv{v_3} = \frac{\vv{b_3}}{\|\vv{b_3}\|} = \frac{1}{\sqrt{30}}\bm 5\\-2\\1\em \end{align*}

(5)

4.1Python break!¶

In the following code, we demonstrate how to normalize a set of vectors (an orthogonal basis) represented in a matrix using np.linalg.norm, so that we obtain a normalized set of vectors (an orthonormal basis).

# Normalizing

import numpy as np

b = np.array([[1, 0, 5],
              [2, 1, -2],
              [-1, 2, 1]])
print("The basis represented as a matrix: \n", b)

b_norm = np.linalg.norm(b, axis=0) # notice across what axis we are computing the norm
b_normalized = b / b_norm # Dividing a matrix by a vector!!!

print("Normalized basis: \n", b_normalized)

print("Norm of each basis vector: \n", np.linalg.norm(b_normalized, axis=0))

The basis represented as a matrix: 
 [[ 1  0  5]
 [ 2  1 -2]
 [-1  2  1]]
Normalized basis: 
 [[ 0.40824829  0.          0.91287093]
 [ 0.81649658  0.4472136  -0.36514837]
 [-0.40824829  0.89442719  0.18257419]]
Norm of each basis vector: 
 [1. 1. 1.]

5Working in Orthogonal Bases¶

So why do we care about orthogonal (or even better, orthonormal) bases? Turns out they make a lot of the computations that we’ve been doign so far MUCH easier.

We’ll start with some important properties of computing a vector’s coordinates with respect to an orthogonal basis.

Theorem 1 (Coordinates and Norm in an Orthonormal Basis)

Let $\vv{u_1}, ..., \vv{u_n}$ be an orthonormal basis for an inner product space $V$ . Then any $\vv v\in V$ is a linear combination

\begin{align*} \vv v = c_1 \vv{u_1} + ... + c_n \vv{u_n} \end{align*}

(6)

in which its coordinates are given by

\begin{align*} c_i = \langle \vv v, \vv{u_i} \rangle,\quad u = 1, ..., n \end{align*}

(7)

Moreover, its norm is given by the Pythagorean formula,

\begin{align*} \| \vv v \|^2 = c_1^2 + ... + c_n^2 = \sum_{i=1}^{n}{\langle \vv v, \vv{u_i}\rangle^2} \end{align*}

(8)

Proof 1 (Proof of Theorem 1)

The trick here is to exploit that

\begin{align*} \langle \vv{u_i}, \vv{u_j} \rangle = \begin{cases} 0 \quad\text{if $i \neq j$}\\ 1 \quad\text{if $ i = j$}\end{cases} \end{align*}

(9)

Let’s compute

\begin{align*} \langle \vv v, \vv{u_i} \rangle &= \langle c_1\vv{u_1} + ... + c_n\vv{u_n}, \vv {u_i}\rangle\\ &= c_1 \langle \vv{u_1}, \vv{u_i} \rangle + ... + c_i\langle \vv{u_i}, \vv{u_i}\rangle + ... + c_n\langle \vv{u_n}, \vv{u_i} \rangle\quad\text{(linearity of $\langle \cdot, \vv{u_i}\rangle$)}\\ &= c_i \| \vv{u_i} \|^2\quad\text{(orthogonality)}\\ &= c_i \quad\text{$(\| \vv{u_i} \| = 1)$} \end{align*}

(10)

So we have $c_i = \langle \vv v, \vv{u_i}\rangle$ . Now to compute the norm, we again use a similar trick:

\begin{align*} \|\vv v\|^2 = \langle \vv v, \vv v\rangle &= \left\langle \sum_{i=1}^{n}{c_i\vv{u_i}}, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle\\ &= \sum_{i=1}^{n}{c_i\left\langle \vv{u_i}, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle}\quad\text{(linearity of $\left\langle \cdot, \sum_{j=1}^{n}{c_j\vv{u_j}} \right\rangle$)}\\ &= \sum_{i=1}^{n}{\sum_{j=1}^{n}{c_ic_j\langle \vv{u_i}, \vv{u_j} \rangle}}\quad\text{(linearity of $\langle \vv{u_i}, \cdot \rangle$)}\\ &= \sum_{i=1}^{n}{c_i^2\|\vv{u_i}\|^2}\quad\text{(orthogonality)}\\ &= \sum_{i=1}^{n}{c_i^2}\quad\text{($\|\vv{u_i} = 1\|$)} \end{align*}

(11)

Exercise 1 (Rewriting a vector in an orthonormal basis)

Rewrite $\vv v = \bm 1\\1\\1\em$ in terms of the orthonormal basis

\begin{align*} \vv{v_1} = \frac{1}{\sqrt 6}\bm 1\\2\\-1\em, \quad \vv{v_2} = \frac{1}{\sqrt 5}\bm 0\\1\\2\em, \quad \vv{v_3} = \frac{1}{\sqrt{30}}\bm 5\\-2\\1\em \end{align*}

(12)

Solution to Exercise 1

As we saw earlier, all we need to do is compute dot products!

\begin{align*} \vv v \cdot \vv{u_1} = \frac{2}{\sqrt 6}, \quad \vv v\cdot \vv{u_2} = \frac{3}{\sqrt 5}, \quad \vv v \cdot \vv{u_3} = \frac{4}{\sqrt{30}} \end{align*}

(13)

to then write:

\begin{align*} \vv v = \frac{2}{\sqrt 6}\vv{u_1} + \frac{3}{\sqrt 5}\vv{u_2} + \frac{4}{\sqrt{30}}\vv{u_3} \end{align*}

(14)

This is much simplier than solving the system of linear equations

\begin{align*} \bm \vv{u_1} & \vv{u_2} & \vv{u_3} \em \bm c_1\\c_2\\c_3 \em = \bm 1\\1\\1\em \end{align*}

(15)

for the coordinates $c_1, c_2, c_3$ .

A very small change to the above allows us to extend these ideas to orthogonal, but not orthonormal, bases:

Theorem 2 (Coordinates and Norm in an Orthogonal Basis)

If $\vv{v_1}, ..., \vv{v_n}$ are an orthogonal basis, then $\vv v\in V$ can be written

\begin{align*} \vv v = a_1\vv{v_1} + ... + a_n\vv{v_n} \quad\text{with $a_i = \frac{\langle \vv v, \vv{v_i} \rangle}{\|\vv{v_i}\|^2}$} \end{align*}

(16)

and its norm is given by

\begin{align*} \|\vv v\|^2 = a_1^2 \|v_1\|^2 + ... + a_n^2 \|v_n\|^2 \end{align*}

(17)

Proof 2 (Proof of Theorem 2)

This is derived using our theorem for orthonormal bases by rescaling the $\vv{v_i}$ to $\frac{\vv{v_i}}{\|\vv{v_i}\|}$ .

Example 3 (Change of Coordinates to an Orthogonal Basis of a Function Space)

Even though our focus in this class will mostly be on $\mathbb{R}^n$ (or vector spaces that “behave like” $\mathbb{R}^n$ ), all of these ideas apply to general inner product spaces, including function spaces.

As a simple example, let’s consider the space of quadratic polynomials of degree $\leq 2$ , $P^{(2)}$ over $[0, 1]$ equipped with the integral inner product $\langle f, g \rangle = \int_{0}^{1}{f(x)g(x) \:dx}$ .

The standard monomials ( $1, x, x^2$ ) do NOT form an orthogonal basis:

\begin{align*} \langle 1, x\rangle = \frac 1 2, \quad \langle 1, x^2\rangle = \frac 1 3, \quad \langle x, x^2\rangle = \frac 1 4 \end{align*}

(18)

One orthogonal basis for $P^{(2)}$ is:

\begin{align*} p_1(x) = 1,\quad p_2(x) = x - \frac 1 2, \quad p_3(x) = x^2 - x + \frac 1 6 \end{align*}

(19)

For example,

\begin{align*} \langle p_1, p_2 \rangle = \int_{0}^{1}{1\cdot (x - \frac 1 2) \: dx} &= \int_{0}^{1}{x \:dx} - \frac 1 2 \int_{0}^{1}{dx} \\ &= \left.\frac{x^2}{2}\right\rvert_0^1 - \left.\frac{x}{2}\right\rvert_0^1\\ &= \frac 1 2 - 0 - \frac 1 2 + 0 = 0 \end{align*}

(20)

With a little bit more calculus, you can check that $\langle p_1, p_3\rangle = \langle p_2, p_3\rangle = 0$ and that

\begin{align*} \|p_1\| = 1, \quad \|p_2\| = \frac {1}{2\sqrt 3}, \quad \|p_3\| =\frac{1}{6\sqrt 5} \end{align*}

(21)

If we now want to compute the coordinates $c_1, c_2, c_3$ of a quadratic polynomial

\begin{align*} p(x) = c_1p_1(x) + c_2p_2(x) + c_3p_3(x) \end{align*}

(22)

we simply compute some inner products:

\begin{align*} c_1 = \frac{\langle p, p_1 \rangle}{\|p_1\|^2},\quad c_2 = \frac{\langle p, p_2\rangle}{\|p_2\|^2}, \quad \frac{\langle p, p_3\rangle}{\|p_3\|^2} \end{align*}

(23)

So for example, if $p(x) = x^2 + x + 1$ , then

\begin{align*} c_1 = \frac{\int_{0}^{1}{(x^2 + x + 1)\cdot 1\:dx}}{1} = \frac{11}{6},\\ \quad c_2 = \frac{\int_{0}^{1}{(x^2 + x + 1)(x - \frac 1 2)\:dx}}{\left(\frac{1}{12}\right)} = 2, \\ \quad c_3 = \frac{\int_{0}^{1}{ (x^2 + x + 1)(x^2 - x + \frac 1 6) \:dx}}{(\frac{1}{180})} = 1 \end{align*}

(24)

so that $p(x) = x^2 + x + 1 = \frac{11}{6} + 2(x - \frac 1 2) + (x^2 - x + \frac 1 6)$ .

While this may look very abstract, this is exactly the same mechanism underpinning things like the Discrete Fourier Transform, which is a change of a signal to a (complex) orthonormal basis in function space, where each basis element is a complex sinusoid.