Most of the vector space bases that are useful in applications are assembled from the eigenvectors of a particular matrix. In this section, we focus on matrices with a “complete” set of eigenvectors and show how these form a basis for Rn (or in the complex case, Cn); in these cases, the set of their eigenvectors are known as eigenbases:
Such eigenbases allow us to rewrite the linear transformation determined by a matrix in a simple diagonal form; matrices what allow us to do this are called diagonalizable, a definition which we will formalize shortly. We focus on matrices with real eigenvalues and eigenvectors to start, and will return to matrices with complex eigenvalues/eigenvectors in a few pages.
Our starting point is the following theorem, which we will state as a fact. It is a generalization of the pattern we saw in an example before that the eigenvectors corresponding to distinct eigenvalues are linearly independent:
However, we also saw an example where a 3×3 matrix only had two distinct eigenvalues, but still had three linearly independent eigenvectors:
Notice that in this last example dimVλ1=2 (why?) for the double eigenvalue λ1=2 (i.e., the eigenspace corresponding to λ1 had dimension of 2), and similarly, dimVλ2=1 for the simple eigenvalue λ2=4, so that there is a “real” eigenvector for each time an eigenvalue appears as a factor of the characteristic polynomial.
These notions can be captured in the idea of algebric and geometric multiplicity:
Our observation is that if the algebraic and geometric multiplicity match for each eigenvalue, then we can form a basis for Rn.
For the next little bit, we will assume that our matrix A satisfies the above theorem. What does this buy us? To answer this question, we need to introduce the idea of similarity transformations.
Given a vector x∈Rn with coordinates xi with respect to the standard basis, i.e., x=x1e1+x2e2+...+xnen, we can find the coordinates y1,...,yn of x with respect to a new basis b1,...,bn by solving the following linear system:
where V=[b1b2...bn]. Since the bi form a basis of Rn, they are linearly independent, which means that B is nonsingular.
Now, suppose I have a matrix A∈Rn×n, which I use to define the linear transformation f:Rn→Rn, given by a by f(x)=Ax. Here the f’s inputs x∈Rn and outputs f(x)∈Rn are both expressed with the standard basis e1,...,en, and its matrix representative is A.
What if we would like to implement this linear transformation with respect to the basis B, that is, define a function g:Rn→Rn with inputs y∈Rn in B-coordinates, and outputs g(y)∈Rn in B-coordinates? To accomplish this, we need to convert both input x and output f(x) to B-coordinates.
Relating inputs x to B-coordinate inputs y is easy: x=By.
Relating outputs f(x) to B-coordinate outputs g(y) is easy too: f(x)=Bg(y).
We conclude that if A is the matrix representation of a linear transformation in the standard basis, then B−1AB is the matrix representation in the basis B.
In the above example, our change of basis didn’t really help us understand what the linear transformation f(x) is doing any better than our starting point. However, we’ll see how that if we use the basis defiend by the eigenvectors of a matrix, some magic happens! We’ll start with an example, and then extract out a general conclusion.
The above example showed us an example of a very important property of an eigenbasis: they diagonalize the original matrix representative! Working with diagonal matrices is very convenient, and thus diagonalization is very useful when we can do it.
Although we only saw a 2×2 example, the idea is applicable to general n×n matrices, in the idea of diagonalizable matrices.
Let’s try to understand condition (D) a little bit more by writing it as
that is, the columns of V must be eigenvectors, and the diagonal elements λi must be eigenvectors! Therefore, we immediately get the following characterization of when a matrix is diagonalizable:
Next, let’s look at some examples of diagonalizable and nondiagonalizable matrices:
Here, we’ll show how to use numpy.linalg (or scipy.linalg) to diagonalize a matrix.
import numpy as np
# given a square matrix A, returns a tuple of matrices (P, D) such that A = PDP^{-1}
def diagonalize(A):
evals, evecs = np.linalg.eig(A)
return evecs, np.diag(evals)
A = np.array([
[2, -1, -1],
[0, 3, 1],
[0, 1, 3]
])
P, D = diagonalize(A)
print('P:')
print(P, '\n')
print('D:')
print(D, '\n')
print('PDP^{-1}:')
print(P @ D @ np.linalg.inv(P))
As you can see, finding a diagonalization in Python is really easy! The numpy.linalg.eig function returns the eigenvectors in a matrix and the eigenvalues (conveniently, it has the eigenvalues in the same order as the corresponding eigenvectors).