6.3 Characteristic Equations
1Reading¶
Material related to this page, as well as additional exercises, can be found in ALA 8.1.
2Learning Objectives¶
By the end of this page, you should know:
- how to define the characteristic equation of a square matrix ,
- how to solve the characteristic equation to find the eigenvalues of ,
- that and have the same eigenvalues for square .
3The Characteristic Equation¶
In the previous section, we mentioned that a square matrix is singular if and only if . Combined with this fact, which states that a scalar λ is an eigenvalue of square matrix if and only if was singular, this motivates the following characterization of eigenvalues:
A key property which follows is that and have the same characteristic polynomial (see determinant Fact 5) and hence the same eigenvalues. We emphasize that they do not, however, have the same eigenvectors!
3.1Finding the Eigenvalues by Solving the Characteristic Equation¶
We now have all the pieces needed to find the eigenvalues for matrices (which is all we will ever ask you to compute, unless there is a special structure). This leads to the following method for finding all eigenvalue/vector pairs for a square matrix :
First, we find all solutions λ to the characteristic equation to get the eigenvalues.
Second, for each value of λ, we find all solutions to the homogenous system to get the eigenvectors for each eigenvalue.
Let’s look at a small example with a matrix:
In general, the characteristic equation cannot easily be factored because there is no formula to exactly compute the roots of any polynomial of degree 5 or higher. One way to get around this is to use root-finding algorithms such as Newton’s method, but these may converge slowly or have numerical issues. In the next section, we’ll introduce an algorithm based on the QR decomposition which alleviates these issues.
3.2Optional: Finding the Characteristic Equation¶
Here, we’ll show how to find the characteristic equation, and then solve it for the eigenvalues, in Python. We stress that this method is not used in practice due to being slower than the QR algorithm (the most commonly used eigenvalue algorithm) and having numerical issues. Our aim here is to just demonstrate some applications of ideas we’ve covered previously in the course in Python.
In the previous section, we showed how to compute the characteristic equation for small ( and some ) matrices. A naive way to extend this to larger matrices is through the Laplace expansion, a method for computing determinants; to get the characteristic equation, we could apply the Laplace expansion on and keep track of the coefficients. However, this takes too long for larger matrices.
However, instead of using the Laplace expansion, we can make a clever observation. Recall that the characteristic equation , where is a -degree polynomial in λ, i.e., for some unknown coefficients .
Suppose we pick distinct values , and evaluate for each . Based on our observation above, we know:
If we know the values of and the matching values , then this is actually a linear system with equations and unknowns in ! In this case, the coefficient matrix is a special square matrix known as a Vandermonde matrix and is nonsingular if and only if the ’s are all distinct.
This method of finding the characteristic equation by solving a system of linear equations for the coefficients is an application of the more general method of Lagrange interpolation.
Python Break!¶
Below, we have python code constructing the characteristic equation via Lagrange interpolation, and then applying a root finding algorithm to get the eigenvalues.
import numpy as np
np.random.seed(541)
n = 10
# Generate random 10x10 matrix
A = np.random.rand(n, n)
# Generate 11 values for which to evaluate the characteristic equation
values = [i / n for i in range(n + 1)]
dets = [np.linalg.det(A - value * np.identity(n)) for value in values]
# Construct vandermonde matrix and solve linear system
coeff_matrix = np.array([[values[i]**j for j in range(n + 1)] for i in range(n + 1)])
characteristic_eqtn_coeffs = np.linalg.solve(coeff_matrix, dets)
print('Coefficients of characteristic equation (c_0, c_1, c_2, ..., c_n):')
print(characteristic_eqtn_coeffs, '\n')
# Flip the order; np.roots expects the coefficient corresponding to the highest degree to come first
characteristic_eqtn_coeffs = np.flip(characteristic_eqtn_coeffs)
print('Eigenvalues (from solving characteristic equation):')
print(np.roots(characteristic_eqtn_coeffs), '\n')
print('Eigenvalues (from np.linalg.eigvals):')
print(np.linalg.eigvals(A), '\n')
Coefficients of characteristic equation (c_0, c_1, c_2, ..., c_n):
[ 0.0418432 -0.01477132 -0.16053083 -0.22766626 0.41869727 0.14725493
1.69564907 1.29781919 -2.0174545 -4.63859515 1. ]
Eigenvalues (from solving characteristic equation):
[ 4.97742109+0.j -0.63759595+0.40955455j -0.63759595-0.40955455j
0.03608572+0.61160805j 0.03608572-0.61160805j -0.38836424+0.29676011j
-0.38836424-0.29676011j 0.74401263+0.j 0.44845519+0.13529401j
0.44845519-0.13529401j]
Eigenvalues (from np.linalg.eigvals):
[ 4.97742109+0.j -0.63759595+0.40955455j -0.63759595-0.40955455j
-0.38836424+0.29676011j -0.38836424-0.29676011j 0.03608572+0.61160805j
0.03608572-0.61160805j 0.44845519+0.13529401j 0.44845519-0.13529401j
0.74401263+0.j ]
As you can see, the eigenvalues obtained from solving the characteristic equation are the same as using the built in numpy.linalg.eigvals
function, just in a different order.
To end, we have a few more notes as to what’s going on under the hood of np.roots
. The function np.roots
, given a list of coefficients for a polynomial, computes its roots. You can see on the official documentation that np.roots
actually works by finding the eigenvalues of a matrix defined using the coefficients of the polynomial!
To convince yourself that there isn’t any circular logic in here (i.e., do we seem to need an eigenvalue algorithm to find roots, and we need a root-finding algorithm to find eigenvalues?) we note that there exist other root finding algorithms, which are not based on eigenvalue problems. For example, you may have seen Newton’s method in a previous calculus course. A main reason why eigenvalue-based root-finding algorithms would be prefered to iterative proceedures such as Newton’s method is because eigenvalues algorithms find all roots at once; whereas iterative proceedures only find a single root, and must be applied multiple times (this can lead to numerical issues and slow convergence).