6.3 Characteristic Equations

1Reading¶

Material related to this page, as well as additional exercises, can be found in ALA 8.1.

2Learning Objectives¶

By the end of this page, you should know:

how to define the characteristic equation of a square matrix $A$ ,
how to solve the characteristic equation to find the eigenvalues of $A$ ,
that $A$ and $A^\top$ have the same eigenvalues for square $A$ .

3The Characteristic Equation¶

In the previous section, we mentioned that a square matrix $B$ is singular if and only if $\det (B) = 0$ . Combined with this fact, which states that a scalar λ is an eigenvalue of square matrix $A$ if and only if $A - \lambda I$ was singular, this motivates the following characterization of eigenvalues:

Definition 1 (Characteristic Equation)

For a square matrix $A \in \mathbb{R}^{n \times n}$ , its characteristic equation is defined as the equation:

\begin{align*} \det(A - \lambda I)= 0 \end{align*}

(1)

which has a scalar variable λ. The solutions $\lambda_i$ that satisfy the equation are eigenvalues of $A$ because of the following equivalence which we have mentioned previously

\begin{align*} \det(A - \lambda_i I)= 0 \Leftrightarrow A - \lambda_i I \text{ is singular } \Leftrightarrow \lambda_i \text{ is an eigenvalue} \end{align*}

(2)

The characteristic equation is a $n^{\text{th}}$ -degree polynomial equation in terms of λ. This can be seen by fully expanding the equation using the entries of $A$ (e.g., via the Laplace expansion).

A key property which follows is that $A$ and $A^\top$ have the same characteristic polynomial (see determinant Fact 5) and hence the same eigenvalues. We emphasize that they do not, however, have the same eigenvectors!

3.1Finding the Eigenvalues by Solving the Characteristic Equation¶

We now have all the pieces needed to find the eigenvalues for $2\times2$ matrices (which is all we will ever ask you to compute, unless there is a special structure). This leads to the following method for finding all eigenvalue/vector pairs for a square matrix $A$ :

First, we find all solutions λ to the characteristic equation $\det(A - \lambda I)$ to get the eigenvalues.
Second, for each value of λ, we find all solutions $\vv v$ to the homogenous system $(A - \lambda I)\vv v = \vv 0$ to get the eigenvectors for each eigenvalue.

Let’s look at a small example with a $2\times 2$ matrix:

Example 1 (Finding the eigenvalues and eigenvectors of a

2\times 2

matrix)

Let’s find the eigenvalues and eigenvectors of the matrix $\bm 3&1\\1&3 \em$ .

First, let’s solve the characteristic equation $\det\left(\bm 3&1\\1&3 \em - \lambda I\right) = 0$ . We have:

\begin{align*} \det\left( \bm 3&1\\1&3 \em - \lambda I \right) = 0 &\iff \det\left( \bm 3 - \lambda&1\\1&3 - \lambda \em \right) = 0 \\ &\iff (3 - \lambda)(3 - \lambda) - (1)(1) = 0\\ &\iff \lambda^2 - 6\lambda + 8 = 0\\ \end{align*}

(3)

The second line follows from the determinant formula for $2\times 2$ matrices. This is a quadratic function of λ and has roots $\lambda_1 = 2, \lambda_2 = 4$ , our eigenvalues.

Second, let’s find the eigenvectors corresponding to each eigenvalue. Just like we did two sections ago, we solve for the nullspaces of $\bm 3&1\\1&3 \em - 2I$ and $\bm 3&1\\1&3 \em - 4I$ to get the eigenvectors corresponding to $\lambda_1$ and $\lambda_2$ , respectively.

You should get that the eigenvectors corresponding to $\lambda_1 = 2$ are:

\begin{align*} \text{span} \left\{ \bm -1\\ 1\em \right\} \end{align*}

(4)

and the eigenvectors corresponding to $\lambda_2 = 4$ are:

\begin{align*} \text{span} \left\{ \bm 1\\ 1 \em \right\}. \end{align*}

(5)

We typically only distinguish linearly independent eignevectors. In this case, we would say that $\vv{v_1} = \bm 1\\1 \em$ is the eigenvector corresponding to $\lambda_2 = 4$ , although it is understood that any vector $\vv{\tilde{v_1}} = a \vv{v_1}$ , for $a \neq 0$ , is also a valid eigenvector.

In general, the characteristic equation cannot easily be factored because there is no formula to exactly compute the roots of any polynomial of degree 5 or higher. One way to get around this is to use root-finding algorithms such as Newton’s method, but these may converge slowly or have numerical issues. In the next section, we’ll introduce an algorithm based on the QR decomposition which alleviates these issues.

3.2Optional: Finding the Characteristic Equation¶

Here, we’ll show how to find the characteristic equation, and then solve it for the eigenvalues, in Python. We stress that this method is not used in practice due to being slower than the QR algorithm (the most commonly used eigenvalue algorithm) and having numerical issues. Our aim here is to just demonstrate some applications of ideas we’ve covered previously in the course in Python.

In the previous section, we showed how to compute the characteristic equation for small ( $2\times 2$ and some $3\times 3$ ) matrices. A naive way to extend this to larger matrices is through the Laplace expansion, a method for computing determinants; to get the characteristic equation, we could apply the Laplace expansion on $A - \lambda I$ and keep track of the coefficients. However, this takes too long for larger matrices.

However, instead of using the Laplace expansion, we can make a clever observation. Recall that the characteristic equation $\det(A - \lambda I)$ , where $A \in \mathbb{R}^{n\times n}$ is a $n$ -degree polynomial in λ, i.e., $\det(A - \lambda I) = c_0 + c_1\lambda + c_2\lambda^2 + \dots + c_n\lambda^n$ for some unknown coefficients $c_0, c_1, \dots, c_n$ .

Suppose we pick $n + 1$ distinct values $l_1, l_2, \dots l_{n + 1} \in \mathbb{R}$ , and evaluate $\det(A - \lambda l_i)$ for each $i = 1, \dots, n + 1$ . Based on our observation above, we know:

\begin{align*} \det(A - l_1 I) &= c_0 + c_1 l_1 + c_2 l_1^2 + \dots + c_n l_1^n\\ \det(A - l_2 I) &= c_0 + c_1 l_2 + c_2 l_2^2 + \dots + c_n l_2^n\\ &\dots\\ \det(A - l_{n + 1} I) &= c_0 + c_1 l_{n + 1} + c_2 l_{n + 1}^2 + \dots + c_n l_{n + 1}^n \end{align*}

(6)

If we know the values of $l_1, l_2, \dots l_{n + 1}$ and the matching values $\det(A - l_1 I), \det(A - l_2 I), \dots, \det(A - l_{n + 1} I)$ , then this is actually a linear system with $n + 1$ equations and $n + 1$ unknowns in $c_0, c_1, \dots, c_{n + 1}$ ! In this case, the coefficient matrix $\bm 1 & l_1 & l_1^2 & \dots & l_1^n \\ 1 & l_2 & l_2^2 & \dots & l_2^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\1 & l_{n + 1} & l_{n + 1}^2 & \dots & l_{n + 1}^n \em$ is a special square matrix known as a Vandermonde matrix and is nonsingular if and only if the $l_i$ ’s are all distinct.

This method of finding the characteristic equation by solving a system of linear equations for the coefficients is an application of the more general method of Lagrange interpolation.

Python Break!¶

Below, we have python code constructing the characteristic equation via Lagrange interpolation, and then applying a root finding algorithm to get the eigenvalues.

import numpy as np 
np.random.seed(541)

n = 10

# Generate random 10x10 matrix
A = np.random.rand(n, n)

# Generate 11 values for which to evaluate the characteristic equation
values = [i / n for i in range(n + 1)]
dets = [np.linalg.det(A - value * np.identity(n)) for value in values]

# Construct vandermonde matrix and solve linear system
coeff_matrix = np.array([[values[i]**j for j in range(n + 1)] for i in range(n + 1)])
characteristic_eqtn_coeffs = np.linalg.solve(coeff_matrix, dets)

print('Coefficients of characteristic equation (c_0, c_1, c_2, ..., c_n):')
print(characteristic_eqtn_coeffs, '\n')

# Flip the order; np.roots expects the coefficient corresponding to the highest degree to come first
characteristic_eqtn_coeffs = np.flip(characteristic_eqtn_coeffs)

print('Eigenvalues (from solving characteristic equation):')
print(np.roots(characteristic_eqtn_coeffs), '\n')

print('Eigenvalues (from np.linalg.eigvals):')
print(np.linalg.eigvals(A), '\n')

Coefficients of characteristic equation (c_0, c_1, c_2, ..., c_n):
[ 0.0418432  -0.01477132 -0.16053083 -0.22766626  0.41869727  0.14725493
  1.69564907  1.29781919 -2.0174545  -4.63859515  1.        ] 

Eigenvalues (from solving characteristic equation):
[ 4.97742109+0.j         -0.63759595+0.40955455j -0.63759595-0.40955455j
  0.03608572+0.61160805j  0.03608572-0.61160805j -0.38836424+0.29676011j
 -0.38836424-0.29676011j  0.74401263+0.j          0.44845519+0.13529401j
  0.44845519-0.13529401j] 

Eigenvalues (from np.linalg.eigvals):
[ 4.97742109+0.j         -0.63759595+0.40955455j -0.63759595-0.40955455j
 -0.38836424+0.29676011j -0.38836424-0.29676011j  0.03608572+0.61160805j
  0.03608572-0.61160805j  0.44845519+0.13529401j  0.44845519-0.13529401j
  0.74401263+0.j        ]

As you can see, the eigenvalues obtained from solving the characteristic equation are the same as using the built in numpy.linalg.eigvals function, just in a different order.

To end, we have a few more notes as to what’s going on under the hood of np.roots. The function np.roots, given a list of coefficients for a polynomial, computes its roots. You can see on the official documentation that np.roots actually works by finding the eigenvalues of a matrix defined using the coefficients of the polynomial!

To convince yourself that there isn’t any circular logic in here (i.e., do we seem to need an eigenvalue algorithm to find roots, and we need a root-finding algorithm to find eigenvalues?) we note that there exist other root finding algorithms, which are not based on eigenvalue problems. For example, you may have seen Newton’s method in a previous calculus course. A main reason why eigenvalue-based root-finding algorithms would be prefered to iterative proceedures such as Newton’s method is because eigenvalues algorithms find all roots at once; whereas iterative proceedures only find a single root, and must be applied multiple times (this can lead to numerical issues and slow convergence).