Skip to article frontmatterSkip to article content

1.7 Matrix Inverses

How do I divide by a matrix?

Dept. of Electrical and Systems Engineering
University of Pennsylvania

Binder

Lecture notes

1Reading

Material related to this page, as well as additional exercises, can be found in ALA Ch. 1.5, LAA Ch 2.2. These notes are mostly based on ALA Ch 1.5.

2Learning Objectives

By the end of this page, you should know:

  • what is the inverse of a matrix
  • computing inverse for 2x2 matrices
  • important properties of matrix inverse: existence, product
  • Gauss Jordan elimination to compute inverses

3Basic Definition

The inverse of a matrix is analogous to the reciprocal a1=1aa^{−1} =\frac{1}{a} of a nonzero scalar a0a \neq 0. We already encountered the inverses of matrices corresponding to elementary row operations. In this section, we will study inverses of general square matrices. We begin with the formal definition.

The inverse of a matrix is typically more useful in theory than it is in practice. In fact, a commandment of numerical linear algebra is “thou shalt not invert a matrix” because it tends to cause numerical issues like the ones we described in the last section. Because of this, we will not spend too much time on computing matrix inverses: it is actually very rare that you will ever need to do this outsie of a linear algebra class!

4Formula for 2x2 matrices

We want to find the inverse of the matrix AA, that is denoted by XX

A=[abcd], X=[xyzw]AX=I=[1001]A = \begin{bmatrix} a & b \\ c & d \end{bmatrix},\ X = \begin{bmatrix} x & y \\ z & w \end{bmatrix} \Rightarrow AX = I = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}

The above matrix equation produces a set of four linear equations in the unknowns (x,y,z,w)(x, y, z, w):

AX=[abcd][xyzw]=[ax+bzay+bwcx+dzcy+dw]=[1001]=I, AX = \begin{bmatrix} a & b \\ c & d \end{bmatrix}\begin{bmatrix} x & y \\ z & w \end{bmatrix} = \begin{bmatrix} ax + bz & ay + bw \\ cx + dz & cy + dw \end{bmatrix} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} = I,

which holds if an only if (x,y,z,w)(x,y,z,w) satisfy the linear system:

ax+bz=1ay+bw=0cx+dz=0cy+dw=1.\begin{align*} ax + bz &= 1\\ ay + bw &= 0\\ cx + dz &= 0\\ cy + dw & =1. \end{align*}

Solving by Gaussian Elimination, we find

x=dadbc, y=badbc, z=cadbc, w=aadbcX=1adbc[dbca],x = \frac{d}{ad-bc}, \ y = \frac{-b}{ad-bc}, \ z = \frac{-c}{ad-bc}, \ w = \frac{a}{ad-bc} \Rightarrow X = \frac{1}{ad-bc}\begin{bmatrix} d & -b \\ -c & a \end{bmatrix},

provided that the common denominator adbc0ad-bc \neq 0. You can verify that XA=IXA = I also holds, which lets us conclude that X=A1X=A^{-1} is the inverse of AA.

4.1Python Break!

NumPy has a built in function for computing matrix inverses, np.linalg.inv. Let’s compare the output of that function to a matrix inverse computed with our formula (5). Notice that because of numerical errors, AA1IAA^{-1} \approx I, but that the approximation error is very small (on the order of 1e-16).

import numpy as np

def my_inv(A):
    a = A[0,0]
    b = A[0,1]
    c = A[1,0]
    d = A[1,1]
    denominator = a*d - b*c
    X = 1/denominator * np.array([[d, -b], [-c, a]])
    return X

A = np.array([[1,2],[-3,5]])
Ainv = my_inv(A)
Ainv_np = np.linalg.inv(A)

print(f'A=\n {A}, \n Ainv =\n {Ainv}, \n Ainv_np = \n {Ainv_np}')
print(f'AA^{-1} = \n{A @ Ainv}')
A=
 [[ 1  2]
 [-3  5]], 
 Ainv =
 [[ 0.45454545 -0.18181818]
 [ 0.27272727  0.09090909]], 
 Ainv_np = 
 [[ 0.45454545 -0.18181818]
 [ 0.27272727  0.09090909]]
AA^-1 = 
[[ 1.00000000e+00  0.00000000e+00]
 [-2.22044605e-16  1.00000000e+00]]

5Some Useful Properties

One way to understand the matrix inverse is to take a dynamic view of matrix multiplication. If we think of the matrix AA as defining a function f(x)f(\vv x) that maps x\vv x to a new vector f(x)=Axf(\vv x) = A\vv x, then we can intuitively think of the matrix inverse as “undoing” this action, just as we did for elementary operation matrices and their inverses.

For example, the elementary operation of adding twice the first row to the third row is given by

E=[100010201],E = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ 2 & 0 & 1 \end{bmatrix},

while the inverse operation is given by

L=[100010201], L = \begin{bmatrix}1 & 0 & 0 \\ 0 & 1 & 0 \\ -2 & 0 & 1 \end{bmatrix},

and you can verify that L=E1L = E^{-1}. You can also verify similarly that for permutation matrices with exactly one interchange that

P=[010100001]=P1, P = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} = P^{-1},

i.e., that PP is its own inverse! Our observation above gives us some easy intuition for understanding why this is true, but you can also check this directly by using the formula (5).

The above statement will be proved later, but for now think about the scalar analogy. The equation ax=bax=b has a unique solution x=a1bx = a^{-1}b if and only if a0a \neq 0. Similarly, Ax=bA \textbf{x} = \textbf{b} has a unique solution x=A1b\textbf{x} = A^{-1}\textbf{b} if and only if A1A^{-1} exists.

5.1Python Break!

# Inverse of the product
A = np.array([[1, 2, 3], 
              [4, -5, 6], 
              [7, 8, 9]])

B = np.array([[1, 2, 0],
              [1, 3, 0],
              [1, -3, 1]])

A_inv = np.linalg.inv(A)
B_inv = np.linalg.inv(B)

AB_inv = np.linalg.inv(A@B)
print("\nInverse of product: \n", AB_inv, "\nProduct of inverses:\n", B_inv @ A_inv)

Inverse of product: 
 [[-2.425       0.35        0.575     ]
 [ 0.825      -0.15       -0.175     ]
 [ 5.45833333 -0.75       -1.20833333]] 
Product of inverses:
 [[-2.425       0.35        0.575     ]
 [ 0.825      -0.15       -0.175     ]
 [ 5.45833333 -0.75       -1.20833333]]

6Gauss-Jordan Elimination

Gauss-Jordan Elimination (GJE) is the principal algorithm for computing inverses of a nonsingular matrix.

Here is some good news: we already have all of the tools needed to solve AX=IAX = I for the unknown matrix XX. Our starting point is to recognize that the matrix equation AX=IAX = I is really nn linear systems of the form Axi=eiA\vv x_i = \vv e_i in parallel, where the xi\vv x_i and ei\vv e_i are the columns of the matrix XX and the identify matrix II, respevtively.

We define the n×1n \times 1 unit vectors ei\textbf{e}_i:

e1=[1000], e2=[0100],, en=[0001] \textbf{e}_1 = \begin{bmatrix} 1 \\ 0 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \ \textbf{e}_2 = \begin{bmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}, \cdots, \ \textbf{e}_n = \begin{bmatrix} 0 \\ 0 \\ \vdots \\ 0 \\ 1 \end{bmatrix}

as the vectors with exactly one entry of 1 in the ithi^{th} position and zeros elsewhere. The vectors ei\textbf{e}_i are the columns of the identity matrix InI_n:

In=[e1e2en] I_n = \begin{bmatrix} \textbf{e}_1 & \textbf{e}_2 & \cdots & \textbf{e}_n \end{bmatrix}

Hence, the right inverse equation can be written as

AX=I    A[x1x2xn]=[Ax1Ax2Axn]=[e1e2en]    Ax1=e1,Ax2=e2,,Axn=en.\begin{align*} AX = I \iff A \begin{bmatrix} \textbf{x}_1 & \textbf{x}_2 & \cdots & \textbf{x}_n \end{bmatrix} = \begin{bmatrix} A \textbf{x}_1 & A \textbf{x}_2 & \cdots & A \textbf{x}_n \end{bmatrix} = \begin{bmatrix} \textbf{e}_1 & \textbf{e}_2 & \cdots & \textbf{e}_n \end{bmatrix} \iff A\textbf{x}_1 = \textbf{e}_1, A\textbf{x}_2 = \textbf{e}_2, \cdots, A\textbf{x}_n = \textbf{e}_n. \end{align*}

The above defines a set of nn systems of linear equations. A key feature here is that all nn linear systems have the same coefficient matrix AA. We can take advantage of that to build one large augmented matrix MM that stacks all nn right hand sides on the right of the coefficient matrix AA:

M=[Ae1e2en]=[AI].M = \left[ \begin{array}{c|ccc} A & \textbf{e}_1 & \textbf{e}_2 & \cdots & \textbf{e}_n \end{array}\right] = \left[ \begin{array}{c|c} A & I \end{array}\right].

We can then apply our row operations (scaling and adding, swapping) to (22) to reduce AA to upper triangular form

M=[AI]N=[UC], M = \left[ \begin{array}{c|c} A & I \end{array}\right] \to N = \left[ \begin{array}{c|c} U & C \end{array}\right],

which is equivalent to reducing the original nn linear systems to

Ux1=c1,Ux2=c2,,Uxn=cn, U\textbf{x}_1 = \textbf{c}_1, U\textbf{x}_2 = \textbf{c}_2, \cdots, U\textbf{x}_n = \textbf{c}_n,

which we could then solve via back substituion for the columns xi\vv x_i of the matrix inverse XX.

For example, consider the following AA matrix, corresponding augmented matrix MM, and corresponding upper triangular form

A=[021261114], M=[021100261010114001]N=[26101002110000921121].A = \begin{bmatrix}0 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix},\ M = \left[ \begin{array}{ccc|ccc} 0 & 2 & 1 & 1 & 0 & 0 \\ 2 & 6 & 1 & 0 & 1 & 0 \\ 1 & 1 & 4 & 0 & 0 & 1 \end{array}\right] \to N = \left[ \begin{array}{ccc|ccc} 2 & 6 & 1 & 0 & 1 & 0 \\ 0 & 2 & 1 & 1 & 0 & 0 \\ 0 & 0 & \frac{9}{2} & 1 & -\frac{1}{2} & 1 \end{array}\right].

Although we could stop here, it’s worth highlighting that a more common version of GJE continues to apply row operations to fully reduce the augmented matrix to the form [IX]\left[ \begin{array}{c|c} I & X \end{array}\right] so that XX is the inverse of AA.

We first note that both UU and II have zeros below the diagonal, so this is a good start! However, in our current form [UC]\left[ \begin{array}{c|c} U & C \end{array}\right] , the diagonal pivots of UU are not 1. We need another row operation!

The scaling operation on NN in (25) reduces the augmented matrix to

N=[26101002110000921121][VB]=[1312012001121200001291929],N = \left[ \begin{array}{ccc|ccc} 2 & 6 & 1 & 0 & 1 & 0 \\ 0 & 2 & 1 & 1 & 0 & 0 \\ 0 & 0 & \frac{9}{2} & 1 & -\frac{1}{2} & 1 \end{array}\right] \to\left[ \begin{array}{c|c} V & B \end{array}\right] = \left[ \begin{array}{ccc|ccc} 1 & 3 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ 0 & 1 & \frac{1}{2} & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 1 & \frac{2}{9} & -\frac{1}{9} & \frac{2}{9} \end{array}\right],

where we divide each row by its corresponding pivot. Now, to make VV identity, we perform use the same idea as we did in Gaussian Elimination, but to zero out entries above the pivot. In this case, we start with the (3,3)(3,3) pivot to zero out the (2,3)(2, 3) and (1,3)(1,3) entries, and then use the (2,2)(2,2) pivot to zero out the (2,1)(2,1) entry.

[VB]=[1312012001121200001291929][10023187182901071811819001291929]\left[ \begin{array}{c|c} V & B \end{array}\right] = \left[ \begin{array}{ccc|ccc} 1 & 3 & \frac{1}{2} & 0 & \frac{1}{2} & 0 \\ 0 & 1 & \frac{1}{2} & \frac{1}{2} & 0 & 0 \\ 0 & 0 & 1 & \frac{2}{9} & -\frac{1}{9} & \frac{2}{9} \end{array}\right] \to \left[ \begin{array}{ccc|ccc} 1 & 0 & 0 & -\frac{23}{18} & \frac{7}{18} & \frac{2}{9} \\ 0 & 1 & 0 & \frac{7}{18} & \frac{1}{18} & -\frac{1}{9} \\ 0 & 0 & 1 & \frac{2}{9} & -\frac{1}{9} & \frac{2}{9} \end{array}\right]

Finally, the right-hand matrix in (27) is the inverse of AA:

A1=[23187182971811819291929] A^{-1} = \left[ \begin{array}{ccc} -\frac{23}{18} & \frac{7}{18} & \frac{2}{9} \\ \frac{7}{18} & \frac{1}{18} & -\frac{1}{9} \\ \frac{2}{9} & -\frac{1}{9} & \frac{2}{9} \end{array}\right]

Binder