1.5 LU-Factorization

1Reading¶

This section is based on ALA Ch 1.3. Optional: for an alternative perspective on the LU-factorization, see Ch 5.3 of Jessy Grizzles ROB 101 notes.

2Learning Objectives¶

By the end of this page, you should know:-

what the $LU$ factorization of a matrix is
how to apply $LU$ factorization to solve systems of linear equations
how to represent row operations using elementary matrices
LU factorization using elementary matrix actions
how LU factorization relates to Gaussian Elimination (forward elimination and back substitution)

3We Like Triangular Matrices!¶

A theme we’ve exploited so far is that if we have an upper triangular system of equations, i.e., if $U\vv x = \vv c$ for $U$ and upper triangular matrix, then finding a solution $\vv x$ is easy via back substitution. Hopefully you believe that a lower triangular system of equations of the form $L\vv x = \vv c$ for $L$ a lower triangular matrix is also easy to solve via forward substitution. This section starts with the following question:

Is it easy to solve a system of linear equations of the form $LU \vv x = \vv b$ , where $L$ is a lower triangular matrix and $U$ is an upper triangular matrix?

This looks like something we might be able to handle, since we know how to solve linear systems that look like $U\vv x = \vv z$ , for $U$ upper triangular, by Back Substitution, and how to solve linear systems that look like $L\vv y = \vv c$ , for $L$ lower triangular, by Forward Substitution.^[1] Our strategy for solving $LU\vv x = \vv b$ will be to tackle the problem in two steps: first, we solve an intermediate lower triangular system by Forward Substitution, and then use the solution to that intermediate problem to define an upper triangular system, which we solve by Back Substitution! With that strategy in mind, let’s introduce a new intermediate variable $\vv z = U \vv x$ . Then we can find the solution to $LU\vv x = \vv b$ by solving

L\vv z = \vv b,

(1)

for $\vv z$ via Forward Substitution, and then solving

U \vv x = \vv z,

(2)

for $\vv x$ via Backward Substitution. If we piece everything together, we see that $\vv x$ is indeed a solution to our original equation $LU\vv x = \vv b$ because

L\underbrace{U \vv x}_{=\vv z}= \underbrace{L\vv z}_{=\vv b} = \vv b.

(3)

We summarize the above in the following algorithm for solving linear systems of the form $LU\vv x = \vv b$ , where $L$ is a lower triangular matrix, and $U$ is an upper triangular matrix:

Solve $L\vv z = \vv b$ via forward substitution.
Using $\vv z$ , solve $U\vv x = \vv z$ via back substitution.

This is exciting! We can solve systems of linear equations even if the coefficient matrix is not triangular, as long as it can be factored into a product of a lower triangular matrix and an upper triangular one. This expands our class of “easy to solve systems” pretty signifcantly. The natural next question is then of course

When can a square matrix $A$ be factored as $A=LU$ , for $L$ and $U$ lower and upper triangular matrices, respectively?

What we’ll show in this section is an amazing fact of linear algebra: any matrix $A$ that is regular can be factored as a product of a lower triangular matrix $L$ and an upper triangular matrix $U$ such that $A=LU$ . This matrix factorization, the first (but not the last!) we will see in this class, is called the $LU$ factorization of a matrix (a descriptive, if not particularly creative, name).

4Elementary Matrices¶

Our starting point is the following crucial idea that underpins much of linear algebra.

For example, our strategy in previous sections of adding equations together is in fact an operation that can be realized by matrix multiplication. Our starting point will be to introduce a the idea of an elementary matrix, which encodes the corresponding elementary row operation. So far we’ve only seen the elementary operation of adding equations together, which we recall, is equivalent to adding rows of the augmented matrix $M = [A \, | \, \vv b]$ together, but we’ll see other types of operations shortly. The definition below will work for all of them.

This definition is a bit abstract, so let’s see it in action. Suppose that we have a system of three equations in three unknowns, encoded by the augmented matrix $M = [ A\, | \, \vv b]$ , where $A$ is a $3\times 3$ matrix and $\vv b$ is a $3 \times 1$ vector. What is the elementary matrix associated with subtracting twice the first row from the second row? If we start with the identify matrix

I_3 = \bm 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \em,

(4)

and apply this operation to it, we end up with the elementary matrix

E_1 = \bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em,

(5)

which we named $E_1$ just to keep track of which elementary matrix we are talking about later on. Let’s check if it does what it’s supposed to on a two different $3 \times 3$ matrices

\underbrace{\bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em}_{E_1}\bm 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \em = \bm 1 & 2 & 3 \\ 2 & 1 & 0 \\ 7 & 8 & 9 \em, \quad \underbrace{\bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em}_{E_1}\bm 1 & 2 & 3 \\ 1 & 2 & 3 \\ 1 & 2 & 3 \em =\bm 1 & 2 & 3 \\ -1 & -2 & -3 \\ 1 & 2 & 3 \em .

(6)

Indeed it does, and by playing around with our rules of matrix arithmetic and multiplication, you should be able to convince yourself that that left multiplying any 3-row matrix by $E_1$ will subtract twice its first row from its second row.

We can also use Definition 1 to reverse engineer what row operation an elementary matrix is encoding.

Exercise 1 (Reverse Engineering Elementary Matrices)

What elementary row operations are realized if we left multiply by the following matrices?

E_2 = \bm 1 & 0 & 0 \\ 0 & 1 & 0 \\ -1 & 0 & 1 \em, \quad E_3 = \bm 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & \frac{1}{2} & 1 \em.

(7)

Solution to Exercise 1

Left multiplying by $E_2$ realizes subtracting the first row from the last row of a matrix. This is true because to get the last row of $E_2$ , $\bm -1 & 0 & 1\em$ from $I_3$ , we need to subract the first row $\bm 1 & 0 & 0 \em$ of $I_3$ from the last row $\bm 0 & 0 & 1 \em$ of $I_3$ .

Using similar reasoning, we see that left multiplying by $E_3$ realizes adding $1/2$ the second row of a matrix to its last row.

Python break!¶

import numpy as np
# Define the elementary matrices above

E_1 = np.array([[1., 0., 0.], [-2., 1., 0.],[0., 0., 1.]])
E_2 = np.array([[1., 0., 0.], [0., 1., 0.],[-1., 0., 1.]])
E_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., 0.5, 1.]])

# Apply them to a 3 x 4 matrix and confirm they do what they're supposed to
A = np.array([[1., 2., 3., 4.],[2., 4., 6., 8.],[0.5, 1., 1.5, 2.]])

# E1: row 2 - 2 x row 1
print(f'A = {A},\n E_1A = {E_1 @ A}')

# E2: row 3 - row 1
print(f'E_2A = {E_2 @ A}')

# E3: row 3 + .5 x row 2
print(f'E_3A = {E_3 @ A}')

A = [[1.  2.  3.  4. ]
 [2.  4.  6.  8. ]
 [0.5 1.  1.5 2. ]],
 E_1A = [[1.  2.  3.  4. ]
 [0.  0.  0.  0. ]
 [0.5 1.  1.5 2. ]]
E_2A = [[ 1.   2.   3.   4. ]
 [ 2.   4.   6.   8. ]
 [-0.5 -1.  -1.5 -2. ]]
E_3A = [[1.  2.  3.  4. ]
 [2.  4.  6.  8. ]
 [1.5 3.  4.5 6. ]]

4.1Elementary Matrices in Action¶

Let us use elementary matrices to design a matrix that can help us solve linear equations. Let’s revisit the very first system of equations we encountered:

\begin{align*} x_1+2x_2+x_3 & = 2,\\ 2x_1+6x_2+x_3 & =7,\\ x_1+x_2+4x_3 & =3, \end{align*}

(9)

or in matrix-vector form

\underbrace{\begin{bmatrix} 1 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix}}_A\underbrace{\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}}_{\vv x} = \underbrace{\begin{bmatrix} 2 \\ 7 \\ 3 \end{bmatrix}}_{\vv b}

(10)

which we solved by first applying the row operation encoded by $E_1$ (subtracted twice equation 1 from equation 2), then applying the operation encoded by $E_2$ (subtract the first equation from the last equation ), and finally by applying the operation encoded by $E_3$ (add 1/2 first equation to last equation). Funny how that conveninently worked out! Therefore, if we keep careful track of the row operations and their matrix multiplication realizations, we conclude that:

A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix}, \quad E_3 E_2 E_1 A = U = \bm 1 & 2 & 1 \\ 0 & 2 & -1 \\ 0 & 0 & \frac{5}{2} \em.

(11)

The way to read the expression $E_3 E_2 E_1 A$ is from right-to-left: we start with $A$ , then apply $E_1$ , then $E_2$ , and finally $E_3$ . If we give the composition of elementary operations a name, say $E = E_3 E_2 E_1 A$ , then we can check that $E [A \, | \, \vv b] = [U \, | \, \vv c]$ , that is to say left multiplying the augmented matrix $M = [A \, | \, \vv b]$ by the matrix $E$ performs Gaussian Elmination for us! This is just a first taste of a powerful feature of linear algebra: we can encode very sophisticated and complex operations via matrix multiplication.

Python break!¶

Let’s implement the above example and confirm that we can indeed cook up a matrix $E$ such that left multiplying $A$ by $E$ gives $EA = U$ an upper triangular matrix.

# Define A and b
A = np.array([[1., 2., 1.],[2., 6., 1.],[1., 1., 4.]])
b = np.array([2., 7., 3.])

# Build M = [A|b]
M = np.hstack((A,b.reshape((3,1))))

# Build E
E = E_3 @ E_2 @ E_1

# Compute N = [U|c] = E [A|b] = EM
N = E @ M

# Extract out U and c
U = N[:,0:3]
c = N[:,-1].flatten()

print(f'A = {A}, \n b = {b}, \n U = {U}, \n c = {c}')

A = [[1. 2. 1.]
 [2. 6. 1.]
 [1. 1. 4.]], 
 b = [2. 7. 3.], 
 U = [[ 1.   2.   1. ]
 [ 0.   2.  -1. ]
 [ 0.   0.   2.5]], 
 c = [2.  3.  2.5]

4.2Inverse Elementary Matrices¶

Our work so far shows that we can find a matrix $E$ such that $EA = U$ , for $U$ upper triangular. This is progress, but we still haven’t reached our goal of decomposing $A$ into a product $LU$ of a lower and an upper triangular matrix. Our approach will be to look at the matrix equation $EA=U$ and find a matrix $L$ that “undoes” the action of left-multiplyting by $E$ .

Let’s start by focussing on how to undo the action of an elementary matrix using its corresponding inverse elementary matrix. Since elmentary matrix multiplication implements simple row operations, we can intuitively figure out how to undo them. For example, to undo the action of $E_1$ (subtract twice equation 1 from equation 2), we add twice equation 1 to equation 2. The corresponding matrix is

L_1 = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

(12)

We note that $L_1E_1 = I$ , since, $L_1$ reverses the action done by $E_1$ and the end result $I$ does not do any action. Similarly, we can define appropriate inverses

L_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}, \ L_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -\frac{1}{2} & 1 \end{bmatrix}

(13)

We have $U = E_3E_2E_1A$ . If we reverse the actions using $L_1L_2L_3$ , we have

L_1L_2L_3U = L_1L_2L_3E_3E_2E_1A = L_1L_2E_2E_1A = L_1E_1A,

(14)

where we used the fact that $L_2E_2 = L_1E_1 = I$ . Hence, calling $L = L_1L_2L_3$ , we showed that $A = LU$ . The matrix $U$ is upper triangular, butis $L$ lower triangular? The answer is yes due to the following fundamental property of triangular matrices.

Since $L_1, L_2, L_3$ are all lower triangular, so is their product $L$ . Thus we have computed the LU facotrization of $A$ using elementary matrix operations.

Python break!¶

Let’s double check all of the above in code. We’ll recycle the $E_i$ , $U$ , and $A$ from above.

# Define the inverse elementary matrices
L_1 = np.array([[1., 0., 0.], [2., 1., 0.],[0., 0., 1.]])
L_2 = np.array([[1., 0., 0.], [0., 1., 0.],[1., 0., 1.]])
L_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., -0.5, 1.]])

# Confirm that they "undo" their corresponding elementary matrices' actions
print(f'These should all be the 3x3 identity matrix:\n {L_1 @ E_1},\n {L_2 @ E_2},\n {L_3 @ E_3}')

# Confirm that L = L_3 L_2 L_1 "undoes" the action of E
L = L_1 @ L_2 @ L_3
print(f'L = {L},\n E={E}')
print(f'So should this: \n {L @ E}')

# Finally, let's check that A = LU
print(f'This should be all zeros: \n {A - L@U}')

These should all be the 3x3 identity matrix:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]],
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]],
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
L = [[ 1.   0.   0. ]
 [ 2.   1.   0. ]
 [ 1.  -0.5  1. ]],
 E=[[ 1.   0.   0. ]
 [-2.   1.   0. ]
 [-2.   0.5  1. ]]
So should this: 
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
This should be all zeros: 
 [[ 0.   0.   0. ]
 [-3.  -9.  -1. ]
 [-1.  -5.   2.5]]

4.3Regular Matrix¶

Using the $LU$ factorization explained in this page, we can finally give a formal definition of a regular matrix.

Python break!¶

Below is a simple NumPy implementation of the $LU$ -factorization algorithm that we described above. Notice that it looks very similar to our code for Gaussian elimination, but that we are keeping track of the changes we made to reduce $A$ to $U$ within the matrix $L$ . Specifically, when we subtract scalar times row j from row i in the line U[i, :] = U[i, :] - scalar*U[j, :] to perform forward elimination, we adjust the corresponding (i,j) entry of $L$ to undo this action by setting L[i, j] = scalar. You should be able to convince yourself that building $L$ this way gives the same answer as computing the product $L_k L_{k-1}\cdots L_2 L_1$ of elementary inverse matrix operations.

def lu(A):
    
    #Get the number of rows
    n = A.shape[0]
    
    U = A.copy()
    L = np.eye(n, dtype=np.double)
    
    #Loop over rows
    for j in range(n):
        if U[j, j] == 0:
            print("A is not regular")
            break
        else:
            for i in range(j+1, n):
                scalar = U[i, j]/U[j, j]
                U[i, :] = U[i, :] - scalar*U[j, :]
                L[i, j] = scalar
    return L, U

Now let’s test this out on our favorite linear system!

## applying LU factoriation to solve linear equations

A = np.array([[2., 6., 1.],
              [1., 1., 4.],
              [1., 2., 1.]])
b = np.array([7., 3., 2.])
L, U = lu(A) # LU decomposition using scipy package

print("\nL: \n", L, "\nU: \n", U)

n = A.shape[1]

# forward substitution
z = np.zeros((n,))
z[0] = b[0]/L[0, 0]
for i in range(1, n):
    z[i] = (b[i] - np.sum(L[i, :i]*z[:i]))/L[i, i]

# back substitution
x = np.zeros((n,))
x[-1] = z[-1]/U[-1, -1]
for i in range(n-2, -1, -1):
    x[i] = (z[i] - np.sum(U[i, i+1:]*x[i+1:]))/U[i, i]

print("\nSolution x: \n", x)


L: 
 [[1.  0.  0. ]
 [0.5 1.  0. ]
 [0.5 0.5 1. ]] 
U: 
 [[ 2.    6.    1.  ]
 [ 0.   -2.    3.5 ]
 [ 0.    0.   -1.25]]

Solution x: 
 [-3.  2.  1.]

5Are We Done?¶

Not quite! We’ve assumed throughout that our matrix $A$ is regular, which we now somewhat circularly defined as being the class of matrices for which the above $LU$ -facotrization works. But there are very simple (and easily solvable!) systems of equations for which $A$ is not regular. For example, if $A$ is lower triangular, the above algorithm fails! We’ll see a very simple fix next that let’s us handle a much more general class of systems with just a small tweak.

6Optional: $LU$ -factorization is Gaussian Elimination!¶

We have $A = LU$ . Suppose we have applied Gaussian Elimination to get our system into $U\vv x = \vv c$ . There is a matrix $M$ such that $MA= U$ and $M\vv b=\vv c$ . So we have

MA\vv x = MLU\vv x = M\vv b

(15)

But if $MA = U$ then we must have that $ML= I$ .

Now, if we go back to solving our system using $LU$ -factorization, we have that $LU\vv x = \vv b$ which we can turn into $L\vv z = \vv b$ and $U\vv x = \vv z$ . We saw that we can solve that $L\vv z = b$ by forward substitution, but notice that we also have that $ML\vv z = \vv z = M \vv b$ , i.e., the matrix $M$ that reduces our original system of equations to upper triangular form is actually performing this forward substitution.

1Reading¶

2Learning Objectives¶

3We Like Triangular Matrices!¶

4Elementary Matrices¶

Python break!¶

4.1Elementary Matrices in Action¶

Python break!¶

4.2Inverse Elementary Matrices¶

Python break!¶

4.3Regular Matrix¶

Python break!¶

5Are We Done?¶

6Optional: LULULU-factorization is Gaussian Elimination!¶

6Optional: $LU$ -factorization is Gaussian Elimination!¶