Skip to article frontmatterSkip to article content

1.5 LU-Factorization

A new matrix factorization

Dept. of Electrical and Systems Engineering
University of Pennsylvania

Binder

Lecture notes

1Reading

This section is based on ALA Ch 1.3. Optional: for an alternative perspective on the LU-factorization, see Ch 5.3 of Jessy Grizzles ROB 101 notes.

2Learning Objectives

By the end of this page, you should know:-

  • what the LULU factorization of a matrix is
  • how to apply LULU factorization to solve systems of linear equations
  • how to represent row operations using elementary matrices
  • LU factorization using elementary matrix actions
  • how LU factorization relates to Gaussian Elimination (forward elimination and back substitution)

3We Like Triangular Matrices!

A theme we’ve exploited so far is that if we have an upper triangular system of equations, i.e., if Ux=cU\vv x = \vv c for UU and upper triangular matrix, then finding a solution x\vv x is easy via back substitution. Hopefully you believe that a lower triangular system of equations of the form Lx=cL\vv x = \vv c for LL a lower triangular matrix is also easy to solve via forward substitution. This section starts with the following question:

Is it easy to solve a system of linear equations of the form LUx=bLU \vv x = \vv b, where LL is a lower triangular matrix and UU is an upper triangular matrix?

This looks like something we might be able to handle, since we know how to solve linear systems that look like Ux=zU\vv x = \vv z, for UU upper triangular, by Back Substitution, and how to solve linear systems that look like Ly=cL\vv y = \vv c, for LL lower triangular, by Forward Substitution.[1] Our strategy for solving LUx=bLU\vv x = \vv b will be to tackle the problem in two steps: first, we solve an intermediate lower triangular system by Forward Substitution, and then use the solution to that intermediate problem to define an upper triangular system, which we solve by Back Substitution! With that strategy in mind, let’s introduce a new intermediate variable z=Ux\vv z = U \vv x. Then we can find the solution to LUx=bLU\vv x = \vv b by solving

Lz=b, L\vv z = \vv b,

for z\vv z via Forward Substitution, and then solving

Ux=z, U \vv x = \vv z,

for x\vv x via Backward Substitution. If we piece everything together, we see that x\vv x is indeed a solution to our original equation LUx=bLU\vv x = \vv b because

LUx=z=Lz=b=b. L\underbrace{U \vv x}_{=\vv z}= \underbrace{L\vv z}_{=\vv b} = \vv b.

We summarize the above in the following algorithm for solving linear systems of the form LUx=bLU\vv x = \vv b, where LL is a lower triangular matrix, and UU is an upper triangular matrix:

  1. Solve Lz=bL\vv z = \vv b via forward substitution.
  2. Using z\vv z, solve Ux=zU\vv x = \vv z via back substitution.

This is exciting! We can solve systems of linear equations even if the coefficient matrix is not triangular, as long as it can be factored into a product of a lower triangular matrix and an upper triangular one. This expands our class of “easy to solve systems” pretty signifcantly. The natural next question is then of course

When can a square matrix AA be factored as A=LUA=LU, for LL and UU lower and upper triangular matrices, respectively?

What we’ll show in this section is an amazing fact of linear algebra: any matrix AA that is regular can be factored as a product of a lower triangular matrix LL and an upper triangular matrix UU such that A=LUA=LU. This matrix factorization, the first (but not the last!) we will see in this class, is called the LULU factorization of a matrix (a descriptive, if not particularly creative, name).

4Elementary Matrices

Our starting point is the following crucial idea that underpins much of linear algebra.

For example, our strategy in previous sections of adding equations together is in fact an operation that can be realized by matrix multiplication. Our starting point will be to introduce a the idea of an elementary matrix, which encodes the corresponding elementary row operation. So far we’ve only seen the elementary operation of adding equations together, which we recall, is equivalent to adding rows of the augmented matrix M=[Ab]M = [A \, | \, \vv b] together, but we’ll see other types of operations shortly. The definition below will work for all of them.

This definition is a bit abstract, so let’s see it in action. Suppose that we have a system of three equations in three unknowns, encoded by the augmented matrix M=[Ab]M = [ A\, | \, \vv b], where AA is a 3×33\times 3 matrix and b\vv b is a 3×13 \times 1 vector. What is the elementary matrix associated with subtracting twice the first row from the second row? If we start with the identify matrix

I3=[100010001], I_3 = \bm 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \em,

and apply this operation to it, we end up with the elementary matrix

E1=[100210001], E_1 = \bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em,

which we named E1E_1 just to keep track of which elementary matrix we are talking about later on. Let’s check if it does what it’s supposed to on a two different 3×33 \times 3 matrices

[100210001]E1[123456789]=[123210789],[100210001]E1[123123123]=[123123123]. \underbrace{\bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em}_{E_1}\bm 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \em = \bm 1 & 2 & 3 \\ 2 & 1 & 0 \\ 7 & 8 & 9 \em, \quad \underbrace{\bm 1 & 0 & 0 \\ -2 & 1 & 0 \\ 0 & 0 & 1\em}_{E_1}\bm 1 & 2 & 3 \\ 1 & 2 & 3 \\ 1 & 2 & 3 \em =\bm 1 & 2 & 3 \\ -1 & -2 & -3 \\ 1 & 2 & 3 \em .

Indeed it does, and by playing around with our rules of matrix arithmetic and multiplication, you should be able to convince yourself that that left multiplying any 3-row matrix by E1E_1 will subtract twice its first row from its second row.

We can also use Definition 1 to reverse engineer what row operation an elementary matrix is encoding.

Python break!

import numpy as np
# Define the elementary matrices above

E_1 = np.array([[1., 0., 0.], [-2., 1., 0.],[0., 0., 1.]])
E_2 = np.array([[1., 0., 0.], [0., 1., 0.],[-1., 0., 1.]])
E_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., 0.5, 1.]])

# Apply them to a 3 x 4 matrix and confirm they do what they're supposed to
A = np.array([[1., 2., 3., 4.],[2., 4., 6., 8.],[0.5, 1., 1.5, 2.]])

# E1: row 2 - 2 x row 1
print(f'A = {A},\n E_1A = {E_1 @ A}')

# E2: row 3 - row 1
print(f'E_2A = {E_2 @ A}')

# E3: row 3 + .5 x row 2
print(f'E_3A = {E_3 @ A}')
A = [[1.  2.  3.  4. ]
 [2.  4.  6.  8. ]
 [0.5 1.  1.5 2. ]],
 E_1A = [[1.  2.  3.  4. ]
 [0.  0.  0.  0. ]
 [0.5 1.  1.5 2. ]]
E_2A = [[ 1.   2.   3.   4. ]
 [ 2.   4.   6.   8. ]
 [-0.5 -1.  -1.5 -2. ]]
E_3A = [[1.  2.  3.  4. ]
 [2.  4.  6.  8. ]
 [1.5 3.  4.5 6. ]]

4.1Elementary Matrices in Action

Let us use elementary matrices to design a matrix that can help us solve linear equations. Let’s revisit the very first system of equations we encountered:

x1+2x2+x3=2,2x1+6x2+x3=7,x1+x2+4x3=3,\begin{align*} x_1+2x_2+x_3 & = 2,\\ 2x_1+6x_2+x_3 & =7,\\ x_1+x_2+4x_3 & =3, \end{align*}

or in matrix-vector form

[121261114]A[x1x2x3]x=[273]b\underbrace{\begin{bmatrix} 1 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix}}_A\underbrace{\begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}}_{\vv x} = \underbrace{\begin{bmatrix} 2 \\ 7 \\ 3 \end{bmatrix}}_{\vv b}

which we solved by first applying the row operation encoded by E1E_1 (subtracted twice equation 1 from equation 2), then applying the operation encoded by E2E_2 (subtract the first equation from the last equation ), and finally by applying the operation encoded by E3E_3 (add 1/2 first equation to last equation). Funny how that conveninently worked out! Therefore, if we keep careful track of the row operations and their matrix multiplication realizations, we conclude that:

A=[121261114],E3E2E1A=U=[1210210052]. A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix}, \quad E_3 E_2 E_1 A = U = \bm 1 & 2 & 1 \\ 0 & 2 & -1 \\ 0 & 0 & \frac{5}{2} \em.

The way to read the expression E3E2E1AE_3 E_2 E_1 A is from right-to-left: we start with AA, then apply E1E_1, then E2E_2, and finally E3E_3. If we give the composition of elementary operations a name, say E=E3E2E1AE = E_3 E_2 E_1 A, then we can check that E[Ab]=[Uc]E [A \, | \, \vv b] = [U \, | \, \vv c], that is to say left multiplying the augmented matrix M=[Ab]M = [A \, | \, \vv b] by the matrix EE performs Gaussian Elmination for us! This is just a first taste of a powerful feature of linear algebra: we can encode very sophisticated and complex operations via matrix multiplication.

Python break!

Let’s implement the above example and confirm that we can indeed cook up a matrix EE such that left multiplying AA by EE gives EA=UEA = U an upper triangular matrix.

# Define A and b
A = np.array([[1., 2., 1.],[2., 6., 1.],[1., 1., 4.]])
b = np.array([2., 7., 3.])

# Build M = [A|b]
M = np.hstack((A,b.reshape((3,1))))

# Build E
E = E_3 @ E_2 @ E_1

# Compute N = [U|c] = E [A|b] = EM
N = E @ M

# Extract out U and c
U = N[:,0:3]
c = N[:,-1].flatten()

print(f'A = {A}, \n b = {b}, \n U = {U}, \n c = {c}')
A = [[1. 2. 1.]
 [2. 6. 1.]
 [1. 1. 4.]], 
 b = [2. 7. 3.], 
 U = [[ 1.   2.   1. ]
 [ 0.   2.  -1. ]
 [ 0.   0.   2.5]], 
 c = [2.  3.  2.5]

4.2Inverse Elementary Matrices

Our work so far shows that we can find a matrix EE such that EA=UEA = U, for UU upper triangular. This is progress, but we still haven’t reached our goal of decomposing AA into a product LULU of a lower and an upper triangular matrix. Our approach will be to look at the matrix equation EA=UEA=U and find a matrix LL that “undoes” the action of left-multiplyting by EE.

Let’s start by focussing on how to undo the action of an elementary matrix using its corresponding inverse elementary matrix. Since elmentary matrix multiplication implements simple row operations, we can intuitively figure out how to undo them. For example, to undo the action of E1E_1 (subtract twice equation 1 from equation 2), we add twice equation 1 to equation 2. The corresponding matrix is

L1=[100210001] L_1 = \begin{bmatrix} 1 & 0 & 0 \\ 2 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix}

We note that L1E1=IL_1E_1 = I, since, L1L_1 reverses the action done by E1E_1 and the end result II does not do any action. Similarly, we can define appropriate inverses

L2=[100010101], L3=[1000100121] L_2 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}, \ L_3 = \begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & -\frac{1}{2} & 1 \end{bmatrix}

We have U=E3E2E1AU = E_3E_2E_1A. If we reverse the actions using L1L2L3L_1L_2L_3, we have

L1L2L3U=L1L2L3E3E2E1A=L1L2E2E1A=L1E1A, L_1L_2L_3U = L_1L_2L_3E_3E_2E_1A = L_1L_2E_2E_1A = L_1E_1A,

where we used the fact that L2E2=L1E1=IL_2E_2 = L_1E_1 = I. Hence, calling L=L1L2L3L = L_1L_2L_3, we showed that A=LUA = LU. The matrix UU is upper triangular, butis LL lower triangular? The answer is yes due to the following fundamental property of triangular matrices.

Since L1,L2,L3L_1, L_2, L_3 are all lower triangular, so is their product LL. Thus we have computed the LU facotrization of AA using elementary matrix operations.

Python break!

Let’s double check all of the above in code. We’ll recycle the EiE_i, UU, and AA from above.

# Define the inverse elementary matrices
L_1 = np.array([[1., 0., 0.], [2., 1., 0.],[0., 0., 1.]])
L_2 = np.array([[1., 0., 0.], [0., 1., 0.],[1., 0., 1.]])
L_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., -0.5, 1.]])

# Confirm that they "undo" their corresponding elementary matrices' actions
print(f'These should all be the 3x3 identity matrix:\n {L_1 @ E_1},\n {L_2 @ E_2},\n {L_3 @ E_3}')

# Confirm that L = L_3 L_2 L_1 "undoes" the action of E
L = L_1 @ L_2 @ L_3
print(f'L = {L},\n E={E}')
print(f'So should this: \n {L @ E}')

# Finally, let's check that A = LU
print(f'This should be all zeros: \n {A - L@U}')
These should all be the 3x3 identity matrix:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]],
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]],
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
L = [[ 1.   0.   0. ]
 [ 2.   1.   0. ]
 [ 1.  -0.5  1. ]],
 E=[[ 1.   0.   0. ]
 [-2.   1.   0. ]
 [-2.   0.5  1. ]]
So should this: 
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
This should be all zeros: 
 [[ 0.   0.   0. ]
 [-3.  -9.  -1. ]
 [-1.  -5.   2.5]]

4.3Regular Matrix

Using the LULU factorization explained in this page, we can finally give a formal definition of a regular matrix.

Python break!

Below is a simple NumPy implementation of the LULU-factorization algorithm that we described above. Notice that it looks very similar to our code for Gaussian elimination, but that we are keeping track of the changes we made to reduce AA to UU within the matrix LL. Specifically, when we subtract scalar times row j from row i in the line U[i, :] = U[i, :] - scalar*U[j, :] to perform forward elimination, we adjust the corresponding (i,j) entry of LL to undo this action by setting L[i, j] = scalar. You should be able to convince yourself that building LL this way gives the same answer as computing the product LkLk1L2L1L_k L_{k-1}\cdots L_2 L_1 of elementary inverse matrix operations.

def lu(A):
    
    #Get the number of rows
    n = A.shape[0]
    
    U = A.copy()
    L = np.eye(n, dtype=np.double)
    
    #Loop over rows
    for j in range(n):
        if U[j, j] == 0:
            print("A is not regular")
            break
        else:
            for i in range(j+1, n):
                scalar = U[i, j]/U[j, j]
                U[i, :] = U[i, :] - scalar*U[j, :]
                L[i, j] = scalar
    return L, U

Now let’s test this out on our favorite linear system!

## applying LU factoriation to solve linear equations

A = np.array([[2., 6., 1.],
              [1., 1., 4.],
              [1., 2., 1.]])
b = np.array([7., 3., 2.])
L, U = lu(A) # LU decomposition using scipy package

print("\nL: \n", L, "\nU: \n", U)

n = A.shape[1]

# forward substitution
z = np.zeros((n,))
z[0] = b[0]/L[0, 0]
for i in range(1, n):
    z[i] = (b[i] - np.sum(L[i, :i]*z[:i]))/L[i, i]

# back substitution
x = np.zeros((n,))
x[-1] = z[-1]/U[-1, -1]
for i in range(n-2, -1, -1):
    x[i] = (z[i] - np.sum(U[i, i+1:]*x[i+1:]))/U[i, i]

print("\nSolution x: \n", x)

L: 
 [[1.  0.  0. ]
 [0.5 1.  0. ]
 [0.5 0.5 1. ]] 
U: 
 [[ 2.    6.    1.  ]
 [ 0.   -2.    3.5 ]
 [ 0.    0.   -1.25]]

Solution x: 
 [-3.  2.  1.]

5Are We Done?

Not quite! We’ve assumed throughout that our matrix AA is regular, which we now somewhat circularly defined as being the class of matrices for which the above LULU-facotrization works. But there are very simple (and easily solvable!) systems of equations for which AA is not regular. For example, if AA is lower triangular, the above algorithm fails! We’ll see a very simple fix next that let’s us handle a much more general class of systems with just a small tweak.

6Optional: LULU-factorization is Gaussian Elimination!

We have A=LUA = LU. Suppose we have applied Gaussian Elimination to get our system into Ux=cU\vv x = \vv c. There is a matrix MM such that MA=UMA= U and Mb=cM\vv b=\vv c. So we have

MAx=MLUx=Mb MA\vv x = MLU\vv x = M\vv b

But if MA=UMA = U then we must have that ML=IML= I.

Now, if we go back to solving our system using LULU-factorization, we have that LUx=bLU\vv x = \vv b which we can turn into Lz=bL\vv z = \vv b and Ux=zU\vv x = \vv z. We saw that we can solve that Lz=bL\vv z = b by forward substitution, but notice that we also have that MLz=z=MbML\vv z = \vv z = M \vv b, i.e., the matrix MM that reduces our original system of equations to upper triangular form is actually performing this forward substitution.

Binder

Footnotes
  1. Forward Substitution is exactly what it sounds like. To solve Ly=cL\vv y = \vv c, when LL is lower triangular, we start at the top row and set y1=c1/l11y_1 = c_1/l_{11}. We then move down a row, and solve l21y1+l22y2=c2l_{21}y_1 + l_{22}y_2 = c_2 by plugging in y1y_1 and solving for y2y_2. We continue in this manner until we’ve solved for y\vv y.

  2. A triangular matrix with all 1’s on the diagonal is also called a unitriangular matrix. The “uni-” prefix indicates that the diagonal should be all 1’s.