1.5 LU-Factorization
A new matrix factorization
1Reading¶
This section is based on ALA Ch 1.3. Optional: for an alternative perspective on the LU-factorization, see Ch 5.3 of Jessy Grizzles ROB 101 notes.
2Learning Objectives¶
By the end of this page, you should know:-
- what the factorization of a matrix is
- how to apply factorization to solve systems of linear equations
- how to represent row operations using elementary matrices
- LU factorization using elementary matrix actions
- how LU factorization relates to Gaussian Elimination (forward elimination and back substitution)
3We Like Triangular Matrices!¶
A theme we’ve exploited so far is that if we have an upper triangular system of equations, i.e., if for and upper triangular matrix, then finding a solution is easy via back substitution. Hopefully you believe that a lower triangular system of equations of the form for a lower triangular matrix is also easy to solve via forward substitution. This section starts with the following question:
Is it easy to solve a system of linear equations of the form , where is a lower triangular matrix and is an upper triangular matrix?
This looks like something we might be able to handle, since we know how to solve linear systems that look like , for upper triangular, by Back Substitution, and how to solve linear systems that look like , for lower triangular, by Forward Substitution.[1] Our strategy for solving will be to tackle the problem in two steps: first, we solve an intermediate lower triangular system by Forward Substitution, and then use the solution to that intermediate problem to define an upper triangular system, which we solve by Back Substitution! With that strategy in mind, let’s introduce a new intermediate variable . Then we can find the solution to by solving
for via Forward Substitution, and then solving
for via Backward Substitution. If we piece everything together, we see that is indeed a solution to our original equation because
We summarize the above in the following algorithm for solving linear systems of the form , where is a lower triangular matrix, and is an upper triangular matrix:
- Solve via forward substitution.
- Using , solve via back substitution.
This is exciting! We can solve systems of linear equations even if the coefficient matrix is not triangular, as long as it can be factored into a product of a lower triangular matrix and an upper triangular one. This expands our class of “easy to solve systems” pretty signifcantly. The natural next question is then of course
When can a square matrix be factored as , for and lower and upper triangular matrices, respectively?
What we’ll show in this section is an amazing fact of linear algebra: any matrix that is regular can be factored as a product of a lower triangular matrix and an upper triangular matrix such that . This matrix factorization, the first (but not the last!) we will see in this class, is called the factorization of a matrix (a descriptive, if not particularly creative, name).
4Elementary Matrices¶
Our starting point is the following crucial idea that underpins much of linear algebra.
For example, our strategy in previous sections of adding equations together is in fact an operation that can be realized by matrix multiplication. Our starting point will be to introduce a the idea of an elementary matrix, which encodes the corresponding elementary row operation. So far we’ve only seen the elementary operation of adding equations together, which we recall, is equivalent to adding rows of the augmented matrix together, but we’ll see other types of operations shortly. The definition below will work for all of them.
This definition is a bit abstract, so let’s see it in action. Suppose that we have a system of three equations in three unknowns, encoded by the augmented matrix , where is a matrix and is a vector. What is the elementary matrix associated with subtracting twice the first row from the second row? If we start with the identify matrix
and apply this operation to it, we end up with the elementary matrix
which we named just to keep track of which elementary matrix we are talking about later on. Let’s check if it does what it’s supposed to on a two different matrices
Indeed it does, and by playing around with our rules of matrix arithmetic and multiplication, you should be able to convince yourself that that left multiplying any 3-row matrix by will subtract twice its first row from its second row.
We can also use Definition 1 to reverse engineer what row operation an elementary matrix is encoding.
Python break!¶
import numpy as np
# Define the elementary matrices above
E_1 = np.array([[1., 0., 0.], [-2., 1., 0.],[0., 0., 1.]])
E_2 = np.array([[1., 0., 0.], [0., 1., 0.],[-1., 0., 1.]])
E_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., 0.5, 1.]])
# Apply them to a 3 x 4 matrix and confirm they do what they're supposed to
A = np.array([[1., 2., 3., 4.],[2., 4., 6., 8.],[0.5, 1., 1.5, 2.]])
# E1: row 2 - 2 x row 1
print(f'A = {A},\n E_1A = {E_1 @ A}')
# E2: row 3 - row 1
print(f'E_2A = {E_2 @ A}')
# E3: row 3 + .5 x row 2
print(f'E_3A = {E_3 @ A}')
A = [[1. 2. 3. 4. ]
[2. 4. 6. 8. ]
[0.5 1. 1.5 2. ]],
E_1A = [[1. 2. 3. 4. ]
[0. 0. 0. 0. ]
[0.5 1. 1.5 2. ]]
E_2A = [[ 1. 2. 3. 4. ]
[ 2. 4. 6. 8. ]
[-0.5 -1. -1.5 -2. ]]
E_3A = [[1. 2. 3. 4. ]
[2. 4. 6. 8. ]
[1.5 3. 4.5 6. ]]
4.1Elementary Matrices in Action¶
Let us use elementary matrices to design a matrix that can help us solve linear equations. Let’s revisit the very first system of equations we encountered:
or in matrix-vector form
which we solved by first applying the row operation encoded by (subtracted twice equation 1 from equation 2), then applying the operation encoded by (subtract the first equation from the last equation ), and finally by applying the operation encoded by (add 1/2 first equation to last equation). Funny how that conveninently worked out! Therefore, if we keep careful track of the row operations and their matrix multiplication realizations, we conclude that:
The way to read the expression is from right-to-left: we start with , then apply , then , and finally . If we give the composition of elementary operations a name, say , then we can check that , that is to say left multiplying the augmented matrix by the matrix performs Gaussian Elmination for us! This is just a first taste of a powerful feature of linear algebra: we can encode very sophisticated and complex operations via matrix multiplication.
Python break!¶
Let’s implement the above example and confirm that we can indeed cook up a matrix such that left multiplying by gives an upper triangular matrix.
# Define A and b
A = np.array([[1., 2., 1.],[2., 6., 1.],[1., 1., 4.]])
b = np.array([2., 7., 3.])
# Build M = [A|b]
M = np.hstack((A,b.reshape((3,1))))
# Build E
E = E_3 @ E_2 @ E_1
# Compute N = [U|c] = E [A|b] = EM
N = E @ M
# Extract out U and c
U = N[:,0:3]
c = N[:,-1].flatten()
print(f'A = {A}, \n b = {b}, \n U = {U}, \n c = {c}')
A = [[1. 2. 1.]
[2. 6. 1.]
[1. 1. 4.]],
b = [2. 7. 3.],
U = [[ 1. 2. 1. ]
[ 0. 2. -1. ]
[ 0. 0. 2.5]],
c = [2. 3. 2.5]
4.2Inverse Elementary Matrices¶
Our work so far shows that we can find a matrix such that , for upper triangular. This is progress, but we still haven’t reached our goal of decomposing into a product of a lower and an upper triangular matrix. Our approach will be to look at the matrix equation and find a matrix that “undoes” the action of left-multiplyting by .
Let’s start by focussing on how to undo the action of an elementary matrix using its corresponding inverse elementary matrix. Since elmentary matrix multiplication implements simple row operations, we can intuitively figure out how to undo them. For example, to undo the action of (subtract twice equation 1 from equation 2), we add twice equation 1 to equation 2. The corresponding matrix is
We note that , since, reverses the action done by and the end result does not do any action. Similarly, we can define appropriate inverses
We have . If we reverse the actions using , we have
where we used the fact that . Hence, calling , we showed that . The matrix is upper triangular, butis lower triangular? The answer is yes due to the following fundamental property of triangular matrices.
Since are all lower triangular, so is their product . Thus we have computed the LU facotrization of using elementary matrix operations.
Python break!¶
Let’s double check all of the above in code. We’ll recycle the , , and from above.
# Define the inverse elementary matrices
L_1 = np.array([[1., 0., 0.], [2., 1., 0.],[0., 0., 1.]])
L_2 = np.array([[1., 0., 0.], [0., 1., 0.],[1., 0., 1.]])
L_3 = np.array([[1., 0., 0.], [0., 1., 0.],[0., -0.5, 1.]])
# Confirm that they "undo" their corresponding elementary matrices' actions
print(f'These should all be the 3x3 identity matrix:\n {L_1 @ E_1},\n {L_2 @ E_2},\n {L_3 @ E_3}')
# Confirm that L = L_3 L_2 L_1 "undoes" the action of E
L = L_1 @ L_2 @ L_3
print(f'L = {L},\n E={E}')
print(f'So should this: \n {L @ E}')
# Finally, let's check that A = LU
print(f'This should be all zeros: \n {A - L@U}')
These should all be the 3x3 identity matrix:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]],
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]],
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
L = [[ 1. 0. 0. ]
[ 2. 1. 0. ]
[ 1. -0.5 1. ]],
E=[[ 1. 0. 0. ]
[-2. 1. 0. ]
[-2. 0.5 1. ]]
So should this:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
This should be all zeros:
[[ 0. 0. 0. ]
[-3. -9. -1. ]
[-1. -5. 2.5]]
4.3Regular Matrix¶
Using the factorization explained in this page, we can finally give a formal definition of a regular matrix.
Python break!¶
Below is a simple NumPy implementation of the -factorization algorithm that we described above. Notice that it looks very similar to our code for Gaussian elimination, but that we are keeping track of the changes we made to reduce to within the matrix . Specifically, when we subtract scalar
times row j from row i in the line U[i, :] = U[i, :] - scalar*U[j, :]
to perform forward elimination, we adjust the corresponding (i,j) entry of to undo this action by setting L[i, j] = scalar
. You should be able to convince yourself that building this way gives the same answer as computing the product of elementary inverse matrix operations.
def lu(A):
#Get the number of rows
n = A.shape[0]
U = A.copy()
L = np.eye(n, dtype=np.double)
#Loop over rows
for j in range(n):
if U[j, j] == 0:
print("A is not regular")
break
else:
for i in range(j+1, n):
scalar = U[i, j]/U[j, j]
U[i, :] = U[i, :] - scalar*U[j, :]
L[i, j] = scalar
return L, U
Now let’s test this out on our favorite linear system!
## applying LU factoriation to solve linear equations
A = np.array([[2., 6., 1.],
[1., 1., 4.],
[1., 2., 1.]])
b = np.array([7., 3., 2.])
L, U = lu(A) # LU decomposition using scipy package
print("\nL: \n", L, "\nU: \n", U)
n = A.shape[1]
# forward substitution
z = np.zeros((n,))
z[0] = b[0]/L[0, 0]
for i in range(1, n):
z[i] = (b[i] - np.sum(L[i, :i]*z[:i]))/L[i, i]
# back substitution
x = np.zeros((n,))
x[-1] = z[-1]/U[-1, -1]
for i in range(n-2, -1, -1):
x[i] = (z[i] - np.sum(U[i, i+1:]*x[i+1:]))/U[i, i]
print("\nSolution x: \n", x)
L:
[[1. 0. 0. ]
[0.5 1. 0. ]
[0.5 0.5 1. ]]
U:
[[ 2. 6. 1. ]
[ 0. -2. 3.5 ]
[ 0. 0. -1.25]]
Solution x:
[-3. 2. 1.]
5Are We Done?¶
Not quite! We’ve assumed throughout that our matrix is regular, which we now somewhat circularly defined as being the class of matrices for which the above -facotrization works. But there are very simple (and easily solvable!) systems of equations for which is not regular. For example, if is lower triangular, the above algorithm fails! We’ll see a very simple fix next that let’s us handle a much more general class of systems with just a small tweak.
6Optional: -factorization is Gaussian Elimination!¶
We have . Suppose we have applied Gaussian Elimination to get our system into . There is a matrix such that and . So we have
But if then we must have that .
Now, if we go back to solving our system using -factorization, we have that which we can turn into and . We saw that we can solve that by forward substitution, but notice that we also have that , i.e., the matrix that reduces our original system of equations to upper triangular form is actually performing this forward substitution.
Forward Substitution is exactly what it sounds like. To solve , when is lower triangular, we start at the top row and set . We then move down a row, and solve by plugging in and solving for . We continue in this manner until we’ve solved for .
A triangular matrix with all 1’s on the diagonal is also called a unitriangular matrix. The “uni-” prefix indicates that the diagonal should be all 1’s.