Skip to article frontmatterSkip to article content

1.6 The Permuted LU-Factorization

Let's mix things up a little

Dept. of Electrical and Systems Engineering
University of Pennsylvania

Binder

Lecture notes

1Reading

Material related to this page, as well as additional exercises, can be found in ALA Ch. 1.4, LAA Ch 2.5, and ILA Ch. 2.4. These notes are mostly based on ALA Ch 1.4.

2Learning Objectives

By the end of this page, you should know:

  • how to solve linear equations if AA is not regular
  • what are permutation matrices
  • how to use permuted LU factorization

3Interchanging Rows

The method of Gaussian Elimination works only if the matrices are regular. However, not every matrix is regular as given in the example below.

2y+z=2,2x+6y+z=7,x+y+4z=3,\begin{align*} 2y + z & = 2,\\ 2x +6y + z & = 7,\\ x+y+4z & =3, \end{align*}

The augmented coefficient matrix for the above set of equations is

[021226171143].\left[ \begin{array}{ccc|c} 0 & 2 & 1 & 2\\ 2& 6 & 1 & 7\\ 1 & 1 & 4 & 3 \end{array}\right].

In the above example, the entry (1,1)(1,1) is 0, which is the first diagonal element, but cannot serve as a pivot. The “problem” is because xx does not appear in the first equation, but this is actually a good thing because we already have an equation that has only two variables in it. Hence, we need to eliminate xx from only one of the other two equations. Let us interchange the first two rows of (2):

[261702121143],\left[ \begin{array}{ccc|c} 2& 6 & 1 & 7\\ 0 & 2 & 1 & 2\\ 1 & 1 & 4 & 3 \end{array}\right],

which clearly does not change the solution set, but now we have a pivot at (1,1)(1,1) and (2,2)(2,2). Interchanging the first and second row is equivalent to swapping the first and second equation. Now, we can proceed as in Gaussian Elimination to zero out (3,1)(3, 1) and (3,2)(3, 2) using elementary row operations to get

[Uc]=[26170212009232].\left[ \begin{array}{c|c} U & \textbf{c} \end{array}\right] = \left[ \begin{array}{ccc|c} 2& 6 & 1 & 7\\ 0 & 2 & 1 & 2\\ 0 & 0 & \frac{9}{2} & \frac{3}{2} \end{array}\right].

The pivots are 2,2,922, 2, \frac{9}{2} and solving via back substitution yields the solution x=56,y=56,z=13x=\frac{5}{6}, y = \frac{5}{6}, z = \frac{1}{3}.

This row swapping operation is pretty useful, and so we’ll make it another elementary row operation, in addition to the type 1 operation of adding rows of a matrix together that we’ve been relying on so far.

Square matrices AA that can be reduced to upper triangular form with nonzeros along the diagonal via Gaussian Elimination with pivoting play a distingished role in linear algebra. As we’ll see shortly, when such matrices are used to define a system of linear equations Ax=bA\vv x = \vv b, we know that there will be a unique solution for any choice of right hand side b\vv b. Because these kinds of matrices are so important, we’ll give them a name and call them nonsingular. Why we call them this will become clearer later in the course, but for now, this will just be our name for “nice” matrices AA as described below.

In contrast, a singular square matrix cannot be reduced to such upper triangular form by such row operations, because at some stage in the elimination procedure the diagonal pivot entry and all of the entries below it are zero.

We summarize the discussion about unique solutions of systems of equations for nonsingular AA in the theorem below.

We are able to prove the “if” part of this theorem, since we know that when AA is nonsingular, we know that Gaussian Elimination with pivoting succeeds, meaning that the solution exists and is unique. We’ll come back to prove the “only if” part later.

4Small Pivots Cause Numerical Issues

It’s important to remember then when linear algebra is applied in engineering, data science, economics, and scientific domains, it is done so using computers. Computers store data using bits (0s and 1s), and as such have limited precision. This means that computers have to eventually round off numbers, because they can’t keep an infinite number of decimal points. This matters because when the numbers you are working with span a very large range (i.e., some numbers are very big and others are very small), then even very small rounding errors introduced by the limits of your computer can cause significant errors in your solutions. Here we’ll look at a simple example of such numerical ill-conditionning, and see that by being cognizant of these issues, we can find a work around.

We’ll look at the simple system of two equations in two unknowns:

0.01x+1.6y=32.1,x+0.6y=22,\begin{align*} 0.01x +1.6y & = 32.1,\\ x+0.6y & = 22, \end{align*}

where, the exact solution is x=10,y=20x=10, y=20 (you can check this). Now let’s assume that we have only have a very primitive computer for solving this system of equations that only retains three digits for any number we store in it. Of course, modern computers have much higher, but still finite, capacity. We can modify the exmaple below to produce similar issues on modern computers.

The actual augmented matrix is given below after making it upper triangular.

[.011.632.11.622][.011.632.10159.43188].\left[ \begin{array}{cc|c} .01 & 1.6 & 32.1\\ 1 & .6 & 22 \end{array}\right] \leftrightarrow \left[ \begin{array}{cc|c} .01 & 1.6 & 32.1\\ 0 & -159.4 & -3188 \end{array}\right].

Since the calculator only retains three digits, the matrix computed by our primitive computer is instead

[.011.632.10159319].\left[ \begin{array}{cc|c} .01 & 1.6 & 32.1\\ 0 & -159 & -319 \end{array}\right].

If we use back substitution with the rounded augmented matrix (7) to solve the system by, we get first get y=319015920.1y = \frac{-3190}{159} \approx 20.1. Well that’s not too bad, we’re only off by 0.1 in yy. But, if we proceed to solve for xx using this value of yy, we get x10x \approx -10! This is a big problem! A small error in yy, i.e, 0.1, produced a huge error in xx! The problem is because of the small pivot 0.01 at (1,1)(1,1). The small error in yy is scaled up when evaluating the first equation, where we have to divide the exression by the small pivot 0.01. If we interchanged the rows so that the pivot is 1 and proceed from there via Gaussian Elimination, the approximate solution is y20.1,x9.9{y \approx 20.1, x \approx 9.9}, which is much more reasonable.

5Permutation matrices

Similar to how type 1 operations are represented using elementary matrices, we can represent type 2 operations using permutation matrices.

For example, to swap rows 1 and 2 of the matrix AA, we left multiply AA by PP as follows.

A=[123456789],P=[010100001]PA=[456123789]A = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \\ 7 & 8 & 9 \end{bmatrix}, P = \begin{bmatrix} 0 & 1 & 0 \\ 1 & 0 & 0 \\ 0 & 0 & 1 \end{bmatrix} \Rightarrow PA = \begin{bmatrix} 4 & 5 & 6 \\ 1 & 2 & 3 \\ 7 & 8 & 9 \end{bmatrix}

5.1Python Break!

import numpy as np

P = np.array([[0, 1, 0], [1, 0, 0], [0, 0, 1]])
A = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

#Swap rows 1 and 2 of A
P @ A
array([[4, 5, 6], [1, 2, 3], [7, 8, 9]])

6Permuted LU-factorization

From Definition 1, every nonsingular matrix AA can be reduced to upper triangular form using type 1 operations and type 2 operations. If we perform all the permutations first using a permutation matrix PP, then we obtain a regular matrix PAPA that admits an LULU factorization. Hence we can define the permuted LU factorization of a nonsingular matrix AA as follows.

Given a permuted LU factorization of the matrix AA, we can solve the system of equations by first doing forward substitution with Lz=PbL\textbf{z}= P\textbf{b} and back substitution with Ux=zU\textbf{x} = \textbf{z}.

6.1Python Break!

Implementing the algorithms you learn in this class from scratch in NumPy is a great way to both gain a deeper understanding of the math and to practice your coding skills. However, when applying these ideas in real word situations, it is usually better to use built-in functions: these typically have 100s, if not 1000s, of engineering hours behind them, and have been optimized for reliability, performance, and scalability. In the code snippet below, we use the standard scientific computing package SciPy’s built in lu implementation for the LU-factorization (with pivots). Note that this implementation is slightly different than the one described in our notes, and instead computes P,L,UP, L, U such that A=PLUA = PLU. By the end of the next section on matrix inverses, you’ll know how to relate this expression to the one we described above.

# Permuted LU factorization

import numpy as np
from scipy.linalg import lu

A = np.array([[0, 2, 1],
              [2, 6, 1],
              [1, 1, 4]])
P, L, U = lu(A)
print("P: \n", P, "\nL: \n", L, "\nU: \n", U)
print(f'A - P L U =\n {A - P @ L @ U}')
P: 
 [[0. 1. 0.]
 [1. 0. 0.]
 [0. 0. 1.]] 
L: 
 [[ 1.   0.   0. ]
 [ 0.   1.   0. ]
 [ 0.5 -1.   1. ]] 
U: 
 [[2.  6.  1. ]
 [0.  2.  1. ]
 [0.  0.  4.5]]
A - P L U =
 [[0. 0. 0.]
 [0. 0. 0.]
 [0. 0. 0.]]
# Let's try a bigger example!  How big can you make n before it takes too long to solve?

n = 2000
A = np.random.randn(n,n)

P, L, U = lu(A)
print(f'$max|(A - P L U)_ij| =\n {np.max(np.abs(A - P @ L @ U))}')
$max|(A - P L U)_ij| =
 1.461053500406706e-13

Binder