Material related to this page, as well as additional exercises, can be found in ALA Ch. 1.2, LAA Ch 2.1, and ILA Ch. 2.4. These notes are mostly based on ALA Ch 1.2 and LAA Ch 2.1.
A lot of this page is reviewing mechanics that you’ve arleady seen in Math 1410. If you feel like you remember all of this material well and just want to skip to the new stuff, here are some shortcuts:
for a generic matrix A of size m×n (read “m by n”), where m denotes the number of rows in A and n denotes the number of columns[1]. Therefore, the preceding examples of matrices have respective sizes 2×3, 4×2, and 1×3. A matrix is square if m=n, i.e., it has the same number of rows as columns. A column vector is an m×1 matrix, while a row vector is a 1×n matrix. While these might seem like they are the same thing, they very much are not! Column vectors end up playing a much more important role in our story, and so whenever we just say “vector” we will always mean a column vector. A 1×1 matrix, which has a single entry, is both a column and row vector, and as we’ll see later, behaves like an ordinary scalar number.
The number that lies in the ith row and jth column of A is called the (i,j)entry of A, and is denoted by aij. The row index always appears first and the column index second. Each column of A is a m×1 vector, which we denote by a1,…an. It will often be convenient to write a matrix in terms of its columns:
Python Break! Constructing Matrices and Vectors in NumPY¶
NumPy is a standard scientific computing module in the Python programming language, and is widely used in the engineering, data science, and financial industries. Throughout these notes, we will provide you with NumPy code snippets that implement the mathematical concepts we introduce. The purpose of these are twofold. First, you will have coding exercises on the problem sets, and these snippets give you examples of the syntax needed to implement concepts from class. Second, they let you see the math “come to life.” While we are starting with some pretty basic ideas now, we’ll soon be able to do some really interesting things with a few lines of code. A great way to get intuition about mathematical objects is to paly with them in code. You can do this by clicking the button below[2], which will launch an interactive version of this notebook where you can modify code to see what changes in the output. You are strongly encouraged to do this!
NumPy uses data structures called NumPy Arrays to represent matrices and vectors. Accessing elements of a matrix is done using zero indexing: see this page for examples.
# Constructing matrices and vectors
import numpy as np
# Create a 2x3 matrix A
A = np.array([[1, 2, 3],
[4, 5, 6]])
print(f'A= {A}')
# We can check the size of an array by accessing its shape field
print(A.shape)
# Print 2,3 element of A (note the zero indexing means we access [1,2] element of array!)
print(f'A23 = {A[1,2]}')
# Create a 1x4 row-vector b
b = np.array([[0, -1, 1, 3]])
print(f'b = {b}')
# Print first two elements of b
print(f'first two elements of b = {b[0,0:2]}')
# a Create a 3x1 column-vector c
c = np.array([[0],
[1],
[2]])
print(f'c = {c}')
A= [[1 2 3]
[4 5 6]]
(2, 3)
A23 = 6
b = [[ 0 -1 1 3]]
first two elements of b = [ 0 -1]
c = [[0]
[1]
[2]]
First we define addition of matrices. You are allowed to add two matrices only if they are of the same size, and matrix addition is performed entry-wise. For example
More generally, if A and B are m×n matrices, then their sum C=A+B is the m×n matrix whose entries are given by cij=aij+bij for i=1,…,m and j=1,…,n. When defined, matrix addition behaves just like ordinary addition. It is
## Matrix addition
# Create two 2x3 matrices A and B
A = np.array([[1, 2, 3],
[4, 5, 6]])
B = np.array([[2, 4, 6],
[8, 10, 12]])
#Add them together
add = A + B # adding two matrices
print(f'A + B = {add}')
# this is a 1-D array of size 3 in python, which is different than the column-vector c created above
vec = np.array([-1, -2, 3])
# notice the difference how 1-D arrays add to 2-D arrays in python. # Try A+c and observe what happens
add_vec = A + vec
print(f'Adding incompatible matrices in python does something strainge! A + vec = {add_vec}')
A + B = [[ 3 6 9]
[12 15 18]]
Adding incompatible matrices in python does something strainge! A + vec = [[0 0 6]
[3 3 9]]
A scalar is a fancy name for an ordinary number. For now, we restrict ourselves to scalars, vectors, and matrices with real entries, but we will eventually extend these ideas to complex numbers and matrices with complex entries. Although technically not the same thing, we will treat the 1×1 matrix [c] and the scalar c∈R as a scalar,[3] that is to say as an ordinary number, so we will drop the brackets. Scalar multiplication takes a scalar c and an m×n matrix A an computes the m×n matrix B=cA by multiplying each entry of A by c. For example:
We first look at multiplying a row vector with a column vector and use this as the base to multiply two matrices. Let a be a 1×n row vector and x be a n×1 column vector. The product ax is defined as
We can multiply two matrices A,B if and only if they have compatible sizes. For example, to compute the product C=AB, A must have the same number of columns as the number of rows in B. If A is an m×n matrix and B is an n×p matrix, then C=AB is an m×p matrix, where each element is defined by
An important special case of matrix-matrix multiplication is matrix-vector products. Let A be an m×n matrix and x be an n×1 column vector. Then, the matrix-vector product b=Ax is an m×1 column vector, where the entries of b are
NumPy has built in functions for constructing special matrices such as the identity and all zeros matrix. Other special matrices include the all ones matrix and Vandermonde matrices (which you’ll see in your homework set).
Summarized below are the rules for how you can combine matrix addition, scalar multiplication, and matrix-matrix multiplication. These are properties you should have seen in Math 1410, so we will not go over them in detail, but it is important to keep these in mind as we progress through the semester. The big thing to remember is that matrix-matrix multiplication comes with more rules than traditional scalar multiplication.
Let’s use our new matrix-vector tools to compactly write out systems of linear equations. Recall that a general linear system of m equations in n unknowns takes the form
Equation (21) is composed of three basic ingredients: the m×ncoefficient matrixA, with entries aij as in (2), the column vector x=⎣⎡x1x2⋮xn⎦⎤ containing the unknowns or variables and the column vector b=⎣⎡b1b2⋮bm⎦⎤ containing the right-hand sides. As you can see, it is a bit unwieldy to write column vectors inline, and so we will often equivalently write them as x=(x1,x2,⋯,xn) and b=(b1,b2,⋯,bm) instead.
Revisiting linear system (2), we see that the coefficient matrix A, the unknown vector x, and the right hand side vector b can be read off as
An important observatoion is that if a variable doesn’t appear in an equation, then the corresponding matrix entry is 0. For example, the following (admittedly silly) system of 2 equations in 2 unknowns
Remember that we write x∈S to mean that the element x lives in the set S. In this example, c∈R means that the element c lives in the real line R, and hence c is a scalar.