1.3 Matrices and Vectors

1Reading¶

Material related to this page, as well as additional exercises, can be found in ALA Ch. 1.2, LAA Ch 2.1, and ILA Ch. 2.4. These notes are mostly based on ALA Ch 1.2 and LAA Ch 2.1.

2Learning Objectives¶

By the end of this page, you should know:

what matrices and vectors are
what arithmetic operations are allowed when working with matrices and vectors
how to perform arithmetic operations on matrices and vectors
how to represent linear system of equations using matrices and vectors

3What’s New from Math 1410¶

A lot of this page is reviewing mechanics that you’ve arleady seen in Math 1410. If you feel like you remember all of this material well and just want to skip to the new stuff, here are some shortcuts:

Matrix-vector multiplication as a linear combination of columns
Matrix-matrix multiplication as batch matrix-vector multiplication
Linear systems in matrix vector notation

4Matrices and Vectors¶

A matrix is a rectangular array of numbers. For example

\begin{bmatrix} 1 & 0 & 3 \\ -2 & 4 & 1 \end{bmatrix}, \quad \begin{bmatrix} \pi & 0 \\ e & \frac{1}{2} \\ -1 & 0.83 \\ \sqrt{5} & -\frac{4}{7} \end{bmatrix}, \quad \begin{bmatrix} 0 & 0 & 0 \end{bmatrix}

(1)

are all examples of matrices. We use the notation

A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}

(2)

for a generic matrix $A$ of size $m \times n$ (read “ $m$ by $n$ ”), where $m$ denotes the number of rows in $A$ and $n$ denotes the number of columns^[1]. Therefore, the preceding examples of matrices have respective sizes $2 \times 3$ , $4 \times 2$ , and $1 \times 3$ . A matrix is square if $m=n$ , i.e., it has the same number of rows as columns. A column vector is an $m \times 1$ matrix, while a row vector is a $1 \times n$ matrix. While these might seem like they are the same thing, they very much are not! Column vectors end up playing a much more important role in our story, and so whenever we just say “vector” we will always mean a column vector. A $1 \times 1$ matrix, which has a single entry, is both a column and row vector, and as we’ll see later, behaves like an ordinary scalar number.

The number that lies in the $i$ th row and $j$ th column of $A$ is called the $(i,j)$ entry of $A$ , and is denoted by $a_{ij}$ . The row index always appears first and the column index second. Each column of $A$ is a $m \times 1$ vector, which we denote by $\vv a_1, \dots \vv a_n$ . It will often be convenient to write a matrix in terms of its columns:

A = \begin{bmatrix} \vv a_1 & \vv a_2 & \cdots & \vv a_n \end{bmatrix}

(3)

Python Break! Constructing Matrices and Vectors in NumPY¶

NumPy is a standard scientific computing module in the Python programming language, and is widely used in the engineering, data science, and financial industries. Throughout these notes, we will provide you with NumPy code snippets that implement the mathematical concepts we introduce. The purpose of these are twofold. First, you will have coding exercises on the problem sets, and these snippets give you examples of the syntax needed to implement concepts from class. Second, they let you see the math “come to life.” While we are starting with some pretty basic ideas now, we’ll soon be able to do some really interesting things with a few lines of code. A great way to get intuition about mathematical objects is to paly with them in code. You can do this by clicking the button below^[2], which will launch an interactive version of this notebook where you can modify code to see what changes in the output. You are strongly encouraged to do this!

NumPy uses data structures called NumPy Arrays to represent matrices and vectors. Accessing elements of a matrix is done using zero indexing: see this page for examples.

# Constructing matrices and vectors
import numpy as np

# Create a 2x3 matrix A
A = np.array([[1, 2, 3],
             [4, 5, 6]])
print(f'A= {A}')

# We can check the size of an array by accessing its shape field
print(A.shape)

# Print 2,3 element of A (note the zero indexing means we access [1,2] element of array!)
print(f'A23 = {A[1,2]}')

# Create a 1x4 row-vector b
b = np.array([[0, -1, 1, 3]])
print(f'b = {b}')

# Print first two elements of b
print(f'first two elements of b = {b[0,0:2]}')

# a Create a 3x1 column-vector c
c = np.array([[0],
              [1],
              [2]])

print(f'c = {c}')

A= [[1 2 3]
 [4 5 6]]
(2, 3)
A23 = 6
b = [[ 0 -1  1  3]]
first two elements of b = [ 0 -1]
c = [[0]
 [1]
 [2]]

5Matrix Arithmetic¶

Matrix arithmetic involves three basic operations: matrix addition, scalar multiplication, and matrix multiplication.

5.1Matrix Addition¶

First we define addition of matrices. You are allowed to add two matrices only if they are of the same size, and matrix addition is performed entry-wise. For example

\bm 1 & 2 \\ -1 & 0\em + \bm 3 & -5 \\ 2 & 1 \em = \bm 4 & -3 \\ 1 & 1 \em.

(4)

More generally, if $A$ and $B$ are $m \times n$ matrices, then their sum $C = A+ B$ is the $m \times n$ matrix whose entries are given by $c_{ij} = a_{ij} + b_{ij}$ for $i=1,\dots,m$ and $j=1,\dots,n$ . When defined, matrix addition behaves just like ordinary addition. It is

commutative: $A + B = B + A$ , and
associative: $A + (B + C) = (A+ B) + C = A + B + C$ .

Exercise 1 (Matrix addition)

For (a)-(e), we are given some matrices. If they can be added together, then find their sum. Otherwise, indicate that they can’t be added.

a. $\begin{bmatrix} 1 & 2 \\ -1 & 0 \end{bmatrix}$ and $\begin{bmatrix} 3 & -5 \\ 2 & 1 \end{bmatrix}$

b. $\begin{bmatrix} 5 \end{bmatrix}$ and $\begin{bmatrix} -3 \end{bmatrix}$

c. $\begin{bmatrix} 1&2\\3&4\\5&6 \end{bmatrix}$ and $\begin{bmatrix} 2&4\\6&8\\10&12 \end{bmatrix}$

d. $\begin{bmatrix} 1 & 2 \end{bmatrix}$ and $\begin{bmatrix} 1\\3 \end{bmatrix}$

e. $\begin{bmatrix} 1\\2 \end{bmatrix}$ and $\begin{bmatrix} 1\\3 \end{bmatrix}$ and $\begin{bmatrix} 2\\5 \end{bmatrix}$

Solution to Exercise 1

a. $\begin{bmatrix} 1 & 2 \\ -1 & 0 \end{bmatrix} + \begin{bmatrix} 3 & -5 \\ 2 & 1 \end{bmatrix} = \bm 1+3&2-5\\-1+2&0+1 \em = \bm 4&-3\\1&1 \em$

b. $\begin{bmatrix} 5 \end{bmatrix} + \begin{bmatrix} -3 \end{bmatrix} = \bm 5 -3 \em = \bm -2 \em$

c. $\begin{bmatrix} 1&2\\3&4\\5&6 \end{bmatrix} + \begin{bmatrix} 2&4\\6&8\\10&12 \end{bmatrix} = \bm 1 + 2 & 2 + 4\\ 3 + 6 & 4 + 8\\5 + 10& 6 + 12 \em = \bm 3&6\\9&12\\15&18 \em$

d. They can’t be added. The left matrix is a 1-by-2 matrix, whereas the right matrix is a 2-by-1 matrix, meaning they have different dimensions.

e. $\begin{bmatrix} 1\\2 \end{bmatrix} + \begin{bmatrix} 1\\3 \end{bmatrix} + \begin{bmatrix} 2\\5 \end{bmatrix} = \bm 1 + 1 + 2 \\ 2 + 3 + 5 \em = \bm 4 \\ 10 \em$

Python Break!¶

## Matrix addition

# Create two 2x3 matrices A and B
A = np.array([[1, 2, 3],
             [4, 5, 6]])

B = np.array([[2, 4, 6],
             [8, 10, 12]])

#Add them together
add = A + B # adding two matrices
print(f'A + B = {add}')

# this is a 1-D array of size 3 in python, which is different than the column-vector c created above
vec = np.array([-1, -2, 3]) 
# notice the difference how 1-D arrays add to 2-D arrays in python. # Try A+c and observe what happens
add_vec = A + vec 
print(f'Adding incompatible matrices in python does something strainge! A + vec = {add_vec}')

A + B = [[ 3  6  9]
 [12 15 18]]
Adding incompatible matrices in python does something strainge! A + vec = [[0 0 6]
 [3 3 9]]

5.2Scalar Multiplication¶

A scalar is a fancy name for an ordinary number. For now, we restrict ourselves to scalars, vectors, and matrices with real entries, but we will eventually extend these ideas to complex numbers and matrices with complex entries. Although technically not the same thing, we will treat the $1 \times 1$ matrix $[c]$ and the scalar $c \in \R$ as a scalar,^[3] that is to say as an ordinary number, so we will drop the brackets. Scalar multiplication takes a scalar $c$ and an $m \times n$ matrix $A$ an computes the $m \times n$ matrix $B = cA$ by multiplying each entry of $A$ by $c$ . For example:

c = 3, \quad A = \bm 1 & 2 \\ -1 & 0 \em, \quad cA = 3\bm 1 & 2 \\ -1 & 0 \em=\bm 3 & 6 \\ -3 & 0\em.

(5)

In general, if $B = cA$ , then $b_{ij}=ca_{ij}$ for each entry $i=1,\dots,m$ and $j=1,\dots,n$ .

Exercise 2 (Scalar multiplication)

For (a)-(e), compute the scalar-matrix product.

a. $5\bm 1 & 2 \\ 3 & 4\em$

b. $0\bm 1 & 1 \\ 2 & 3 \em$

c. $\frac 1 2 \bm 2 & 4 \\ 6 & 5 \em$

d. $\sqrt 2 \bm 1 & \sqrt 2 \\ 0 & \pi \em$

e. $i \bm 1 & i \\ -i & -1 \em$ where $i = \sqrt{-1}$

Solution to Exercise 2

a. $5\bm 1 & 2 \\ 3 & 4\em = \bm 5\cdot 1 & 5\cdot 2 \\ 5\cdot 3 & 5\cdot 4\em = \bm 5 & 10 \\ 15 & 20\em$

b. $0\bm 1 & 1 \\ 2 & 3 \em = \bm 0\cdot 1 & 0\cdot 1 \\ 0\cdot 2 & 0\cdot 3 \em = \bm 0 & 0\\ 0 & 0\em$

c. $\frac 1 2 \bm 2 & 4 \\ 6 & 5 \em = \bm 2/2 & 4/2 \\ 6/2 & 5/2 \em = \bm 1 & 2 \\ 3 & 5/2 \em$

d. $\sqrt 2 \bm 1 & \sqrt 2 \\ 0 & \pi \em$

e. $i \bm 1 & i \\ -i & -1 \em = \bm i\cdot 1 & i\cdot i \\i \cdot -1 & i \cdot -i \em= \bm i & -1 \\1 & -i \em$

Python Break!¶

# scalar multiplication
scalar = 3
scalar_mult = scalar*A
print(f'A = {A},\n c={scalar},\n cA = {scalar_mult}')

A = [[1 2 3]
 [4 5 6]],
 c=3,
 cA = [[ 3  6  9]
 [12 15 18]]

6Matrix Multiplication Warmup¶

We first look at multiplying a row vector with a column vector and use this as the base to multiply two matrices. Let $\vv a$ be a $1 \times n$ row vector and $\vv x$ be a $n \times 1$ column vector. The product $\vv a \vv x$ is defined as

\vv a \vv x = \begin{bmatrix} a_1 & a_2 & \cdots & a_n\end{bmatrix} \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n \end{bmatrix} = a_1x_1 + a_2x_2 + \cdots a_nx_n = \sum_{k=1}^n a_kx_k

(6)

Python Break!¶

# vector products
b_c = np.array([[1], [2], [-1], [2]])
product = b @ b_c # multipying row-vector b with column-vector b_c
print(f'b = {b},\n b_c = {b_c},\n  b x b_c = {product}')

b = [[ 0 -1  1  3]],
 b_c = [[ 1]
 [ 2]
 [-1]
 [ 2]],
  b x b_c = [[3]]

7Matrix-Matrix Multiplication¶

We can multiply two matrices $A, B$ if and only if they have compatible sizes. For example, to compute the product $C = AB$ , $A$ must have the same number of columns as the number of rows in $B$ . If $A$ is an $m \times n$ matrix and $B$ is an $n \times p$ matrix, then $C = AB$ is an $m \times p$ matrix, where each element is defined by

c_{ij} = \sum_{k=1}^na_{ik}b_{kj}

(7)

The $(i, j)^{th}$ entry of $C$ is the vector product of the $i^{th}$ row of $A$ and $j^{th}$ column of $B$ .

7.1Matrix-Vector Multiplication¶

An important special case of matrix-matrix multiplication is matrix-vector products. Let $A$ be an $m \times n$ matrix and $\vv x$ be an $n \times 1$ column vector. Then, the matrix-vector product $\vv b = A \vv x$ is an $m\times 1$ column vector, where the entries of $\vv b$ are

b_{i} = \sum_{k=1}^na_{ik}x_{i}

(8)

Matrix-vector multiplication as linear combinations of columns

An alternative expression for the matrix-vector product $A\vv x$ can be obtained in terms of a weighted sum of the columns of $A$ with weights given by the entries of $\vv x$ . More precisely, if we let $\vv a_1, \vv a_2, \cdots, \vv a_n$ to be the columns of $A$ so that

A = \bm \vv a_1 & \vv a_2 & \cdots & \vv a_n \em,

(9)

then, another formula for $\vv b = A \vv x$ is

\vv b = x_1 \vv a_1 + x_2 \vv a_2 + \cdots + x_n \vv a_n,

(10)

that is, $\vv b$ is computed by adding the columns of $A$ weighted by the entries of $\vv x$ . This will be useful later when we discuss about the column span of a matrix $A$ .

Matrix-matrix multiplication as batch matrix-vector multiplication

Another convenient way of computing matrix products $C = AB$ is given in terms of matrix-vector products between $A$ and the columnbs $\vv b_i$ of $B$ . If A is an $m \times n$ matrix, and if $B$ is an $n \times p$ matrix with columns $\vv b_1,\dots,\vv b_p$ , then the matrix product $AB$ is then the $m \times p$ matrix whose columns are $A\vv b_1, A\vv b_2,\dots, A\vv b_p$ , that is,

C = AB = A\bm \vv b_1 & \vv b_2 & \cdots & \vv b_p\em = \bm A\vv b_1 & A\vv b_2 & \cdots & A\vv b_p\em.

(11)

The $k^{th}$ column of $C$ is the matrix-vector product of $A$ and the $k^{th}$ column of $B$ , and so we can think of the matrix-matrix product as conducting $p$ matrix-vector products as a batch in parallel.

Exercise 4 (Matrix Multiplication)

For (a)-(c), compute the expressions if they exist; otherwise, indicate that the product doesn’t exist.

$A = \bm 1 & 1 \\ 1 & 0 \em, B = \bm 2 & 4 & 1 \\ 0 & 3 & -1 \em, \vv x = \bm 13\\ 8 \\ 2\em, \vv y = \bm 1 & 2\em$

a. $AB, BA$

b. $B \vv x$

c. $\vv y B, \vv y B \vv x$

Solution to Exercise 4

a. $BA$ does not exist, since sizes $2 \times 3$ and $2 \times 2$ do not match.

AB = \bm 1 & 1 \\ 1 & 0 \em \bm 2 & 4 & 1 \\ 0 & 3 & -1 \em = \bm 1*2 + 1*0 & 1*4 + 1*3 & 1*1 + 1*-1\\ 1*2 + 0*0 & 1*4 + 0*3 & 1*1 + 0*-1\em = \bm 2 & 7 & 0 \\ 2 & 4 & 1 \em

(12)

b. $\bm 2 & 4 & 1 \\ 0 & 3 & -1 \em\bm 13\\ 8 \\ 2\em = 13\bm 2 \\ 0 \em + 8\bm 4 \\ 3\em + 2\bm 1 \\ -1 \em= \bm 60 \\ 22 \em$

c. $\vv y B = \bm 1 & 2\em \bm 2 & 4 & 1 \\ 0 & 3 & -1 \em = 1\bm 2 & 4 & 1 \em + 2 \bm 0 & 3 & -1 \em = \bm 2 & 10 & -1 \em$

\vv y B \vv x = \bm 2 & 10 & -1 \em \bm 13\\ 8 \\ 2\em = 2*13 + 10*8 + -1*2 = 104

(13)

Python Break!¶

# Code implementing solutions to Exercise 4
A = np.array([[1, 1],
              [1, 0]])
B = np.array([[2, 4, 1],
              [0, 3, -1]])
x = np.array([13, 8, 2])
y = np.array([1, 2])

a_r = A@B
b_r = B@x
c_r = y@B
c_r1 = y@B@x
print("\nAB: \n", a_r, "\nBx: \n", b_r, "\nyB: \n", c_r, "\nyBx: \n", c_r1)


AB: 
 [[2 7 0]
 [2 4 1]] 
Bx: 
 [60 22] 
yB: 
 [ 2 10 -1] 
yBx: 
 104

8Special Matrices¶

Definition 1

The identity matrix $I = I_m$ is the $m \times m$ matrix

I_m = \bm 1 & 0 & 0 & \cdots & 0 & 0 \\ 0 & 1 & 0 & \cdots & 0 & 0 \\ 0 & 0 & 1 & \cdots & 0 & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & 0 \\ 0 & 0 & 0 & \cdots & 0 & 1 \em.

(14)

The entries along the main diagonal which runs from top left to bottom right are all equal to 1 ( $I_{ii} = 1$ ), and all other entries are equal to 0 $I_{ij} = 0$ for all $i \neq j$ . The identify matrix satisfies $IA = AI = A$ . The $I$ that multiplies $A$ to the left and right might be of different sizes if $A$ is not square.

Python Break! Built in NumPy functions¶

NumPy has built in functions for constructing special matrices such as the identity and all zeros matrix. Other special matrices include the all ones matrix and Vandermonde matrices (which you’ll see in your homework set).

# Identity matrices
I2 = np.eye(2)
I3 = np.eye(3)
print(f'I_2 = {I2}')
print(f'I_3 = {I3}')

# Zero matrix
O23 = np.zeros((2,3))
print(f'O_23 = {O23}')

# Ones matrix
ones_54 = np.ones((5,4))
print(f'ones_54 = {ones_54}')

I_2 = [[1. 0.]
 [0. 1.]]
I_3 = [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
O_23 = [[0. 0. 0.]
 [0. 0. 0.]]
ones_54 = [[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]

Note

The arguments to np.ones and np.zeros are tuples of the form (num_rows,num_cols). This leads to initially odd looking expressions with double brackets, and you might be wondering why can’t I just write np.ones(5,4) instread of np.ones((5,4)). The reason is that the first argument of these functions is a shape tuple, and so we need to group the size of the dimensions (in our case number of rows and number of columns) into a single “object”: we do that by putting them within brackets. So while np.ones((5,4)) tells python to make me a 5 x 4 all ones matrix, np.ones(5,4) tells python to make me an all ones 1-d array with 5 entries, and specifies the second function argument (dtype in this case) to be set to 4. This latter expression does not make sense, and so python will send an error. When using built in functions, it’s always helpful to take a look at the documentation page to see what kinds of arguments (tuples, lists, etc.) the function expects.

9Matrix Arithmetic Rules¶

Summarized below are the rules for how you can combine matrix addition, scalar multiplication, and matrix-matrix multiplication. These are properties you should have seen in Math 1410, so we will not go over them in detail, but it is important to keep these in mind as we progress through the semester. The big thing to remember is that matrix-matrix multiplication comes with more rules than traditional scalar multiplication.

Example 1 (Matrix Multiplication Cautions)

Here, we’ll give some examples of when matrices don’t commute:

\begin{align*} \bm 1 & 2 \\3 & 4 \em \bm 2 & 0 \\1 & 2 \em = \bm 4 &4 \\10& 8\em \neq \bm 2 &4 \\ 7& 10 \em= \bm 2 & 0 \\ 1 & 2 \em \bm 1 & 2 \\ 3 & 4\em \end{align*}

(15)

It can also be the case that only one of the products exists. For example, the column vector $(0, 1, 2)$ and the row vector $\bm 3 &4\em$ .

As an additional exercise, can you think of some specific examples of when matrices do commute? For example, $AB = BA$ if $A = I$ or $B = I$ .

Here, we’ll give an example of when the cancellation law doesn’t hold (i.e., $AB = AC$ does not in general imply that $B = C$ ):

\begin{align*} \bm 0 & 1 \\ 0 & 1 \em \bm 0 & 1 \\ 0 & 1 \em = \bm 0 & 1 \\ 0 & 1 \em \end{align*}

(16)

and

\begin{align*} \bm 0 & 1 \\ 0 & 1 \em\bm 1 & 0 \\ 0 & 1 \em = \bm 0 & 1 \\ 0 & 1 \em \end{align*}

(17)

but of course,

\begin{align*} \bm 1 & 0 \\ 0 & 1 \em \neq \bm 0 & 1 \\ 0 & 1 \em \end{align*}

(18)

Here, we’ll show that the product of 2 nonzero matrices can be zero:

\begin{align*} \bm 0& 1 \\ 0 & 0\em^2 = \bm 0& 1 \\ 0 & 0\em\bm 0& 1 \\ 0 & 0\em = \bm 0& 0 \\ 0 & 0\em \end{align*}

(19)

10Linear Systems in Matrix-Vector Notation¶

Let’s use our new matrix-vector tools to compactly write out systems of linear equations. Recall that a general linear system of $m$ equations in $n$ unknowns takes the form

\begin{array}{cccl} a_{11} x_1 + a_{12} x_2 +& \cdots &+ a_{1n} x_n = & b_1,\\ a_{21} x_1 + a_{22} x_2 +& \cdots &+ a_{2n} x_n = & b_2,\\ \vdots & \vdots && \vdots \\ a_{m1} x_1 + a_{m2} x_2 +& \cdots &+ a_{mn} x_n = & b_m,\\ \end{array}

(20)

which we rewrite compactly as

A \vv x = \vv b.

(21)

Equation (21) is composed of three basic ingredients: the $m \times n$ coefficient matrix $A$ , with entries $a_{ij}$ as in (2), the column vector $\vv x = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_n\end{bmatrix}$ containing the unknowns or variables and the column vector $\vv b = \begin{bmatrix} b_1 \\ b_2 \\ \vdots \\ b_m \end{bmatrix}$ containing the right-hand sides. As you can see, it is a bit unwieldy to write column vectors inline, and so we will often equivalently write them as $\vv x = (x_1, x_2, \cdots, x_n)$ and $\vv b = (b_1, b_2, \cdots, b_m)$ instead.

Revisiting linear system (2), we see that the coefficient matrix $A$ , the unknown vector $\vv x$ , and the right hand side vector $\vv b$ can be read off as

A = \begin{bmatrix} 1 & 2 & 1 \\ 2 & 6 & 1 \\ 1 & 1 & 4 \end{bmatrix}, \quad \vv x = \begin{bmatrix} x_1 \\ x_2 \\ x_3 \end{bmatrix}, \quad \vv b = \begin{bmatrix} 2 \\ 7 \\ 3 \end{bmatrix}.

(22)

An important observatoion is that if a variable doesn’t appear in an equation, then the corresponding matrix entry is 0. For example, the following (admittedly silly) system of 2 equations in 2 unknowns

\begin{align*} x + y &= 2\\ y & = 1 \end{align*}

(23)

becomes

\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}\begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} 2 \\ 1\end{bmatrix}.

(24)