Geometrical view - Juan Klopper€¦ · Lecture01_Geometric_view_of_linear_systems 2015/03/28, 12:22 PM ... This notebook is part of lecture 1 The geometry of linear equations in

2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems

Page 1 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false

This notebook is part of lecture 1 The geometry of linear equations in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper

Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])

(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).

[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)

In this series of notebooks I will make use of a custom cascading style sheetThe file style.css must be in the same folder as the notebook fileThe first block of code executes the stylesheet

In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())

All the modules and function will be imported here

In [2]: import numpy as np # Using namespace abbreviation to import numerical pythonfrom sympy import init_printing, symbols, Matrix, Eq # Imporint only the# required functions in the sympy moduleimport matplotlib.pyplot as plt # Using namespace abbreviation to import# the pyplot submodule of matplotlibimport seaborn as sns # Using namespace abbreviation to import# the seaborn plotting libraryfrom IPython.display import Imagefrom warnings import filterwarnings

init_printing(use_latex = 'mathjax') # Used to print Latex to the screen%matplotlib inlinefilterwarnings('ignore') # Ignore those ugly pink warning boxes

In [3]: # Comments will be in this form# Comments are not executed

In [4]: x, y, z = symbols('x y z') # Creating symbolic mathematical variables as opposed to computer variables# These symbols can no longer be used as computer variable names

Geometrical view

System of linear equations

A set of variables (each of power one and not transcendental)Example

This can be represented as an augmented matrix

In [5]: A_augm = Matrix([[2, -1, 0], [-1, 2, 3]]) # Note the placement of ()'s and []'sA_augm # A_augm is a computer variable that contains the matrix

Out[1]:

2x − y = 0−x + 2y = 3

Out[5]: [ ]2−1

−12

03

mailto:[email protected]

http://creativecommons.org/licenses/by-nc/4.0/


http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm

http://ipython.org/



In [6]: # We can ask python what type of computer variable A_augm holdstype(A_augm) # We see that it is a mutable dense matrix

The matrix of coefficients:

In [7]: A = Matrix([[2, -1], [-1, 2]])A

The variable vector:

In [8]: x_vect = Matrix([x, y])x_vect

The solution vector:

In [9]: b_vect = Matrix([0, 3])b_vect

In [10]: Eq(A * x_vect, b_vect) # From Ax = b# The Eq function takes the arguments left-hand-side (LHS), right-handside (RHS) of the equation

The row picture

Out[6]: sympy.matrices.dense.MutableDenseMatrix

Out[7]: [ ]2−1

−12

Out[8]: [ ]xy

Out[9]: [ ]03

Out[10]: [ ] = [ ]2x − y−x + 2y

03



In [11]: # Don't be too concerned about the code for plotting# It does not form part of this series of notebooks

x_vals = np.linspace(-3, 3, 100) # Create 100 values between -3 and 3# Note that we cannot use the computer variable x, because it has been reserved above as a mathematical variable in# the symbols function

plt.figure(figsize = (10,8)) # Create a graph of size 10 by 8plt.plot(x_vals, 2 * x_vals) # Plot every single value created above with 2 times that values# Taken from the first equation which was y = 2x or f(x) = 2x# The plot takes the arguments (code between parentheses) of x,yplt.plot(x_vals, ((x_vals / 2) + (3 / 2))) # Also plot the second equationplt.show; # Draw the plot on screen

The column picture

In the column picture we look at the column vector associate with the variables:

It asks us to look at the linear combination of the columns

Performing this multiplication results in the same equation

In [12]: Eq(x * Matrix([2, -1]) + y * Matrix([-1, 2]), Matrix([0, 3]))

Out[11]: <function matplotlib.pyplot.show>

x [ ] + y [ ] = [ ]2−1

−12

03

Out[12]: [ ] = [ ]2x − y−x + 2y

03



In [13]: from mpl_toolkits.mplot3d import proj3d

fig = plt.figure(figsize = (10, 8))ax = fig.add_subplot(111, projection='3d')ax.plot([0, 2], [0, -1],zs=[0, 0])# The three sets of square bracket contain as first element the starting# point, i.e. 0, 0, 0 (as in x, y ,z coordinates)# The second element in each square bracket represents the end-point, i.e. 2, -1, 0 ax.plot([0, -1], [0, 2],zs=[0, 0])

plt.show();



In [14]: # Method adding arrow heads (very complicated)from mpl_toolkits.mplot3d import Axes3Dfrom itertools import product, combinationsfig = plt.figure(figsize = (10, 8))ax = fig.gca(projection='3d')ax.set_aspect("equal")

#draw a vectorfrom matplotlib.patches import FancyArrowPatchfrom mpl_toolkits.mplot3d import proj3d

class Arrow3D(FancyArrowPatch): def __init__(self, xs, ys, zs, *args, **kwargs): FancyArrowPatch.__init__(self, (0,0), (0,0), *args, **kwargs) self._verts3d = xs, ys, zs

def draw(self, renderer): xs3d, ys3d, zs3d = self._verts3d xs, ys, zs = proj3d.proj_transform(xs3d, ys3d, zs3d, renderer.M) self.set_positions((xs[0],ys[0]),(xs[1],ys[1])) FancyArrowPatch.draw(self, renderer)

a = Arrow3D([0, 2],[0, -1],[0, 0], mutation_scale=20, lw=1, arrowstyle="-|>", color="k")b = Arrow3D([0, -1],[0, 2],[0, 0], mutation_scale=20, lw=1, arrowstyle="-|>", color="k")ax.add_artist(a)ax.add_artist(b)plt.show()

The column view suggest that we need one of the first vector to be added to two times the second vector to get to point (0,3)

Note that we are working in the xy-planeForgetting for now the solution (0,3), if we took all the possible values (on the real line) for x and y, we would fill the whole planeLinear combinations of the two (column) vectors...

...and...

...fill �

It's easy to see that these two vectors are not linear combinations of each other (they don't lie on the same line)If this is so (they are linearly independent) and linear combinations of them fill the plane we say they span the plane (� )

[ ]2−1

[ ]−12

2

2



It's also easy to imagine that the xy-plane is filled with (all the points are filled with) vectors, i.e. I can find any coordinate by drawing a vector to itAll these vectors together can be called a setLet's call this set W and is equals �Later we will see that this vector space is a subspace of V = �We will also see that the vectors above span W, i.e. W = span(set of two vectors above)It will also be shown that this set of two vectors is a basis of W (they are linearly independent and they span W)� is of dimension two (2) as the whole space can be represented by a linear combination of just two vectors

The basis vectors for � are actually...

...which we commonly call

The 3-space picture

In [15]: A_augm = Matrix([[3, 2, -1, 2], [1, -2, -1, 3], [2, 1, -1, 1]])A_augm

In [16]: A_augm.rref()

In [17]: Image(filename = '3d.png')

In [ ]:

23

22

[ ] , [ ]10

01

,i j

3x + 2y − z = 2x − 2y − z = 32x + y − z = 1

Out[15]: ⎡

⎣⎢⎢

312

2−21

−1−1−1

231

⎤

⎦⎥⎥

Out[16]: ⎛

⎝

⎜⎜⎜ ,

⎡

⎣

⎢⎢⎢

100

010

001

52

− 32

52

⎤

⎦

⎥⎥⎥ [ ]0, 1, 2

⎞

⎠

⎟⎟⎟

Out[17]:

2015/03/28, 12:27 PMI_02_Overview

Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…n_overview_of_linear_algebra/I_02_Overview.ipynb?download=false

This notebook is part of the addition lecture An overview of key ideas in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrtfrom numpy.linalg import pinv, inv, det, svd, normfrom scipy.linalg import pinv2from warnings import filterwarnings

In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')

An overview of key ideas

Moving from vectors to matrices

Consider a position vector in three-dimensional spaceIt can be written as a column-vector

We can add constant scalar multiples of these vectors

This is simple vector additionIts easy to visualize that if we combine all possible combinations, that we start filling a plane through the originAdding a third vector that is not in this plane will extend all possible linear combinations to fill all of three-dimensional space

We now have the following

Notice how this last equation can be written in matrix form Ax=b

This is the column-view of matrix-vector multiplication as opposed to the row viewMatrices are seen a column, representing vectorsEach element of the column vector x is a scalar multiple of the corresponding column in the matrix A

Out[1]:

u =⎡

⎣⎢⎢

1−10

⎤

⎦⎥⎥

v =⎡

⎣⎢⎢

01

−1

⎤

⎦⎥⎥

u + v = bx1 x2

w =⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

u + v + w = bx1 x2 x3

=⎡

⎣⎢⎢

1−10

01

−1

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣⎢⎢

x1−x2 x1−x3 x2

⎤

⎦⎥⎥

+ + = = u + v + wx1

⎡

⎣⎢⎢

1−10

⎤

⎦⎥⎥ x2

⎡

⎣⎢⎢

01

−1

⎤

⎦⎥⎥ x3

⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

x1− +x1 x2− +x2 x3

x1 x2 x3





http://ipython.org/

2015/03/28, 12:27 PMI_02_Overview


Now consider the solution vector b

By substitution we we now have the following

This, though, looks like a matrix times b

This matrix is the inverse of A such that x=A b

The above matrix A is called a difference matrix as it took simple differences between the elements of vector xIt was lower triangularIts inverse became a sum matrixSo it was a good matrix, able to transform between x and b (back-and-forth) and therefor invertible and for every x has a specific inverseIt transforms x into b (maps)

Let's look at the code for this matrix which replaces w above

In [4]: x1, x2, x3, b1, b2, b3 = symbols('x1, x2, x3, b1, b2, b3') # Creating algebraic symbols# This reserves these symbols so as not to see them as computer variable names

In [5]: C = Matrix([[1, 0, -1], [-1, 1, 0], [0, -1, 1]]) # Creating a matrix and putting# it into a computer variable called CC # Displaying it to the screen

In [6]: x_vect = Matrix([[x1], [x2], [x3]]) # Giving this columns vector a computer# variable namex_vect

In [7]: C * x_vect

We now have three equations

Adding the left and right sides we get the following

We are now constrained for values of b

The problem is clear to see geometrically as the new w is in the same plane as u and vIn essence w did not add anythingAll combinations of u, v, and w will still be in the planeThe first matrix A above had three independent columns and their linear combinations could fill all of three-dimensional spaceThat made the first matrix A invertible as opposed to the second one (C), which is not invertible (i.e. it cannot take any vector in three-dimensional space back to x)

= =⎡

⎣⎢⎢

1−10

01

−1

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣⎢⎢

x1−x2 x1−x3 x2

⎤

⎦⎥⎥

⎡

⎣⎢⎢

b1b2b3

⎤

⎦⎥⎥

=⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣⎢⎢

b1+b1 b2

+ +b1 b2 b2

⎤

⎦⎥⎥

⎡

⎣⎢⎢

111

011

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

b1b2b3

⎤

⎦⎥⎥

-1

Out[5]: ⎡

⎣⎢⎢

1−10

01

−1

−101

⎤

⎦⎥⎥

Out[6]: ⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

Out[7]: ⎡

⎣⎢⎢

−x1 x3− +x1 x2− +x2 x3

⎤

⎦⎥⎥

− =x1 x3 b1− =x2 x1 b2− =x3 x2 b3

0 = + +b1 b2 b3i

2015/03/28, 12:27 PMI_02_Overview


Let's look at the original column vectors in CRemember the following dot product

In linear algebra getting the dot product of two vectors is written as follows

Which is the transpose of the second times the first

In [8]: u = Matrix([[1], [-1], [0]])v = Matrix([[0], [1], [-1]])w = Matrix([[-1], [0], [1]])u, v, w

In [9]: v.transpose() * u

In [10]: w.transpose() * u

In [11]: w.transpose() * v

In [12]: u.transpose() * v

In [13]: u.transpose() * w

In [14]: v.transpose() * w

The angle between all of them is π radians and therefor they must all lie in a plane

Example problems

Example problem 1

Suppose A is a matrix with the following solution

What can you say about the columns of A?

Solution

In [15]: c = symbols('c')x_vect = Matrix([[0], [1 + 2 * c], [1 + c]])b = Matrix([[1], [4], [1], [1]])

a ⋅ b = ||a||||b|| cos θcos (π) = −1

a ⋅ b = abT

Out[8]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

1−10

⎤

⎦⎥⎥ ,

⎡

⎣⎢⎢

01

−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−101

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[9]: [ ]−1

Out[10]: [ ]−1

Out[11]: [ ]−1

Out[12]: [ ]−1

Out[13]: [ ]−1

Out[14]: [ ]−1

Ax =

⎡

⎣

⎢⎢⎢⎢

1411

⎤

⎦

⎥⎥⎥⎥

x = + c⎡

⎣⎢⎢

011

⎤

⎦⎥⎥

⎡

⎣⎢⎢

021

⎤

⎦⎥⎥

2015/03/28, 12:27 PMI_02_Overview


x is of size m × n is 3 × 1b is of size 4 × 1Therefor A must be of size 4 × 3 and each column vector in A is in �

Let's call these columns of A C , C , and C

With the particular way in which x was written we can say that we have a particular solution and a special solution

For c = 0 we have:

For c = 1 we have:

We also have that the following

For x we have the following

For x we have the following

Solving for C and C we have the following

As for the first column of A, we need to know more about ranks and subspacesWe see, though, that columns 2 and three are already constant multiples of each otherSo, as long as column 1 is not a constant multiple of b, we are safe

In [ ]:

4

1 2 3 ⎡

⎣

⎢⎢⎢⎢⎢

⋮C1

⋮⋮

⋮C2

⋮⋮

⋮C3

⋮⋮

⎤

⎦

⎥⎥⎥⎥⎥

A ( + c ⋅ ) = bxp xs

A = bxp

A + A = bxp xs

∵ A = bxp

b + A = bxs∴ A = 0xs

= , =xp

⎡

⎣⎢⎢

011

⎤

⎦⎥⎥ xs

⎡

⎣⎢⎢

021

⎤

⎦⎥⎥

p

= b ⇒ + = b

⎡

⎣

⎢⎢⎢⎢⎢

⋮C1

⋮⋮

⋮C2

⋮⋮

⋮C3

⋮⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣⎢⎢

011

⎤

⎦⎥⎥ C2 C3

s

= ⇒ 2 + = 0

⎡

⎣

⎢⎢⎢⎢⎢

⋮C1

⋮⋮

⋮C2

⋮⋮

⋮C3

⋮⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣⎢⎢

021

⎤

⎦⎥⎥ 0⎯⎯ C2 C3

2 3= −2C3 C2

− 2 = bC2 C2= −bC2= 2bC3

A =

⎡

⎣

⎢⎢⎢⎢⎢

⋮C1

⋮⋮

1411

2822

⎤

⎦

⎥⎥⎥⎥⎥

2015/03/28, 12:34 PMI_03_Elimination

Page 1 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false

This notebook is part of lecture 2 Elimination with matrices in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, eye, Rationalfrom warnings import filterwarnings


Elimination

A system of linear equations

Out[1]:





http://ipython.org/

2015/03/28, 12:34 PMI_03_Elimination


Linear refers to the fact that each variable appears on its own (i.e. to the power 1) and is not transcendtalA solution satisfies all of the equations at onceConsider the following linear set

A solution for x, y, and z could be as follows

Since this is a set ( of three) equations that have a solution (solutions) for the variable in common, all left- and all right hand sides can be manipulated in certain waysWe could simply exchange the order of the equations (here equations 2 and 3 have been exchanged; row exchange)

We could multiply both the left- and right-hand side of one of the equations with a scalar (here I multiply the first equation by 2)

Lastly, we can subtract a constant multiple of one equation from anotherThis serves an excellent purpose, as I can eliminate of one (or more) of the variables (give it a coefficient of 0)Remember that we are trying to solve for all three equations and have three unknownsWe can most definitely struggle by doing this problem algebraically by substitution, but linear algebra makes it much easierHere I have multiplies the first equation by 3 (both sides, so that we maintain integrity of the equation) and subtracted the left hand side of this newequation from the left-hand side of equation two and the new right-hand side of equation 1 from the right-hand side of equation twoThis is quite legitimate, as the left- and right-hand sides are equal (it is an equation after all) and so, when subtracting from equation 2, we are still doingthe same thing to the lfet-hand side as the right-hand side

This has introduced a noice zero for me in the second equationLet's go further and multiply equation 2 by 2 and subtract that from equation 3

Now let last equation is easy to solve for z

Knowing this I can go back up to equation 2 and solve for y

Finally up to equation 1

We need to have gone straight for substitution, indeed, we could have tried to get zeros above all our leading (non-zero) coefficientsLet's just clean up equation three by multiplying out by ⅕

Now we have to get rid of the -2z in equation 2 which we can do by multiplying equation 3 by -2 and subtracting from equations 2

Multiplying equation 2 by ½ gives us the following

Now we can do the same to get rid of the 1z in equation 1 (multiply equation 3 by 1 and subtracting from equation 1)

Now tow get rid of the 2y in equation 1, which is above our leading 1y in equation 2Simple enough, we multiply equation 2 by 2 and subtract that from equation 1

The solution is now clear for x, y, and z

1x + 2y + 1z = 23x + 8y + 1z = 120x + 4y + 1z = 2

1 (2) + 2 (1) + 1 (−2) = 23 (2) + 8 (1) + 1 (−2) = 120 (2) + 4 (1) + 1 (−2) = 2

1x + 2y + 1z = 20x + 4y + 1z = 23x + 8y + 1z = 12

2x + 4y + 2z = 43x + 8y + 1z = 120x + 4y + 1z = 2

1x + 2y + 1z = 20x + 2y − 2z = 60x + 4y + 1z = 2

1x + 2y + 1z = 20x + 2y − 2z = 6

0x + 0y + 5z = −10

z = −2

2y + 2(−2) = 6y = 1

x + 2(1) + 1(−2) = 2x = 2

1x + 2y + 1z = 20x + 2y − 2z = 6

0x + 0y + 1z = −2

1x + 2y + 1z = 20x + 2y − 0z = 2

0x + 0y + 1z = −2

1x + 2y + 1z = 20x + 1y + 0z = 1

0x + 0y + 1z = −2

1x + 2y + 0z = 40x + 1y + 0z = 1

0x + 0y + 1z = −2

1x + 0y + 0z = 20x + 1y + 0z = 1

0x + 0y + 1z = −2

2015/03/28, 12:34 PMI_03_Elimination


We need not rewrite all of the variables all the timeWe can simply write the coefficients

This is called the augmented matrix (right-hand side is added)A matrix has rows and columns (attcahed in position to our algebraic equation above; we simply omit the variables)

The left-upper entry is called the pivotOur aim is to get everything below it to be a zero (as we did with the algebra)We do exactely the same as we did above, which is multiply row 1 by 3 and subtract these new values from row 2

Now 2 times row 2 subtracted from row 3

Multiply the last row with ⅕

This show 1z to equal -2With this small matrix, it's easy to do back substitution as we did algebraically aboveThe first non-zero number in each row is the pivot (just like the upper-left entry)The steps we have taken up to this point is called Gauss elimination and the form we end up with is row-echelon formWe could carry on and do the same sort of thing to get rid of all the non-zero entries above each pivotThis is called Gauss-Jordan elimination and the result is reduced row-echelon form (see the computer code below)All of these steps are called elementary row operationsThe only one we didn't do is row exchange

We reserve this so as not to have leading (in the pivot position) zeros

In [4]: A_augmented = Matrix([[1, 2, 1, 2], [3, 8, 1, 12], [0, 4, 1, 2]])A_augmented

We can ask python™ to simply get the augmented matrix in reduced row-echelon form and read off the solutions

In [5]: A_augmented.rref() # The rref() method returns the reduced row-echelon form

So row one reads as follows

Elimination matrices

⎡

⎣⎢⎢

130

284

111

2122

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

224

1−21

262

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

220

1−25

26

−10

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

220

1−21

26

−2

⎤

⎦⎥⎥

Out[4]: ⎡

⎣⎢⎢

130

284

111

2122

⎤

⎦⎥⎥

Out[5]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

001

21

−2

⎤

⎦⎥⎥ [ ]0, 1, 2

⎞

⎠⎟⎟

1x + 0y + 0z = 2x = 2

2015/03/28, 12:34 PMI_03_Elimination


Matrices can only be multiplied by each other if in order we have the first column size equal the second row sizeRows are usually called m and columns nSo, our augmented matrix above will be m × n = 3 × 4Let's look at how matrices are multiplied by looking at two small matrices

The subscripts refer to row and column position, i.e. 21 means row 2 column 1We see that we have a 2 × 2 matrix times a 2 × 2 matrix

The inner two values are the same (2 and 2), so this multiplication is allowedThe resultant matrix will have the size equal to the outer two values (first row and last columns); here also 2 and 2

So let's look at position 11 (row 1 and column 1)To get this we take the entries in row 1 of the first matrix and multiply them by the entries in the first column of the second matrixWe do this element by element and add the multiplication of each set of separate elements tow each otherThe python code below shows you exactly how this is done

In [6]: a11, a12, a21, a22, b11, b12, b21, b22 = symbols('a11 a12 a21 a22 b11 b12 b21 b22')

In [7]: A = Matrix([[a11, a12], [a21, a22]])B = Matrix([[b11, b12], [b21, b22]])A, B

In [8]: A * B

Let's constrain ourselves to the matrix of coefficients (this discards the right-hand side from the augmented matrix above)

In [9]: A = Matrix([[1, 2, 1], [3, 8, 1], [0, 4, 1]]) # I use the same computer variable above, which# will change its value in the computer memoryA # A 3 by 3 matrix, which we call square

The identity matrix is akin to the number 1, i.e. multiplying by it leaves everything unchangedIt has 1 along what is called the main diagonal and 0 everywhere else

In [10]: I = eye(3) # Identity matrices are always square and the argument# here is 3, so it is a 3 by 3 matrixI # Note what the main diagonal is

Let's multiply this by A

In [11]: I * A # Nothing will change

To get rid of the leading 3 in row 2 (because we want a zero under the pivot 1 in row 1), we multiplied row 1 by 3 and subtracted that from row 2Interestingly enough we can do something to this identity matrix that when multiplied by A will results in the first step we have aboveSince we required to subtract 3 times the first row from the 2 (it's all about that 3 in row 2, column 1), we can do the following

In [12]: E21 = Matrix([[1, 0, 0], [-3, 1, 0], [0, 0, 1]])E21 # 21 because we are working on row 2, column 1

[ ] [ ]a11a21

a12a22

b11b21

b12b22

Out[7]: ( )[ ] ,a11a21

a12a22 [ ]b11

b21

b12b22

Out[8]: [ ]+a11 b11 a12 b21+a21 b11 a22 b21

+a11 b12 a12 b22+a21 b12 a22 b22

Out[9]: ⎡

⎣⎢⎢

130

284

111

⎤

⎦⎥⎥

's 's

Out[10]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

Out[11]: ⎡

⎣⎢⎢

130

284

111

⎤

⎦⎥⎥

Out[12]: ⎡

⎣⎢⎢

1−30

010

001

⎤

⎦⎥⎥

2015/03/28, 12:34 PMI_03_Elimination


That gives us the required 3 times row 1 and the negative shows that we subtract (add the negative)It's a thing of beauty

In [13]: E21 * A

Just what we wantedE1 is called the first elimination matrix

Let's do something to the identity matrix to get rif of the 4 in row 3 column 2It would require 2 times row 2 subtracted from row 3Look carefully at the positions

In [14]: E32 = Matrix([[1, 0, 0], [0, 1, 0], [0, -2, 1]])E32

In [15]: E32 * (E21 * A)

Spot on!We now have nice pivots (leading non-zeros), with nothing under themAs a tip, try not to get fractions involvedAs far as the other two row operations are concerned, we can either exchange rows in the identity matrix or multiply the required row by a scalar constant

Look at what happens we multiply E2 and E1

In [16]: L_inv = E32 * E21L_inv

Later we'll call this matrix the inverse of LIt is in triangular form, in this case lower triangular (note all the zeros above the main diagonal)

In [17]: L_inv * A # Later we'll call this result the matrix U


If we can get the inverse of the inverse of L we'll have the following

The inverse of a square matrix multiplied by itself gives the identity matrix

We can construct L from E32 and E21 above

Out[13]: ⎡

⎣⎢⎢

100

224

1−21

⎤

⎦⎥⎥

Out[14]: ⎡

⎣⎢⎢

100

01

−2

001

⎤

⎦⎥⎥

Out[15]: ⎡

⎣⎢⎢

100

220

1−25

⎤

⎦⎥⎥

Out[16]: ⎡

⎣⎢⎢

1−36

01

−2

001

⎤

⎦⎥⎥

Out[17]: ⎡

⎣⎢⎢

100

220

1−25

⎤

⎦⎥⎥

A = UL−1

L A = LUL−1

IA = LUA = LU

= UE−121 E−1

32 E32E21 E−121 E−1

32∴ = LE−1

21 E−132

2015/03/28, 12:34 PMI_03_Elimination


In [18]: E21.inv() # The inverse is easy to understand in words# We just want to add 3 instead of subtracting 3

In [19]: E32.inv()

In [20]: E21.inv() * E32.inv()

This is exactly the inverse of our inverse of L above

In [21]: L_inv.inv()

This is called LU-decomposition of AMore about this in two chapter from now (I_05_LU_decomposition)

As an aside we can also do elementary column operation, but then we have to multiply on the right of A and not on the left as above

Example problems

Example problem 1

Solve the following linear set (set of linear equations)

Solution

In [22]: A_augm = Matrix([[1, -1, -1, 1, 0], [2, 0, 2, 0, 8], [0, -1, -2, 0, -8], [3, -3, -2, 4, 7]])A_augm

In [23]: A_augm.rref()

Whoa! That was easy!Let's take it a notch down and do some elementary matricesFirst off, we want the matrix of coefficients

Out[18]: ⎡

⎣⎢⎢

130

010

001

⎤

⎦⎥⎥

Out[19]: ⎡

⎣⎢⎢

100

012

001

⎤

⎦⎥⎥

Out[20]: ⎡

⎣⎢⎢

130

012

001

⎤

⎦⎥⎥

Out[21]: ⎡

⎣⎢⎢

130

012

001

⎤

⎦⎥⎥

x − y − z + u = 02x + 2z = 8

− y − 2z = −83x − 3y − 2z + 4u = 7

Out[22]: ⎡

⎣

⎢⎢⎢⎢

1203

−10

−1−3

−12

−2−2

1004

08

−87

⎤

⎦

⎥⎥⎥⎥

Out[23]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1000

0100

0010

0001

1234

⎤

⎦

⎥⎥⎥⎥[ ]0, 1, 2, 3

⎞

⎠

⎟⎟⎟⎟

2015/03/28, 12:34 PMI_03_Elimination


In [24]: A = Matrix([[1, -1, -1, 1], [2, 0, 2, 0], [0, -1, -2, 0], [3, -3, -2, 4]])A

Now we need to get rid of the 2 in position row 2, column 1We start by numbering the elementary matrix by this position and modifying the identity matrix

In [25]: E21 = Matrix([[1, 0, 0, 0], [-2, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])E21 * A

Now for position row 3, column 2We have to use row 2 to do thisIf we used row 1, we would introduce a non-zero into position row 3, column 1

In [26]: E32 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, Rational(1, 2), 1, 0], [0, 0, 0, 1]])E32 * (E21 * A)

Now for the 3 in position row 4, column 1

In [27]: E41 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [-3, 0, 0, 1]])E41 * (E32 * E21 * A)

Let's exchange rows 3 and 4

In [28]: Ee34 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0]])Ee34 * E41 * E32 * E21 * A

Let's see where that leaves b, after all, what we do to the left, we must do to the right

In [29]: b_vect = Matrix([[0], [8], [-8], [7]])b_vect

Out[24]: ⎡

⎣

⎢⎢⎢⎢

1203

−10

−1−3

−12

−2−2

1004

⎤

⎦

⎥⎥⎥⎥

Out[25]: ⎡

⎣

⎢⎢⎢⎢

1003

−12

−1−3

−14

−2−2

1−204

⎤

⎦

⎥⎥⎥⎥

Out[26]: ⎡

⎣

⎢⎢⎢⎢

1003

−120

−3

−140

−2

1−2−14

⎤

⎦

⎥⎥⎥⎥

Out[27]: ⎡

⎣

⎢⎢⎢⎢

1000

−1200

−1401

1−2−11

⎤

⎦

⎥⎥⎥⎥

Out[28]: ⎡

⎣

⎢⎢⎢⎢

1000

−1200

−1410

1−21

−1

⎤

⎦

⎥⎥⎥⎥

× × × Ax = × × × bEe34 E41 E32 E21 Ee34 E41 E32 E21

Out[29]: ⎡

⎣

⎢⎢⎢⎢

08

−87

⎤

⎦

⎥⎥⎥⎥

2015/03/28, 12:34 PMI_03_Elimination


In [30]: Ee34 * E41 * E32 * E21 * b_vect

Let's print them next to each other on the screen

In [31]: Ee34 * E41 * E32 * E21 * A, Ee34 * E41 * E32 * E21 * b_vect

So we can simply do back substitutionWe note that -1u = -4 and thus u = 4From here, we work our way back up

In [ ]:

Out[30]: ⎡

⎣

⎢⎢⎢⎢

087

−4

⎤

⎦

⎥⎥⎥⎥

Out[31]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1000

−1200

−1410

1−21

−1

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

087

−4

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

−1(u) = −4 ∴ u = 41(z) + 1(4) = 7 ∴ z = 3

2(y) + 4(3) − 2(4) = 8 ∴ y = 21(x) − 1(2) − 1(3) + 1(4) = 0 ∴ x = 1

2015/03/28, 12:35 PMI_04_Matrix_multiplication_Inverses

Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…nspose/I_04_Matrix_multiplication_Inverses.ipynb?download=false

This notebook is part of lecture 3 Multiplication and inverse matrices in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper







Matrix multiplication, inverse and transpose

Multiplying matrices

Method 1

Consider multiply matrices A and B to result in CWe have already seen that the column size of the first must equal the row size of the second, n must equal m

C will then be of size m × nEvery position c , with i as the row position and j as the column position is calculated by taking the dot product (i.e. each element times it's corresponding element, alladded), c = (row i in A ⋅ column j of B)Here we calculate the row 2, column 1 position in C by the dot product of row 2 in A by column 1 in B

Notice how this multiplication is only possible because the row size of A equals the column size of B

Method 2

Out[1]:

A B× ⋅ ×mA nA mB nB= ⋅mA nB

A Bij

ij

=

⎡

⎣

⎢⎢⎢⎢

⋯3⋯⋯

⋯2⋯⋯

⋯−1⋯⋯

⎤

⎦

⎥⎥⎥⎥4×3

⎡

⎣

⎢⎢⎢121

⋮⋮⋮

⎤

⎦

⎥⎥⎥3×2

⎡

⎣

⎢⎢⎢⎢

c11(3 × 1) + (2 × 2) + (−1 × 1)

c31c41

c12c22c32c42

⎤

⎦

⎥⎥⎥⎥4×2

=

⎡

⎣

⎢⎢⎢⎢

⋯a21⋯⋯

⋯a22⋯⋯

⋯a23⋯⋯

⎤

⎦

⎥⎥⎥⎥4×3

⎡

⎣

⎢⎢⎢b11

b21

b31

⋮⋮⋮

⎤

⎦

⎥⎥⎥3×2

⎡

⎣

⎢⎢⎢⎢

c11( ) + ( ) + ( )a21 b11 a22 b21 a23 b31

c31c41

c12c22c32c42

⎤

⎦

⎥⎥⎥⎥4×2

= ∑k=1

na2kbk1





http://ipython.org/



In this method we note that each column in C is the result of the matrix A times the corresponding column in BThis is akin to a matrix multiplied by a vector Ax=bWe see B as made up of vector columnsThe columns of C are thus combinations of columns of A

The numbers in the corresponding columns in B is this combination

Method 3

Here every row in A produces the same numbered row in C by multiplying it with the matrix BThe rows of C are linear combinations of B

Method 4

In method 1 we looked at row × col producing a single number in CWhat if we did column × row?The size of column of A is m × 1 and a row of B is of size 1 × nThis results in C of size m × nLet's look at a simple example using python (with sympy)

In [4]: A = Matrix([[2], [3], [4]])B = Matrix([[1, 6]])A, B

In [5]: C = A * BC

Notice how the columns of C are linear combinations of the values in the columns of AThe rows of C are multiples of the rows of BSo in method 4, C is the sum of the columns of A × the rows of B

Block multiplication

Combining the above we can do the followingBoth A and B can be broken into block of sizes that allow for multiplicationHere is an example of two square matrices

Inverses

If the inverse of a matrix A exists then A =I, the identity matrixAbove is a left inverse, but what about a right inverse, AA ?

This is also equal to the identity for invertible square inversesInvertible matrices are also called non-singular matrices

A B

A BA B

Out[4]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

234

⎤

⎦⎥⎥ [ ]1 6

⎞

⎠⎟⎟

Out[5]: ⎡

⎣⎢⎢

234

121824

⎤

⎦⎥⎥

[ ] = [ ] + [ ]⎡

⎣⎢⎢

a11a21a31

a12a22a32

⎤

⎦⎥⎥

b11b21

b12b22

⎡

⎣⎢⎢

a11a21a31

⎤

⎦⎥⎥ b11 b12

⎡

⎣⎢⎢

a12a22a32

⎤

⎦⎥⎥ b21 b22

[ ] [ ] = [ ]A1A3

A2A4

B1B3

B2B4

+A1B1 A2B3+A3B1 A4B3

+A1B2 A2B4+A3B2 A4B4

-1-1



Non-invertible matrices are also called singular matricesAn example would look like this

Note how the elements on row two are just two times the elements in row 1 (A linear combination)The same go for the columns, the first being a linear combination of the second, multiplying each element by 3More profoundly, note that you could find a column vector x such that Ax=0

This says 3 times column 1 in A plus -1 times column 2 gives nothing

Let construct as example

In essence we have to solve two systemsA × column j of A = column j of IThis is the Gauss-Jordan idea of solving two systems at once

This gives us the two columns of AWe now create the augmented matrix

Now we use elementary row operations to reduced row-echelon form (leading 1 in the pivot positions, with 0 below and above each)

We now read off the two columns of A

To do all of the elimination, we created a lot of elimination (elementary) matricesIf we combine all of them into E we have E[AI]=[IA ], because EA=I tells us E=A

Example problems

Example problem 1

Find the conditions on a and b that makes the matrix A invertible and find A

Solution

[ ]12

36

[ ] [ ] = [ ]12

36

3−1

00

[ ] [ ] = [ ]12

37

ab

cd

10

01

-1

[ ] [ ] = [ ]12

37

ab

10

[ ] [ ] = [ ]12

37

cd

01

-1

[ ]12

37

10

01

's 's

[ ] → [ ] → [ ]12

37

10

01

10

31

1−2

01

10

01

7−2

−31

-1

[ ]7−2

−31

-1 -1

-1

A =⎡

⎣⎢⎢

aaa

baa

bba

⎤

⎦⎥⎥



A matrix is singular (non-invertible) if we have a row or column of zeros, so a ≠ 0We can also not have similar columns, so a ≠ bUsing Gauss-Jordan elimination we will have the following

Additionally then we note that for the inverse of A to exist a - b ≠ 0, which is the same as a ≠ b and again a ≠ 0

In [ ]:

→ →⎡

⎣⎢⎢

aaa

baa

bba

100

010

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

a00

ba − ba − b

b0

a − b

1−1−1

010

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

a00

ba − b

0

b0

a − b

1−10

01

−1

001

⎤

⎦⎥⎥

→ →⎡

⎣

⎢⎢⎢

a00

ba−ba−b

0

b0

a−ba−b

1−1a−b

0

01

a−b−1a−b

001

a−b

⎤

⎦

⎥⎥⎥

⎡

⎣

⎢⎢⎢

a00

b10

b01

1−1a−b

0

01

a−b−1

a−b

001

a−b

⎤

⎦

⎥⎥⎥

→ →

⎡

⎣

⎢⎢⎢

a

00

b

10

001

1−1a−b

0

(b)1a−b

1a−b−1a−b

− (b)1a−b

01

a−b

⎤

⎦

⎥⎥⎥

⎡

⎣

⎢⎢⎢

a

00

010

001

1 + ba−b

−1a−b

0

01

a−b−1a−b

− (b)1a−b

01

a−b

⎤

⎦

⎥⎥⎥

→

⎡

⎣

⎢⎢⎢⎢

100

010

001

1a−b−1a−b

0

01

a−b−1a−b

− (b)1a(a−b)

01

a−b

⎤

⎦

⎥⎥⎥⎥

=A−1 1a − b

⎡

⎣⎢⎢⎢

1−10

01

−1

−ba

01

⎤

⎦⎥⎥⎥

2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A

Page 1 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false

This notebook is part of lecture 4 Factorization into LU in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper





In [2]: import numpy as npfrom sympy import *import matplotlib.pyplot as pltimport seaborn as snsfrom IPython.display import Imagefrom warnings import filterwarnings

init_printing(use_latex = 'mathjax')%matplotlib inlinefilterwarnings('ignore')

LU decomposition of a matrix A

We will decompose the matrix A into and upper and lower triangular matrix, such that multiplying these will result back into A

Turning the matrix of coefficients into Upper triangular form

Consider the following matrix of coefficients

Successive elementary row operation followWhich is nothing other than matrix multiplication of the elementary matricesAn elementary matrix is an identity matrix on which one elementary row operation was performed

In [3]: A = Matrix([[1, -2, 1], [3, 2, -2], [6, -1, -1]])A

In [4]: eye(3)

We have to get a -3 in the first pivot (the 1 in row 1, column 1) to get rid of the 3 in position row 2, column 1 (we call the resulting matrix E21, referring to the row 2,column 1)Then we add the new row 1 to row two Row one of the identity matrix is then (-3,0,0) (but we leave it (1,0,0) in E21) and adding this to row 2 leaves (-3,1,0)

Out[1]:

A = LU

⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥

Out[3]: ⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥

Out[4]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥





http://ipython.org/



In [5]: E21 = Matrix([[1, 0, 0], [-3, 1, 0], [0, 0, 1]])E21

In [6]: E21 * A # The resulting matrix after multiplication by E21

We do the same to get rid of the 6 in row 3, column 1Multiplying row 1 (of the identity matrix) by -6 and adding this new row to row 3 (but again leaving row 1 as (1,0,0) in E31)

In [7]: E31 = Matrix([[1, 0, 0], [0, 1, 0], [-6, 0, 1]])E31

In [8]: E31 * E21 * A # This got rid of the leading 6 in row 3

Now the 8 in row 2, column 2 is the pivot and we need to get rid of the 11 in row 3, column 2Unfortunately we have an 8 and an 11 to deal withWe will have to do two elementary row operations

-11 times row 2 of the identity matrix (0,-11,0)Added to 8 times row 3 (0,0,8) ∴ (0,-11,8)

In [9]: E32 = Matrix([[1, 0 , 0], [0, 1, 0], [0, -11, 8]])E32

In [10]: U = E32 * E31 * E21 * AU # We call is U for upper triangular

The matrix is now in upper triangular form

Calculating the Lower triangular from

Note, to reverse this process we would have to do the following:

The inverse of a matrix can be calculated using the sympy method .inv()

We can check this with a Boolean request

In [11]: E21.inv() * E31.inv() * E32.inv() * E32 * E31 * E21 * A == A # The Boolean double equal signs asks: Is the# left-hand side equal to the right-hand side?

Out[5]: ⎡

⎣⎢⎢

1−30

010

001

⎤

⎦⎥⎥

Out[6]: ⎡

⎣⎢⎢

106

−28

−1

1−5−1

⎤

⎦⎥⎥

Out[7]: ⎡

⎣⎢⎢

10

−6

010

001

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢

100

−2811

1−5−7

⎤

⎦⎥⎥

Out[9]: ⎡

⎣⎢⎢

100

01

−11

008

⎤

⎦⎥⎥

Out[10]: ⎡

⎣⎢⎢

100

−280

1−5−1

⎤

⎦⎥⎥

( ) ( ) ( ) A = UE32 E31 E21

( ) ( ) ( ) A = A( )E21−1( )E31

−1( )E32−1 E32 E31 E21

Out[11]: True



Indeed, we will be back with the identity matrix just multiplying the inverse elementary matrices and the elementary matrices

In [12]: E21.inv() * E31.inv() * E32.inv() * E32 * E31 * E21

Multiplying the inverse elementary matrices on the left, must also have it happen on the right

The multiplication of these inverse elementary matrices is lower triangularWe can call in L

In [13]: L = E21.inv() * E31.inv() * E32.inv()L

In [14]: A == L * U # Checking this with a Boolean question

In [15]: A, L * U # They are identical

Doing this in one go using sympy

In [16]: L, U, _ = A.LUdecomposition()

In [17]: L

In [18]: U # Note the difference from the U above

In [19]: L * U # Back to A

What's special about L?

This only works when no row interchange happensIt also actually only works when doing the conventional subtracting the scalar multiplication of a row from another row, leaving the positive scalar as opposed to thenegatives I use, allowing me to add the two rows (as opposed to subtraction)Note the 3 (in row 2, column 1) and the 6 (in row 3, column 1)They are the row multiplications we have to do for E21 and E31The ¹¹ / ₈ is what we did for E32 (we just did it in two steps so as not to use fractions)

Out[12]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

( ) ( ) ( ) A = U( )E21−1( )E31

−1( )E32−1 E32 E31 E21 ( )E21

−1( )E31−1( )E32

−1

A = LU

Out[13]: ⎡

⎣⎢⎢⎢

136

01118

0018

⎤

⎦⎥⎥⎥

Out[14]: True

Out[15]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[17]: ⎡

⎣⎢⎢⎢

136

01118

001

⎤

⎦⎥⎥⎥

Out[18]: ⎡

⎣⎢⎢⎢

100

−280

1−5− 1

8

⎤

⎦⎥⎥⎥

Out[19]: ⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥



Row exchanges

We have to allow row exchanges if the pivot contains a zero

For an example, from a 3×3 identity matrix we could have:

In [20]: eye(3)

Exchanging rows one and two would be:

In [21]: Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]])

In [22]: A, Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]]) * A # Showing row exchange

How many permutations of row exchanges are there?

In a 3×3 matrix there are 3! = 6 permutationsMultiplying any of them will result in one of the 6They are inverses of each otherThe inverse are the transposes

For 4×4 there are 4! = 24

Example problems

Example problem 01

Perform LU decomposition of:

For which values of a and b does L and U exist?

Solution

In [23]: a, b = symbols('a b')

In [24]: A = Matrix([[1, 0, 1], [a, a, a], [b, b, a]])A

In [25]: L,U, _ = A.LUdecomposition()

Out[20]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

Out[21]: ⎡

⎣⎢⎢

010

100

001

⎤

⎦⎥⎥

Out[22]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

136

−22

−1

1−2−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

316

2−2−1

−21

−1

⎤

⎦⎥⎥

⎞

⎠⎟⎟

n!

⎡

⎣⎢⎢

1ab

0ab

1aa

⎤

⎦⎥⎥

Out[24]: ⎡

⎣⎢⎢

1ab

0ab

1aa

⎤

⎦⎥⎥



In [26]: L, U

Checking

In [27]: L * U == A

For existence:a ≠ 0

It's easy to see why, since if a equals zero, we will have a zero in a pivot position and we will have to do row exchange, which is not allowed for LU-decomposition

Hints and tips

In [28]: E21, E21.inv() # To take the inverse of an elementary matrix, simply change the sign of the off-diagonal elements and# multiply each element by 1 over the determinant# The determinant is easy to do for these *n* = 3 square matrices, since the top row is (1,0,0)

In [29]: E31, E31.inv()

In [30]: E32, E32.inv()

By keeping track of the elementary matrices it is easy to get L and UIt's easy to get the inverses of L and UThis means it is easy to calculate x

In [ ]:

Out[26]: ⎛

⎝⎜⎜⎜ ,

⎡

⎣⎢⎢⎢

1ab

01ba

001

⎤

⎦⎥⎥⎥

⎡

⎣⎢⎢

100

0a0

10

a − b

⎤

⎦⎥⎥

⎞

⎠⎟⎟⎟

Out[27]: True

Out[28]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

1−30

010

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

130

010

001

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[29]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

10

−6

010

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

106

010

001

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[30]: ⎛

⎝⎜⎜⎜ ,

⎡

⎣⎢⎢

100

01

−11

008

⎤

⎦⎥⎥

⎡

⎣⎢⎢⎢

100

01118

0018

⎤

⎦⎥⎥⎥

⎞

⎠⎟⎟⎟

Ax = LUx = bUx = bL−1

x = bU−1L−1

2015/03/28, 12:51 PMI_06_Transposes_Permutations_Spaces

Page 1 of 3http://localhost:8888/nbconvert/html/I_06_Transposes_Permutations_Spaces.ipynb?download=false

This notebook is part of lecture 5 Transposes, permutations, and vector spaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings


Transposes, permutations and vector spaces

The permutation matrices

Remember that the permutation matrices allow for row exchangesThey are used to manage zero's in pivot positionsThe have the following property

In [3]: P = Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]])P # Exchanging rows 1 and 2

In [4]: P.inv(), P.transpose()

In [5]: P.inv() == P.transpose()

If a matrix is of size n × n then there are n! number of permutations

The transpose of a matrix

Out[1]:

=P−1 PT

Out[3]: ⎡

⎣⎢⎢

010

100

001

⎤

⎦⎥⎥

Out[4]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

010

100

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

010

100

001

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[5]: True





http://ipython.org/



We have mentioned transposes of a matrix, but what are they?The simply make row of the column elements and columns of the row elements as in the example below

In [6]: a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34 = symbols('a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34')# Creating mathematical scalar constants

In [7]: A = Matrix([[a11, a12, a13], [a21, a22, a23], [a31, a32, a33]])A

In [8]: A.transpose()

This applies to any size matrix

In [9]: A = Matrix([[a11, a12, a13, a14], [a21, a22, a23, a24]])A


Multiplying a matrix by its transpose results in a symmetric matrix

In [11]: A * A.transpose()

Symmetric matrices

A symmetric matrix is a square matrix with elements opposite the main diagonal all equalExample

In [12]: S = Matrix([[1, 3, 2], [3, 2, 4], [2 , 4, 2]])S

On the main diagonal we have 1, 2, 2Opposite this main diagonal we have a 3 and a 3 and a 2 and a 2 and a 4 and a 4The transpose of a symmetric matrix is equal to the matrix

In [13]: S == S.transpose()

Vector spaces

Out[7]: ⎡

⎣⎢⎢

a11a21a31

a12a22a32

a13a23a33

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢

a11a12a13

a21a22a23

a31a32a33

⎤

⎦⎥⎥

Out[9]: [ ]a11a21

a12a22

a13a23

a14a24

Out[10]: ⎡

⎣

⎢⎢⎢⎢

a11a12a13a14

a21a22a23a24

⎤

⎦

⎥⎥⎥⎥

Out[11]: [ ]+ + +a211 a2

12 a213 a2

14+ + +a11 a21 a12 a22 a13 a23 a14 a24

+ + +a11 a21 a12 a22 a13 a23 a14 a24

+ + +a221 a2

22 a223 a2

24

Out[12]: ⎡

⎣⎢⎢

132

324

242

⎤

⎦⎥⎥

Out[13]: True



A vector space is a bunch of vectors (a set of vectors) With certain properties that allow us to do stuff withThe space � is all vectors of two components that reaches every coordinate point in �It always includes the zero vector 0We usually call this vector space V, such that V = � or V = �A linear combination of a certain number of these can also fill all of �A good example is the two unit vectors along the two axesSuch a set of vectors form a basis for VThe two of them also span � , i.e. a linear combination of them fills V = �Linear independence means the vectors in � don't fall on the same line

If they do, we can't get to all coordinate points in �The important point about a vector space V is that it allows for vector addition and scalar multiplication

Taking any of the set of vectors in V and adding them results in a new vector which is still a component of VMultiplying a scalar by any of the vectors in V results in a vector still in V

A subspace

For a subspace the rules of vector addition and scalar multiplication must applyI.e. a quadrant of � is not a vector subspace

Addition or scalar multiplication of any vector in this quadrant can lead to a vector outside of this quadrantThe zero vector 0 is a subspace (every subspace must contain the zero vector)The whole space V = � (here we use n = 2) is a subspace of itselfContinuing with our example of n = 2, any line through the origin is a subspace of �

Adding a vector on this line to itself of a scalar multiple of itself will eventually fill the whole lineFor n = 3 we have the whole space V = � , a plane through the origin, a line through the origin and the zero vectors are all subspace of V = �The point is that vector addition and scalar multiplication of vectors in the subspace must result in a new vector that remains in the subspaceEvery subspace must include the zero vector 0All the properties of vectors must apply to the vectors in a subspace (and a space)

Column spaces of matrices

Here we see the columns of a matrix as a vectorIf there are two columns and three rows we will have the following as an example

If they are not linear combinations of each other addition and scalar multiplication of the two of them will fill a plane in �

In [ ]:

2 2

2 n2

2 22

2

2

n2

3 3

= +⎡

⎣⎢⎢

212

132

⎤

⎦⎥⎥

⎡

⎣⎢⎢

212

⎤

⎦⎥⎥

⎡

⎣⎢⎢

132

⎤

⎦⎥⎥

3

2015/03/28, 1:11 PMI_07_Column_and_null_spaces

Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…rt_Strang/ZIP/I_07_Column_and_null_spaces.ipynb?download=false

This notebook is part of lecture 6 Columnspace and nullspace in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Columnspace and nullspace of a matrix

Columnspaces of matrices

We saw in the previous lecture that columns of a matrix can form vectorsConsider now the LU-decomposition of A

The union P∪L (all vectors in P or L or both) is NOT a subspaceThe intersection P∩L (or vectors in P and L) is a subspace (because their intersection is only the zero vector)

The intersection of any two subspaces is a subspace

Consider the following example matrix

In [3]: A = Matrix([[1, 1, 2], [2, 1, 3], [3, 1, 4], [4, 1, 5]])A

Each of the column spaces are vectors (column space) in �

The linear combinations of all the column vectors form a subspaceIs it the whole V = � , though?

Out[1]:

PA = PLU

Out[3]: ⎡

⎣

⎢⎢⎢⎢

1234

1111

2345

⎤

⎦

⎥⎥⎥⎥

4

4





http://ipython.org/



The reason why we ask is because we want to bring it back to a system of linear equations and ask the question: Is there (always) a solution to the following:

Thus, which right-hand sides b are allowed?In our example above we are in � and we ask if linear combination of all of them fill �

From our example above some right-hand sides will be allowed (they form a subspace)Let's look at an example for b

In [4]: x1, x2, x3 = symbols('x1, x2, x3')vec_x = Matrix([x1, x2, x3])b = Matrix([1, 2, 3, 4])A, vec_x, b

In [5]: A * vec_x

You can do the row multiplication, but it's easy to see from above we are asking about linear combinations of the columns, i.e. how many (x ) of column 1 plus how many(x ) of column 2 plus how many (x ) of column 3 equals b?Well, since b is the same as the first column, x would be

So we can solve for all values of b if b is in the column space

Linear independence

We really need to know if the columns above are linearly independentWe note that column three above is a linear combination of the first two, so adds nothing newActually, we could also throw away the first one because it is column 3 plus -1 times column 2Same for column 2We thus have two columns left and we say that the column space is of dimension 2 (a 2-dimensional subspace of � )

The nullspace

It contains all solutions x for Ax=0This solution(s) is in �

In [6]: zero_b = Matrix([0, 0, 0, 0])A, vec_x, zero_b

A =x⎯⎯⎯ b⎯ ⎯⎯

4 4

Out[4]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1234

1111

2345

⎤

⎦

⎥⎥⎥⎥,

⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣

⎢⎢⎢⎢

1234

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

Out[5]: ⎡

⎣

⎢⎢⎢⎢

+ + 2x1 x2 x32 + + 3x1 x2 x33 + + 4x1 x2 x34 + + 5x1 x2 x3

⎤

⎦

⎥⎥⎥⎥

12 3

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

4

3

Out[6]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1234

1111

2345

⎤

⎦

⎥⎥⎥⎥,

⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣

⎢⎢⎢⎢

0000

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟



Some solutions would be

In fact, we have:

It is thus a lineThe nullspace is a line in �

PLEASE remember, for any space the rules of addition and scalar multiplication must hold for vectors to remain in that space

In [ ]:

⎡

⎣⎢⎢

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

11

−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

22

−2

⎤

⎦⎥⎥

c⎡

⎣⎢⎢

11

−1

⎤

⎦⎥⎥

3

2015/03/28, 1:15 PMI_08_Solving_homogeneous_systems_Pivot_variables_Special_solutions

Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_systems_Pivot_variables_Special_solutions.ipynb?download=false

This notebook is part of lecture 7 Solving Ax=0, pivot variables, and special solutions in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Solving homogeneous systemsPivot variablesSpecial solutions

We are trying to solve a system of linear equationsFor homogeneous systems the right-hand side is the zero vectorConsider the example below

In [3]: A = Matrix([[1, 2, 2, 2], [2, 4, 6, 8], [3, 6, 8, 10]])A # A 3x4 matrix

In [4]: x1, x2, x3, x4 = symbols('x1, x2, x3, x4')

x_vect = Matrix([x1, x2, x3, x4]) # A 4x1 matrixx_vect

In [5]: b = Matrix([0, 0, 0])b # A 3x1 matrix

The x column vector is a set of all the solutions to this homogeneous equationIt forms the nullspaceNote that the column vectors in A are not linearly independent

Out[1]:

Out[3]: ⎡

⎣⎢⎢

123

246

268

2810

⎤

⎦⎥⎥

Out[4]: ⎡

⎣

⎢⎢⎢⎢

x1x2x3x4

⎤

⎦

⎥⎥⎥⎥

Out[5]: ⎡

⎣⎢⎢

000

⎤

⎦⎥⎥





http://ipython.org/



Performing elementary row operations leaves us with the matrix belowIt has two pivots, which is termed rank 2

In [6]: A.rref() # rref being reduced row echelon form

Which represents the following

We are free set a value for x , let's sat t

We will have to make x equal to another variable, say s

This results in the following, which is the complete nullspace and has dimension 2

From the above, we clearly have two vectors in the solution and we can take constant multiples of these to fill up our solution space (our nullspace)

We can easily calculate how many free variables we will have by subtracting the number of pivots (rank) from the number of variables (x) in xHere we have 4 - 2 = 2

Example problem

Calculate x for the transpose of A above

Solution

In [7]: A_trans = A.transpose() # Creating a new matrix called A_trans and giving it the value of the inverse of AA_trans

In [8]: A_trans.rref() # In reduced row echelon form this would be the following matrix

Out[6]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

200

010

−220

⎤

⎦⎥⎥ [ ]0, 2

⎞

⎠⎟⎟

+ + + =x1

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥ x2

⎡

⎣⎢⎢

200

⎤

⎦⎥⎥ x3

⎡

⎣⎢⎢

010

⎤

⎦⎥⎥ x4

⎡

⎣⎢⎢

−220

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

⎤

⎦⎥⎥

+ 2 + 0 − 2 = 0x1 x2 x3 x40 + 0 + + 2 = 0x1 x2 x3 x4

+ 0 + 0 + 0 = 0x1 x2 x3 x4

4+ 2 + 0 − 2 = 0x1 x2 x3 x4

0 + 0 + + 2t = 0x1 x2 x3+ 0 + 0 + 0 = 0x1 x2 x3 x4

∴ = −2tx3

2+ 2s + 0 − 2t = 0x1 x3∴ = 2t − 2sx1

= = + = s + t

⎡

⎣

⎢⎢⎢⎢

x1x2x3x4

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−2s + 2ts

−2tt

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−2ss00

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

2t0

−2tt

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−2100

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

20

−21

⎤

⎦

⎥⎥⎥⎥

Out[7]: ⎡

⎣

⎢⎢⎢⎢

1222

2468

36810

⎤

⎦

⎥⎥⎥⎥

Out[8]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1000

0100

1100

⎤

⎦

⎥⎥⎥⎥[ ]0, 1

⎞

⎠

⎟⎟⎟⎟



Remember this is 4 equations in 3 unknowns, i.e.

It seems we are free to choose a value for xLet's make is t

We had n = 3 unknowns and r (rank) = 2 pivotsThe solution set (nullspace) will thus have 1 variable (t) (3-2=1)

The third column is the sum of the first two, so only 2 columns are linearly independentWe thus expect 2 pivots and can predict the nullspace to have only 1 variable (i.e. it is one-dimensional)

In [ ]:

+ + =x1

⎡

⎣

⎢⎢⎢⎢

1000

⎤

⎦

⎥⎥⎥⎥x2

⎡

⎣

⎢⎢⎢⎢

0100

⎤

⎦

⎥⎥⎥⎥x3

⎡

⎣

⎢⎢⎢⎢

1100

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0000

⎤

⎦

⎥⎥⎥⎥+ 0 + = 0x1 x2 x3

0 + + = 0x1 x2 x30 + 0 + 0 = 0x1 x2 x30 + 0 + 0 = 0x1 x2 x3

3

t − t + t =

⎡

⎣

⎢⎢⎢⎢

1000

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0100

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

1100

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0000

⎤

⎦

⎥⎥⎥⎥= tx3

+ 0 + t = 0x1 x20 + + t = 0x1 x2

∴ = −tx2∴ = −tx1

= = t⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎡

⎣⎢⎢

t−tt

⎤

⎦⎥⎥

⎡

⎣⎢⎢

1−11

⎤

⎦⎥⎥

2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers

Page 1 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false

This notebook is part of lecture 22 Diagonalization and powers of A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Diagonalizing a matrixPowers of a matrix A

Definition

If A is a n×n, then a non-zero vector x in � is called an eigenvector of the matrix A if Ax is a scalar multiple of xWhat this suggests is that if you consider the column vector x and multiply it by a scalar (here called λ) (which is then parallel to x, just of different length) it results in thesame solution as multiplying the matrix A by xLet's try another explanation: if a matrix A, multiplied with a (column) vector (x) results in a scalar multiple of that same (column) vector (and is thus parallel to that(column) vector) then this (column) vector is an eigenvector of the matrix A

In essence this multiplication of a matrix with a (column) vector produces another vector on the same line as the original vectorDepending on the value of this scalar the resulting vector might point in the opposite direction and be shorter or longer than the original

This scalar multiple is called the eigenvalueMatrices can have more than one eigenvalue and eigenvector

Derivations

We need to insert an identity matrix of size n into the equation that describes the explanation above

Look at this carefully and you'll notice that we are suggesting the nullspace (eigenspace) of the matrix (A-λI)This matrix has to be singular, i.e. have a determinant of 0

Solving this equation (called the characteristic equation) will give us the eigenvalues (λ )It will always be a polynomial in λ (called the characteristic polynomial of A), with a leading coefficient of 1 and a degree of n corresponding to the size of A

Substituting them back into...

... allows us to calculate the eigenvector(s) x

Out[1]:

n

A = λx⎯⎯ x⎯⎯A = λIx⎯⎯ x⎯⎯

A − λI =x⎯⎯ x⎯⎯ 0⎯⎯(A − λI) =x⎯⎯ 0⎯⎯

A − λI = 0∣∣ ∣∣'s

p (λ) = + + ⋯ +λn c1 λn−1 cn

(A − λI) =x⎯⎯ x⎯⎯ 0⎯⎯





http://ipython.org/



Let's look at the following matrix A

Let's start with the first eigenvalue, which is equal to 1 and replace it in A-λI

In [4]: A = Matrix([[-1, 0 ,-2], [1, 1, 1], [1, 0, 2]])A

We now need the nullspace of this matrix

In [5]: A.nullspace()

We knew that this would be 1-dimensional after looking at the row-reduced form

In [6]: A.rref()

It has rank 2 (two pivot column and 1 free variable

Now for the other 2 eigenvalues, both equaling 2

In [7]: A = Matrix([[-2, 0, -2], [1, 0, 1], [1, 0, 1]])A


In [9]: A.rref()

Only a single pivot column, therefor rank of 1 and two independent (free) variables

A =⎡

⎣⎢⎢

011

020

−213

⎤

⎦⎥⎥

A − λI = − =⎡

⎣⎢⎢

011

020

−213

⎤

⎦⎥⎥

⎡

⎣⎢⎢

λ00

0λ0

00λ

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−λ11

02 − λ

0

−21

3 − λ

⎤

⎦⎥⎥

= 0∣

∣

∣∣∣

−λ11

02 − λ

0

−21

3 − λ

∣

∣

∣∣∣

− 5 + 8λ − 4 = 0λ3 λ2

= 1, = = 2λ1 λ2 λ3

Out[4]: ⎡

⎣⎢⎢

−111

010

−212

⎤

⎦⎥⎥

Out[5]: ⎡

⎣⎢⎢

⎡

⎣⎢⎢

−211

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[6]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

2−10

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

Out[7]: ⎡

⎣⎢⎢

−211

000

−211

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−101

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[9]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

000

100

⎤

⎦⎥⎥ [ ]0

⎞

⎠⎟⎟



Corresponding to the first eigenvalue we have a single eigenvector that is the basis for a 1-dimensional (line) eigenspace in �Corresponding to the second (and third) eigenvalues we have two basis vectors for a 2-dimensional plane in �Since we are talking about subspaces, we must note that the zero vector must be in both eigenspaces (type of nullspace), but isn't an eigenvector

The eigenvalues of triangular (upper and lower) and diagonal matrices

The eigenvalue of these type of matrices are exactly the entries along the main diagonal

Real and complex eigenvalues

There will be characteristic polynomials resulting in complex rootsThe consequences of real-valued eigenvalues for a square matrix A of size n are the following

The system (A-λI)x=0 has non-trivial solutionsThere is a non-zero vector x in � such that Ax=λx

The eigenvector matrix S and eigenvalue matrix Λ

We need to create S from the (column) eigenvectors such that the following holds

As such, S should be square of size n×n and invertible, so we need n independent eigenvectors

Suppose we have n linearly independent eigenvectors of APut them in the columns of S and calculate AS

From this we have the following

Later I will use the computer variable D for this diagonal matrix Λ

The power of a matrix (only for n independent eigenvectors)

We saw in the example section of the last lecture that the following holds

The eigenvectors are the same for A and AWe can also see the following

The power need not be 2, but any k which will have S appearing k-1 times

We thus have the following theorems

...and...If k is a positive integer, λ is an eigenvalue of the matrix A, and x is a corresponding eigenvector, then λ is an eigenvalue of A and x is a corresponding eigenvector

33

n

AS = ΛS−1

AS = A = =

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮x1

⋮

⋮⋮x2

⋮

⋮⋮…⋮

⋮⋮xn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮xλ1 1

⋮

⋮⋮

λ2x2

⋮

⋮⋮…⋮

⋮⋮

λnxn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮x1

⋮

⋮⋮x2

⋮

⋮⋮…⋮

⋮⋮xn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

λ10⋮0

0λ2

⋮0

00…0

00⋮λn

⎤

⎦

⎥⎥⎥⎥

AS = SΛ

AS = SΛAS = ΛS−1

A = SΛS−1

= xA2x⎯⎯ λ22

= SΛ SΛ = SA2 S−1 S−1 Λ2S−1

-1

→ 0 ∵ k → ∞;Ak ∣∣λi∣∣

k k



What makes a matrix diagonalizable

In discussing diagonalization we are concerned with finding a basis for � that consists of eigenvectors of a given square matrix of size nThese bases can tell us about geometric properties of A and it can simplify numerical computations involving A

We need to answer two question (which are actually the same)Given a square matrix of size n, is there a basis for � consisting of eigenvectors?Given a square matrix of size n, it there and invertible matrix S, such that S AS is a diagonal matrix? (It is the same matrix S referred to above)

If such a matrix S exists, it is said to diagonalize A (and we will call the resultant diagonal matrix D)

In short the answer to the above question(s) is yes if A has n independent eigenvectorsThis happens if all λ are different (none are repeated) (not totally excluded if they are repeated though)

If they are repeated, we still might have independent eigenvectors, i.e. any size identity matrix (because it is already diagonal)

In [10]: A = eye(5)A

In [11]: A.eigenvals()

In [12]: A.eigenvects()

Here we look at a triangular matrix, though

In [13]: A = Matrix([[2, 1], [0, 2]])A.eigenvals()


We can use python™ code to calculate the diagonalized matrix

In [15]: A = Matrix([[3, -2, 4, -2], [5, 3, -3, -2], [5, -2, 2, -2], [5, -2, -3, 3]])A

In [16]: S, D = A.diagonalize()

n

n-1

's

Out[10]: ⎡

⎣

⎢⎢⎢⎢⎢

10000

01000

00100

00010

00001

⎤

⎦

⎥⎥⎥⎥⎥

Out[11]: { }1 : 5

Out[12]: ⎡

⎣

⎢⎢⎢⎢⎢

⎛

⎝

⎜⎜⎜⎜⎜

1, 5,

⎡

⎣

⎢⎢⎢⎢⎢

,

⎡

⎣

⎢⎢⎢⎢⎢

10000

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

01000

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

00100

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

00010

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

00001

⎤

⎦

⎥⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟⎟

⎤

⎦

⎥⎥⎥⎥⎥

Out[13]: { }2 : 2

Out[14]: [ ]( )2, 2, [ ][ ]10

Out[15]: ⎡

⎣

⎢⎢⎢⎢

3555

−23

−2−2

4−32

−3

−2−2−23

⎤

⎦

⎥⎥⎥⎥



In [17]: S # S, such that A = S times D times the inverse of S

In [18]: D # The diagonal

In [19]: S * D * S.inv() == A # Checking to see if our statement above is correct

In [20]: S.inv() * A * S == D # Checking to see if our statement above is correct


Remember Λ from above?The eigenvalues are precisely the entries along the main diagonal of the diagonal matrix

To produce the required diagonal matrix manually then will require computing n linearly independent eigenvectors for matrix A of size n (assuming that it isdiagonalizable), creating a matrix with its columns equal to these eigenvectors (called matrix S) and performing the equation S AS to calculate the diagonal matrix D ( orΛ)

Back to the topic of what makes a matrix diagonalizable

Suppose we have an equation that starts with some vector and every subsequent vector is a matrix A time the previous vector

From this arises the following

To really solve this problem, rewrite u as follows (a certain scalar times an eigenvector)

Where the Sc is a linear combination of the individual eigenvectors

Now multiply both sides by A

Taking a power of A now (i.e. k) would be akin to taking each eigenvalue to that power

This can be written as

Out[17]: ⎡

⎣

⎢⎢⎢⎢

0111

1111

1110

0−101

⎤

⎦

⎥⎥⎥⎥

Out[18]: ⎡

⎣

⎢⎢⎢⎢

−2000

0300

0050

0005

⎤

⎦

⎥⎥⎥⎥

Out[19]: True

Out[20]: True

Out[21]: { }−2 : 1, 3 : 1, 5 : 2

-1

= Au⎯⎯k+1 u⎯⎯k

= Au⎯⎯1 u⎯⎯0

= A =u⎯⎯2 Au⎯⎯0 A2 u⎯⎯0=u⎯⎯k Aku⎯⎯0

0= + + ⋯ + = Su⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯n c⎯⎯

A = A + A + ⋯ + Au⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯nA = + + ⋯ +u⎯⎯0 c1 λ1x⎯⎯1 c2 λ2x⎯⎯2 cn λnx⎯⎯n

= + + ⋯ +Aku⎯⎯0 c1 λk1x⎯⎯1 c2 λk

2x⎯⎯2 cn λknx⎯⎯n

= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯



As an example consider the Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, ...What would the 100 number be?Consider the following

This is a (second-order) difference equation; think of this example as similar to a second-order differential equation (without derivatives)By adding a second equation F =F , consider u to be the following vector

This means the following

In [22]: A = Matrix([[1, 1], [1, 0]])A




In [26]: D

From above we remember the following

We have u contains the first two values

In [27]: u_zero = Matrix([1, 0])u_100 = A ** 100 * u_zerou_100 # The top value is the 100th Fibonacci number

In [28]: u_four = A ** 4 * u_zerou_four # If the first number is 0 the the fourth number would be the top value

Example problems

Example problem 1

Find an equation for C where C is given by the following matrix

Calculate C when a=b=-1

Solution

th

= +Fk+2 Fk+1 Fk

k+1 k+1 k

= [ ]u⎯⎯kFk+1Fk

= [ ] [ ] = [ ]u⎯⎯k+111

10

Fk+1Fk

+Fk+1 Fk

Fk+1

= [ ]u⎯⎯k+111

10

u⎯⎯k

Out[22]: [ ]11

10

Out[23]: { }+ : 1,12

5√2 − + : 15√

212

Out[24]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ + ,1

25√

2 1,⎡

⎣⎢⎢

⎡

⎣⎢⎢

− 1− +5√

212

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ − + ,5√

212 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− 1+1

25√

2

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

Out[26]: ⎡⎣⎢⎢

+12

5√2

0

0

− +5√2

12

⎤⎦⎥⎥

= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯0

Out[27]: [ ]573147844013817084101354224848179261915075

Out[28]: [ ]53

k

100



In [29]: a, b, k = symbols('a b k')

In [30]: C = Matrix([[2 * b - a, a - b], [2 * b - 2 * a, 2 * a - b]])C

We remember the following

Where Λ is denoted by the computer variable D

In [31]: S, D = C.diagonalize()

In [32]: S

Python™ is not always good at simplifying theseIf you look at it carefully you will note the following

In [33]: S = Matrix([[1, Rational(1, 2)], [1, 1]])S

In [34]: D

In [35]: D = Matrix([[b, 0], [0, a]])D

For the values given, we have the following

In [36]: C = Matrix([[-1, 0], [0, -1]])C


In [38]: D

In [39]: D ** 100

In [40]: S * (D ** 100) * S.inv()

Doing the same, but with eigenvalues and eigenvectors

Out[30]: [ ]−a + 2b−2a + 2b

a − b2a − b

= SAk ΛkS−1

Out[32]: ⎡

⎣⎢⎢

− 2a−2b

−3a+3b+ (a−b)2√1

2a−2b

3a−3b+ (a−b)2√1

⎤

⎦⎥⎥

Out[33]: [ ]11

121

Out[34]: ⎡⎣⎢⎢

+ −a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

0

0

+ +a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

⎤⎦⎥⎥

Out[35]: [ ]b0

0a

Out[36]: [ ]−10

0−1

Out[38]: [ ]−10

0−1

Out[39]: [ ]10

01

Out[40]: [ ]10

01




In [42]: C.eigenvals()

This simplifies the λ = b and λ = aThat makes Λ (or D) the following

In [43]: D = Matrix([[b, 0], [0, a]])D

In [44]: C.eigenvects() # The solution is two tuples, with each being eigenvalue, eigenvector

This simplifies to the following eigenvalue matrix S


We can see if we can get back to C

In [46]: S * D * S.inv()

In [47]: S * D * S.inv() == C

Python™ won't to D for you, but it's easy to do yourself

In [48]: D = Matrix([[b ** k, 0], [0, a ** k]])D

Now we can compute SΛS

In [49]: S * D * S.inv()

Placing the given values into this equation will give you the same solution for C as above

In [ ]:

Out[41]: [ ]−a + 2b−2a + 2b

a − b2a − b

Out[42]: { }+ − : 1,a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ + + : 1a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

1 2

Out[43]: [ ]b0

0a

Out[44]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ + − ,a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− a−b

− + +3a2

3b2

12 (a−b)2√

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ + + ,a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− a−b

− + −3a2

3b2

12 (a−b)2√

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

Out[45]: [ ]11

121

Out[46]: [ ]−a + 2b−2a + 2b

a − b2a − b

Out[47]: True

k

Out[48]: [ ]bk

00ak

-1

Out[49]: [ ]− + 2ak bk

−2 + 2ak bk−ak bk

2 −ak bk

100

2015/03/28, 1:31 PMI_10_Independence_Spanning_Basis_Dimension

Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathemati…10_Independence_Spanning_Basis_Dimension.ipynb?download=false

This notebook is part of lecture 9 Independence, basis and dimension in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper






init_printing(use_latex = 'mathjax') # Pretty Latex printing to the screen#%matplotlib inlinefilterwarnings('ignore')

IndependenceSpanningBasisDimension

Independence

Vectors are linearly independent ifNo combination of these vectors results in the zero vector (except the zero combinations)

In 2-space, this means that they should noy be on the same line through the originIn 3-space they should not be on the same line through the origin or on a plane through the originIn higher-dimensional space they should not be on the same line through the origin or a hyperplane through the origin

If they are independent by the constraints above, only a zero combination of them will results in zeroIf there are vectors in the nullspace (apart from the zero vector), then (there is a linear combination that will give zero and) then the vectors are not linearly independent

In [3]: A= Matrix([[1, 2, 4], [3, 1, 4]])A

Here we will have a rank of 2 (2 pivots) and 3 unknowns and 2 rowsThus, r = m (full row rank)We are left with n - r freen variable, i.e. 3 - 2 = 1 and will have one vector in the nullspace

In [4]: A.rref()

Out[1]:

+ + ⋯ + ≠ 0, ≠ 0c1 x1 c2 x2 cn xn ci

Out[3]: [ ]13

21

44

Out[4]:

( )[ ] ,10

01

4585

[ ]0, 1





http://ipython.org/




Another way to state independence

Consider the columns of the matrix A as vectors v , v , ..., v

If r = n then the nullspace only contains the zero vector and the column vectors are linearly independent

Spanning

If we have a set of linearly independent vector that all their linear combinations (including zero) span a subspace ( in this instance a column space)

We are particularly interested in a set of (column) vectors (in a matrix) that are linearly independent and span a subspaceThis leads us to the next topic, basis

Basis

A set of vectors (in a space W) with the propertiesThey are linearly independentThey span the space (linear combinations of them fill the space)

Up until now we looked at columns in a matrix AIt is more common in textbooks to look at a space first and ask about basis vectors, spanning vectors, dimension, etc

So let's look at �The obvious set of basis vectors are

What about

So, are they linearly independent and do they span � ?|

In [6]: A = Matrix([[1, 2], [1, 2], [2, 5]])A

Here we will have r = 2, n = 2 and thus a (n - r = 0) zero nullspace


In [8]: A.rref()

Out[5]: ⎡

⎣

⎢⎢⎢

⎡

⎣

⎢⎢⎢− 4

5

− 85

1

⎤

⎦

⎥⎥⎥

⎤

⎦

⎥⎥⎥

1 2 n

3

, ,i j k

,⎡

⎣⎢⎢

112

⎤

⎦⎥⎥

⎡

⎣⎢⎢

225

⎤

⎦⎥⎥

3

Out[6]: ⎡

⎣⎢⎢

112

225

⎤

⎦⎥⎥

Out[7]: []

Out[8]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟



For now, our intuition is that they will not span �This intuition is correct, because all their linear combinations will only fill a plane through the originThe zero combination does result in the zero vector, though, so they do fill a subspace of �Some textbooks refer to this as V = � , with W a subspace of V

If we added a column vector that is a linear combination of these, it will also fall in the plane and thus not be linearly independent (there will be a vector in the nullspaceother than the zero vector

In [9]: A = Matrix([[1, 2, 3], [1, 2, 3], [2, 5, 7]])A.nullspace()

In [10]: A.rref()

Indeed, we have a column without a pivot and thus a free variable

Let's add another, such that we have

In [11]: A = Matrix([[1, 2, 3], [1, 2, 3], [2, 5, 8]])A.rref()

Again, a column without a pivot and sure enough, we'll find a vector (other than the zero vector) in the nullspace


The special case of a square matrix

If we now end up with a square matrix, we need only look at it's determinant, i.e., is it invertible

In [13]: A.det() # .det() calculates the determinant

Indeed the determinant is zero as expected

Dimension

Given a (sub)space, every basis for that (sub)space has the same number of vectors (there are usally more than one basis for every (sub)space)This called the dimension of a (sub)space

Important point to remember

The (sub)space which a set of (column) vectors (matrix of coefficients, A) span, is the set of possible b-values

3

3n

Out[9]: ⎡

⎣⎢⎢

⎡

⎣⎢⎢

−1−11

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[10]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

110

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

Out[11]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

−120

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

Out[12]: ⎡

⎣⎢⎢

⎡

⎣⎢⎢

1−21

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[13]: 0



More examples

Example problem

Consider the column space

In [14]: A = Matrix([[1, 2, 3, 1], [1, 1, 2, 1], [1, 2, 3, 1]])

In [15]: A

There are n = 4 unknowns, m = 3 unknownsWe note that column 1 = column 4We note that with 4 unknowns we are dealing with �In essence, there are at most three independent columns, thus the matrix cannot be a basis for �It is possible for them to span the column space (don't get confused by column space (� ) and matrix here)


In [17]: A.rref()

As we can see here (columns three and four have free variables, i.e. no pivots)

The rank of the matrix is two (it is the number of pivot columns)This matrix space thus have two basis vectors (column vectors 1 and 2) and we say the dimension of this space is twoRemember, a matrix has a rank, which is the dimension of a column space (the column space representing the space 'produced' by the column vectors)We talk about the rank of a matrix, rank(A) and the column space of a matrix, C(A)

In summary we have two basis above (they span a space)Any two vectors that are not linearly dependent will also span this space, they can't help but todimC(A)= rThe nullspace will have n - r vectors (the dimension of the null space equal the number of free variables)

In [ ]:

Out[15]: ⎡

⎣⎢⎢

111

212

323

111

⎤

⎦⎥⎥

44

4

Out[16]: ⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

−1−110

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−1001

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

Out[17]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

110

100

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

+ 0 + + = 0x1 x2 x3 x40 + 1 + + = 0x1 x2 x3 04

=x4 c2=x3 c1

∴ = −x2 c1∴ = − −x1 c1 c2

= = + = +

⎡

⎣

⎢⎢⎢⎢

x1x2x3x4

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

− −c1 c2−c1c1c2

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−c1−c1c10

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−c200c2

⎤

⎦

⎥⎥⎥⎥c1

⎡

⎣

⎢⎢⎢⎢

−1−110

⎤

⎦

⎥⎥⎥⎥c2

⎡

⎣

⎢⎢⎢⎢

−1001

⎤

⎦

⎥⎥⎥⎥

2015/03/28, 1:38 PMI_11_Subspaces

Page 1 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false

This notebook is part of lecture 10 The four fundamental subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper




In [ ]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())

In [ ]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings


The four fundamental subspacesIntroducing the matrix space

The four fundamental subspaces

Columnspace, C(A)Nullspace, N(A)Rowspaces

All linear combinations of rowsAll the linear combinations of the colums of A , C(A )

Nullspace of A , N(A ) (the left nullspace of A)

Where are these spaces for a matrix A ?

C(A) is in �N(A) is in �C(A ) is in �N(A ) is in �

Calculating basis and dimension

For C(A)

The bases are the pivot columnsThe dimension is the rank r

Out[ ]:

T TT T

m×n

mn

T nT m





http://ipython.org/

2015/03/28, 1:38 PMI_11_Subspaces


For N(A)

The bases are the special solutions (one for every free variable, n - r)The dimension is n - r

For C(A )

If A undergoes row reduction to row echelon form (R), then C(R) ≠ C(A), but rowspace(R) = rowspace(A) (or C(R ) = C(A ))A basis for the rowspace of A (or R) is the first r rows of R

So we row reduce A and take the pivot rows and transpose themThe dimension is also equal to the rank r

For N(A )

It is also called the left, because it ends up on the left (as seen below)Here we have A y = 0

y (A ) = 0y A = 0This is (again) the pivot columns of A (after row reduction)

The dimension is m - r

Example problems

Consider this example matrix and calculate the bases and dimension for all four fundamental spaces

In [ ]: A = Matrix([[1, 2, 3, 1], [1, 1, 2, 1], [1, 2, 3, 1]]) # We note that rows 1 and three are identical and that# columns 3 is the addtion of columns 1 and 2 and column 1 equals column 4A

Columnspace

In [ ]: A.rref() # Remember that the columnspace contains the pivot columns as a basis

The basis is thus:

It is indeed in � (rows of A = m = 3, i.e. each column vector is in 3-space or has 3 components)

The rank (no of columns with pivots) are 2, thus dim(A) = 2

Nullspace

T

T T

T

TT T T TT T

T

Out[ ]: ⎡

⎣⎢⎢

111

212

323

111

⎤

⎦⎥⎥

Out[ ]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

110

100

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

⎡

⎣⎢⎢

100

010

⎤

⎦⎥⎥

3

2015/03/28, 1:38 PMI_11_Subspaces


In [ ]: A.nullspace() # Calculating the nullspace vectors

So, indeed the basis is in � (A has n = 4 columns)

In [ ]: A.rref() # No pivots for columns 3 and 4

The dimension is two (there are 2 column vectors, which is indeed n - r = 4 - 2 = 2)

Rowspace C(A )

So we are looking for the pivot columns of A

In [ ]: A.rref()

The pivot rows are rows 1 and 2We take them and transpose them

As stated above, it is in �

The dimension is n - r = 4 - 2 = 2

Nullspace of A

In [ ]: A.nullspace()

Which is indeed in �

The dimension is 1, since m - r = 3 - 2 = 1 (remember that the rank is the number of pivot columns)

Consider this example matrix (in LU form) and calculate the bases and dimension for all four fundamental spaces

Out[ ]: ⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

−1−110

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−1001

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

4

Out[ ]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

110

100

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

T

T

Out[ ]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

110

100

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

⎡

⎣

⎢⎢⎢⎢

1011

0110

⎤

⎦

⎥⎥⎥⎥

4

T

Out[ ]: ⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

−1−110

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−1001

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

3

2015/03/28, 1:38 PMI_11_Subspaces


In [ ]: L = Matrix([[1, 0, 0], [2, 1, 0], [-1, 0, 1]])U = Matrix([[5, 0, 3], [0, 1, 1], [0, 0, 0]])A = L * UL, U, A

Columnspace of A

In [ ]: A.rref()

The basis is thus:

Another basis would be the pivot columns of L:

It is in � , since m = 3It has a rank of 2 (two pivot columns)Since the dimension of the columnspace is equal to the rank, dim(A) = 2

Note that it is also equal to the number of pivot columns in U

Nullspace of A

In [ ]: A.nullspace()

The nullspace is in � , since n = 3The basis is the special solution(s), which is one column vector for every free variable

Since we only have a single free variable, we have a single nullspace column vectorThis fits in with the fact that it needs to be n - rIt can also be calculated by taking U, setting the free variable to 1 and solving for the other rows by setting each equal to zero

The dimension of the nullspace is also 1 (n - r, i.e. a single column)It is also the number of free variables

The rowspace

This is the columnspace of ADon't take the transpose first!Row reduce, identify the rows with pivots and transpose them

In [ ]: A.rref()

Out[ ]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

12

−1

010

001

⎤

⎦⎥⎥ ,

⎡

⎣⎢⎢

500

010

310

⎤

⎦⎥⎥

⎡

⎣⎢⎢

510−5

010

37

−3

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[ ]: ⎛

⎝⎜⎜⎜ ,

⎡

⎣⎢⎢⎢

100

010

3510

⎤

⎦⎥⎥⎥ [ ]0, 1

⎞

⎠⎟⎟⎟

⎡

⎣⎢⎢

100

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

12

−1

010

⎤

⎦⎥⎥

3

Out[ ]: ⎡

⎣⎢⎢⎢

⎡

⎣⎢⎢⎢

− 35

−11

⎤

⎦⎥⎥⎥

⎤

⎦⎥⎥⎥

3

T

Out[ ]: ⎛

⎝⎜⎜⎜ ,

⎡

⎣⎢⎢⎢

100

010

3510

⎤

⎦⎥⎥⎥ [ ]0, 1

⎞

⎠⎟⎟⎟

2015/03/28, 1:38 PMI_11_Subspaces


The basis is can also be written down by identifying the rows with pivots in U and writing them down as columns (getting their transpose)

It is in � , since n = 3The rank r = 2, which is equal to the dimension, i.e. dim(A ) = 2

The nullspace of A

In [ ]: A.transpose().nullspace()

It is indeed in � , since m = 3A good way to do it is to take the inverse of L, such that L A = U

Now the free variable row in U is row threeTake the corresponding row in L and transpose it

The dimension in m - 2 = 3 - 2 = 1

The matrix space

A square matrix is also a 'vector' space, because they obey the vector space rules of addition and scalar multiplicationSubspaces (of same) would include

Upper triangular matricesSymmetric matrices

In [ ]:

⎡

⎣⎢⎢

503

011

⎤

⎦⎥⎥

3T

T

Out[ ]: ⎡

⎣⎢⎢

⎡

⎣⎢⎢

101

⎤

⎦⎥⎥

⎤

⎦⎥⎥

3-1

-1

2015/03/28, 1:35 PMI_12_Matrix_spaces

Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_06_Gilbert_Strang/ZIP/I_12_Matrix_spaces.ipynb?download=false

This notebook is part of lecture 11 Matrix spaces, rank 1, small world graphs in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Matrix spaces

New vector / matrix spaces

Square matrices

Consider M to be all 3×3 matrices (with real elements)

Subspaces would be:Upper or lower triangular matricesSymmetric matrices

Basis would be:

The dimension would be 9

For upper and lower triangular matrices the dimensions would be 6 and the basis:

For symmetric matrices the dimension would also be six ( Knowing the diagonal and entries on one of the two sides)

These are unique cases where the bases for the subspaces are contained in the basis of the 3×3 matrix M

Out[1]:

, , , , , , , ,⎡

⎣⎢⎢

100

000

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

100

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

100

⎤

⎦⎥⎥

⎡

⎣⎢⎢

010

000

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

010

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

001

000

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

001

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

001

⎤

⎦⎥⎥

, , , , ,⎡

⎣⎢⎢

100

000

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

100

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

100

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

010

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

001

⎤

⎦⎥⎥





http://ipython.org/



Other square matrices that are subspaces of M

The intersection of symmetric and upper triangular matrices (that is symmetric AND upper triangular, S∩U)This is a diagonal matrixThe dimension is 3The basis is

The union of symmetric and upper triangular matrices (that is symmetric OR upper triangular, S∪U)It is NOT a subspace

The addition (sum) of symmetric and upper triangular matricesIt IS a subspaceIt is actually all 3×3 matricesThe dimension is 9

This gives the equation: dim(S) + dim(U) = dim(S∩U) + dim(S+U) = 12

Example problems

Example problem 1

Show that the set of 2×3 matrices whose nullspace contains the column vector below is a vector subspace and find a basis for it

Solution

In essence we have to show the following

... and ...

This can be shown by addition:

Therefor (by virtue of the fact that addition remains in the nullspace) the set is vector subspace

We also need to look at scalar multiplication (if we multiply a matrix in the set by a scalar, does it remain in the set)

, ,⎡

⎣⎢⎢

100

000

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

010

000

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

000

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

A = [ ]⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

[ ] = [ ]a11a21

a12a22

a13a23

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

B = [ ]⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

(A + B) = [ ] = A + B = [ ]⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

(cA) = c A = c [ ] = [ ]⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

⎛

⎝⎜⎜

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

⎞

⎠⎟⎟

00

00



Example problem 2

Find a basis for the nullspace above

Solution

Let's look at the first row:

From this we can make the following row vectors

From this we can construct 4 basis:

Example problem 3

What about the set of those whose column space contains the following column vector?

Solution

Well, any subspace must contain the zero matrix

It does not contain the above column vector, which is therefor not a subspace

In [ ]:

[ ] = [ ]a11 a12 a13

⎡

⎣⎢⎢

211

⎤

⎦⎥⎥

00

2 + + = 0a11 a12 a13= −2 −a13 a11 a12

[ ] = [ ]a11 a12 a13 a11 a12 (−2 − )a11 a12

= [ ] + [ ]a11 0 −2a11 0 a12 −a12

= [ ] + [ ]a11 1 0 −2 a12 0 1 −1

[ ] , [ ] , [ ] , [ ]10

00

−20

00

10

−10

01

00

0−2

00

01

0−1

[ ]21

[ ]00

00

00

2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws

Page 1 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false

This notebook is part of lecture 12 Graphs, netwroks, and incidence matrices in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarningsfrom IPython.display import Image


Graphs and networksIncidence matricesKirchhoff's laws

This lecture is about the application of matrices

Graphs and networks

In this instance we refer to nodes and there connections called edgesConsider the graph below:

In [4]: Image(filename = 'Graph1.png')

We will call the nodes n (columns), in this case n = 4The edges (connections) will be called m (rows), with m = 5 in this caseThis will give us a m×n = 5×4 matrixWe will have to give a direction to every edge

The incidence matrix

This corresponds to the graph above

Out[1]:

Out[4]:





http://ipython.org/



In [5]: A = Matrix([[-1, 1, 0, 0], [0, -1, 1, 0], [-1, 0, 1, 0], [-1, 0, 0, 1], [0, 0, -1, 1]])A# For each row (edge) look only at that edge (line)# In the case of row (edge, line) 1, the arrow point away from node 1, hence the first -1 in the matrix# The arrow point towards node 2, hence the 1# It does not point to nodes 3 and 4, hence the 0's

Edges 1, 2, and 3 form a loopNotice for the first loop (edges 1, 2, and 3) the corresponding third row is a linear combination of rows 1 and 2Intuitively, you can see that you can reach node 3 from node 1 by a combination of edges (rows) 1 and 2

In [6]: A.rref()

We note that we have three pivot columns, hence a rank, r = 3We have one column without a pivot and will thus have one in the nullspace (n - r = 4 - 3 = 1)


The basis for this subspace is one dimensional and includes all scalar multiplications of this vectorThe meaning in our example is that nothing will happen when the solutions fall on this line in 4-dimensional space, i.e. no current will flow

If you think of the solution x and every component of x being a potential at a node, the matrix multiplication Ax gives you the potential differences along the edgesThe nullspace would then be the solution where all the potential differences are 0

In [8]: x1, x2, x3, x4 = symbols('x1, x2, x3, x4')

In [9]: x_vect = Matrix([x1, x2, x3, x4])x_vect

In [10]: A * x_vect

For the nullspace, each row now equals 0 (the potential difference between two nodes)

Let's look at the row space and the nullspace of the row pictureWe now to get the rowspace by transposing the row that contain pivots

Out[5]: ⎡

⎣

⎢⎢⎢⎢⎢

−10

−1−10

1−1000

0110

−1

00011

⎤

⎦

⎥⎥⎥⎥⎥

Out[6]: ⎛

⎝

⎜⎜⎜⎜⎜

,

⎡

⎣

⎢⎢⎢⎢⎢

10000

01000

00100

−1−1−100

⎤

⎦

⎥⎥⎥⎥⎥

[ ]0, 1, 2

⎞

⎠

⎟⎟⎟⎟⎟

Out[7]: ⎡

⎣

⎢⎢⎢⎢

⎡

⎣

⎢⎢⎢⎢

1111

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

Out[9]: ⎡

⎣

⎢⎢⎢⎢

x1x2x3x4

⎤

⎦

⎥⎥⎥⎥

Out[10]: ⎡

⎣

⎢⎢⎢⎢⎢

− +x1 x2− +x2 x3− +x1 x3− +x1 x4− +x3 x4

⎤

⎦

⎥⎥⎥⎥⎥



In [11]: A_row = Matrix([[1, 0, 0, -1], [0, 1, 0, -1], [0, 0, 1, -1]]).transpose()A_row

In [12]: A


In [14]: A.transpose().rref()

Note how the pivot columns are columns 1, 2, and 4These represent edges 1, 2, 4Note (form the graph above) that thye are independent as they are not a part of a loopA graph without a loop (with 1 less edge than nodes) is called a treeIt has a nullspace of

In [15]: A.transpose().nullspace()

The dimension of the nullspace of A is m - r = number of edges minus (number of nodes - 1)∴ number of nodes - number of edges + number of loops = 1This is Euler's formula and works for all graphsIt tells you how many independent loops there are

There is a connection between potentials and currentsWith 5 edges we will have 5 currents, which we can represent as a vector y

This relationship is Ohm's law

Kirchhoff's law

By the way, Kirchhoff's current law is: A y = 0We can look at it in the following way

Out[11]: ⎡

⎣

⎢⎢⎢⎢

100

−1

010

−1

001

−1

⎤

⎦

⎥⎥⎥⎥

Out[12]: ⎡

⎣

⎢⎢⎢⎢⎢

−10

−1−10

1−1000

0110

−1

00011

⎤

⎦

⎥⎥⎥⎥⎥

Out[13]: ⎡

⎣

⎢⎢⎢⎢

−1100

0−110

−1010

−1001

00

−11

⎤

⎦

⎥⎥⎥⎥

Out[14]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1000

0100

1100

0010

−1−110

⎤

⎦

⎥⎥⎥⎥[ ]0, 1, 3

⎞

⎠

⎟⎟⎟⎟

Out[15]: ⎡

⎣

⎢⎢⎢⎢⎢

,

⎡

⎣

⎢⎢⎢⎢⎢

−1−1100

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

110

−11

⎤

⎦

⎥⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥⎥

T

= [ ]y⎯⎯⎯ y1 y2 y3 y4 y5

T




In [17]: y1, y2, y3, y4, y5 = symbols('y1, y2, y3, y4, y5')

In [18]: y_vect = Matrix([y1, y2, y3, y4, y5])y_vect

In [19]: A.transpose() * y_vect

For row 1 (setting it equal to 0 and looking at graph above tells us that current flows out from node 1 on all these 3 edgesFor row 2 (doing the same as above) we note that for node 2 current flow towards it on edge y and away from it along edge yFor row 3 we note that current flows from node three along edges 2 (edge y ) and 3 (edge y ) and away from it along edge 5 (edge y )For row 4 we note that current flows towards it along edges 4 (edge y ) and 5 (edge y )

Look back at the nullspace of AThe two basis vectors show the flow in current that will allow for NO current to accumulate at a nodeIn this example, current flowed along the loop of edges 1, 2, and 3 (with nothing along 4 and 5The other solution would be current flowing all along the periphery, with nothing along 3These are the basis vectors of the nullspaceAnother valid basis would include flow along the upper loopNotice that the basis is two dimensional as (between the 3 flows explained above) one is a linear combination of the other two

Putting it all together

All of the above can be stated as follows

Wheree is the potential differencesf is an external current in Kirchhoff's law

This gives us the fundamental equation for applications as stated here

These equations are for equilibrium (no Newton's law, no time)

Remember that A A is always symmetric

Example problem

Example problem 1

Out[16]: ⎡

⎣

⎢⎢⎢⎢

−1100

0−110

−1010

−1001

00

−11

⎤

⎦

⎥⎥⎥⎥

Out[18]: ⎡

⎣

⎢⎢⎢⎢⎢⎢

y1y2y3y4y5

⎤

⎦

⎥⎥⎥⎥⎥⎥

Out[19]: ⎡

⎣

⎢⎢⎢⎢

− − −y1 y3 y4−y1 y2

+ −y2 y3 y5+y4 y5

⎤

⎦

⎥⎥⎥⎥

1 22 3 5

4 5

T

= Ae⎯⎯⎯ x⎯⎯⎯= Cy⎯⎯⎯ e⎯⎯⎯

=AT y⎯⎯⎯ f⎯⎯⎯

CA =AT x⎯⎯⎯ f⎯⎯⎯

T



In [20]: Image(filename = 'Graph2.png')

Calculate the incidence matrix ACalculate the nullspaces of A and ACalculate the trace of A A

Solution

In [21]: A = Matrix([[-1, 1, 0, 0, 0], [0, -1, 1, 0, 0], [-1, 0, 1, 0, 0], [0, -1, 0, 1, 0], [0, 0, 0, -1, 1], [0, 0, 1, 0, -1]])A

In [22]: A.rref()

We note that we have 4 independent columnsThe dimension of the nullspace will be n - r = 5 - 4 = 1We will let x = s, then from the row-reduced echelon form abobe we have


It represents a potential difference between all nodes t be zero: Ax = 0This means that the potential at all nodes must be a constant

Out[20]:

TT

Out[21]: ⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

−10

−1000

1−10

−100

011001

0001

−10

00001

−1

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

Out[22]: ⎛

⎝

⎜⎜⎜⎜⎜⎜⎜

,

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

100000

010000

001000

000100

−1−1−1−100

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥[ ]0, 1, 2, 3

⎞

⎠

⎟⎟⎟⎟⎟⎟⎟

5− = 0x1 x5− = 0x2 x5− = 0x3 x5− = 0x4 x5

= s

⎡

⎣

⎢⎢⎢⎢⎢

x1x2x3x4x5

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

11111

⎤

⎦

⎥⎥⎥⎥⎥

Out[23]: ⎡

⎣

⎢⎢⎢⎢⎢

⎡

⎣

⎢⎢⎢⎢⎢

11111

⎤

⎦

⎥⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥⎥



In [24]: A.transpose().nullspace()

It is of dimension 2, as there are two independent loopsAs per Euler's formula

nodes - edges + loops = 15 - 6 + 2 = 1

This tells us about current that needs to flow so as not to accumulate current at a nodeIt therefor indicates the independent loopsIt works out beautifully

Look at the two loops and assign flow as per the two vector columns for each edge and you will see perfect flow along either of the two independent loops withno current accumulating at any node

We could calculate it from the row-reduced echelon for of A

In [25]: A.transpose().rref()

This gives us 4 independent columns, with dependent y and y

In [26]: A.transpose() * A

In [27]: (A.transpose() * A).trace()

Out[24]: ⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

,

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

−1−11000

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

0−10111

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

T

Out[25]: ⎛

⎝

⎜⎜⎜⎜⎜

,

⎡

⎣

⎢⎢⎢⎢⎢

10000

01000

11000

00100

00010

01

−1−10

⎤

⎦

⎥⎥⎥⎥⎥

[ ]0, 1, 3, 4

⎞

⎠

⎟⎟⎟⎟⎟

3 6= sy6= ty3

+ = + t = 0y1 y3 y1∴ = −ty1+ + = 0y2 y3 y6

∴ = −s − ty2− = − s = 0y4 y6 y4

∴ = sy4− = − s = 0y5 y6 y5

∴ = sy5

= = + = s + t

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

y1y2y3y4y5y6

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

−t−s − t

tsss

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

0−s0sss

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

−t−tt000

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

0−10111

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

−1−11000

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

Out[26]: ⎡

⎣

⎢⎢⎢⎢⎢

2−1−100

−13

−1−10

−1−130

−1

0−102

−1

00

−1−12

⎤

⎦

⎥⎥⎥⎥⎥

Out[27]: 12



The degree of the node is the number of edges it hasLook at the columns of the incidence matrix AEvery non-trivial (non-zero) entry represents an edgeNote that there are 2 in column 1

This gives us a degree of 2, which will also be the first entry on the diagonal of A AColumn 2 has 3 entries representing 3 edges from node 2 and an entry of 3 on the diagonal of A A... and so onThe trace is therefor just the sum of the degree of all the nodes

In [ ]:

TT

2015/03/28, 1:46 PMI_14_Exam_review

Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…18_06_Gilbert_Strang/ZIP/I_14_Exam_review.ipynb?download=false

This notebook is part of Exam review 1 in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarnings

In [3]: init_printing(use_latex='mathjax')filterwarnings('ignore')

Exam review

Question 1

Consider three non-zero vectors in �What is the dimension of the subspace that they can span?

Solution 1

One, two, or threeThey can't span a subspace of higher dimension as there are only three vectorsZero cannot be an answer, because they are all non-zero vectors

Question 2

Part 1Consider a 5×3 matrix in echelon form with three pivots

What's the nullspacePart 2

Consider a 10×3 matrix of form the form below and calculate its rank and echelon form

Part 3Give the row-reduced form of the matrix

Solution 2

Out[1]:

7

[ ]R2R

[ ]UU

U0





http://ipython.org/

2015/03/28, 1:46 PMI_14_Exam_review


Part 1The nullspace can only be the zero vector

With three columns all with pivots, we have n - r = 3 - 3 = 0

Part 2Row reduction will take us to

Part 3

In reduced row echelon form

Question 3

Consider

With

Part 1What is the dimension of the rowspace of A and the nullspace of A

Part 2For what values of b can Ax = b be solved?

Solution 3

Part 1Well, the size of the matrix must be 3×3The dimension of the nullspace is 2 (because two non-pivot columns)With n - r = 2, we have r = 1, which must hold for the rowspace of A

Part 2Looking only at the particular solution we must have

So, how did I get the last two columns?Well, they cannot be independent of the first column and the last column must have all zeros to set up the first variable solution fr x , i.e. x = dAdding column 2 to column 1 to get all zeros must result in what you seen for column 2

⎡

⎣⎢⎢

000

⎤

⎦⎥⎥

[ ]R0

[ ]U0

U−U

[ ]U0

0U

A =x⎯⎯⎯⎡

⎣⎢⎢

242

⎤

⎦⎥⎥

x = + c + d⎡

⎣⎢⎢

200

⎤

⎦⎥⎥

⎡

⎣⎢⎢

110

⎤

⎦⎥⎥

⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

2 =⎡

⎣⎢⎢

a11a21a31

⎤

⎦⎥⎥

⎡

⎣⎢⎢

242

⎤

⎦⎥⎥

∴ =⎡

⎣⎢⎢

a11a21a31

⎤

⎦⎥⎥

⎡

⎣⎢⎢

121

⎤

⎦⎥⎥

∴ =⎡

⎣⎢⎢

a11a21a31

a12a22a32

a13a23a33

⎤

⎦⎥⎥

⎡

⎣⎢⎢

121

−1−2−1

000

⎤

⎦⎥⎥

3 3

2015/03/28, 1:46 PMI_14_Exam_review


In [4]: A = Matrix([[1, -1, 0], [2, -2, 0], [1, -1, 0]])x_vect = Matrix([2, 0, 0])x_vect_null_1 = Matrix([1, 1, 0])x_vect_null_2 = Matrix([0, 0, 1])A * x_vect + A * x_vect_null_1 + A * x_vect_null_2


It can only be solve for scalar multiples of

Question and solution 4

If the nullspace of a square matrix is the zero vector only, does the nullspace of the transpose also only contain the zero vectorYes

Consider the matrix space of all 5×5 matrices; do the invertible 5×5 matrices form a subspaceNo, as the set of invertible matrices would not contain the zero matrixAlso if I add two invertible matrices, I don't know if the resultant matrix is invertibleThe singular ones won't work either as adding two we also don't know if the resultant matrix is invertible

If B = 0, is B = 0?No, i.e

In [6]: B = Matrix([[0, 1], [0, 0]])B ** 2 # Could also use B * B

In [7]: B == B * B # Checking by Boolean statement

A system of n unknowns in n equations is solvable for every b if the columns of the matrix of coefficients are independent?Yes

Question 5

Calculate the basis of the nullspace of B

Solution 5

B will have to be a 3×4 matrixThe last row will be all zeros

Out[4]: ⎡

⎣⎢⎢

242

⎤

⎦⎥⎥

Out[5]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

110

⎤

⎦⎥⎥

⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎡

⎣⎢⎢

121

⎤

⎦⎥⎥

2

B = [ ]00

10

Out[6]: [ ]00

00

Out[7]: False

B =⎡

⎣⎢⎢

101

110

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

010

−110

2−10

⎤

⎦⎥⎥

2015/03/28, 1:46 PMI_14_Exam_review


In [8]: B1 = Matrix([[1, 1, 0], [0, 1, 0], [1, 0, 1]])B2 = Matrix([[1, 0, -1, 2], [0, 1, 1, -1], [0, 0, 0, 0]])B1, B2

It is important to note that of we multiply an invertible matrix by another matrix (assuming multiplication is possible by the shape of the matrices), then the invertible one(B1 above), plays no part in the nullspace

N(CD) = N(D) if C is invertibleWe therefor only have to look at B2It has 2 pivot columns, i.e. rank is r = 2It will therefor have 2 independent variables, making the nullspace 2-dimensional

In [9]: B = B1 * B2B.nullspace()

In [ ]:

Out[8]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

101

110

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

010

−110

2−10

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[9]: ⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

1−110

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

−2101

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

2015/03/29, 1:46 PMII_01_Orthogonality_of_vectors_and_subspaces

Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…01_Orthogonality_of_vectors_and_subspaces.ipynb?download=false

This notebook is part of lecture 14 Orthogonal vectors and subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Orthogonal vectors and subspaces

Rowspace orthogonal to nullspace and columnspace to nullspace of A

N(A A) = N(A)

Orthogonal vectors

Two vectors are orthogonal if their dot product is zeroIf they are written as column vectors x and y, their dot product is x y

For orthogonal (perpendicular) vectors x y = 0From the Pythagorean theorem they are orthogonal if

The length squared of a (column) vector x can be calculated by x xThis achieves exactly the same as the sum of the squares of each element in the vector

Following from the Pythagorean theorem we have

This states that the dot product of orthogonal vectors equal zero

The zero vector is orthogonal to all other similar dimensional vectors

Out[1]:

T

T

TT

+ =∥∥x⎯⎯⎯ ∥∥2 ∥∥y⎯⎯⎯ ∥∥2 +∥∥x⎯⎯⎯ y⎯⎯⎯ ∥∥2

=∥∥x⎯⎯⎯ ∥∥ + + ⋯ +x21 x2

2 x2b

‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√

T

+ + ⋯ +x21 x2

2 x2n

+ =∥∥x⎯⎯⎯ ∥∥2 ∥∥y⎯⎯⎯ ∥∥2 +∥∥x⎯⎯⎯ y⎯⎯⎯ ∥∥2

+ = ( + )x⎯⎯Tx⎯⎯ y

⎯⎯T y⎯⎯ ( + )x⎯⎯ y

⎯⎯T x⎯⎯ y⎯⎯

+ = + + +x⎯⎯Tx⎯⎯ y

⎯⎯T y⎯⎯ x⎯⎯

Tx⎯⎯ x⎯⎯Ty⎯⎯ y

⎯⎯T x⎯⎯ y

⎯⎯T y⎯⎯

∵ =x⎯⎯Ty⎯⎯ y

⎯⎯T x⎯⎯

+ = + 2 +x⎯⎯Tx⎯⎯ y

⎯⎯T y⎯⎯ x⎯⎯

Tx⎯⎯ x⎯⎯Ty⎯⎯ y

⎯⎯T y⎯⎯

2 = 0x⎯⎯Ty⎯⎯

= 0x⎯⎯Ty⎯⎯





http://ipython.org/



Orthogonality of subspaces

Consider two subspaces S and TTo be orthogonal every vector in S must be orthogonal to any vector in T

Consider the XY and YZ planes in 3-spaceThey are not orthogonal, since many combinations of vectors (one in each plane) are not orthogonalVectors in the intersection, even though, one each from each plane can indeed be the same vectorWe can say that any planes that intersect cannot be orthogonal to each other

Orthogonality of the rowspace and the nullspace

The nullspace contains vectors x such that Ax = 0Now remembering that x y = 0 for orthogonal column vectors and considering each row in A as a transposed column vector and x (indeed a column vector) and theirproduct being zero meaning that they are orthogonal, we have:

The rows (row vectors) in A are NOT the only vectors in the rowspace, since we also need to show that ALL linear combinations of them are also orthogonal to xThis is easy to see by the structure above

Orthogonality of the columnspace and the nullspace of A

The proof is the same as above

The orthogonality of the rowspace and the nullspace is creating two orthogonal subspaces in �The orthogonality of the columnspace and the nullspace of A is creating two orthogonal subspaces in �

Note how the dimension add up to the degree of the space �The rowspace (a fundamental subspace in � ) is of dimension rThe dimension of the nullspace (a fundamental subspace in � ) is of dimension n - rAddition of these dimensions gives us the dimension of the total space n as in �ANDThe columnspace is of dimension r and the nullspace of A is of dimension m - r, which adds to m as in �

This means that two lines that may be orthogonal in � cannot be two orthogonal subspaces of � since the addition of the dimensions of these two subspaces (lines) isnot 3 (as in � )

We call this complementarity, i.e. the nullspace and rowspace are orthogonal complements in �

A A

We know thatThe result is squareThe result is symmetric, i.e. (n×m)(m×n)=n×n(A A) = A A = A A

T

=

⎡

⎣

⎢⎢⎢⎢

a11a21

⋮am1

a12a22

⋮am2

……⋮…

a1n

a2n

⋮amn

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

x1x2

⋮xn

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

00⋮0

⎤

⎦

⎥⎥⎥⎥

[ ] = 0a11 a12 … a1n

⎡

⎣

⎢⎢⎢⎢

x1x2

⋮xn

⎤

⎦

⎥⎥⎥⎥…

T

nT m

nn

n

T m

3 33

n

T

T T T TT T



When Ax = b is not solvable we use A Ax = A bx in the first instance did not have a solution, but after multiplying both side with A , we hope that the second x has an solution, now called

Consider the matrix below with m = 4 equation in n = 2 unknownsThe only b solutions must be linear combinations of the columnspace of A

In [4]: A = Matrix([[1, 1], [1, 2], [1, 5]])A


Note how the nullspace of A A is equal to the nullspace of A

In [6]: (A.transpose() * A).nullspace() == A.nullspace()

The same goes for the rank

In [7]: A.rref(), (A.transpose() * A).rref()

A A is not always invertibleIn fact it is only invertible if the nullspace of A only contains the zero vector (has independent columns)

In [ ]:

T TT

A = bAT x AT

Out[4]: ⎡

⎣⎢⎢

111

125

⎤

⎦⎥⎥

+ =x1

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥ x2

⎡

⎣⎢⎢

125

⎤

⎦⎥⎥

⎡

⎣⎢⎢

b1b2b3

⎤

⎦⎥⎥

Out[5]: [ ]38

830

T

Out[6]: True

Out[7]: ⎛

⎝⎜⎜ ,

⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟ ( )[ ] ,1

001 [ ]0, 1

⎞

⎠⎟⎟

T

2015/03/29, 1:51 PMII_02_Projection_onto_subspaces

Page 1 of 3http://localhost:8888/nbconvert/html/II_02_Projection_onto_subspaces.ipynb?download=false

This notebook is part of lecture 15 Projections onto subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import Imagefrom warnings import filterwarnings


Projections onto subspaces

Geometry in the plane

Projection of a vector onto another (in the plane)Consider the orthogonal projection of b onto a

In [4]: Image(filename = 'Orthogonal projection in the plane.png')

Note that p falls on a line, which is a subspace of the plane �Remember from the previous lecture that orthogonal subspaces have Ax = 0Note that p is some scalar multiple of aWith a perpendicular to e and e = b - xaThus we have a (b - xa) = 0 and xa a = a bSince a a is a number we can simplify

We also have p = ax

Out[1]:

Out[4]:

2

T T TT

x = a⎯⎯T b⎯⎯

a⎯⎯T a⎯⎯

= x =p⎯⎯ a⎯⎯ a⎯⎯a⎯⎯

T b⎯⎯a⎯⎯

T a⎯⎯





http://ipython.org/



This equation is helpfulDoubling (or any other scalar multiple of) b doubles (or scalar multiplies) pDoubling (or scalar multiple of) a has no effect

Eventually we are looking for proj = Pb, where P is the projection matrix

Properties of the projection matrix PThe columnspace of P (C(P)) is the line which contains aThe rank is 1, rank(P) = 1P is symmetrix, i.e. P = PApplying the projection matrix a second time (i.e. P ) nothing changes, thus P = P

Why project?

(projecting onto more than a one-dimensional line)

Because Ax = b may not have a solutionb may not be in the columnspaceMay have more equations than unknowns

Solve for the closest vector in the columnspaceThis is done by solving for p instead, where p is the projection of b onto the columnsapce of A

Now we have to get b orthogonally project (as p) onto the column(sub)spaceThis is done by calculating two bases vectors for the plane that contains p, i.e. a and a

Going way back to the graph up top we note that e is perpendicular to the planeSo, we have:

We know that both a and a is perpendicular to e, so:

We know that from ...

... e must be in the nullspace of AWhich is right because from the previous lecture the nullspace of A is orthogonal to the columnspace of A

Simplifying the last equations we have

Just look back at the plane example in � example we started withSimplifying things back to a column vector a instead of a matrix subspace A in this last equation does give us what we had in �

Solving this we have

p= Pp⎯⎯ b⎯⎯

P = 1a⎯⎯

T a⎯⎯a⎯⎯a⎯⎯

T

T2 2

A =x p⎯⎯

1 2

A =x p⎯⎯1 2

= 0; = 0aT1 e⎯⎯ aT

2 e⎯⎯∵ = −e⎯⎯ b⎯⎯ p⎯⎯

∵ = Ap⎯⎯ x

( − A ) = 0; ( − A ) = 0aT1 b⎯⎯ x aT

2 b⎯⎯ x

[ ] ( − A ) = [ ]aT1

aT2

b⎯⎯ x 00

( − A ) = 0AT b⎯⎯ x T

T

A = bAT x AT

22

=x ( A)AT −1AT b⎯⎯



Which leaves us with

Making the projection matrix P

Just note that for a square invertible matrix A, P is the identity matrixMost of the time A is not square (and thus invertible) so we have to leave the equation as it isAlso, note that P = P and P = P

Applications

Least squares

Given a set of data points in two dimensions, i.e. with variables (t,b)We need to fit them onto the best lineSo, as an example consider the points (1,1), (2,2), (3,2)

A best line in this instance means a straight line in the form

Using the three points above we get three equations

If the line goes through all points, we would give a solutionInstead we have the following

Three equation, two unknowns, no solution, so solve ...

... which for the solution is

In [5]: A = Matrix([[1, 1], [1, 2], [1, 3]])A

In [6]: b = Matrix([1, 2, 2])b

In [7]: (A.transpose() * A).inv() * A.transpose() * b

Thus, the solution is:

In [ ]:

= Ap⎯⎯ x

= Ap⎯⎯ ( A)AT −1AT b⎯⎯

P = A( A)AT −1AT

T 2

b = C + Dt

C + D = 1C + 2D = 2C + 3D = 2

[ ] =⎡

⎣⎢⎢

111

123

⎤

⎦⎥⎥

CD

⎡

⎣⎢⎢

122

⎤

⎦⎥⎥

A = bAT x AT

= bx ( A)AT −1AT

Out[5]: ⎡

⎣⎢⎢

111

123

⎤

⎦⎥⎥

Out[6]: ⎡

⎣⎢⎢

122

⎤

⎦⎥⎥

Out[7]:

[ ]2312

b = + t23

12

2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares

Page 1 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false

This notebook is part of lecture 16 Projection matrices and least squares in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import Imagefrom warnings import filterwarnings


Projection matrices and least squares

In [4]: Image(filename = 'Line.png')

Least squares

Out[1]:

Out[4]:





http://ipython.org/



Consider from the previous lecture the three data point in the plain

From this we need to construct a straight lineThis could be helpful say in, statistics (remember, though in statistics we might have to get rid of statistical outliers)Nonetheless (view image above) we note that we have a straight line in slope-intercept form

On the line at t values of 1, 2, and 3 we will have

The actual y values at these t values are 1, 2, and 2, thoughWe are thus including an error of

Since some are positive and some are negative (actual values below or above the line), we simply determine the square (which will always be positive)Adding the (three in our example here) squares we have the sum total of the error (which is actuall just the sqautre of the distance between the line and actual y values)The line will be the best fit when this error sum is at a minimum (hence least squares)We can do this with calculus or with linear algebraFor calculus we take the partial derivatives of both unknowns and set to zeroFor linear algebra we project orthogonally onto the columnspace (hence minimizing the error)

Note that the solution b does not exist in the columnspace (it is not a linear combination of the columns)

Calculus method

We'll create a function f(C,D) and successively take the partial derivatives of both variables and set it to zeroWe fill then have two equation with two unknowns to solve (which is easy enough to do manually or by simple linear algebra and row reduction)

In [5]: C, D = symbols('C D')

In [6]: e1_squared = ((C + D) - 1) ** 2e2_squared = ((C + 2 * D) - 2) ** 2e3_squared = ((C + 3 * D) - 2) ** 2f = e1_squared + e2_squared + e3_squaredf

In [7]: f.expand() # Expanding the expression

Doing the partial derivatives will be

In [8]: f.diff(C) # Taking the partial derivative with respect to C

In [9]: f.diff(D) # Taking the partial derivative with respect to D

Setting both equal to zero (and creating a simple augmented matrix) we get

( , ) = (1, 1), (2, 2), (3, 2)ti yi

y = C + Dt

= C + D = 1y1= C + 2D = 2y2= C + 3D = 2y3

δy=( )e1

2 [(C + D) − 1]2

=( )e22 [(C + 2D) − 2]2

=( )e32 [(C + 3D) − 2]2

Out[6]: + +(C + D − 1)2 (C + 2D − 2)2 (C + 3D − 2)2

Out[7]: 3 + 12CD − 10C + 14 − 22D + 9C2 D2

f (C, D) = 3 + 12CD − 10C + 14 − 22D + 9C2 D2

= 6C + 12D − 10 = 0∂f∂C

= 12C + 28D − 22 = 0∂f∂D

Out[8]: 6C + 12D − 10

Out[9]: 12C + 28D − 22

6C + 12D − 10 = 012C + 28D − 22 = 0∴ 6C + 12D = 10∴ 12C + 28D = 22



In [10]: A_augm = Matrix([[6, 12, 10], [12, 28, 22]])A_augm

In [11]: A_augm.rref() # Doing a Gauss-Jordan elimination to reduced row echelon form

We now have a solution

Linear algebra

We note that we can construct the following

b is not in the columnspace of A and we have to do orthogonal projection

In [12]: A = Matrix([[1, 1], [1, 2], [1, 3]])b = Matrix([1, 2, 2])A, b # Showing the two matrices

In [13]: x_hat = (A.transpose() * A).inv() * A.transpose() * bx_hat

Again, we get the same values for C and D

Remember the following

p and e are perpendicularIndeed p is in the columnspace of A and e is perpendicular to the columspace (or any vector in the columnspace)

Example problem

Example problem 1

Find the quadratic (second order polynomial) equation through the origin, with the following data points: (1,1), (2,5) and (-1,-2)

Out[10]: [ ]612

1228

1022

Out[11]:

( )[ ] ,10

01

2312

[ ]0, 1

y = + t23

12

C + 1D = 1C + 2D = 2C + 3D = 2

C + D =⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

⎡

⎣⎢⎢

123

⎤

⎦⎥⎥

⎡

⎣⎢⎢

122

⎤

⎦⎥⎥

A =x⎯⎯ b⎯⎯

[ ] =⎡

⎣⎢⎢

111

123

⎤

⎦⎥⎥

CD

⎡

⎣⎢⎢

122

⎤

⎦⎥⎥

A =AT x AT b⎯⎯=x ( A)AT −1AT b⎯⎯

Out[12]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

111

123

⎤

⎦⎥⎥

⎡

⎣⎢⎢

122

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[13]:

[ ]2312

= +b⎯⎯ p⎯⎯ e⎯⎯



Solution

Let's just think about a quadratic equation in y and t

Through the origin (0,0) means y = 0 and t = 0, thus we have

This gives us three equation for our three data points

Clearly b is not in the columnspace of A and we have to project orthogonally onto the columnspace using

In [14]: A = Matrix([[1, 1], [2, 4], [-1, 1]])b = Matrix([1, 5, -2])x_hat = (A.transpose() * A).inv() * A.transpose() * bx_hat

Here's a simple plot of the equation

In [15]: import matplotlib.pyplot as plt # The graph plotting moduleimport numpy as np # The numerical mathematics module%matplotlib inline

y = + Ct + Dc1 t2

0 = + C0 + Dc1 02

= 0c1y = Ct + Dt2

C (1) + D = 1(1)2

C (2) + D = 5(2)2

C (−1) + D = −2(−1)2

C + D =⎡

⎣⎢⎢

12

−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

141

⎤

⎦⎥⎥

⎡

⎣⎢⎢

15

−2

⎤

⎦⎥⎥

A =⎡

⎣⎢⎢

12

−1

141

⎤

⎦⎥⎥

= [ ]x⎯⎯CD

=b⎯⎯

⎡

⎣⎢⎢

15

−2

⎤

⎦⎥⎥

=x ( A)AT −1AT b⎯⎯

Out[14]:

[ ]4122522



In [16]: x = np.linspace(-2, 3, 100) # Creating 100 x-valuesy = (41 / 22) * x + (5 / 22) * x ** 2 # From the equation aboveplt.figure(figsize = (8, 6)) # Creating a plot of the indicated sizeplt.plot(x, y, 'b-') # Plot the equation above , in essence 100 little plots using small segmnets of blue linesplt.plot(1, 1, 'ro') # Plot the point in a red dotplt.plot(2, 5, 'ro')plt.plot(-1, -2, 'ro')plt.plot(0, 0, 'gs') # Plot the origin as a green squareplt.show(); # Create the plot

In [ ]:

2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt

Page 1 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false

This notebook is part of lecture 17 Orthogonal matrices and Gram-Scmidt in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, symbols, Matrix, sin, cos, sqrt, Rational, GramSchmidtfrom warnings import filterwarnings


In [4]: theta = symbols('theta')

Orthogonal basis

Orthogonal matrix

Gram-Schmidt

Orthogonal basis

Here we mean vectors q ,q ,...,q

We actually mean orthonormal vectors (for orthogonal or perpendicular and of unit length / normalized)Vectors that are orthogonal have a dot product equal to zero

If they are orthogonal

If they are not

Orthogonal matrix

We can now put these (column) basis vectors into a matrix Q

This brings about

In the case of the matrix Q being square the word orthogonal matrix is usedWhen it is square we can calculate the inverse making

Consider the following permutation matrix with orthonormal column vectors

Out[1]:

1 2 n

= 0qTi qj

≠ 0qTi qj

Q = IQT

=QT Q−1





http://ipython.org/



In [5]: Q = Matrix([[0, 0, 1], [1, 0, 0], [0, 1, 0]])Q, Q.transpose()

In this example the transpose also contains orthonormal column vectorsMultiplication gives the identity matrix

In [6]: Q.transpose() * Q

Consider this example

In [7]: Q = Matrix([[cos(theta), -sin(theta)], [sin(theta), cos(theta)]])Q, Q.transpose()

The two column vectors are orthogonal and the length of each column vector is 1It is thus an orthogonal matrix


The example below certainly has orthogonal column vectors, but they are not of unit length

Well, we can change them into unit vectors by dividing each component by the length of that vector

As it stands Q Q is not the identity matrix

In [9]: Q = Matrix([[1, 1], [1, -1]])Q.transpose() * Q

Turning it into an orthogonal matrix

In [10]: Q = (1 / sqrt(2)) * Matrix([[1, 1], [1, -1]])Q


Out[5]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

010

001

100

⎤

⎦⎥⎥

⎡

⎣⎢⎢

001

100

010

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[6]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

Out[7]: ( )[ ] ,cos (θ)sin (θ)

− sin (θ)cos (θ) [ ]cos (θ)

− sin (θ)sin (θ)cos (θ)

Out[8]: [ ](θ) + (θ)sin2 cos2

00

(θ) + (θ)sin2 cos2

Q = [ ]11

1−1

=+(1)2 (1)2‾ ‾‾‾‾‾‾‾‾‾√ 2‾√

=+(1)2 (−1)2‾ ‾‾‾‾‾‾‾‾‾‾‾√ 2‾√

Q = [ ]12‾√

11

1−1

T

Out[9]: [ ]20

02

Out[10]: ⎡⎣⎢⎢

2√22√

2

2√2

− 2√2

⎤⎦⎥⎥

Out[11]: [ ]10

01



Consider this example with orthogonal (but not orthonormal) column vectors

In [12]: Q = Matrix([[1, 1, 1, 1], [1, -1, 1, -1], [1, 1, -1, -1], [1, -1, -1, 1]])Q

Again, as it stands Q Q is not the identity matrix


But turning it into an orthogonal matrix works

In [14]: Q = Rational(1, 2) * Matrix([[1, 1, 1, 1], [1, -1, 1, -1], [1, 1, -1, -1], [1, -1, -1, 1]])# Rational() creates a mathematical fraction instead of a decimalQ


Consider this matrix Q with orthogonal column vectors, but that is not square

In [16]: Q = Rational(1, 3) * Matrix([[1, -2], [2, -1], [2, 2]])Q

We now have a matrix with two column vectors that are normalized and orthogonal to each other and they form a basis for a plane (subspace) in �

There must be a third column matrix of unit length, orthogonal to the other two so we end up with an orthogonal matrix

In [17]: Q = Rational(1, 3) * Matrix([[1, -2, 2], [2, -1, -2], [2, 2, 1]])Q


Out[12]: ⎡

⎣

⎢⎢⎢⎢

1111

1−11

−1

11

−1−1

1−1−11

⎤

⎦

⎥⎥⎥⎥

T

Out[13]: ⎡

⎣

⎢⎢⎢⎢

4000

0400

0040

0004

⎤

⎦

⎥⎥⎥⎥

Out[14]: ⎡

⎣

⎢⎢⎢⎢⎢

12121212

12

− 12

12

− 12

1212

− 12

− 12

12

− 12

− 12

12

⎤

⎦

⎥⎥⎥⎥⎥

Out[15]: ⎡

⎣

⎢⎢⎢⎢

1000

0100

0010

0001

⎤

⎦

⎥⎥⎥⎥

Out[16]: ⎡

⎣

⎢⎢⎢

132323

− 23

− 13

23

⎤

⎦

⎥⎥⎥

3

Out[17]: ⎡

⎣

⎢⎢⎢

132323

− 23

− 13

23

23

− 23

13

⎤

⎦

⎥⎥⎥

Out[18]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥



Let's make use of these matrices with orthonormal columns (which we will always denote with a letter Q) and project them onto their columnspaceWhat would the projection matrix be?

Remember, though that for matrices with orthonormal column vectors we have Q Q is the identity matrix and we have

If additionally, Q is square, then we have independent columns and the columnspace contain the whole space � and the projection matrix is the identity matrix in nRemember Q = Q in these cases making it easy to see that we get the identity matrixRemember also that the projection matrix is symmetricLastly the projection matrix has the property of squaring it leaves us in the same spot, so here we will have (QQ ) =QQ

All of this has the final consequence that

Gram-Schmidt

All of the above makes things quite easy, so we should try and create orthogonal matrices

Good, let's start with two independent vectors a and b and try and create two orthogonal vectors A and B and then create two orthonormal vectors

We can choose one of them as our initial vector, say a = A, so we have to get an orthogonal projection (to a) for BThis is what we previously called the error vector e

Remembering how to get p we have the following

Let's do an example

In [19]: a = Matrix([1, 1, 1])b = Matrix([1, 0, 2])a, b

In [20]: A = aA

In [21]: A.transpose() * b


Q = bx⎯⎯P = Q( Q)QT −1QT

T

P = QQT

nT -1

T 2 T

Q =QT x QT b⎯⎯=x QT b⎯⎯= bx i qT

i

=q1A

∥A∥

=q2B

∥B∥

= −e⎯⎯ b⎯⎯ p⎯⎯

B = − Ab⎯⎯AT b⎯⎯

AAT

Out[19]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

⎡

⎣⎢⎢

102

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[20]: ⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

Out[21]: [ ]3

Out[22]: [ ]3



In [23]: B = b - AB

Checking that they are perpendicular

In [24]: A.transpose() * B

Now we have to create Q by turning A and B into unit vectors and place them in the same matrix

In [25]: A.normalized() # Easy way no normalize a matrix

In [26]: B.normalized()

In [27]: Q = Matrix([[sqrt(3) / 3, 0], [sqrt(3) / 3, -sqrt(2) / 2], [sqrt(3) / 3, sqrt(2) / 2]])Q

The columnspace of the original matrix (of two column vectors) and Q are the same

In python™ we can use the following code

In [28]: # The column matrices (independant orthogonal column vectors) are entered indivisually inside square bracket []A = [Matrix([1, 1, 1]), Matrix([1, 0, 2])]A

In [29]: Q = GramSchmidt(A, True) # The True argument normalizes the columnsQ

Example problems

Example problem 1

Out[23]: ⎡

⎣⎢⎢

0−11

⎤

⎦⎥⎥

Out[24]: [ ]0

Out[25]: ⎡

⎣

⎢⎢⎢⎢

3√33√

33√

3

⎤

⎦

⎥⎥⎥⎥

Out[26]: ⎡

⎣

⎢⎢⎢⎢

0− 2√

22√

2

⎤

⎦

⎥⎥⎥⎥

Out[27]: ⎡

⎣

⎢⎢⎢⎢

3√33√

33√

3

0

− 2√22√

2

⎤

⎦

⎥⎥⎥⎥

Out[28]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

⎡

⎣⎢⎢

102

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[29]: ⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

3√33√

33√

3

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0− 2√

22√

2

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥



Create an orthogonal matrix from the following matrix

Solution

In [30]: A = [Matrix([1, 0, 0]), Matrix([2, 0, 3]), Matrix([4, 5, 6])]A

In [31]: Q = GramSchmidt(A, True)Q

We can also consider QR-factorization

In [32]: from sympy.mpmath import matrix, qr

In [33]: A = matrix([[1, 2, 4], [0, 0, 5], [0, 3, 6]])print(A)

In [34]: Q, R = qr(A)

In [35]: print(Q)

In [36]: print(R)

In [ ]:

⎡

⎣⎢⎢

100

203

456

⎤

⎦⎥⎥

Out[30]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥ ,

⎡

⎣⎢⎢

203

⎤

⎦⎥⎥

⎡

⎣⎢⎢

456

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[31]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥ ,

⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

010

⎤

⎦⎥⎥

⎤

⎦⎥⎥

[1.0 2.0 4.0][0.0 0.0 5.0][0.0 3.0 6.0]

[1.0 0.0 0.0][0.0 0.0 -1.0][0.0 -1.0 0.0]

[1.0 2.0 4.0][0.0 -3.0 -6.0][0.0 0.0 -5.0]

2015/03/29, 2:01 PMII_05_Properties_of_the_determinant

Page 1 of 4http://localhost:8888/nbconvert/html/II_05_Properties_of_the_determinant.ipynb?download=false

This notebook is part of lecture 18 Properties of determinants in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import HTMLfrom warnings import filterwarnings


In [4]: a, b, c = symbols('a b c')

Properties of the determinant

Notation

The determinant of a matrix A is written as det(A) or |A|

Main properties

There are three main properties (first three listed below) and seven that follow from them

Out[1]:





http://ipython.org/



det(I)=1A row exchange changes the sign of the determinant

We now know the determinant of every permutation matrixMultiplying any row with a constant results in the determinant also being multiplied by that constant

* Only works when altering a single row (the determinant is a linear function of each row separately) * An alternate way of seeing this is

If two rows are equal then the determinant is zeroThis follows from property number two, where if we interchange rows the sign must changeThis only works for zero, since row exchange leaves the matrix unchanged, which now can't have a different determinant (opposite sign)

Subtracting a constant multiple of one row from another leaves the determinant unchangedThis flows from property three above

From property four (determinant of matrix with two similar rows equal zero) we now have the following

The determinant of a matrix with a complete row (or columns) of zero is zeroThis also follows from property three above, but multiplying a row by zero

The determinant of an upper triangular matrix is the product of the elements of the main diagonal (the pivots)Same goes for a diagonal matrix

This helps us to develop the expression for the determinant of a 2×2 matrix

We can change an upper triangular matrix into a diagonal matrix by row operations (leaving the determinant unchanged by property five)Now we can use the first part of the third property and take out each pivotEventually we are left with the identity matrix and the product of all the pivotsFor a zero on the main diagonal we can use the property of a row of zeros and know the determinant is zero

If the determinant is zero, the matrix is singular (only has the zero solution / not invertible)The determinant of the product of matrices

For the determinant of a transpose of a matrix we have

Example problems

Do the following by making use of the properties above

Example problem 1

= +∣∣∣ a + a′

cb + b′

d∣∣∣

∣∣∣ ac

bd

∣∣∣

∣∣∣ a′

cb′

d∣∣∣

= +∣∣∣ ac − la

bd − lb

∣∣∣

∣∣∣ ac

bd

∣∣∣

∣∣∣ a−la

b−lb

∣∣∣

= + (−l)∣∣∣ ac

bd

∣∣∣

∣∣∣ aa

bb

∣∣∣

= + 0∣∣∣ ac

bd

∣∣∣

=∣∣∣ ac

bd

∣∣∣

[ ]ac

bd

[ ]aca

cbc

a

d

[ ]ac − ac

a

bd − bc

a

[ ]a0

bd − bc

a

∴∣∣∣∣a0

bd − bc

a

∣∣∣∣

= (a) (d − b)ca

= ad − a bca

= ad − bc

|AB| = |A| |B|=∣∣A−1 ∣∣

1|A|

= |A| |A| =∣∣A2∣∣ |A|2

|cA| = |A|cn

= |A|∣∣AT ∣∣



In [5]: A = Matrix([[101, 201, 301], [102, 202, 302], [103, 203, 303]])A

Solution

By constant multiple subtraction we get

Two identical rows, thus the determinant is zero

In [6]: A.det()

Example problem 2

In [7]: A = Matrix([[1, a, a ** 2], [1, b, b ** 2], [1, c, c ** 2]])A

Solution

Subtracting constant multiple of row 1 from rows 2 and three and expanding the elements

Using property three that states that the determinant is a linear property of each row

Another elimination on row three

Now we have upper triangular form and the determinant is the product of the elements in the main diagonal and also multiplying the (b-a)(c-a)

In [8]: (A.det()).factor() # Calculating the determinant and factorizing the result

This is called a Vandermonde matrixhttp://en.wikipedia.org/wiki/Vandermonde_matrix (http://en.wikipedia.org/wiki/Vandermonde_matrix)

Example problem 3

Out[5]: ⎡

⎣⎢⎢

101102103

201202203

301302303

⎤

⎦⎥⎥

⎡

⎣⎢⎢

10111

20111

30111

⎤

⎦⎥⎥

Out[6]: 0

Out[7]: ⎡

⎣⎢⎢⎢

111

abc

a2

b2

c2

⎤

⎦⎥⎥⎥

=∣

∣

∣∣∣

100

ab − ac − a

a2

−b2 a2

−c2 a2

∣

∣

∣∣∣

=∣

∣

∣∣∣

100

ab − ac − a

a2

(b − a) (b + a)(c − a) (c + a)

∣

∣

∣∣∣

= (b − a) (c − a) =∣

∣

∣∣∣

100

a11

a2

(b + a)(c + a)

∣

∣

∣∣∣

(b − a) (c − a) =∣

∣

∣∣∣

100

a11

a2

(b + a)(c + a)

∣

∣

∣∣∣

= (b − a) (c − a) (b − c)

Out[8]: − (a − b) (a − c) (b − c)

http://en.wikipedia.org/wiki/Vandermonde_matrix



In [9]: A = Matrix([1, 2, 3]) * Matrix([[1, -4, 5]])A

Solution

The rows of the resultant 3×3 matrix is linearly dependent, i.e. they are 1 times the row (1,-4,5), then twice this same row for row two and lastly three times the same rowfor row threeThis means that the determinant will be zero

In [10]: A.det()

Example problem 4

In [11]: A = Matrix([[0, 1, 3], [-1, 0, 4], [-3, -4, 0]])A

Solution

Note how this matrix is skew symmetricThis means that A =-AWith the matrices A and -A being equal, their determinant are equalRemember, though that the determinant of a matrix is the same as the determinant of the transpose of the matrix

In [12]: A.det()

Not all skew symmetric matrices have a zero determinantIt only works because n is odd for this size matrix being 3×3 allowing for the negative

In [ ]:

Out[9]: ⎡

⎣⎢⎢

123

−4−8−12

51015

⎤

⎦⎥⎥

Out[10]: 0

Out[11]: ⎡

⎣⎢⎢

0−1−3

10

−4

340

⎤

⎦⎥⎥

TT

|A| = = |−A| = |A| = − |A|∣∣AT ∣∣ (−1)3

|A| = − |A|∴ |A| = 0

Out[12]: 0

2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors

Page 1 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false

This notebook is part of lecture 19 Determinant formulas and cofactors in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







In [4]: x, y = symbols('x y')

Determinant formulas and cofactorsTridiagonal matrices

Creating an equation for the determinant of a 2×2 matrix

Using just the three main properties from the previous lecture and knowing that the determinant of a matrix with a column of zero's is zero we have the following

Creating an equation for the determinant of a 3×3 matrix

By the method above, this will create a lot of matricesWe need to figure out which ones remain, i.e. do not have columns of zerosNote carefully that we just keep those with at least one element from each row and column

Out[1]:

= +∣∣∣ ac

bd

∣∣∣

∣∣∣ ac

0d

∣∣∣

∣∣∣ 0c

bd

∣∣∣

= + + +∣∣∣ ac

00

∣∣∣

∣∣∣ a0

0d

∣∣∣

∣∣∣ 0c

b0

∣∣∣

∣∣∣ 00

bd

∣∣∣

∵ = = 0∣∣∣ ac

00

∣∣∣

∣∣∣ 00

bd

∣∣∣

+∣∣∣ a0

0d

∣∣∣

∣∣∣ 0c

b0

∣∣∣

= −∣∣∣ a0

0d

∣∣∣

∣∣∣ c0

0b

∣∣∣

= ad − bc

∣

∣

∣∣∣

a11a21a31

a12a22a32

a13a23a33

∣

∣

∣∣∣

= + + + + +∣

∣

∣∣∣

a1100

0a220

00

a33

∣

∣

∣∣∣

∣

∣

∣∣∣

a1100

00

a32

0a230

∣

∣

∣∣∣

∣

∣

∣∣∣

0a210

a1200

00

a33

∣

∣

∣∣∣

∣

∣

∣∣∣

00

a31

a1200

0a230

∣

∣

∣∣∣

∣

∣

∣∣∣

0a210

00

a32

a1300

∣

∣

∣∣∣

∣

∣

∣∣∣

00

a31

0a220

a1300

∣

∣

∣∣∣

= − − + + −a11 a22 a33 a11 a23 a32 a12 a21 a33 a12 a23 a31 a13 a21 a32 a13 a22 a31





http://ipython.org/



Creating an equation for the determinant of a n × n matrix

We will have n! terms, half of which is positive and the other half negativeWe have n because for the first row we have n positions to choose from, the for the second lot we have n-1 and so on

This holds for permuations of the columns (each used only once)

Consider this example

Successively choosing a single element from each column (using column numbers for the Greek symbols above), we get the following permutations (note their sign as weinterchange the numbers to follow in order 1 2 3 4

(4,3,2,1) = (1,2,3,4) Two swaps(3,2,1,4) = -(1,2,3,4) One swapThat is it!So we have 1 - 1 = 0

Note that in this example of a 4×4 matrix a lot of the permutations would have a zero in the, so we won't end up with 4! = 24 permutations

In [5]: A = Matrix([[0, 0, 1, 1], [0, 1, 1, 0], [1, 1, 0, 0], [1, 0, 0, 1]])A

In [6]: A.det()

We could have seen that this matrix is singular by noting that some combination of rows give identical rows and then by subtraction, a row of zero

In [7]: A.rref()

Cofactors of a 3×3 matrix

Start with the equation above

The cofactors are in parentheses and are the 2×2 submatrix determinantsThey signify the determinant of a smaller (n-1) matrix with some sign problems, i.e. some are positive the determinant and some are negative the determinantWe are especially interested here in row one, but any row (or even column) will doSo for any a the cofactor is the ± determinant of the n-1 matrix with its i row and j column erasedFor the sign, if i + j is even, the sign is positive and if it is odd, then the sign is negativeSo the cofactor of a = C

For rows we have

|A| = ∑ ± . . .a1α a2β a3γ anω

(α, β, γ, δ, … , ω) = (1, 2, 3, 4, … , n)

⎡

⎣

⎢⎢⎢⎢

0011

0110

1100

1001

⎤

⎦

⎥⎥⎥⎥

Out[5]: ⎡

⎣

⎢⎢⎢⎢

0011

0110

1100

1001

⎤

⎦

⎥⎥⎥⎥

Out[6]: 0

Out[7]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1000

0100

0010

1−110

⎤

⎦

⎥⎥⎥⎥[ ]0, 1, 2

⎞

⎠

⎟⎟⎟⎟

− − + + −a11 a22 a33 a11 a23 a32 a12 a21 a33 a12 a23 a31 a13 a21 a32 a13 a22 a31= ( − ) + (− + ) + ( − )a11 a22 a33 a23 a32 a12 a21 a33 a23 a31 a13 a21 a32 a22 a31

ij

ij ij

=|A|i ∑k=1

naikCik



Diagonal matrices

Calculate

In [8]: A = Matrix([1])A

In [9]: A.det()

Calculate

In [10]: A = Matrix([[1, 1], [1, 1]])A

In [11]: A.det()

Calculate

In [12]: A = Matrix([[1, 1, 0], [1, 1, 1], [0, 1, 1]])A

By the cofactor equation above

In [13]: A.det()

Calculate

In [14]: A = Matrix([[1, 1, 0, 0], [1, 1, 1, 0], [0, 1, 1, 1], [0, 0, 1, 1]])A

In [15]: A.det()

∣∣A1∣∣

Out[8]: [ ]1

Out[9]: 1

∣∣A2∣∣

Out[10]: [ ]11

11

Out[11]: 0

∣∣A3∣∣

Out[12]: ⎡

⎣⎢⎢

110

111

011

⎤

⎦⎥⎥

=|A|i ∑k=1

naikCik

= + +|A|1 a11 C11 a12 C12 a13 C13→ +; (i + j) ∈ 2nCij

→ −; (i + j) ∈ 2n + 1Cij

= 1 (0) − 1 (1) + 0 (1) = −1|A|1

Out[13]: −1

∣∣A4∣∣

Out[14]: ⎡

⎣

⎢⎢⎢⎢

1100

1110

0111

0011

⎤

⎦

⎥⎥⎥⎥

Out[15]: −1



Continuing on this path of tridiagonal matrices we have

We would thus have

We note that A starts the sequence all over againTridiagonal matrices have determinants of period 6

Example problems

Example problem 1

Calculate the determinant of the following matrix

In [16]: A = Matrix([[x, y, 0, 0, 0,], [0, x, y ,0 ,0 ], [0, 0, x, y, 0], [0, 0, 0, x, y], [y, 0, 0, 0, x]])A

In [17]: A.det()

Solution

Note how first selecting row 1's x and the y leaves triangular matrices in the remaining (n-1)×(n-1) matrixThese form cofactors and their determinant are simply the product of the entries along the main diagonalWe simply have to remember the sign rule, which well be (-1)

Example problem 2

In [18]: A = Matrix([[x, y, y, y, y], [y, x, y, y, y], [y, y, x, y, y], [y, y, y, x, y], [y, y, y, y, x]])A

Solution

In [19]: A.det()

In [20]: (A.det()).factor()

= −∣∣An∣∣ ∣∣An−1 ∣∣ ∣∣An−2 ∣∣

= −∣∣A5 ∣∣ ∣∣A4∣∣ ∣∣A3∣∣= −1 − (−1) = 0∣∣A5∣∣

= −∣∣A6∣∣ ∣∣A5∣∣ ∣∣A4∣∣= 0 − (−1) = 1∣∣A6∣∣

7

Out[16]: ⎡

⎣

⎢⎢⎢⎢⎢⎢

x000y

yx000

0yx00

00yx0

000yx

⎤

⎦

⎥⎥⎥⎥⎥⎥

Out[17]: +x5 y5

(5+1)

|A| = x ( ) + y ( ) = +x4 y4 x5 y5

Out[18]: ⎡

⎣

⎢⎢⎢⎢⎢⎢

xyyyy

yxyyy

yyxyy

yyyxy

yyyyx

⎤

⎦

⎥⎥⎥⎥⎥⎥

Out[19]: − 10 + 20 − 15x + 4x5 x3 y2 x2 y3 y4 y5

Out[20]: (x + 4y)(x − y)4



Note that we can introduce many zero entry by the elementary row operation of subtracting one row from anotherLet's subtract row 4 from row 5

Now subtract row 3 from 4

Subtract 2 from 3

... and 1 from 2

Now consider some column operations, adding the 5th column to the fourth column and then 4 to 3 etc...This will introduce new non-zero entries, thoughThese can be changed back to a zero by adding the 5 column and the 4 to the 3Then columns 5, 4, 3 to 2, etc...

This is upper triangular and the determinant is the product of the entries on the main diagonal

In [21]: (x + 4 * y) * (x - y) ** 4

In [22]: ((x + 4 * y) * (x - y) ** 4).expand()

In [ ]:

⎡

⎣

⎢⎢⎢⎢⎢⎢

xyyy0

yxyy0

yyxy0

yyyx

y − x

yyyy

x − y

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

xyy00

yxy00

yyx

y − x0

yyy

x − yy − x

yyy0

x − y

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

xy000

yx

y − x00

yy

x − yy − x

0

yy0

x − yy − x

yy00

x − y

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

xy − x

000

yx − yy − x

00

y0

x − yy − x

0

y00

x − yy − x

y000

x − y

⎤

⎦

⎥⎥⎥⎥⎥⎥

th rd

/th th rd

⎡

⎣

⎢⎢⎢⎢⎢⎢

x + 4y0000

4yx − y

000

3y0

x − y

0

2y00

x − y0

y000

x − y

⎤

⎦

⎥⎥⎥⎥⎥⎥

Out[21]: (x + 4y)(x − y)4

Out[22]: − 10 + 20 − 15x + 4x5 x3 y2 x2 y3 y4 y5

2015/03/29, 2:07 PMII_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box

Page 1 of 4http://localhost:8888/nbconvert/html/II_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box.ipynb?download=false

This notebook is part of lecture 20 Cramer's rule, inverse, and volume of a box in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, symbols, eye, Matrix, Rationalfrom warnings import filterwarnings


Equations for the inverse of a matrix

Cramer's rule

The volume of a box

Deriving an equation for the inverse of a matrix

The equation for the inverse of a matrix

With arithmetic alteration we have the following

Writing out the left-hand side we have

Out[1]:

=A−1 1|A| C

T

∴ A = |A| ICT

⎡

⎣

⎢⎢⎢⎢

a11a21

⋮an1

a12a22

⋮an2

…………

a1n

a2n

⋮ann

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

C11C12

⋮C1n

C21C22

⋮C2n

…………

Cn1Cn2

⋮Cnn

⎤

⎦

⎥⎥⎥⎥





http://ipython.org/



From the previous lecture we had the equation for the determinant (using cofactors), which correlates with the above (looking at row 1 times column 1 (i=1)

Alas we have to get |A| only on the main diagonal for the right-hand side aboveIt follows, though that i.e. for position row 1, column 2 we do get a zeroLook at the 2×2 matrix

So for AC we would have the following (note, though what happens if we try and get row 1, column 2

... and that's so cool!

Cramer's rule

From Ax=b we have x=A b, which gives us the following

This is difficult to see, but we successively replace each column in A with the column vector b, whic creates a bunch of new matrices B , such that the following applies

The volume of a box (parallelepiped )

Consider a box in three dimensions (each side is a parallelogram)Make one corner coincide with originThe vector coordinate of the three sides that emanate from this corner become the rows of a square matrix, 3×3 in this caseThe volume is then the determinant of this matrix

Consider a square box of sides of unit length one

In [4]: A = eye(3)A

In [5]: A.det()

This proves the first property of determinants

What about the orthogonal matrix

|A| = + …ai1Ci1 ai2 Ci2 ainCin

=[ ]ac

bd

−1 1|A|C

T

=[ ]ac

bd

−1 1|A|[ ]d

−b−ca

T

= [ ][ ]ac

bd

−1 1|A|

d−c

−ba

T

[ ] [ ] = [ ] = [ ] = [ ] = |A| [ ]ac

bd

d−c

−ba

ad − bccd − cd

−ab + abad − bc

ad − bc0

0ad − bc

|A|0

0|A|

10

01

-1

=x⎯⎯1|A| C

Tb⎯⎯

∴ =

⎡

⎣

⎢⎢⎢⎢

x1x2

⋮xn

⎤

⎦

⎥⎥⎥⎥1|A|

⎡

⎣

⎢⎢⎢⎢

C11C12

⋮C1n

C21C22

⋮C2n

…………

Cn1Cn2

⋮Cnn

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

b1b2

⋮bn

⎤

⎦

⎥⎥⎥⎥

j

=

⎡

⎣

⎢⎢⎢⎢

x1x2

⋮xn

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

∣∣B1 ∣∣|A|∣∣B2 ∣∣|A|

⋮∣∣Bn ∣∣|A|

⎤

⎦

⎥⎥⎥⎥⎥⎥

Out[4]: ⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

Out[5]: 1



Here we have the three edges being orthonormalWe know that Q Q = I

A rectangular box (edges square)

Doubling an edge doubles the volumeThis is akin to a single row multiplied by a scalarThus determinant will increase by this scalar (multiplication)

Area of parallelogram and a triangle

The area of a parallelogram is just the determinant of a 2×2 matrix with the rows being row vectors of two sides from the originThe area of a triangle is simply half of this

For the triangle that is not at the origin, with three angles at (x , y ), (x , y ) and (x , y ), we simply subtract values along the axes from each otherThat is akin to getting the determinant of this matrix

Simple row reduction is equivalent to moving the triangle by the subtraction above

Example problems

Example problem 1

Calculate the volume of the tetrahedron with vertices being the following vectorsa =(2,2,-1), a =(1,3,0), a =(-1,1,4)

Also calculate the volume if a =(-201,-199,104)

Solution

In [6]: A = Matrix([[2, 2, -1], [1, 3, 0], [-1, 1, 4]])A

The volume of a tetrahedron is a third times the area of the (any) base and the height from the (chosen) baseThe volume of a parallelepiped is the area of the base times the heightIf we keep the base of the two the same and the apex the same, we note that the base is twice the area of the triangle that forms the base of the tetrahedronWe thus have that the volume of the tetrahedron is a 6 of the volume of the parallelepiped

T

Q = |I |∣∣QT ∣∣Q = |I |∣∣QT ∣∣ ∣∣ ∣∣

∵ = Q∣∣QT ∣∣ ∣∣ ∣∣∴ Q Q = |I |∣∣ ∣∣ ∣∣ ∣∣

= |I | = 1Q∣∣ ∣∣2∴ Q = ±1∣∣ ∣∣

1 1 2 2 3 3

∣

∣

∣∣∣

x1x2x3

y1y2y3

111

∣

∣

∣∣∣

∣

∣

∣∣∣

x1−x2 x1−x3 x1

y1−y2 y1−y3 y1

100

∣

∣

∣∣∣

1 2 33

Out[6]: ⎡

⎣⎢⎢

21

−1

231

−104

⎤

⎦⎥⎥

th



In [7]: A.det()

In [8]: Rational(1, 6) * A.det()

In [9]: A_new = Matrix([[2, 2, -1], [1, 3, 0], [-201, -199, 104]])A_new

In [10]: Rational(1, 6) * A_new.det()

By the second part of the third property of determinants we know, though, that a constant multiple of a row subtracted from another (one of the elementary rowoperations) does not change the determinantIn this case we subtracted 100 times row 1 from row 3

In effect, the height is not changing; the apex is moving away parallel to a

In [ ]:

Out[7]: 12

Out[8]: 2

Out[9]: ⎡

⎣⎢⎢

21

−201

23

−199

−10

104

⎤

⎦⎥⎥

Out[10]: 2

1

2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors

Page 1 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false

This notebook is part of lecture 21 Eigenvalues and eigenvectors in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, eyefrom warnings import filterwarnings


In [4]: lamda = symbols('lamda') # Note that lambda is a reserved word in python, so we use lamda (without the b)

Eigenvalues and eigenvectors

What are eigenvectors?

A Matrix is a mathematical object that acts on a (column) vector, resulting in a new vector, i.e. Ax=bAn eigenvector is the resulting vector that is parallel to x (some multiple of x)

The eigenvectors with an eigenvalue of zero are the vectors in the nullspaceIf A is singular (takes some non-zero vector into 0) then λ=0

What are the eigenvectors and eigenvalues for projection matrices?

A projection matrix P projects some vector (b) onto a subspace (in 3-space we are talking about a plane through the origin)Pb is not in the same direction as bA vector x that is already in the subspace will result in Px=x, so λ=1Another good x would be one perpendicular to the subspace, i.e. Px=0x, so λ=0

What are the eigenvectors and eigenvalues for permutation matrices?

Out[1]:

A = λx⎯⎯ x⎯⎯





http://ipython.org/



A permutation matrix such as the one below changes the order of the elements in a (column) vector

A good example of a vector that would remain in the same direction after multiplication by the permutation matrix above would the following vector

The eigenvalue would just be λ=1The next (eigen)vector would also work

It would have an eigenvalue of λ=-1

The trace and the determinant

The trace is the sum of the values down the main diagonal of a square matrixNote how this is the same as the sum of the eigenvalues (look at the permutation matrix above and its eigenvalues)The determinant of A is the product of the eigenvalues

How to solve Ax=λx

The only solution to this equation is for A-λI to be singular and therefor have a determinant of zero

This is called the characteristic (or eigenvalue) equationThere will be n λ for a n×n matrix(some of which may be of equal value)

In [5]: A = Matrix([[3, 1], [1, 3]])I = eye(2)A, I # Printing A and the 2-by-2 identity matrix to the screen

In [6]: (A - lamda * I) # Printing A minus lambda times the identity matrix to the screen

This will have the following determinant

In [7]: (A - lamda * I).det()

For this 2×2 matrix the absolute value of the -6 is the trace of A and the 8 is the determinant of A

In [8]: ((A - lamda * I).det()).factor()

I now have two eigenvalues of 2 and 4

In python we could also use the .eigenvals() statement

In [9]: A.eigenvals() # There is one value of 2 and one value of 4

[ ]01

10

[ ]11

[ ]−11

A = λx⎯⎯ x⎯⎯(A − λI) =x⎯⎯ 0⎯⎯

A − λI = 0∣∣ ∣∣

's

Out[5]: ( )[ ] ,31

13 [ ]1

001

Out[6]: [ ]−λ + 31

1−λ + 3

Out[7]: − 6λ + 8λ2

Out[8]: (λ − 4) (λ − 2)

Out[9]: { }2 : 1, 4 : 1



The eigenvectors are calculated by substituting the two values of λ into the original equation


The results above is interpreted as followsThe first eigenvalue has one eigenvector and the second eigenvalue also has a single eigenvector

Note the similarity between the eigenvectors of the two examples aboveIt is easy to see that adding a constant multiple of the identity matrix to another matrix (above we added 3I to the initial matrix) doesn't change the eigenvectors; it doesadd that constant to the eigenvalues though (we went from -1 and 1 to 2 and 4)

If we add another matrix to A (not a constant multiple of I) or even multiply them, then the influence on the original eigenvalues and eigenvectors of A is NOT sopredictable (as above)

The eigenvalues and eigenvectors of a rotation matrix

Consider this rotation matrix that rotates a vector by 90 (it is orthogonal)Think about it, though: what vector can come out parallel to itself after a 90 rotation?

In [11]: Q = Matrix([[0, -1], [1, 0]])Q

From the trace and determinant above we know that we will have the following equation

In [12]: Q.eigenvals()

In [13]: Q.eigenvects()

Note how the eigenvalues are complex conjugatesSymmetric matrices will only have real eigenvaluesAn anti-symmetric matrix (where the transpose is the original matrix times the scalar -1, as our example above) will only have complex eigenvaluesMatrices in between can have a mix of these

Eigenvalues and eigenvectors of an upper triangular matrix

Compute the eigenvalues and eigenvectors of the following matrix (note it is upper triangular)

In [14]: A = Matrix([[3, 1], [0, 3]])A


(A − λI) =x⎯⎯ 0⎯⎯

Out[10]: [ ]( ) ,2, 1, [ ][ ]−11 ( )4, 1, [ ][ ]1

1

A = λx⎯⎯ x⎯⎯∴ (A + cI) = (λ + c)x⎯⎯ x⎯⎯

oo

Out[11]: [ ]01

−10

− 0λ + 1 = 0λ2

= −1λ2

Out[12]: { }−i : 1, i : 1

Out[13]: [ ]( ) ,−i, 1, [ ][ ]−i1 ( )i, 1, [ ][ ]i

1

Out[14]: [ ]30

13

Out[15]: { }3 : 2



We have two eigenvalues, both equal to 3


This is a degenerate matrix; it does not have independent eigenvectors

Look at this upper triangular matrix

In [17]: A = Matrix([[3, 1, 1], [0, 3, 4], [0, 0, 3]])A



Example problems

Example problem 1

Find the eigenvalues and eigenvectors of the square of the following matrix as well as the inverse of the matrix minus the identity matrix

Solution

Notice the following

Once we know the eigenvalues for A we than simply square them to get the eigenvalues of the matrix squared

Similarly for the inverse of the matrix we have the following (for a non-zero λ, which is fine as A must be invertible for this problem)

In [20]: A = Matrix([[1, 2, 3], [0, 1, -2], [0, 1, 4]])A


Out[16]: [ ]( )3, 2, [ ][ ]10

Out[17]: ⎡

⎣⎢⎢

300

130

143

⎤

⎦⎥⎥

Out[18]: { }3 : 3

Out[19]: ⎡

⎣⎢⎢

⎛

⎝⎜⎜ 3, 3,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

A =⎡

⎣⎢⎢

100

211

3−24

⎤

⎦⎥⎥

A = λx⎯⎯ x⎯⎯= A (A ) = A (λ ) = λ (A ) =A2 x⎯⎯ x⎯⎯ x⎯⎯ x⎯⎯ λ2x⎯⎯

= = A =A−1x⎯⎯ A−1 Ax⎯⎯λ

A−1 1λ

x⎯⎯1λ

x⎯⎯

Out[20]: ⎡

⎣⎢⎢

100

211

3−24

⎤

⎦⎥⎥

Out[21]: { }1 : 1, 2 : 1, 3 : 1




From this it is clear that the eigenvalues of A will be 1, 4, and 9 and for A would be a 1, a half and a third

In [23]: (A ** 2).eigenvals()

In [24]: (A.inv()).eigenvals()

The eigenvectors will be as follows (exactly the same)

In [25]: (A ** 2).eigenvects()

In [26]: (A.inv()).eigenvects()

In [ ]:

Out[22]: ⎡

⎣⎢⎢⎢ ,

⎛

⎝⎜⎜ 1, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟ ,

⎛

⎝⎜⎜ 2, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

−1−21

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜⎜ 3, 1,

⎡

⎣⎢⎢⎢

⎡

⎣⎢⎢⎢

12

−11

⎤

⎦⎥⎥⎥

⎤

⎦⎥⎥⎥

⎞

⎠⎟⎟⎟

⎤

⎦⎥⎥⎥

2 -1

Out[23]: { }1 : 1, 4 : 1, 9 : 1

Out[24]: { }: 1,13 : 1,1

2 1 : 1

Out[25]: ⎡

⎣⎢⎢⎢ ,

⎛

⎝⎜⎜ 1, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟ ,

⎛

⎝⎜⎜ 4, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

−1−21

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜⎜ 9, 1,

⎡

⎣⎢⎢⎢

⎡

⎣⎢⎢⎢

12

−11

⎤

⎦⎥⎥⎥

⎤

⎦⎥⎥⎥

⎞

⎠⎟⎟⎟

⎤

⎦⎥⎥⎥

Out[26]: ⎡

⎣⎢⎢⎢ ,

⎛

⎝⎜⎜⎜ ,1

3 1,⎡

⎣⎢⎢⎢

⎡

⎣⎢⎢⎢

12

−11

⎤

⎦⎥⎥⎥

⎤

⎦⎥⎥⎥

⎞

⎠⎟⎟⎟ ,

⎛

⎝⎜⎜ ,1

2 1,⎡

⎣⎢⎢

⎡

⎣⎢⎢

−1−21

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ 1, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥⎥


Page 1 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false

This notebook is part of lecture 22 Diagonalization and powers of A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Diagonalizing a matrixPowers of a matrix A

Definition

If A is a n×n, then a non-zero vector x in � is called an eigenvector of the matrix A if Ax is a scalar multiple of xWhat this suggests is that if you consider the column vector x and multiply it by a scalar (here called λ) (which is then parallel to x, just of different length) it results in thesame solution as multiplying the matrix A by xLet's try another explanation: if a matrix A, multiplied with a (column) vector (x) results in a scalar multiple of that same (column) vector (and is thus parallel to that(column) vector) then this (column) vector is an eigenvector of the matrix A

In essence this multiplication of a matrix with a (column) vector produces another vector on the same line as the original vectorDepending on the value of this scalar the resulting vector might point in the opposite direction and be shorter or longer than the original

This scalar multiple is called the eigenvalueMatrices can have more than one eigenvalue and eigenvector

Derivations

We need to insert an identity matrix of size n into the equation that describes the explanation above

Look at this carefully and you'll notice that we are suggesting the nullspace (eigenspace) of the matrix (A-λI)This matrix has to be singular, i.e. have a determinant of 0

Solving this equation (called the characteristic equation) will give us the eigenvalues (λ )It will always be a polynomial in λ (called the characteristic polynomial of A), with a leading coefficient of 1 and a degree of n corresponding to the size of A

Substituting them back into...

... allows us to calculate the eigenvector(s) x

Out[1]:

n

A = λx⎯⎯ x⎯⎯A = λIx⎯⎯ x⎯⎯

A − λI =x⎯⎯ x⎯⎯ 0⎯⎯(A − λI) =x⎯⎯ 0⎯⎯

A − λI = 0∣∣ ∣∣'s

p (λ) = + + ⋯ +λn c1 λn−1 cn

(A − λI) =x⎯⎯ x⎯⎯ 0⎯⎯





http://ipython.org/



Let's look at the following matrix A

Let's start with the first eigenvalue, which is equal to 1 and replace it in A-λI

In [4]: A = Matrix([[-1, 0 ,-2], [1, 1, 1], [1, 0, 2]])A

We now need the nullspace of this matrix


We knew that this would be 1-dimensional after looking at the row-reduced form

In [6]: A.rref()

It has rank 2 (two pivot column and 1 free variable

Now for the other 2 eigenvalues, both equaling 2

In [7]: A = Matrix([[-2, 0, -2], [1, 0, 1], [1, 0, 1]])A


In [9]: A.rref()

Only a single pivot column, therefor rank of 1 and two independent (free) variables

A =⎡

⎣⎢⎢

011

020

−213

⎤

⎦⎥⎥

A − λI = − =⎡

⎣⎢⎢

011

020

−213

⎤

⎦⎥⎥

⎡

⎣⎢⎢

λ00

0λ0

00λ

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−λ11

02 − λ

0

−21

3 − λ

⎤

⎦⎥⎥

= 0∣

∣

∣∣∣

−λ11

02 − λ

0

−21

3 − λ

∣

∣

∣∣∣

− 5 + 8λ − 4 = 0λ3 λ2

= 1, = = 2λ1 λ2 λ3

Out[4]: ⎡

⎣⎢⎢

−111

010

−212

⎤

⎦⎥⎥

Out[5]: ⎡

⎣⎢⎢

⎡

⎣⎢⎢

−211

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[6]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

010

2−10

⎤

⎦⎥⎥ [ ]0, 1

⎞

⎠⎟⎟

Out[7]: ⎡

⎣⎢⎢

−211

000

−211

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢ ,

⎡

⎣⎢⎢

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−101

⎤

⎦⎥⎥

⎤

⎦⎥⎥

Out[9]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

100

000

100

⎤

⎦⎥⎥ [ ]0

⎞

⎠⎟⎟



Corresponding to the first eigenvalue we have a single eigenvector that is the basis for a 1-dimensional (line) eigenspace in �Corresponding to the second (and third) eigenvalues we have two basis vectors for a 2-dimensional plane in �Since we are talking about subspaces, we must note that the zero vector must be in both eigenspaces (type of nullspace), but isn't an eigenvector

The eigenvalues of triangular (upper and lower) and diagonal matrices

The eigenvalue of these type of matrices are exactly the entries along the main diagonal

Real and complex eigenvalues

There will be characteristic polynomials resulting in complex rootsThe consequences of real-valued eigenvalues for a square matrix A of size n are the following

The system (A-λI)x=0 has non-trivial solutionsThere is a non-zero vector x in � such that Ax=λx

The eigenvector matrix S and eigenvalue matrix Λ

We need to create S from the (column) eigenvectors such that the following holds

As such, S should be square of size n×n and invertible, so we need n independent eigenvectors

Suppose we have n linearly independent eigenvectors of APut them in the columns of S and calculate AS


Later I will use the computer variable D for this diagonal matrix Λ

The power of a matrix (only for n independent eigenvectors)

We saw in the example section of the last lecture that the following holds

The eigenvectors are the same for A and AWe can also see the following

The power need not be 2, but any k which will have S appearing k-1 times

We thus have the following theorems

...and...If k is a positive integer, λ is an eigenvalue of the matrix A, and x is a corresponding eigenvector, then λ is an eigenvalue of A and x is a corresponding eigenvector

33

n

AS = ΛS−1

AS = A = =

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮x1

⋮

⋮⋮x2

⋮

⋮⋮…⋮

⋮⋮xn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮xλ1 1

⋮

⋮⋮

λ2x2

⋮

⋮⋮…⋮

⋮⋮

λnxn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

⋮⋮x1

⋮

⋮⋮x2

⋮

⋮⋮…⋮

⋮⋮xn

⋮

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

λ10⋮0

0λ2

⋮0

00…0

00⋮λn

⎤

⎦

⎥⎥⎥⎥

AS = SΛ

AS = SΛAS = ΛS−1

A = SΛS−1

= xA2x⎯⎯ λ22

= SΛ SΛ = SA2 S−1 S−1 Λ2S−1

-1

→ 0 ∵ k → ∞;Ak ∣∣λi∣∣

k k



What makes a matrix diagonalizable

In discussing diagonalization we are concerned with finding a basis for � that consists of eigenvectors of a given square matrix of size nThese bases can tell us about geometric properties of A and it can simplify numerical computations involving A

We need to answer two question (which are actually the same)Given a square matrix of size n, is there a basis for � consisting of eigenvectors?Given a square matrix of size n, it there and invertible matrix S, such that S AS is a diagonal matrix? (It is the same matrix S referred to above)

If such a matrix S exists, it is said to diagonalize A (and we will call the resultant diagonal matrix D)

In short the answer to the above question(s) is yes if A has n independent eigenvectorsThis happens if all λ are different (none are repeated) (not totally excluded if they are repeated though)

If they are repeated, we still might have independent eigenvectors, i.e. any size identity matrix (because it is already diagonal)

In [10]: A = eye(5)A



Here we look at a triangular matrix, though

In [13]: A = Matrix([[2, 1], [0, 2]])A.eigenvals()


We can use python™ code to calculate the diagonalized matrix

In [15]: A = Matrix([[3, -2, 4, -2], [5, 3, -3, -2], [5, -2, 2, -2], [5, -2, -3, 3]])A


n

n-1

's

Out[10]: ⎡

⎣

⎢⎢⎢⎢⎢

10000

01000

00100

00010

00001

⎤

⎦

⎥⎥⎥⎥⎥

Out[11]: { }1 : 5

Out[12]: ⎡

⎣

⎢⎢⎢⎢⎢

⎛

⎝

⎜⎜⎜⎜⎜

1, 5,

⎡

⎣

⎢⎢⎢⎢⎢

,

⎡

⎣

⎢⎢⎢⎢⎢

10000

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

01000

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

00100

⎤

⎦

⎥⎥⎥⎥⎥

,

⎡

⎣

⎢⎢⎢⎢⎢

00010

⎤

⎦

⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢

00001

⎤

⎦

⎥⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟⎟

⎤

⎦

⎥⎥⎥⎥⎥

Out[13]: { }2 : 2

Out[14]: [ ]( )2, 2, [ ][ ]10

Out[15]: ⎡

⎣

⎢⎢⎢⎢

3555

−23

−2−2

4−32

−3

−2−2−23

⎤

⎦

⎥⎥⎥⎥



In [17]: S # S, such that A = S times D times the inverse of S

In [18]: D # The diagonal

In [19]: S * D * S.inv() == A # Checking to see if our statement above is correct

In [20]: S.inv() * A * S == D # Checking to see if our statement above is correct


Remember Λ from above?The eigenvalues are precisely the entries along the main diagonal of the diagonal matrix

To produce the required diagonal matrix manually then will require computing n linearly independent eigenvectors for matrix A of size n (assuming that it isdiagonalizable), creating a matrix with its columns equal to these eigenvectors (called matrix S) and performing the equation S AS to calculate the diagonal matrix D ( orΛ)

Back to the topic of what makes a matrix diagonalizable

Suppose we have an equation that starts with some vector and every subsequent vector is a matrix A time the previous vector

From this arises the following

To really solve this problem, rewrite u as follows (a certain scalar times an eigenvector)

Where the Sc is a linear combination of the individual eigenvectors

Now multiply both sides by A

Taking a power of A now (i.e. k) would be akin to taking each eigenvalue to that power

This can be written as

Out[17]: ⎡

⎣

⎢⎢⎢⎢

0111

1111

1110

0−101

⎤

⎦

⎥⎥⎥⎥

Out[18]: ⎡

⎣

⎢⎢⎢⎢

−2000

0300

0050

0005

⎤

⎦

⎥⎥⎥⎥

Out[19]: True

Out[20]: True

Out[21]: { }−2 : 1, 3 : 1, 5 : 2

-1

= Au⎯⎯k+1 u⎯⎯k

= Au⎯⎯1 u⎯⎯0

= A =u⎯⎯2 Au⎯⎯0 A2 u⎯⎯0=u⎯⎯k Aku⎯⎯0

0= + + ⋯ + = Su⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯n c⎯⎯

A = A + A + ⋯ + Au⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯nA = + + ⋯ +u⎯⎯0 c1 λ1x⎯⎯1 c2 λ2x⎯⎯2 cn λnx⎯⎯n

= + + ⋯ +Aku⎯⎯0 c1 λk1x⎯⎯1 c2 λk

2x⎯⎯2 cn λknx⎯⎯n

= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯



As an example consider the Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, ...What would the 100 number be?Consider the following

This is a (second-order) difference equation; think of this example as similar to a second-order differential equation (without derivatives)By adding a second equation F =F , consider u to be the following vector


In [22]: A = Matrix([[1, 1], [1, 0]])A




In [26]: D

From above we remember the following

We have u contains the first two values

In [27]: u_zero = Matrix([1, 0])u_100 = A ** 100 * u_zerou_100 # The top value is the 100th Fibonacci number

In [28]: u_four = A ** 4 * u_zerou_four # If the first number is 0 the the fourth number would be the top value

Example problems

Example problem 1

Find an equation for C where C is given by the following matrix

Calculate C when a=b=-1

Solution

th

= +Fk+2 Fk+1 Fk

k+1 k+1 k

= [ ]u⎯⎯kFk+1Fk

= [ ] [ ] = [ ]u⎯⎯k+111

10

Fk+1Fk

+Fk+1 Fk

Fk+1

= [ ]u⎯⎯k+111

10

u⎯⎯k

Out[22]: [ ]11

10

Out[23]: { }+ : 1,12

5√2 − + : 15√

212

Out[24]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ + ,1

25√

2 1,⎡

⎣⎢⎢

⎡

⎣⎢⎢

− 1− +5√

212

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ − + ,5√

212 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− 1+1

25√

2

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

Out[26]: ⎡⎣⎢⎢

+12

5√2

0

0

− +5√2

12

⎤⎦⎥⎥

= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯0

Out[27]: [ ]573147844013817084101354224848179261915075

Out[28]: [ ]53

k

100



In [29]: a, b, k = symbols('a b k')


We remember the following

Where Λ is denoted by the computer variable D


In [32]: S

Python™ is not always good at simplifying theseIf you look at it carefully you will note the following


In [34]: D

In [35]: D = Matrix([[b, 0], [0, a]])D

For the values given, we have the following

In [36]: C = Matrix([[-1, 0], [0, -1]])C


In [38]: D

In [39]: D ** 100

In [40]: S * (D ** 100) * S.inv()

Doing the same, but with eigenvalues and eigenvectors

Out[30]: [ ]−a + 2b−2a + 2b

a − b2a − b

= SAk ΛkS−1

Out[32]: ⎡

⎣⎢⎢

− 2a−2b

−3a+3b+ (a−b)2√1

2a−2b

3a−3b+ (a−b)2√1

⎤

⎦⎥⎥

Out[33]: [ ]11

121

Out[34]: ⎡⎣⎢⎢

+ −a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

0

0

+ +a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

⎤⎦⎥⎥

Out[35]: [ ]b0

0a

Out[36]: [ ]−10

0−1

Out[38]: [ ]−10

0−1

Out[39]: [ ]10

01

Out[40]: [ ]10

01




In [42]: C.eigenvals()

This simplifies the λ = b and λ = aThat makes Λ (or D) the following

In [43]: D = Matrix([[b, 0], [0, a]])D

In [44]: C.eigenvects() # The solution is two tuples, with each being eigenvalue, eigenvector

This simplifies to the following eigenvalue matrix S


We can see if we can get back to C

In [46]: S * D * S.inv()

In [47]: S * D * S.inv() == C

Python™ won't to D for you, but it's easy to do yourself

In [48]: D = Matrix([[b ** k, 0], [0, a ** k]])D

Now we can compute SΛS

In [49]: S * D * S.inv()

Placing the given values into this equation will give you the same solution for C as above

In [ ]:

Out[41]: [ ]−a + 2b−2a + 2b

a − b2a − b

Out[42]: { }+ − : 1,a2

b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ + + : 1a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√

1 2

Out[43]: [ ]b0

0a

Out[44]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ + − ,a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− a−b

− + +3a2

3b2

12 (a−b)2√

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ + + ,a

2b2

12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

− a−b

− + −3a2

3b2

12 (a−b)2√

1

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

Out[45]: [ ]11

121

Out[46]: [ ]−a + 2b−2a + 2b

a − b2a − b

Out[47]: True

k

Out[48]: [ ]bk

00ak

-1

Out[49]: [ ]− + 2ak bk

−2 + 2ak bk−ak bk

2 −ak bk

100

2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix

Page 1 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false

This notebook is part of lecture 23 Differential equations and exponent A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, expfrom warnings import filterwarnings


In [4]: u, u1, u2, t, a, b, c = symbols('u u1 u2 t a b c')

Differential equationsExponential e of a matrix

Differential equations (ordinary only)

A differential equation moves on from the previous lecture's difference equation which had finite steps to continuously changing systems (here in the time parameter t)A differential equation included a function and its derivative(s)It has and order based on the highest derivative that appearsHere we are only concerned with differential equation with constant coefficientsThe simplest differential equation is the following

It is simply solved in the following manner (which gives us some insight into the general solution for these equations)

We can solve for the constant(s) if we have values for the initial condition (usually t=0) for y(t) and all its derivatives (called initial value problems)We can write a system of differential equations in matrix form

Out[1]:

At

= ay (t)dydt

= aydydt

dy = adt1y

∫ dy = a ∫ dt1y

ln y = a (t + )∣∣ ∣∣ c1

=eln y∣∣ ∣∣ eat+ac1

y = eateac1

y = ceat

= 3y′

1 y1

= −2y′

2 y2

= 6y′

3 y3

∴ =⎡

⎣

⎢⎢⎢y′

1

y′

2y′

3

⎤

⎦

⎥⎥⎥⎡

⎣⎢⎢

300

0−20

006

⎤

⎦⎥⎥

⎡

⎣⎢⎢⎢

y1y2y3

⎤

⎦⎥⎥⎥





http://ipython.org/



Rewriting the above, we consider the following differential equations in this lecture

So suppose we have these two differential equations

The intial consitions are given by the following

We can write it as Au

In [5]: A = Matrix([[-1, 2], [1, -2]])A

In [6]: u_vect = Matrix([u1, u2]) # u is now a sympy mathematical symbol# Have to use another variable name, i.e. u_vectu_vect

Multiplying this Au brings you back to the two linear equations

In [7]: A * u_vect


In [9]: A.eigenvects() # The results give the two eigenvectors in the following format# (eigenvalue, no of eigenvectors, eigenvector)

In [10]: S, D = A.diagonalize() # For interest sake we get the matrix of eigenvectors and the diagonal matrixS, D

To complete the solution now, we note that there are two eigenvalues, which will give us the followingTwo constantsTwo exponent to the power eigenvalue times tTwo eigenvectors

It is written like this, with x denoting an eigenvector

This makes our solution as follows

There is clearly a constant term and a term that approaches zero at t approaches infinity

Writing this as two separate equations we have the following

= Adu⎯⎯dt

u⎯⎯

= − + 2du⎯⎯1dt

u⎯⎯1 u⎯⎯2

= − 2du⎯⎯2dt

u⎯⎯1 u⎯⎯2

(0) = [ ]u⎯⎯10

Out[5]: [ ]−11

2−2

Out[6]: [ ]u1u2

Out[7]: [ ]− + 2u1 u2− 2u1 u2

Out[8]: { }−3 : 1, 0 : 1

Out[9]: [ ]( ) ,−3, 1, [ ][ ]−11 ( )0, 1, [ ][ ]2

1

Out[10]: ( )[ ] ,−11

21 [ ]−3

000

i(t) = +u⎯⎯ c1 e tλ1 x⎯⎯1 c2 e tλ2 x⎯⎯2

(t) = [ ] + [ ]u⎯⎯ c1 e−3t −11

c2 e(0)t 21

(t) = [ ] + [ ]u⎯⎯ c1 e−3t −11

c221

(t) = − + 2u1 c1 e−3t c2(t) = +u2 c1 e−3t c2



Using the initial conditions we can solve for c

Let's use an augmented matrix and Gauss-Jordan elimination to solve for the two constants (or at least the python™ equivalent)

In [11]: C = Matrix([[-1, 2, 1], [1, 1, 0]])C.rref()

Which gives us the final solution

Just to remind ourselves about the previous lecture where we had difference equations and had something like the following

Which is for finite steps, i.e. stepping by one

What can the eigenvalues tell us about these equations as t approaches ∞

If both eigenvalues (real parts) are negative, the equation t(t) approaches 0 (called stability)If one eigenvalue (real part) is zero and the others (real parts) are less than one, the equations reach is specific value (called a steady state)If any eigenvalue (real parts) is larger than zero, the equations approach ±∞

What can the matrix A tell us about the eigenvalues (and then what happens when tapproaches ∞)

The trace is equal to the sum of the eigenvaluesThe determinant is the product of the eigenvalues

If the trace is negative and the determinant positive, we will have stability

Using diagonalization

Consider the following derivation


i(0) = − + 2 = 1u1 c1 e−3(0) c2

(0) = + = 0u2 c1 e−3(0) c2− + 2 = 1c1 c2

+ = 0c1 c2

Out[11]:

( )[ ] ,10

01

− 13

13

[ ]0, 1

(t) = +u113 e−3t 2

3(t) = +u2

−13 e−3t 1

3

= +u⎯⎯ c1 λk1x⎯⎯1 c2 λk

2x⎯⎯2

= Au⎯⎯k+1 u⎯⎯k

2×2

= Adu⎯⎯dt

u⎯⎯∵ A = Su⎯⎯ v⎯⎯

S = Sdv⎯⎯dt

v⎯⎯

= S = Λdv⎯⎯dt

v⎯⎯S−1 v⎯⎯

(t) = (0)v⎯⎯ eΛt v⎯⎯(t) = S (0)u⎯⎯ eΛt S−1u⎯⎯

= SeAt eΛt S−1



Matrix exponential e

How do we calculate a matrix as a power?Consider Taylor series expansion

This comes from the following

As the denominator increases the n term approaches 0Remember also this (geometric) series (just for fun)

This will blow up unless then eigenvalues of A are less than 1

Now, let calculate e , remembering the following

We thus have the following

This is with the assumption that A can be diagonalized (otherwise we will have to use the infinite series (above)

Remember that Λ is a diagonal matrix and therefor we would have the following

The S and S matrices are stable, it is therefor the Λ matrix that provides an approach to zero as t approaches ∞This is achieved by every λ having a real part less than zero

The powers of A go to zero if the absolute value of the real part of all the λ -values is less than 1

Let's consider this example

We have to create a system of two first-order equations

Example problems

Example problem 1

At

= I + At + + + ⋯ +eAt (At)2

2!(At)3

3!(At)n

n!

=ex ∑n=0

∞ xn

n!th

=11 − x ∑

0

∞xn

= I + At + + …(I − At)−1 (At)2 (At)n

At

= SAk ΛkS−1

=eAt ∑n=0

∞ (S )ΛnS−1 ntn

n!

= SeAt eΛt S−1

= I + At + + + ⋯ +eAt (At)2

2!(At)3

3!(At)n

n!

=eΛt

⎡

⎣

⎢⎢⎢⎢

e tλ1

0⋮0

0e tλ2

⋮0

00…0

00⋮

e tλn

⎤

⎦

⎥⎥⎥⎥

-1

i

i

+ b + k =y⎯⎯

″ y⎯⎯

′ y⎯⎯ 0⎯⎯

= [ ]u⎯⎯y′

y

= [ ] = [ ] [ ]u⎯⎯′ y″

y′−b1

−k0

y′

y



Find the general solutions, the matrix A, and the first column of e of the following third-order, ordinary, homogeneous differential equation with constant coefficients

Solution

Since the differential equation is third order, we need to create three first-order differential equations

Notice that when we do the last matrix multiplication, we get exactly what we need

We now have the matrix A and we can calculate e if A is diagonalizable

In [12]: A = Matrix([[-2, 1, 2], [1, 0, 0], [0, 1, 0]])A

In [13]: S, D = A.diagonalize()S, D

We can calculate e from the following

Remember that Λ is a diagonal matrix and therefor we would have the following


This gives us the three eigenvalues from which we can create the diagonal matrix e

At

+ 2 − − 2y = 0yd3

dt3yd2

dt2dydt

+ 2 − − 2y = 0yd3

dt3yd2

dt2dydt

∴ = −2 + + 2y‴ y″ y′

=u⎯⎯

⎡

⎣⎢⎢

y″

y′

y

⎤

⎦⎥⎥

=u⎯⎯′

⎡

⎣⎢⎢

y‴

y″

y′

⎤

⎦⎥⎥

= A =u⎯⎯′ u⎯⎯

⎡

⎣⎢⎢

−210

101

200

⎤

⎦⎥⎥

⎡

⎣⎢⎢

y″

y′

y

⎤

⎦⎥⎥

= =u⎯⎯′

⎡

⎣⎢⎢

y‴

y″

y′

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−2 + + 2yy″ y′

y″

y′

⎤

⎦⎥⎥

At

Out[12]: ⎡

⎣⎢⎢

−210

101

200

⎤

⎦⎥⎥

Out[13]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

4−21

1−11

111

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−200

0−10

001

⎤

⎦⎥⎥

⎞

⎠⎟⎟

At

= SeAt eΛt S−1

=eΛt

⎡

⎣

⎢⎢⎢⎢

e tλ1

0⋮0

0e tλ2

⋮0

00…0

00⋮

e tλn

⎤

⎦

⎥⎥⎥⎥

Out[14]: { }−2 : 1, −1 : 1, 1 : 1

Λt



In [15]: e_Lamda_t = Matrix([[exp(-2 * t), 0, 0], [0, exp(-t), 0], [0, 0, exp(t)]])e_Lamda_t

We have to multiply the following three matrices (with which python is not comfortable, so I'll do it in two steps)

In [16]: S, e_Lamda_t, S.inv()

In [17]: first_part = S * e_Lamda_tfirst_part

In [18]: first_part * S.inv()

We still need to write our general solution


With these eigenvalues and eigenvectors we get

Example problem 2

Solve the following second-order ordinary differential equation

Solution

We need to create two first-order equations

Out[15]: ⎡

⎣⎢⎢⎢

e−2t

00

0e−t

0

00et

⎤

⎦⎥⎥⎥

Out[16]: ⎛

⎝

⎜⎜⎜ ,⎡

⎣⎢⎢

4−21

1−11

111

⎤

⎦⎥⎥ ,

⎡

⎣⎢⎢⎢

e−2t

00

0e−t

0

00et

⎤

⎦⎥⎥⎥

⎡

⎣

⎢⎢⎢

13

− 12

16

0− 1

212

− 13

113

⎤

⎦

⎥⎥⎥

⎞

⎠

⎟⎟⎟

Out[17]: ⎡

⎣⎢⎢⎢

4e−2t

−2e−2t

e−2t

e−t

−e−t

e−t

et

et

et

⎤

⎦⎥⎥⎥

Out[18]: ⎡

⎣

⎢⎢⎢⎢

− +et

6e−t

243 e−2t

+ −et

6e−t

223 e−2t

− +et

6e−t

213 e−2t

−et

2e−t

2

+et

2e−t

2

−et

2e−t

2

+ −et

3 e−t 43 e−2t

− +et

3 e−t 23 e−2t

+ −et

3 e−t 13 e−2t

⎤

⎦

⎥⎥⎥⎥

(t) = + +u⎯⎯ c1 e tλ1 x⎯⎯1 c2 e tλ2 x⎯⎯2 c3 e tλ3 x⎯⎯3

Out[19]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ −2, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

4−21

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟ ,

⎛

⎝⎜⎜ −1, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

1−11

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ 1, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

(t) = = + +u⎯⎯

⎡

⎣⎢⎢

y″

y′

y

⎤

⎦⎥⎥ c1 e−2t

⎡

⎣⎢⎢

4−21

⎤

⎦⎥⎥ cse−t

⎡

⎣⎢⎢

1−11

⎤

⎦⎥⎥ c3 et

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

∴ y (t) = + +c1 e−2t c2 e−t c3 et

− − 6y = 0y″ y′

∴ = + 6yy″ y′

∵ (t) = [ ]u⎯⎯y′

y

(t) = [ ]u⎯⎯′ y″

y′

∴ (t) = [ ] (t)u⎯⎯′ 1

160

u⎯⎯



In [20]: A = Matrix([[1, 6], [1, 0]])A


The solution is thus as follows

In [ ]:

Out[20]: [ ]11

60

Out[21]: [ ]( ) ,−2, 1, [ ][ ]−21 ( )3, 1, [ ][ ]3

1

y (t) = +c1 e−2t c2 e3t

2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series

Page 1 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false

This notebook is part of lecture 24 Markov matrices and Fourier series in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper








Markov matrices and steady stateProjections and Fourier series

Markov matrix

Consider the follow Markov matrix (transition matrix)

In [5]: A = Matrix([[0.1, 0.01, 0.3], [0.2, 0.99, 0.3], [0.7, 0, 0.4]])A

All the column entries add to 1 (also true for powers of the matrix)All entries ≥ 0 (also true for powers of the matrix)There will be an eigenvalue of 1All other eigenvalues will have an absolute value of at most 1For difference equations we will have the following

Thus for the eigenvalues other than 1, successive powers will lead to those terms approaching zero and a steady state being reached by the term with the eigenvalue of1


All the components of the eigenvector of the eigenvalue 1 is positive

Out[1]:

Out[2]:

Out[5]: ⎡

⎣⎢⎢

0.10.20.7

0.010.99

0

0.30.30.4

⎤

⎦⎥⎥

= = + + …uk Aku0 c1 λk1x1 c1 λk

2x2

Out[6]: { }1 : 1, + : 1,49200

3 1009√200 − + : 13 1009√

20049200





http://ipython.org/




If the matrix A - 1λ is singular, then the eigenvalue is 1

In [8]: A - eye(3) # A minus the identity matrix

In [9]: (A - eye(3)).det() # A computer peculiarity to inidcate 0

The sum of the entries in each column is now zeroWe would like a proof involving this (sum of column entries equal zero) as an assumption to give a singular matrix, without having to calculate the determinantIt is easy to see that the rows are linearly dependent and that would give proof of a singular matrixLook also at the nullspace of A

In [10]: A_1t = A - eye(3)(A_1t.transpose()).nullspace()

In [11]: (A.transpose()).nullspace()


In [13]: (A - eye(3)).rref()

Note how the eigenvalues of A and A are the same

In [14]: A.eigenvals() == A.transpose().eigenvals()

The proof of this lies in the fact that we calculate the eigenvalue(s) by the following equation

The eigenvalue(s) must also solve this equation

Since λI=λI and both of these equations eqal the same value (0), the eigenvalue(s) must be equal

Example of a Markov matrix

Consider the population of two (isolated) states (c and m)Over time there is movement between the two (no loss or entry from outside the system)We consider the following system fo movement (difference equation)

We can create the following matrix equation

Out[7]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ 1.0, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

0.85714285714285747.1428571428571

1.0

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟ ,

⎛

⎝⎜⎜ 0.721471405228058, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

0.459244864611511−1.45924486461151

1.0

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ −0.231471405228058, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

−0.902102007468654−0.0978979925313461

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢

−0.90.20.7

0.01−0.01

0

0.30.3

−0.6

⎤

⎦⎥⎥

Out[9]: −3.03576608295941 ⋅ 10−18

T

Out[10]: []

Out[11]: []

Out[12]: []

Out[13]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

1.000

01.00

00

1.0

⎤

⎦⎥⎥ [ ]0, 1, 2

⎞

⎠⎟⎟

T

Out[14]: True

A − λI = 0∣∣ ∣∣

− λI = 0∣∣AT ∣∣T

= Auk+1 uk

= [ ][ ]uc

um k+1

0.90.1

0.20.8 [ ]uc

um k



In [15]: A = Matrix([[0.9, 0.2], [0.1, 0.8]])A

Let's start at time, t=0 (or in our case k=0), with the whole population in state mBecause the system is closed, the total population stays at 1000

In [16]: u0 = Matrix([0, 1000])

After 1 time period k+1 will be given by the following

In [17]: u1 = A * u0u1

In [18]: u2 = A * u1u2

Where will we end up?Let's look at the eigenvaluesOne will be one and the other must be the trace minus 1 = 0.7



When can look at what happens after a hundred steps using a small while loop

In [21]: u = Matrix([0, 1000])i = 0

while i < 101: u = A * u i += 1u

It seems that we are at about ⅔ vs ⅓It looks suspiciously like the eigenvector of the eigenvalue 1

Remember the equation for the difference equation we used above?

We only have two terms, so this will become somewhat shortened

Out[15]: [ ]0.90.1

0.20.8

Out[17]: [ ]200.0800.0

Out[18]: [ ]340.0660.0

Out[19]: { }: 1,710 1 : 1

Out[20]: [ ]( ) ,0.7, 1, [ ][ ]−1.01.0 ( )1.0, 1, [ ][ ]2.0

1.0

Out[21]: [ ]666.666666666668333.333333333334

= = + + …uk Aku0 c1 λk1x1 c1 λk

2x2

= = +uk Aku0 c1 λk1x1 c1 λk

2x2

= [ ] + [ ]uk c1 1k 21

c1710

k −11

= [ ] + [ ]u0 c1 10 21

c1 ( )710

0 −11

= [ ] + [ ] = [ ]u0 c121

c1−11

01000



In [22]: result = Matrix([[2, -1, 0], [1, 1, 1000]])result.rref()

So we have c =1000÷3 and c =2000÷3

This results in the final solution

Now we can work out the solution for any time step k and even see what happens at time approaches infinityWe note that the second expression disappears as k approaches infinity, which represents the steady state

Projections and Fourier series

Consider projections with orthonormal basis q , q , ..., qAny vector v can be expressed (expanded) as an combination of this basis

Because the basis q were chosen to be orthogonal taking the dot product of q on both sides will cancel out all q factors

We can see this clearly in matrix form below

The first component of x, x is the the first row of Q times vThe first row of Q is just q , with a bunch of zeros and just ends as we had above

The key here was to choose orthonormal basis vectorsNow for how this relates to Fourier seriesWe might want to write a function as an infinite series of expressions

The idea here is that there is still something orthogonal in each of these expressions (sine and cosine)Joseph Fourier realized that he could work in function spaceThe vectors are now functions in an infinitely large spaceThe basis vectors are cos(x), sin(x), cos(2x)...

Out[22]:

( )[ ] ,10

01

10003

20003

[ ]0, 1

1 2

= [ ] + [ ]uk1000

3 1k 21

20003 ( )7

10k −1

1

1 2 n

v = + + ⋯ +x1 q1 x2 q2 xn qn

i 1 ≠1v = + 0 + 0 + …qT

1 x1 qT1 q1

v =qT1 x1

= v

⎡

⎣⎢⎢⎢

⋮q1

⋮

⋮⋯⋮

⋮qn

⋮

⎤

⎦⎥⎥⎥

⎡

⎣⎢⎢⎢

x1

⋮xn

⎤

⎦⎥⎥⎥

Qx = vx = vQ−1

∵ =Q−1 QT

x = vQT

1 TT 1

v =qT1 x1

f (x) = + cos x + sin x + cos 2x + sin 2x + …a0 a1 b1 a2 b2



What is the inner (dot) product of functions that make them orthogonal?For vectors is was the following

For functions f and g the analogue is to multiply them and the analogue of all the addition is integration

Now we need to know the lower and upper limit of integrationWe note that the sine and cosine functions are periodic and repeat every 2π (these are periodic functions)

Just like the inner product of pairs gave zero because of orthogonality we have the same here

With some trigonometric identities we can show the same for all the other pairs

A Fourier series is then an expression of a function expanded on this orthonormal basis pairs

How do we get the coefficients then?Same as above, i.e. taking the inner product of both sides with cos(x)

Example problem (Markov matrixes)

Example problem 1

A particle jumps between positions A and B with the following probabilitiesA to A (stays in A) probability is 0.6A to B probability is 0.4 (so all states from A add to 1.0)B to B probability is 0.8B to A probability is 0.2

If the particle starts in position A, what is the probability that it will be at A and B after the first step, n-steps, and ∞-steps

Solution

In [23]: x_vect = Matrix([1, 0])A = Matrix([[0.6, 0.2], [0.4, 0.8]]) # Look carefully at what goes whereA, x_vect # Displaying the two matrices

After a single step we will have the following

In [24]: A * x_vect

w = + + ⋯ +vT v1 w1 v2 w2 vn wn

g = ∫ f (x) g (x)dxf T

∵ f (x) = f (x + 2π)

g = f (x) g (x)dxf T ∫2π

0

sin x cos xdx∫2π

0u (x) = sin x

= cos xdudx

cos xdx = du(2π) = sin 2π = 0u2

(0) = sin 0 = 0u1

udu = 0∫0

0

f (x) cos xdx = dx + 0 + 0 + ⋯ = π∫2π

0a1 ∫

2π

0(cos x)2

= f (x) cos xdxa11π ∫

2π

0

Out[23]: ( )[ ] ,0.60.4

0.20.8 [ ]1

0

Out[24]: [ ]0.60.4



A probability of 0.6 of being in position A and a probability of 0.4 of being in position B

Now look at the following trend

To take the power of a matrix in python is easy, but let's remind ourselves that we need to wotk with eignevalues and eigenvectorsBecause all the column entries add to 1, we are dealing with a Markov matrixOne of the eigenvalues will be 1, the trace is 1.4, therefor the other eigenvalue is 0.4Remember diagonalization?

In [25]: S, D = A.diagonalize()S, D

Note the matrix of eigenvalues (diagonal) DNote the matrix of eigenvectorsSo for the eigenvalue of 1 we have the following eigenvector (steady state)

For the other eigenvalue we have the following eigenvector (decay)

Solving for the constants we can create an augmented matrix

In [26]: u = Matrix([[1, -1, 1], [2, 1, 0]])u.rref()

This gives us the following solution

So at infinity it will be at position A with a probability of ⅓ and at position B with a probability of ⅔

In [27]: (A ** 1000000) * x_vect # Just to show how easy it is just to take the power of A in python

In [ ]:

= Ap1 p0

= A = A ( ) =p2 p1 Ap0 A2p0

=p3 A3p0

⋮=pn Anp0

Out[25]: ( )[ ] ,−1.01.0

1.02.0 [ ]0.4

00

1.0

[ ]12

[ ]−10

= = +uk Aku0 c1 λk1x1 c1 λk

2x2

= [ ] + [ ]uk c1 1k 12

c1410

k −11

= [ ] + [ ]u0 c1 10 12

c1 ( )410

0 −11

= [ ] + [ ] = [ ]u0 c112

c1−11

10

Out[26]:

( )[ ] ,10

01

13

− 23

[ ]0, 1

= ( ) [ ] + [ ]un13 1n 1

2−23 ( )4

10n −1

1

= + (−1)p (A)n13

−23 ( )4

10n

= (2) + (1)p (B)n13

−23 ( )4

10n

Out[27]: [ ]0.3333333333627920.666666666725585

2015/03/29, 7:50 PMIII_01_Symmetric_matrices

Page 1 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false

This notebook is part of lecture 25 Symmetric matrices and positive definiteness in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper






In [3]: from sympy import init_printing, Matrix, symbols, sqrtfrom warnings import filterwarnings


Symmetric matricesPositive definite matrices

Symmetric matrices

Symmetric matrices are square with the following property

We are concerned with the eigenvalues and eigenvectors of symmetric matricesThe eigenvalues are realThe eigenvectors are orthogonal, or at least, can be chosen orthogonal

Out[1]:

Out[2]:

A = AT





http://ipython.org/



Considering proof of the real nature of eigenvalues we have the followingAny matrix equation of the following example can be changed to its complex conjugate form by changing each element into its complex conjugate form (heremarked with a bar over the top)

We can take the complex conjugate transpose of x on both sides

In the complex conjugate form this becomes the following

Now if A is symmetric we use the fact that A=A

Note how the right-hand sides of (1) and (2) are equal and we therefor have the following


The only ways that this is possible is if the imaginary part is zero and only real eigenvalues are possibleNote also what happens if the complex conjugate of the vector x is multiplied by the vector itself

Remember that x x is a form of the dot product (which is the length squared)Any number times its complex conjugate gets rid of the imaginary part

Consider the following symmetric matrix A

In [5]: A = Matrix([[5, 2], [2, 3]])A

Let's see if it really is symmetric by making sure that it is equal to it's transpose

In [6]: A == A.transpose() # Boolean (true or false) statement


S, the matrix containing the eigenvectors as it's columnsRemember that these eigenvectors are not necessarily the same as those you would get doing these problems by handWhen substituting the values for λ a singular matrix is created with rows that are simply linear combinations of each otherYou are free to choose values for the components of the eigenvectors for each eigenvalue (usually choosing the simplest ones)

In [8]: S

D, the matrix containing the values of the eigenvalues down the main diagonal

In [9]: D

In decomposition, a symmetric matrix results in the following

In this case we have an orthogonal matrix times diagonal matrix times transpose of orthogonal matrix



A = λx⎯⎯ x⎯⎯A =x⎯⎯⎯ λ

⎯⎯⎯x⎯⎯⎯

A = λ … (1)x⎯⎯⎯ T x⎯⎯ x⎯⎯⎯ T x⎯⎯

= λx⎯⎯⎯ T AT x⎯⎯ x⎯⎯⎯ T x⎯⎯T

A = … (2)x⎯⎯⎯ T x⎯⎯ x⎯⎯⎯ T λ⎯⎯⎯x⎯⎯

λ =x⎯⎯⎯T x⎯⎯ λ⎯⎯⎯x⎯⎯⎯ T x⎯⎯

λ = λ⎯⎯⎯

T

Out[5]: [ ]52

23

Out[6]: True

i

Out[8]:

[ ]− 2

1+ 5√

1− 2

− +15√

1

Out[9]: [ ]− + 45‾√0

0+ 45‾√

A = SΛST

A = QΛQT

Out[10]: { }− + 4 : 1,5‾√ + 4 : 15‾√

Out[11]:

[ ]( ) ,− + 4,5‾√ 1, [ ][ ]− 2

1+ 5√

1 ( )+ 4,5‾√ 1, [ ][ ]− 2

− +15√

1



We've seen in our example that, indeed, the eigenvalues are realLet's see of the eigenvectors are orthogonal by looking at their dot product

In [12]: eigenvec_1 = Matrix([-2 / (1 + sqrt(5)), 1])eigenvec_2 = Matrix([-2 / (1 - sqrt(5)), 1])eigenvec_1.dot(eigenvec_2)

This is certainly zero when simplified

In [13]: (eigenvec_1.dot(eigenvec_2)).simplify() # Using the simplify() method

We need not use symbolic computing (computer algebra system, CAS)Let's look at numerical evaluation using numerical python (numpy)

In [14]: import numpy as np # Using namespace abbreviations

In [15]: A = np.matrix([[5, 2], [2, 3]])A

In [16]: w, v = np.linalg.eig(A) # Calculating the eigenvalues and eigenvectors# The result of np.linalg.eig() is a tuple, the first being the eigenvalues# The second being the eigenvectors

In [17]: w

In [18]: v

In [19]: # Creating the diagonal matrix manually from the eigenvaluesD = np.matrix([[6.23606798, 0], [0, 1.76393202]])D

In [20]: # Checking to see if our equation for A holdsv * D * np.matrix.transpose(v)

Positive definite matrices (referring to symmetric matrices)

The properties of positive definite (symmetric) matricesAll eigenvalues are positiveAll pivots are positiveAll determinants (actually also all sub-determinants) are positive

The fact that a (square symmetric) matrix A is invertible implies the followingThe determinant is non-zero (actually larger than zero)The determinant is the product of the eigenvaluesThe determinant must therefor be larger than zero

Out[12]: + 14(1 + ) (− + 1)5‾√ 5‾√

Out[13]: 0

Out[15]: matrix([[5, 2], [2, 3]])

Out[17]: array([ 6.23606798, 1.76393202])

Out[18]: matrix([[ 0.85065081, -0.52573111], [ 0.52573111, 0.85065081]])

Out[19]: matrix([[ 6.23606798, 0. ], [ 0. , 1.76393202]])

Out[20]: matrix([[ 5., 2.], [ 2., 3.]])



For projection matricesThe eigenvalues are either 0 or 1If this projection matrix is positive definite

The eigenvalues must all be 1 (since they must be larger than zero)The only matrix that satisfies this property is the identity matrix

The diagonal matrix D is positive definiteThis means that for any non-zero vector x we have x Dx>0Let's look at a 3-component vector with a 3×3 matrix D

In [21]: d1, d2, d3, x1, x2, x3 = symbols('d1 d2 d3 x1 x2 x3')

In [22]: D = Matrix([[d1, 0, 0], [0, d2, 0], [0, 0, d3]])x_vect = Matrix([x1, x2, x3])x_vect.transpose(), D, x_vect

Indeed we have x Dx>0 since the components if x are squared and the eigenvalues are all positive

In [23]: x_vect.transpose() * D * x_vect

Not all symmetric matrices with a positive determinant are definite positiveEasy matrices to construct with this property have negative values on the main diagonalNote below how the eigenvalues are not all more than zeroAlso note how x Dx≯0It is important to note that the sub-determinant must also be positive

In the example below the sub-determinant of 3 is -1

In [24]: A = Matrix([[3, 1], [1, -1]])A

In [25]: A == A.transpose()

In [26]: A.det()




In [30]: S

In [31]: D

T

Out[22]: ⎛

⎝⎜⎜ [ ] ,x1 x2 x3 ,

⎡

⎣⎢⎢

d100

0d20

00d3

⎤

⎦⎥⎥

⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

⎞

⎠⎟⎟

T

Out[23]: [ ]+ +d1 x21 d2 x2

2 d3 x23

T

Out[24]: [ ]31

1−1

Out[25]: True

Out[26]: −4

Out[27]: { }1 + : 1,5‾√ − + 1 : 15‾√

Out[28]:

[ ]( ) ,1 + ,5‾√ 1, [ ][ ]− 1

− +25√

1 ( )− + 1,5‾√ 1, [ ][ ]− 1

2+ 5√

1

Out[30]:

[ ]− 1

− +25√

1− 1

2+ 5√

1

Out[31]: [ ]1 + 5‾√0

0− + 15‾√



In [32]: x_vect = Matrix([x1, x2])x_vect


In this example the sub-determinant of 1 is -3

In [34]: A = Matrix([[-3, 1], [1, 1]])A

In [35]: A == A.transpose()



In [ ]:

Out[32]: [ ]x1x2

Out[33]: [ ](1 + ) + (− + 1)x21 5‾√ x2

2 5‾√

Out[34]: [ ]−31

11

Out[35]: True

Out[37]: [ ](−1 + ) + (− − 1)x21 5‾√ x2

2 5‾√

2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT

Page 1 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false

This notebook is part of lecture 26 Complex matrices and the fast Fourier transform in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, I, sqrt, Rationalfrom IPython.display import Imagefrom warnings import filterwarnings


Complex vectors, matricesFast Fourier transform

Complex vectors

Consider the following vector with complex entries (from this point on I will not use the underscore to indicate a vector, so as not to create confusion with the bar, notingcomplex conjugate, instead, inferring from context)

The length (actually length squared) of this vector is no good, since it should be positive

Instead we consider the following

In [4]: z = Matrix([1, I]) # I is the sympy symbol for the imaginary number iz

Let's calculate this manually

In [5]: z.norm() # The length of a vector

Out[1]:

z =

⎡

⎣

⎢⎢⎢⎢

z1z2

⋮zn

⎤

⎦

⎥⎥⎥⎥

zzT

z =z z∣∣ ∣∣2

∴ zzT

[ , , … , ]z1 z2 zn

⎡

⎣

⎢⎢⎢⎢

z1z2

⋮zn

⎤

⎦

⎥⎥⎥⎥

Out[4]: [ ]1i

Out[5]: 2‾√





http://ipython.org/



In [6]: z_cc = Matrix([1, -I])z_cc

In [7]: sqrt(z_cc.transpose() * z)

Taking the transpose of the complex conjugate is called the Hermitian

We can use the Hermitian for non-complex (or mixed complex) vectors u and v too

In [8]: from sympy.physics.quantum.dagger import Dagger # A fun way to quickly get the Hermitian

In [9]: Dagger(z)

In [10]: sqrt(Dagger(z) * z)

Complex symmetric matrices

The transpose

If the symmetric matrix has complex entries then A =A is no good

In [11]: A = Matrix([[2, 3 + I], [3 - I, 5]])A # A Hermitian matrix

In [12]: A.transpose() == A

In [13]: Dagger(A)

In [14]: Dagger(A) == A

This will work for real-values symmetric matrices as well

In [15]: A = Matrix([[3, 4], [4, 2]])A

In [16]: A.transpose() == A

In [17]: Dagger(A) == A

Out[6]: [ ]1−i

Out[7]: ([ ])212

zzH

xyT

xyH

Out[9]: [ ]1 −i

Out[10]: ([ ])212

T

Out[11]: [ ]23 − i

3 + i5

Out[12]: False

Out[13]: [ ]23 − i

3 + i5

Out[14]: True

Out[15]: [ ]34

42

Out[16]: True

Out[17]: True



The eigenvalues and eigenvectors

Back to the complex matrix A

In [18]: A = Matrix([[2, 3 + I], [3 - I, 5]])A




In [22]: S

In [23]: D

What about S now?We have to use its transpose, but it is complex, so we have to take the Hermitian

In [24]: Dagger(S)

In [25]: S == Dagger(S) # Don't get confused here, S is not symmetric

Remember that for a symmetric matrix the column vectors in S (usually called Q, the matrix of eigenvectors) are orthogonal, with Q Q=IWith complex entries we have to consider the Hermitian here, not just the simple transposeHere we call Q unitary

The fast Fourier transform

Look at this special matrix (where we start counting rows and columns at zero)

Out[18]: [ ]23 − i

3 + i5

Out[19]: { }0 : 1, 7 : 1

A = [ ]23 − i

3 + i5

A − λI = 0⎯⎯

[ ] − [ ] = 0∣∣∣ 2

3 − i3 + i

5λ0

0λ

∣∣∣

= 0∣∣∣ 2 − λ

3 − i3 + i5 − λ

∣∣∣

(2 − λ) (5 − λ) − (3 + i) (3 − i) = 010 − 7λ + − (9 + 1) = 0λ2

− 7λ = 0λ2

= 0λ1= 7λ2

Out[20]:

[ ]( ) ,0, 1, [ ][ ]− −32

i2

1 ( )7, 1, [ ][ ]+35

i5

1

Out[22]: [ ]−3 − i2

3 + i5

Out[23]: [ ]00

07

Out[24]: [ ]−3 + i3 − i

25

Out[25]: False

T



W is a special number whose n power equals 1

It is in the complex plane of course (as written in sin and cos above)

Remember than n here refers to the size the matrixHere it also refers to the n n roots (if that makes any sense, else look at the image below)

In [26]: Image(filename = 'W.png')

So for n=4 we will have the following

We note that a quarter of the way around is i

We thus have the following

Note how the columns are orthogonal

=Fn

⎡

⎣

⎢⎢⎢⎢⎢⎢

W (0)(0)

W (1)(0)

W (2)(0)

⋮W (n−1)(0)

W (0)(1)

W (1)(1)

W (2)(1)

⋮W (n−1)(1)

W (0)(2)

W (1)(2)

W (2)(2)

⋮W (n−1)(2)

……………

W (0)(n−1)

W (1)(n−1)

W (2)(n−1)

⋮W (n−1)(n−1)

⎤

⎦

⎥⎥⎥⎥⎥⎥

= ; i, j = 0, 1, 2, … , n − 1( )Fn ij W ij

th

= 1W n

W = = cos + i sinei2πn

2πn

2πn

th

Out[26]:

=F4

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

1

1

1

1

1

( )e2πi4

1

( )e2πi4

2

( )e2πi4

3

1

( )e2πi4

2

( )e2πi4

4

( )e2πi4

6

1

( )e2πi4

3

( )e2πi4

6

( )e2πi4

9

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

= ie2πi4

=F4

⎡

⎣

⎢⎢⎢⎢

1111

1ii2

i3

1i2

i4

i6

1i3

i6

i9

⎤

⎦

⎥⎥⎥⎥

=F4

⎡

⎣

⎢⎢⎢⎢

1111

1i

−1−i

1−11

−1

1−i−1i

⎤

⎦

⎥⎥⎥⎥



In [27]: F = Matrix([[1, 1, 1, 1], [1, I, -1, -I], [1, -1, 1, -1], [1, -I, -1, I]])F

In [28]: F.col(0) # Calling only the selected column (counting starts at 0)

The columns are supposed to be orthogonal, i.e. inner (dot) product should be zeroClearly below it is not

In [29]: F.col(1).dot(F.col(3))

Remember, though, that this is a complex matrix and we have to use the Hermitian

In [30]: col1 = F.col(1)col3 = F.col(3)col1, col3

In [31]: Dagger(col3), col1

In [32]: Dagger(col3) * col1 # Another way to do the dot product

So, these columns are all orthogonal, but they are not orthonormalNote, though that the are all of length 2, so we can normalize each

In [33]: Rational(1, 2) * F

We also note the following

Just remember to normalize them

In [34]: Dagger(Rational(1, 2) * F)

Out[27]: ⎡

⎣

⎢⎢⎢⎢

1111

1i

−1−i

1−11

−1

1−i−1i

⎤

⎦

⎥⎥⎥⎥

Out[28]: ⎡

⎣

⎢⎢⎢⎢

1111

⎤

⎦

⎥⎥⎥⎥

Out[29]: 4

Out[30]: ⎛

⎝

⎜⎜⎜⎜,

⎡

⎣

⎢⎢⎢⎢

1i

−1−i

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

1−i−1i

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

Out[31]: ⎛

⎝

⎜⎜⎜⎜[ ] ,1 i −1 −i

⎡

⎣

⎢⎢⎢⎢

1i

−1−i

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

Out[32]: [ ]0

Out[33]: ⎡

⎣

⎢⎢⎢⎢⎢

12121212

12i2

− 12

− i2

12

− 12

12

− 12

12

− i2

− 12

i2

⎤

⎦

⎥⎥⎥⎥⎥

= IFHn Fn

Out[34]: ⎡

⎣

⎢⎢⎢⎢⎢

12121212

12

− i2

− 12

i2

12

− 12

12

− 12

12i2

− 12

− i2

⎤

⎦

⎥⎥⎥⎥⎥



In [35]: Dagger(Rational(1, 2) * F) * ((Rational(1, 2) * F))

Now why do we call it fast Fourier transformNote the following

Now we have the following connection between the two

P is a permutation matrix

Going down to 16 will include the following

The recursive work above leads to decreasing the work that is required for working with these problems

In [ ]:

Out[35]: ⎡

⎣

⎢⎢⎢⎢

1000

0100

0010

0001

⎤

⎦

⎥⎥⎥⎥

=Wn e2πin

=( )Wnp ( )e

2πin

p

= ; n = 64, p = 2( )W642 ( )e

2πi64

2

∴ =( )W642 W32

[ ] = [ ] [ ] [P]F64II

D−D

F320

0F32

D =

⎡

⎣

⎢⎢⎢⎢⎢⎢

100⋮0

0W0⋮0

00

W 2

⋮0

……………

000⋮

W 31

⎤

⎦

⎥⎥⎥⎥⎥⎥

[P]

⎡

⎣

⎢⎢⎢⎢

II00

D−D00

00II

00D

−D

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

F16000

0F1600

00

F160

000

F16

⎤

⎦

⎥⎥⎥⎥

2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid

Page 1 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false

This notebook is part of lecture 27 Positive definite matrices and minima in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, Derivative, difffrom warnings import filterwarnings


In [4]: a, b, c, d, lamda, x1, x2, x3 = symbols('a b c d lamda x1 x2 x3')

Tests for positive definite matricesTests for minimum x Ax>0Ellipsoids in �

When is a symmetric matrix positive definite

Let's first consider the 2×2 matrix

Tests for complete definitivenessλ >0, λ >0a>0, ac-bThe pivots larger than zero

x Ax>0

Let's look at some example matrices

In [5]: A = Matrix([[2, 6], [6, a]])A

The first question is what value of a would make this symmetric matrix positive definiteThe second would be, which of the tests above would you use

The second question firstSeems the determinant tests would sufficeWe need 2a-36>0

The first question is the answereda must therefor be larger than 18

Out[1]:

T

n

1 22

a > 0; > 0ac − b2

aT

Out[5]: [ ]26

6a





http://ipython.org/



Let's play around by making a equal to 18

In [6]: A = Matrix([[2, 6], [6, 18]])A

In [7]: A.charpoly(lamda)


One of the eigenvalues is zero (after all, it is a singular matrix now and one eigenvalues must be zero)It is a 2×2 matrix and we must have two eigenvaluesThe other eigenvalue must equal the trace of A, which is 20 (therefor there was no need to calculate the eigenvalues, we could just reason and read it off)We'll call this matrix positive semi-definite

Notice that the pivot test would not have helped

Let's look at x Ax>0 (where x is any correctly-sized vector)

In [9]: x_vect = Matrix([x1, x2])x_vect

In [10]: f = x_vect.transpose() * A * x_vectf

In [11]: f.expand() # Expanding the expression shows it is no quadratic (not linear anymore)

For A to be positive definite, this quadratic must be positive for all values of x

Below I use some 3D plottingNot too clear to see, but note that nowhere does plot go below zero on the z-axis

In [12]: import scipy as spimport matplotlib.pyplot as pltfrom mpl_toolkits.mplot3d import Axes3D

%matplotlib inline

Out[6]: [ ]26

618

Out[7]: PurePoly ( − 20λ, λ, domain = ℤ)λ2

Out[8]: { }0 : 1, 20 : 1

(6)(2) − 62

2

T

Out[9]: [ ]x1x2

Out[10]: [ ](2 + 6 ) + (6 + 18 )x1 x1 x2 x2 x1 x2

Out[11]: [ ]2 + 12 + 18x21 x1 x2 x2

2

i



In [13]: fig = plt.figure(figsize = (10, 8))ax = Axes3D(fig)x = sp.linspace(-2, 2, 100)y = sp.linspace(-2, 2, 100)[x, y] = sp.meshgrid(x, y)z = 2 * x ** 2 + 12 * x * y + 18 * y ** 2

ax.plot_wireframe(x, y, z, rstride = 5, cstride = 5)plt.show();

We can construct a matrix with a value for a (same matrix as above), which will clearly not be positive definite

In [14]: A = Matrix([[2, 6], [6, 7]])A

In [15]: f = x_vect.transpose() * A * x_vectf.expand()

Out[14]: [ ]26

67

Out[15]: [ ]2 + 12 + 7x21 x1 x2 x2

2





I've saved a separate rendition of this which is rotated so that you can see, we are dipping below z=0



In [17]: Image(filename ='figure_1.png', width = 800)

Clearly now, for some values of x the z value is less than zero

Now for an example which is clearly definite positive

In [18]: A = Matrix([[2, 6], [6, 26]])A

In [19]: f = x_vect.transpose() * A * x_vectf.expand()

Out[17]:

i

Out[18]: [ ]26

626

Out[19]: [ ]2 + 12 + 26x21 x1 x2 x2

2





Minina

We use the following function from our symmetric matrix above

Completing the square we have the following

From this we can see that we are dealing with all positive values irrespective of the values of the variables

In [21]: (2 * (x1 + 3 * x2) ** 2).expand() # Just checking if we are correct

Setting the equation equal to (a positive) value will cut through the plot and result in an ellipseCutting though a saddle point results in a hyperbola

In [22]: # Rewriting the function above as a computer variablefunction = 2 * (x1 + 3 * x2) ** 2 + 2 * x2 ** 2function

In [23]: # Derivative(f, variable with whicg respect to partial derivative is taken, order)Derivative(function, x1) # Printing the partial derivative to the screen

f ( , ) = 2 + 12 + 20x1 x2 x21 x1 x2 x2

2

f ( , ) = 2 + 2x1 x2 ( + 3 )x1 x22 x2

Out[21]: 2 + 12 + 18x21 x1 x2 x2

2

Out[22]: 2 + 2x22 ( + 3 )x1 x2

2

Out[23]: (2 + 2 )∂∂x1

x22 ( + 3 )x1 x2

2



In [24]: Derivative(function, x1).doit() # The .doit() method execute the partial derivative

In [25]: diff(function, x2, 1) # Alternative method of doing the partial derivative

Solving for the two variables in two equations using an augmented matrix

In [26]: M = Matrix([[4, 12, 0], [12, 40, 0]])M.rref()

Let's look at this if we cut through the x z-plane (that is x = 0) and the x z-plane (that is x =0)

Let's look at the two plots

In [27]: x = sp.linspace(-2, 2, 100)

plt.figure(figsize = (10, 8))plt.plot(x, 2 * x **2)plt.show();

Out[24]: 4 + 12x1 x2

Out[25]: 12 + 40x1 x2

Out[26]: ( )[ ] ,10

01

00 [ ]0, 1

1 y 2 1f ( , ) = 2 + 2x1 x2 ( + 3 )x1 x2

2 x22

f ( , 0) = 2x1 ( )x12

f (0, ) = 2 + 2 = 8x2 (3 )x22 x2

2 x22



In [28]: y = sp.linspace(-2, 2, 100)

plt.figure(figsize = (10, 8))plt.plot(y, 8 * x **2)plt.show();

If you reconstruct this in your mind's eye, you can see that we are dealing with a bowl shapeRemember from calculus that an extrema is a first derivative set to zeroWe look at the second derivative to know if we are dealing with a minimum or a maximum

In [29]: diff(2 * x1 ** 2, x1, 2) # Taking a derivative twice

In [30]: diff(8 * x2 ** 2, x2, 2)

These second derivatives are both positive and we have a minimum

Let's go for a setup again which will not be positive definite

Completing the square will have the following equation

In [31]: function = 2 * (x1 + 3 * x2) ** 2 - 11 * x2 ** 2function, function.expand() # Just checking to see if our completion of the square was correct

In [32]: Derivative(function, x1), diff(function, x1)

In [33]: Derivative(function, x2), diff(function, x2)

In [34]: M = Matrix([[4, 12, 0], [12, 14, 0]])M.rref()

Again, an extrema at (0,0)

Out[29]: 4

Out[30]: 16

f ( , ) = 2 + 12 + 7x1 x2 x21 x1 x2 x2

2

f ( , ) = 2 − 11x1 x2 ( + 3 )x1 x22 x2

2

Out[31]: ( )−11 + 2 ,x22 ( + 3 )x1 x2

2 2 + 12 + 7x21 x1 x2 x2

2

Out[32]: ( )(−11 + 2 ) ,∂∂x1

x22 ( + 3 )x1 x2

2 4 + 12x1 x2

Out[33]: ( )(−11 + 2 ) ,∂∂x2

x22 ( + 3 )x1 x2

2 12 + 14x1 x2

Out[34]: ( )[ ] ,10

01

00 [ ]0, 1



In your mind's eye we clearly have a saddle point at (x ,x )=(0,0)

Let's look at x =-x

We are creating a 45 plane and values for z will be negative here

This is what makes the matrix non-positive definitex Ax will result in an equation which we can use to show always, not-always positive (and the marginal case described above)

So positive definite is the matrix equivalent of the first and second derivative in calculus (which looks at the shape of the plot, i.e extrema)

For a 2×2 we are thus looking for the following

The pivots, the multiplier and completing the square

Let's take the matrix belowWe know it's symmetric and positive definiteWe also saw that for x Ax we had to complete the square to show z>0This can easily be done by looking at the pivots and the multiplier

In [35]: A = Matrix([[2, 6], [6, 20]])A

In [36]: L, U, _ = A.LUdecomposition()L, U

Note how the pivots are 2 and 2 and the multiplier was 3Now look at the completed square equation

In [37]: (x_vect.transpose() * A * x_vect).expand()

So, we wanted squares, but we are also interested in what goes on outside the squares, i.e. the pivots (2 and 2 in our example)Positive pivots give sum of squares, everything positive means there is a minimum (everything goes up)We can extend this for any n×n symmetric matrices

Let's look again at the matrix of second derivatives we had above

f and f have to be positive (for a minimum) and they have to be larger than the product of the other two, f and f

1 2

1 2f ( , ) = 2 + 12 + 7x1 x2 x2

1 x1 x2 x22

f ( , ) = 2 − 12 + 7x1 −x1 x21 x1 x1 x2

1f ( , ) = 2 − 12 + 7x1 −x1 x2

1 x21 x2

1f ( , ) = −3x1 −x1 x2

10

T

[ ]∂

∂x∂x∂

∂y∂x

∂∂x∂y

∂∂y∂y

T

Out[35]: [ ]26

620

Out[36]: ( )[ ] ,13

01 [ ]2

062

Out[37]: [ ]2 + 12 + 20x21 x1 x2 x2

2

2 + 12 + 20x21 x1 x2 x2

2

2 + 2( + 3 )x1 x22 x2

2

[ ]∂

∂x∂x∂

∂y∂x

∂∂x∂y

∂∂y∂y

xx yy xy yx



In [38]: function = 2 * x1 ** 2 + 12 * x1 * x2 + 20 * x2 ** 2function

In [39]: fxx = diff(function, x1, 2)fxy = diff(function, x1, x2)fyx = diff(function, x2, x1)fyy = diff(function, x2, 2)

In [40]: deriv_matr = Matrix([[fxx, fxy], [fyx, fyy]])deriv_matr

In [41]: deriv_matr.det()

Setting the first partial derivatives equal to zero finds the extremaThe condition above sets everything positive (positive definite)

Let's the look at the following

In [42]: function = 2 * x1 ** 2 + 12 * x1 * x2 + 7 * x2 ** 2function

In [43]: fxx = diff(function, x1, 2)fxy = diff(function, x1, x2)fyx = diff(function, x2, x1)fyy = diff(function, x2, 2)

In [44]: deriv_matr = Matrix([[fxx, fxy], [fyx, fyy]])deriv_matr


Although the matrix of second derivatives are all positive entries, the determinant is negativeNot all conditions are met for the original matrix to be positive definite

Let's step this up to 3×3 symmetric matrices

In [46]: A = Matrix([[2, -1, 0], [-1, 2, -1], [0, -1, 2]])A

In [47]: A.transpose() == A # Test to see of A is symmetric

Is this symmetric matrix positive definite?

Let's start by looking at the determinant (and sub-determinants)

In [48]: A.det() # determinant of the whole matrix

Out[38]: 2 + 12 + 20x21 x1 x2 x2

2

Out[40]: [ ]412

1240

Out[41]: 16

Out[42]: 2 + 12 + 7x21 x1 x2 x2

2

Out[44]: [ ]412

1214

Out[45]: −88

Out[46]: ⎡

⎣⎢⎢

2−10

−12

−1

0−12

⎤

⎦⎥⎥

Out[47]: True

Out[48]: 4



All the submatrices will be the following

There determinant are all positive (2, 3, 4)

Let's look at the pivots


The pivots are all positive

Notice how the first determinant was 2 (also the first sub-determinant)The product of the first two pivots must equal the 2 determinant (which is 3) and must therefor be /The product of the first three (all) pivots must equal the 4 determinant (which is 4) and must therefor be /

Let's look at the eigenvalues


Again, all positiveSo far, so goodJust as a reminder, remember that the sum of the eigenvalues must equal the trace (sum of the entries on the main diagonal) and multiplying them must equal thedeterminant

Let's look at x Ax

In [51]: x_vect = Matrix([x1, x2, x3])x_vect


In [53]: function = 2 * x1 ** 2 - 2 * x1 * x2 + 2 * x2 ** 2 - 2 * x2 * x3 + 2 * x3 ** 2function # A cubic equation

We can construct this as followsThe main diagonal entries are the constant of the squared variables (2, 2, and 2)There is a -1 and a -1 in the 12 and 21 row-column positions, whose sum is -2 and which then belongs to the x x (or x x )The 13 and 31 entries are both zero, so there will be no x x coefficientThe 23 and 32 entries are both -1, so again, a coefficient of -2 for x x

This matrix represents a plot in 4D space, so we can't draw itWe can construct it as the sum of three squares thoughThe three squares will be made up of the three pivots (for their coefficients)They (and obviously the squared values) are all positive and therefor we will only have f(x,y,z) values which are positive

[2] , [ ] ,2−1

−12

⎡

⎣⎢⎢

2−10

−12

−1

0−12

⎤

⎦⎥⎥

Out[49]: ⎛

⎝

⎜⎜⎜ ,⎡

⎣

⎢⎢⎢

1− 1

2

0

01

− 23

001

⎤

⎦

⎥⎥⎥

⎡

⎣

⎢⎢⎢

200

−132

0

0−1

43

⎤

⎦

⎥⎥⎥

⎞

⎠

⎟⎟⎟

2, ,32

43

nd 3 2th 4 3

Out[50]: { }2 : 1, − + 2 : 1,2‾√ + 2 : 12‾√

T

Out[51]: ⎡

⎣⎢⎢

x1x2x3

⎤

⎦⎥⎥

Out[52]: [ ]2 − 2 + 2 − 2 + 2x21 x1 x2 x2

2 x2 x3 x23

Out[53]: 2 − 2 + 2 − 2 + 2x21 x1 x2 x2

2 x2 x3 x23

1 2 2 11 3

2 3



Cutting through this 4D space (which is difficult to visualize) as say f(x,y,z) will give an ellipsoid (lopsided football)A sphere would have three equal eigenvaluesA football-shape would have two identical eigenvalues and the third differentThe lopsided ellipsoid would all three eigenvalues different as in this caseThe half-lengths the axes of these shapes are 1 over the eigenvaluesDiagonalization will give the principle axis theorem

In [54]: fxx = diff(function, x1, x1)fxy = diff(function, x1, x2)fxz = diff(function, x1, x3)fyx = diff(function, x2, x1)fyy = diff(function, x2, x2)fyz = diff(function, x2, x3)fzx = diff(function, x3, x1)fzy = diff(function, x3, x2)fzz = diff(function, x3, x3)

In [55]: deriv_matr = Matrix([[fxx, fxy, fxz], [fyx, fyy, fyz], [fzx, fzy, fzz]])deriv_matr


The determinant (and all sub-determinants) are positive

Example problems

Example problem 1

For which values of c will the following matrix be positive definite and semi-positive definite

Solution

Let's try the determinant test first

In [57]: A = Matrix([[2, -1, -1], [-1, 2, -1], [-1, -1, 2 + c]])A

In [58]: A.det()

All the sub-determinants are positive, being 2, 3 and then 3c for c>0

Let's look at the pivot test

QΛQT

Out[55]: ⎡

⎣⎢⎢

4−20

−24

−2

0−24

⎤

⎦⎥⎥

Out[56]: 32

⎡

⎣⎢⎢

2−1−1

−12

−1

−1−1

2 + c

⎤

⎦⎥⎥

Out[57]: ⎡

⎣⎢⎢

2−1−1

−12

−1

−1−1

c + 2

⎤

⎦⎥⎥

Out[58]: 3c




Again, all the pivots (in U) are positive for c > 0So for positive definite we have c>0 and for semi-definite we have c=0


The energy or completing the square test


For x we have (c+2)(x )Remember, though, that for the squares of the x-components we must have the entries along the main diagonal of A as their coefficients; this c + 2 = 2 and hence, again,c=0

For interest's sake, we will have the following completed square equation

Now the coefficients come from the values along the diagonal of UThe -½ values come from the multipliers as seen in column 1 of LThe +1 and -1 for (y-z) come from column 2 of LThe +1 in front of z come from column 3 of L (actually every set of () contains and x, y and z, some coefficients (from L) are just zero)Be that as it may, the squared equation as it stands will only equal zero if x=y=z=0For a value of more than zero, c must be positivePlease note that by x,y and z I am referring to x , x and x

If x=0 we have the following matrix

In [62]: A = Matrix([[2, -1, -1], [-1, 2, -1], [-1, -1, 2]])A

In [63]: A.det()

A singular matrix, again only possible if all variables equal zero

In [ ]:

Out[59]: ⎛

⎝

⎜⎜⎜ ,⎡

⎣

⎢⎢⎢

1− 1

2

− 12

01

−1

001

⎤

⎦

⎥⎥⎥

⎡

⎣⎢⎢⎢

200

−1320

−1− 3

2c

⎤

⎦⎥⎥⎥

⎞

⎠

⎟⎟⎟

Out[60]: { }3 : 1, − + : 1,c2

12 + 2c + 9c2‾ ‾‾‾‾‾‾‾‾‾‾√ 3

2 + + : 1c2

12 + 2c + 9c2‾ ‾‾‾‾‾‾‾‾‾‾√ 3

2

Out[61]: [ ]c + 2 − 2 − 2 + 2 − 2 + 2x23 x2

1 x1 x2 x1 x3 x22 x2 x3 x2

3

3 32

2 + + c(x − y − z)12

12

2 32 (y − z)2 z2

22 's

1 2 3

Out[62]: ⎡

⎣⎢⎢

2−1−1

−12

−1

−1−12

⎤

⎦⎥⎥

Out[63]: 0

2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form

Page 1 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false

This notebook is part of lecture 28 Similar matrices and Jordan form in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Similar matrices

Positive definite matrices

Remember the following from positive definite matrices

These always refer to symmetric matricesWhat do we know about their inverses?

We can't say anything about their pivotsThe following is true for their eigenvalues, though

The inverse is also positive definiteIf both A and B are positive definite

We don't know the pivots of (A+B)We don't know the eigenvalues of (A+B)We could look at the following (which is true)

From least squares the m×n matrix A (is not square, neither symmetric (for this section, though, we assume rank is n)) we used A A, which is square and symmetric, butis it positive definite?Analogous with real numbers, where we ask is the square of any value positive?

Again we don't know is the pivots or eigenvaluesWe do look at the following which is always positive (which you can show by grouping some terms)

This last statement is just the length of Ax, which must be positive (or zero; only if x=0)

Similar matrices

Out[1]:

Ax > 0; x ≠ 0xT

=λA−11λA

(A + B) x > 0xT

T

AxxT AT

= (Ax)(Ax)T

= ∥Ax∥2





http://ipython.org/



Consider two similar, square matrices A and B (no longer with the requirement that they are symmetric)They have similar sizes, thoughThe similarity lies in the fact that there is some invertible matrix M for which the following holds

Remember the creation of the diagonal matrix using the eigenvector matrix

This says A is similar to Λ

Now we consider some (invertible) matrix M and create a matrix B from M AMWe state that B is then similar to A (it is now part of some family of matrices of A, the neatest of which is the diagonal matrix Λ for its creation via the eigenvector matrix ofA)

In [4]: A = Matrix([[2, 1], [1, 2]])A

In [5]: S, D = A.diagonalize() # S is the eigenvector matrix

In [6]: S.inv() * A * S # The matrix Lambda

Now let's invent a matrix M

In [7]: M = Matrix([[1, 4], [0, 1]])M

In [8]: B = M.inv() * A * MA, B # Printing both to the screen

What does A and B have in common?The have the same eigenvalues

In [9]: A.eigenvals(), B.eigenvals() # The solution is in the form {eigenvalue:how many times that that value occur...}

All similar-sized matrices with the same eigenvalues are similar matricesThe most special member of this family is the diagonal matrix with the eigenvalues on the main diagonal

The eigenvectors are not the same though

In [10]: A.eigenvects(), B.eigenvects()

B = AMM −1

AS = ΛS−1

-1

Out[4]: [ ]21

12

Out[6]: [ ]10

03

Out[7]: [ ]10

41

Out[8]: ( )[ ] ,21

12 [ ]−2

1−15

6

Out[9]: ( ){ } ,1 : 1, 3 : 1 { }1 : 1, 3 : 1

Ax = λxA x = λ xM −1 M −1

∵ M = IM −1

AM x = λ xM−1 M −1 M −1

∵ B = AMM −1

B x = λ xM −1 M −1

Out[10]: ( )[ ] ,( ) ,1, 1, [ ][ ]−11 ( )3, 1, [ ][ ]1

1 [ ]( ) ,1, 1, [ ][ ]−51 ( )3, 1, [ ][ ]−3

1



Remember that we have a problem when eigenvalues are repeated for a matrixIf this is so, we might not have a full set of eigenvectors and we cannot diagonalize

In [11]: A1 = Matrix([[4, 0], [0, 4]])A2 = Matrix([[4, 1], [0, 4]])

In [12]: A1.eigenvals()


Both the two matrices A 1 and A have two similar eigenvalues each, namely 4They are not similar, thoughThere is no matrix M to use with A to produce A

Note that A is 4 multiplied by the identity matrix of size 2It is a small family, with only this memberA is the neatest member of its much larger familyDiagonalizing it is not possible, though, as if it was, it would results in A which is not in the same family, leaving A as the neatest family member

The nicest (most diagonal one) is called the Jordan form of the family

Let's find more members of A

The matrix A is

The trace is 8, so let's choose 5 and 3

The determinant must remain 16, so let's choose 1 and -1

In [14]: A3 = Matrix([[5, 1], [-1, 3]])A1.eigenvals() == A3.eigenvals() # Check to see if the eigenvalues are similar

So we have to add, similar independent columns of eigenvectors to the definition of similar matricesIt's more than that, though

In [15]: A4 = Matrix([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0]])A4

In [16]: A4.eigenvals() # Four zeros

In [17]: A4.eigenvects() # Rank of 2

Out[12]: { }4 : 2

Out[13]: { }4 : 2

1 2

1 2

1

22 1

1

1

[ ]40

14

[ ]53

[ ]5−1

13

Out[14]: True

Out[15]: ⎡

⎣

⎢⎢⎢⎢

0000

1000

0100

0000

⎤

⎦

⎥⎥⎥⎥

Out[16]: { }0 : 4

Out[17]: ⎡

⎣

⎢⎢⎢⎢

⎛

⎝

⎜⎜⎜⎜0, 4,

⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

1000

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0001

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

⎤

⎦

⎥⎥⎥⎥



In [18]: A5 = Matrix([[0, 1, 7, 0], [0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0]])A5


In [20]: A5.eigenvects()

In [21]: A6 = Matrix([[0, 1, 0, 0], [0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 0]])A6


In [23]: A6.eigenvects()

In [24]: A4.eigenvects() == A5.eigenvects()

In [25]: A4.eigenvects() == A6.eigenvects()

Jordan's theoremEvery square matrix A is similar to a Jordan matrix JThere is one eigenvector per blockThe eigenvalues sit along the main diagonalThe matrices are not similar if the blocks are not of similar sizeSee problem 3 below where Jordan blocks are formed (they must actually both be broken down further into true Jordan blocks which will show the blocks to beof unequal size, instead I keep them in non-Jordan form (not correct) and show different number of pivots and thereby different eigenvectors)

Example problems

Example problem 1

If A and B are similar matrices, why are the following similar?

Solution

Out[18]: ⎡

⎣

⎢⎢⎢⎢

0000

1000

7100

0000

⎤

⎦

⎥⎥⎥⎥

Out[19]: { }0 : 4

Out[20]: ⎡

⎣

⎢⎢⎢⎢

⎛

⎝

⎜⎜⎜⎜0, 4,

⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

1000

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0001

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

⎤

⎦

⎥⎥⎥⎥

Out[21]: ⎡

⎣

⎢⎢⎢⎢

0000

1000

0000

0010

⎤

⎦

⎥⎥⎥⎥

Out[22]: { }0 : 4

Out[23]: ⎡

⎣

⎢⎢⎢⎢

⎛

⎝

⎜⎜⎜⎜0, 4,

⎡

⎣

⎢⎢⎢⎢,

⎡

⎣

⎢⎢⎢⎢

1000

⎤

⎦

⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢

0010

⎤

⎦

⎥⎥⎥⎥

⎤

⎦

⎥⎥⎥⎥

⎞

⎠

⎟⎟⎟⎟

⎤

⎦

⎥⎥⎥⎥

Out[24]: True

Out[25]: False

2 + A − 3IA3

2 + B − 3IB3



There is some matrix such that the following is true

From this follows

I.e. if two matrices (A and B) are similar any polynomial involving them (replacing A with B) are similar

Example problem 2

Are the two 3×3 matrices A and B , with eigenvalues 1, 0, -1 similar?

Solution

Yes, because the eigenvalues are distinct (and diagonalizable)

Example problem 3

Are these two matrices similar?

Solution

No

In [26]: J1 = Matrix([[-1, 1, 0], [0, -1, 1], [0, 0, -1]])J2 = Matrix([[-1, 1, 0], [0, -1, 0], [0, 0, -1]])J1, J2

Let's create Jordan block from these

In [27]: J1 + eye(3), J2 + eye(3)

Jordan blocks have zeros on the main diagonal and various forms of 1 just above the main diagonalNote the difference between the Jordan blocks of J and JThe first now contains two pivots and the second only 1; they will not have the same number of eigenvectors and cannot be similar

In [ ]:

MA = BM −1

M (2 + A − 3I)A3 M −1

= 2 (MA MA MA ) + MA − 3MIM −1 M −1 M −1 M −1 M −1

= 2 + B − 3IB3

=J1

⎡

⎣⎢⎢

−100

1−10

01

−1

⎤

⎦⎥⎥

=J2

⎡

⎣⎢⎢

−100

1−10

00

−1

⎤

⎦⎥⎥

Out[26]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

−100

1−10

01

−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

−100

1−10

00

−1

⎤

⎦⎥⎥

⎞

⎠⎟⎟

Out[27]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

000

100

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

000

100

000

⎤

⎦⎥⎥

⎞

⎠⎟⎟

1 2

2015/03/29, 8:08 PMIII_05_Singular_value_decomposition

Page 1 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false

This notebook is part of lecture 29 Singular value decomposition in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom warnings import filterwarnings


This chapter starts with explanations using sympyA proper method using numpy and sympy is described in the example problem at the end

Singular value decomposition (SVD)

Derivation

This is the final form of matrix factorizationThe factors are an orthogonal matrix A, a diagonal matrix Σ, and an orthogonal matrix V

In case the matrix A is symmetric positive definite, the decomposition is akin to the following

Consider a vector v in � row space, transformed into a vector u in � column space by the matrix A

What we are looking for is an orthogonal basis in � row space, transformed into an orthogonal basis in � column space

It's easy to calculate an orthogonal basis in the row space using Gram-SchmidtNow, though, we need something special in A that would ensure that the basis u in in � column space is also orthogonal (and at the same time make it orthonormal, sothat v ends up as σ u )The two nullspaces are not requiredSo, we are looking for the following

Out[1]:

A = UΣV T

A = QΛQT

1 n 1 m

= Au1 v1

n m

= Au1 v1⊥ ; ⊥v1 v2 u1 u2

i m

i i i

A =⎡

⎣⎢⎢⎢

⋮v1

⋮

⋮v2

⋮

⋮⋯⋮

⋮vr

⋮

⎤

⎦⎥⎥⎥

⎡

⎣⎢⎢⎢

⋮u1

⋮

⋮u2

⋮

⋮⋯⋮

⋮ur

⋮

⎤

⎦⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

σ1σ2

⋱σr

(0)

⎤

⎦

⎥⎥⎥⎥⎥⎥AV = UΣ





http://ipython.org/



In the case that we are not changing spaces V and U would be the same matrix Q (and then Q )

Example problem explaining the derivation

Look at the next matrix A that is square and invertible (i.e. rank 2)

In [4]: A = Matrix([[4, 4], [-3, 3]])A

We are looking for v and v in the � rowspace and u and u in the � columnspace, as well as the scaling factors σ >0 and σ >0

Just to be complete, we extend V until v with zero columns and U with zero columns until u , as well as zeros for Σ to include the nullspaces

Now A is not symmetric so that their eigenvectors are not orthogonal (Q), so we can't go that route

From above we have the following and because V is square and orthogonal we have

Multiplying both sides by A we will have a left-hand side that is square and definte (semi)definte

Because A A is now definite (semi)positive, we have a perfect situation akin to being able to use QΛQThe eigenvalues are the squares of the σ valuesTo get U we use AA and use its eigenvalues and eigenvectors

All of this is easy to accomplish with the mpmath submodule of sympy

In [5]: from sympy.mpmath import svd

In [6]: U, S, V = svd(A)

In [7]: U # The numbers round to zero!!! Please see it as zero

In [8]: S # Not the final Sigma matrix

In [9]: V

There are square roots, so the values are given instead of symbols

Now let's do it step-by-step

-1

Out[4]: [ ]4−3

43

1 2 2 1 2 2 1 2

n m

A = UΣV −1

A = UΣV T

T

A = UΣV T

A = V UΣAT ΣT UT V T

∵ U = IUT

∵ Σ = … …ΣT σ2i

A = V ΣAT ΣT V T

T T

iT

Out[7]: [ ]1.01.11022302462516 ⋅ 10−16

3.33066907387547 ⋅ 10−16

−1.0

Out[8]: matrix([['5.65685424949238'], ['4.24264068711928']])

Out[9]: matrix([['0.707106781186547', '0.707106781186548'], ['0.707106781186548', '-0.707106781186547']])




In [11]: (A.transpose() * A).eigenvals()

In [12]: (A.transpose() * A).eigenvects()

These are not normalized, thoughAlso remember to take the square roots of the eigenvalues... and to add zeros to incorporate the correct size for m and n... and to take the transpose

Now let's tackle U


In [14]: (A * A.transpose()).eigenvals()

The eigenvalues are always the same

In [15]: (A * A.transpose()).eigenvects()

Also remember to normalize (see example problem below)

We now have U, Σ and V (although Σ must still be constructed; see below)

Example problem to explain the derivation for dependent rows, columns

Let's consider this rank=1, 2×2 singular matrixThe rowspace is just a line (the second row is a constant multiple of the first)The nullspace of this row picture is a line perpendicular to thisThe columnspace is also on a line, with the nullspace of A being a line perpendicular to this

In [16]: A = Matrix([[4, 3], [8, 6]])A

Let's use svd() first

In [17]: U, S, V = svd(A, full_matrices = True, compute_uv = True)

This is likely to be different to the value you calculate for UWe are talking unit basis vectors, though, which can be in a different direction depending on your choice

In [18]: U

Out[10]: [ ]257

725

Out[11]: { }18 : 1, 32 : 1

Out[12]: [ ]( ) ,18, 1, [ ][ ]−11 ( )32, 1, [ ][ ]1

1

Out[13]: [ ]320

018

Out[14]: { }18 : 1, 32 : 1

Out[15]: [ ]( ) ,18, 1, [ ][ ]01 ( )32, 1, [ ][ ]1

0

T

T

Out[16]: [ ]48

36

Out[18]: [ ]−0.447213595499958−0.894427190999916

−0.8944271909999160.447213595499958



In [19]: S

Note that the size of our Σ matrix is wrongIt has to be 2×2 and we have to create it from this infoSince A has rank = 1 and all off-diagonal entries must be zero, we will only have a value in the first row, first column positionBelow I show you how to correct this

In [20]: V

In [21]: A.transpose() * A # Which will be symmetric positive definite and of rank = 1

In [22]: (A.transpose() * A).eigenvals() # One eigenvalue will be zero and the other must then be the trace

In [23]: (A.transpose() * A).eigenvects()


In [25]: (A * A.transpose()).eigenvals()

In [26]: (A * A.transpose()).eigenvects()

Inserted below is the three resultant matrices from our calculations above (normalized, etc)

In [27]: Matrix([[1 / sqrt(5), 2 / sqrt(5)], [2 / sqrt(5), 1 / sqrt(5)]]), Matrix([[sqrt(125), 0], [0, 0]]), Matrix([[0.8, 0.6], .6, -0.8]])

Now let me show you how to correct the svd() solutions

In [28]: U

In [29]: S

In [30]: S = Matrix([[11.1803398874989, 0], [0, 0]])S # Composed by hand (proper method further below)

Out[19]: matrix([['11.1803398874989'], ['0.0']])

Out[20]: matrix([['-0.8', '-0.6'], ['-0.6', '0.8']])

Out[21]: [ ]8060

6045

Out[22]: { }0 : 1, 125 : 1

Out[23]:

[ ]( ) ,0, 1, [ ][ ]− 34

1 ( )125, 1, [ ][ ]431

Out[24]: [ ]2550

50100

Out[25]: { }0 : 1, 125 : 1

Out[26]: [ ]( ) ,0, 1, [ ][ ]−21 ( )125, 1, [ ][ ]

121

Out[27]: ⎛⎝⎜⎜ ,

⎡⎣⎢⎢

5√5

2 5√5

2 5√55√

5

⎤⎦⎥⎥ [ ] ,5 5‾√

000 [ ]0.8

0.60.6

−0.8

⎞⎠⎟⎟

Out[28]: [ ]−0.447213595499958−0.894427190999916

−0.8944271909999160.447213595499958

Out[29]: matrix([['11.1803398874989'], ['0.0']])

Out[30]: [ ]11.18033988749890

00



In [31]: V

In [32]: V = Matrix([[-0.8, -0.6], [-0.6, 0.8]])V # Remember that this is actually V transpose

Let's calculate AΣV

In [33]: U * S * V

Compensating for rounding, this is the original matrix A

Summary

The orthonormal basis for the rowspace is

The orthonormal basis for the columnspace is

The orthonormal basis for the nullspace is

The orthonormal basis for the nullspace of A

Example problem

Example problem 1

Find the singular value decomposition of the matrix

Solution

First off, I'll show you how to make proper use of numpy and scipy (as opposed to sympy) to solve singular value decomposition problems

In [34]: from numpy import matrix, transpose # Importing the matrix object and the # transpose object from numerical python (numpy)from numpy.linalg import svd, det # Importing the svd and determinant# methods from the linalg submodule from scipy.linalg import diagsvd

In [35]: type(transpose) # Type tells us what 'something' is (sometimes)

In [36]: C = matrix([[5, 5], [-1, 7]]) # Using the numpy matrix objectC

We can see from the determinant that the rows and columns are independent

Out[31]: matrix([['-0.8', '-0.6'], ['-0.6', '0.8']])

Out[32]: [ ]−0.8−0.6

−0.60.8

T

Out[33]: [ ]3.999999999999987.99999999999996

2.999999999999995.99999999999997

, , … ,v1 v2 vr

, , … ,u1 u2 ur

, , … ,vr+1 vr+2 vnT

, , … ,ur+1 ur+2 um

[ ]5−1

57

Out[35]: function

Out[36]: matrix([[ 5, 5], [-1, 7]])



In [37]: det(C) # Notice the difference in syntax

Let's calculate U by looking at A A

In [38]: transpose(C) *C # Notice the difference in synmtax

This is symmetric, positive definiteOne eigenvalue will be 0 and the other, the trace (since they (the eigenvalues) must sum to the trace)Remember that the eigenvalues are the squares of the σ values

Now let's put numpy and sympy to good use

In [39]: U, S, VT = svd(C) # I use the computer variable VT to remind us that# this is the transpose of V

S will only indicate the eigenvalues and must be converted to the correct sized matrix

In [40]: M, N = C.shape # Shape returns a tuple (two values), indicating# row and column sizeM, N

In [41]: Sig = diagsvd(S, M, N) # Creating a m times n matrix from SSig

In [42]: VT

Let's check if it worked!

In [43]: U * Sig * VT

Now, let's use good old sympy

In [44]: C = Matrix([[5, 5], [-1, 7]])C

We need to work with a positive (semi)definite matrix

In [45]: CTC = C.transpose() * C # Using the computer variable CTC to remind that# it is C transpose times CCTC, CTC.det()

Let's look at the eigenvalues

In [46]: CTC.eigenvals()

Out[37]: 40.0

T

Out[38]: matrix([[26, 18], [18, 74]])

i

Out[40]: ( )2, 2

Out[41]: array([[ 8.94427191, 0. ], [ 0. , 4.47213595]])

Out[42]: matrix([[ 0.31622777, 0.9486833 ], [ 0.9486833 , -0.31622777]])

Out[43]: matrix([[ 5., 5.], [-1., 7.]])

Out[44]: [ ]5−1

57

Out[45]: ( )[ ] ,2618

1874

1600

Out[46]: { }20 : 1, 80 : 1



Σ will contain along its main diagonal the square root of these eigenvalues

In [47]: Sig = Matrix([[sqrt(20), 0], [0, sqrt(80)]])Sig

For V we require the eigenvectors of C CWe need to remember to normalize each vector (dividing each component by the length (norm) of that vector

In [48]: CTC.eigenvects()

Let's normalize each v by calculating the length (norm) of each

In [49]: v1 = Matrix([-3, 1])v1.norm()

In [50]: v2 = Matrix([Rational(1, 3), 1])v2.norm()

We'll get each element of V by dividing by these norms

In [51]: -3 / v1.norm(), 1 / v1.norm()

In [52]: Rational(1, 3) / v2.norm(), 1 / v2.norm()

In [53]: V = Matrix([[-3 / sqrt(10), 1 / sqrt(10)], [1 / sqrt(10), 3 / sqrt(10)]])# Just remember to put the elements of V in the correct placeV

Remember that it is equal to the transpose of V

In [54]: V == V.transpose()

Now for U using CC

In [55]: CCT = C * C.transpose() # Using the computer variable CCTCCT

The eigenvalues will be the same

In [56]: CCT.eigenvals()

Out[47]: [ ]2 5‾√0

04 5‾√

T

Out[48]:

[ ]( ) ,20, 1, [ ][ ]−31 ( )80, 1, [ ][ ]

131

i

Out[49]: 10‾‾‾√

Out[50]: 10‾‾‾√3

Out[51]: ( )− ,3 10√10

10√10

Out[52]: ( ),10√10

3 10√10

Out[53]: ⎡⎣⎢⎢

− 3 10√1010√

10

10√10

3 10√10

⎤⎦⎥⎥

Out[54]: True

T

Out[55]: [ ]5030

3050

Out[56]: { }20 : 1, 80 : 1



In [57]: CCT.eigenvects()

In [58]: u1 = Matrix([-1, 1])u2 = Matrix([1, 1])

In [59]: -1 / u1.norm(), 1 / u1.norm()

In [60]: 1 / u2.norm(), 1 / u2.norm()

In [61]: U = Matrix([[-sqrt(2) / 2, sqrt(2) / 2], [sqrt(2) / 2, sqrt(2) / 2]])# Just remember to put the elements of U in the correct placeU

Let's see if it worked!

In [62]: U * Sig * V

In [ ]:

Out[57]: [ ]( ) ,20, 1, [ ][ ]−11 ( )80, 1, [ ][ ]1

1

Out[59]: ( )− ,2√2

2√2

Out[60]: ( ),2√2

2√2

Out[61]: ⎡⎣⎢⎢

− 2√22√

2

2√22√

2

⎤⎦⎥⎥

Out[62]: [ ]5−1

57

2015/03/29, 8:10 PMIII_06_Linear_transformations

Page 1 of 3http://localhost:8888/nbconvert/html/III_06_Linear_transformations.ipynb?download=false

This notebook is part of lecture 30 Linear transformation and their matrices in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Linear transformations

Mappings / transformations / projections

We'll begin the chapter with a familiar example: Projections

Let's look at a mapping / transformation / projectionT: � → T: �No axes are required for this mapping

Another example using matrices

For these to be a linear mapping, the following must apply

Example that are not linear transformationsShifting the axesThe transformation that turns a vector into its lengthTransformation involving power or transcendental function

Example

Let's create and example which accomplishes the followingT: � → �

Without axes we could look at something like this

Out[1]:

2 2

T (v) = Av

T (v + w) = T (v) + T (w)T (cv) = cT (v)

3 2

f (x, y, z) = (x + y + z, 3x − 2y + z)





http://ipython.org/



With axes we naturally turn to matricesWe require v in 3-space

To have a resultant vector in 2-space we require a matrix A of size 2×3

Notice if we know what a linear transformation does to a single vector, we know what it does to constant multiples of that vector (from the property T(cv)=cT(v))In 2- or 3-space this would represent a line

If we knew what T does to two (linearly independent or basis) vectors, we know what it does to the subspace created by those two vectors (the whole plane in � or aplane in � ), i.e. linear combinations of these two vectors

Coordinates / basis

Coordinates originate from a basis

This need not be the standard basis, though

Let's look then at the following transformation: T: � →�We need a basis for the input (v ,v ,...,v in � ) and the output (w ,w ,...,w in � )

So what is the rule for finding A that will transform v into w?Rule finding A given basis for v and w

The first column of A: apply T(v )=a w +a w +...+a w , with the first column entries of A being a , a , ..., aThe second column of A: apply T(v )=a w +a w +...+a w , with the second column entries of A being a , a , ..., a... and so on until v

Example

Let's look at polynomials with the following transformation

The input might be

The output will be

This gives use the following basis for each

We need the following

Following the rule above we will have to have the following

⎡

⎣⎢⎢

v1v2v3

⎤

⎦⎥⎥

T (v) = Av

=[ ]a11a21

a12a22

a13a23 2×3

⎡

⎣⎢⎢

v1v2v3

⎤

⎦⎥⎥

3×1[ ]+ +a11 v1 a12 v2 a13 v3

+ +a21 v1 a22 v2 a23 v3 2×1

23

v = + + ⋯ +c1 v1 c2 v2 cn vn

v = = 2 + 3 −⎡

⎣⎢⎢

23

−1

⎤

⎦⎥⎥

⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

⎡

⎣⎢⎢

010

⎤

⎦⎥⎥

⎡

⎣⎢⎢

001

⎤

⎦⎥⎥

n m

1 2 n n 1 2 m m

1 11 1 21 2 m1 m 11 21 m12 12 1 22 2 m2 m 12 22 m2

n

T = ddx

+ x +c1 c2 c3 x2

+ 2 xc2 c3

1, x, x2

1, x

A = [ ]⎡

⎣⎢⎢

c1c2c3

⎤

⎦⎥⎥

c22c2

A = [ ]00

10

02



Example problems

Example problem 1

Consider the 2×2 matrix A and let T(A)=AWhy is T linear and what is TWrite the matrix of T in:

What are the eigenvalues and eigenvectors of T?

Solution

From the properties of linear transformation we have the following

A transpose turns a row into a columns and vice versa, from which we infer the following

For the next question we will have to see what T does to each basis matrix

From this we have to form a matrixThink of the columns each being Tv , Tv , Tv , and TvWe see that transforming v takes 1 v , and none of the rest

For the w we will have the following matrix

For the last question we will have the following

In [ ]:

T-1

= [ ] , = [ ] , = [ ] , = [ ]v110

00

v200

10

v301

00

v400

01

= [ ] , = [ ] , = [ ] , = [ ]w111

00

w200

01

w301

10

w40

−110

T (A + B) = = + = T (A) + T (B)(A + B)T AT BT

T (cA) = = cT (A)(cA)T

= IT 2

∴ = TT −1

T ( ) =v1 v1T ( ) =v2 v3T ( ) =v3 v2T ( ) =v4 v4

1 2 3 41 1

=MT

⎡

⎣

⎢⎢⎢⎢

1000

0010

0100

0001

⎤

⎦

⎥⎥⎥⎥

i

=MT

⎡

⎣

⎢⎢⎢⎢

1000

0100

0010

000

−1

⎤

⎦

⎥⎥⎥⎥

T (v) = λv

2015/03/29, 8:12 PMIII_07_Image_compression_Change_of_basis

Page 1 of 4http://localhost:8888/nbconvert/html/III_07_Image_compression_Change_of_basis.ipynb?download=false

This notebook is part of lecture 31 Change of basis and image compression in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper







Image compression and change of basis

Lossy image compression

Consider a 2 × 2 monochrome imageEvery pixel in this 512×512 image can take a value of 0 ≤ x < 255 (this is 8-bit)This make x a vector in � , with n = 512 (for color images this would be 3×n)

In [4]: # Just look at what 512 square is512 ** 2

This is a very large, unwieldy basisConsider the standard basis

Consider now the better basis

Indeed, there are many options

Out[1]:

9 9

in 2

Out[4]: 262144

, , ⋯ ,

⎡

⎣

⎢⎢⎢⎢⎢⎢

100⋮0

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

010⋮0

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

000⋮1

⎤

⎦

⎥⎥⎥⎥⎥⎥

, , , ⋯

⎡

⎣

⎢⎢⎢⎢⎢⎢

111⋮1

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢

1⋮1

−1⋮

−1

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

1−11

−1⋮

⎤

⎦

⎥⎥⎥⎥⎥⎥





http://ipython.org/



JPEG uses an 8 × 8 Fourier basisThis means that an image is broken up into 64 × 64 pixel blocksSee the lectures on the Fourier basis

This gives us a vector x in � (i.e. with 64 coefficients)Up until this point the compression is losslessNow comes the compression (of which there are many such as thresholding)Thresholding

Get rid of values more or less than set values (now there a less coefficients)

Video is a sequence of images that are highly correlated (not big changes from one image to the next) and you can predict future changes from previous changes

There are newer basis such as waveletsHere is an example

Every vector in � is a linear combination of these 8 basis vectors

Let's do some linear algebraConsider only a top row of 8 pixels

The standard vector of the values will be as follows (with 0 ≤ p < 255)

We have to write this as a linear combination of the wavelet basis vectors w (the lossless step)

In vector form we have the following

Let's bring some reality to thisFor fast computation, W must be as easy to invert as possibleThere is great competition to come up with better compression matricesA good matrix must have the following

Be fast, i.e. the fast Fourier transform (FFT)The wavelet basis above is fast

The basis vectors are orthogonal (and can be made orthonormal)If they are orthonormal then the inverse is equal to the transpose

Good compressionIf we threw away some of the p values, we would just have a dark imageWe we threw away, say the last two c values (last two basis vectors) that won't lose us so much quality

, , ⋯

⎡

⎣

⎢⎢⎢⎢⎢⎢

111⋮1

⎤

⎦

⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢

1WW 2

⋮W n−1

⎤

⎦

⎥⎥⎥⎥⎥⎥64

= ∑x c i vi

, , , , , , ,

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

11111111

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

1111

−1−1−1−1

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

11

−1−100

00

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

00001111

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

1−1000000

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

001

−10000

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

00001

−100

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

0000001

−1

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥8

i ⎡

⎣

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

p1p2p3p4p5p6p7p8

⎤

⎦

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥i

P = + + ⋯ +c1 w1 c2 w2 c8 w8

P =⎡

⎣⎢⎢⎢

⋮w1

⋮

⋯⋯⋯

⋮w8

⋮

⎤

⎦⎥⎥⎥

⎡

⎣⎢⎢⎢

c1

⋮c8

⎤

⎦⎥⎥⎥

P = Wcc = PW −1

ii



Change of basis

Let's look at this change in basisAbove, we had the following

Here W is the matrix that takes us from the vector x in the old basis to the vector c in the new basis

Consider any transformation T (such as a rotation transformation)With respect to v ,...,v it has a matrix AWith respect to w ,...,w it has a matrix B

Turns out that matrices A and B are similar

Here M is the matrix that transforms the basis

What is A then, using the basis v ,...,v ?We know T completely from T(v )...

... because if every x=Σc v

... then T(x)=Σc T(v )

Constructing AWrite down all the transformations

Now we know A

Let's consider the linear transformation T(v )=λThis makes A the following

Example problems

Example problem 1

The vector space of all polynomials in x (of degree ≤ 2) has the basis 1, x, xConsider a different basis w , w , w whose values at x = -1, 0, and 1 are given by the following

Express y(x)=-x+5 in the new basisFind the change of basis matricesFind the matrix of taking derivatives in both of the basis

Solution

x = Wc

1 81 8

B = AMM −1

1 8i

i ii i

T ( ) = + + ⋯ +v1 a11 v1 a21 v2 a81 v8T ( ) = + + ⋯ +v2 a12 v1 a22 v2 a82 v8

⋮T ( ) = + + ⋯ +v8 a18 v1 a28 v2 a88 v8

A =⎡

⎣⎢⎢⎢

a11

⋮a81

⋯⋯⋯

a18

⋮a88

⎤

⎦⎥⎥⎥

i i

A =

⎡

⎣

⎢⎢⎢⎢⎢⎢

λ1

0⋮⋮0

0λ2

0⋮⋯

⋯0⋱⋮⋯

⋯⋯⋯⋱0

0⋮⋮0λ8

⎤

⎦

⎥⎥⎥⎥⎥⎥

2

1 2 3x = −1 → 1 + +w1 0w2 0w3x = 0 → 0 + 1 +w1 w2 0w3x = 1 → 0 + +w1 0w2 1w3



For the second part let's look at what happens at x for the various values at 1 (which is x ), x, and xFor -1 we have 1, -1, and 1For 0 we have 1, 0, and 0For 1 we have 1, 1, and 1

From this we can conclude the following

Now we have the following matrix

This converts the first basis to the secondTo convert the second basis to the original we just need A

In [5]: A = Matrix([[1, -1, 1], [1, 0, 0], [1, 1, 1]])A.inv()

Now for derivative matricesFor the original basis, this is easy

For the second basis we need the following

In [6]: Dx = Matrix([[0, 1, 0], [0, 0, 2], [0, 0, 0]])Dw = A * Dx * A.inv()Dw

Just to conclude we can write the values for w from the inverse of A (the columns)

In [ ]:

y (x) = 5 − xy (x) = α + β + γw1 w2 w3

y (−1) = 6y (0) = 5y (1) = 4

=⎡

⎣⎢⎢

100

010

001

⎤

⎦⎥⎥

⎡

⎣⎢⎢

αβγ

⎤

⎦⎥⎥

⎡

⎣⎢⎢

654

⎤

⎦⎥⎥

α = 6, β = 5, γ = 4y = 6 + 5 + 4w1 w2 w3

0 2

1 = + +w1 w2 w3x = − +w1 w3

= +x2 w1 w3

A =⎡

⎣⎢⎢

111

−101

101

⎤

⎦⎥⎥

-1

Out[5]: ⎡

⎣

⎢⎢⎢

0− 1

212

10

−1

01212

⎤

⎦

⎥⎥⎥

=Dx

⎡

⎣⎢⎢

000

100

020

⎤

⎦⎥⎥

= ADDw A−1

Out[6]: ⎡

⎣

⎢⎢⎢

− 32

− 12

12

20

−2

− 12

1232

⎤

⎦

⎥⎥⎥

i

= x +w1−12

12 x2

= 1 −w2 x2

= x +w312

12 x2

2015/03/29, 8:18 PMIII_09_Quiz_review

Page 1 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false

This notebook is part of Quiz review 3 in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrt, eyefrom numpy.linalg import pinv, inv, det, svd, norm, eigfrom scipy.linalg import pinv2from warnings import filterwarnings


Review of select topics

Quick facts

EigenvaluesThere are shortcuts that we can sometimes employ to calculate them

In symmetric matrices A = ATheir eigenvalues are always realThere are always enough eigenvectors and we can choose them to be orthogonalThey van be diagonalized and factorized as Q&Lambda'Q

Similar matrices are any square matrices that are related by A = M BMThey have the same eigenvalues (not eigenvectors)As one grows / decays so does the other A = M B M

Exercise problems

Differential equation matrix

Consider the following and solve for a general solution and solve for e

There are no initial condition, so we need the general solutionsThey will be in the form

Out[1]:

T

T-1

k -1 k

At

= Au = ududt

⎡

⎣⎢⎢

010

−101

0−10

⎤

⎦⎥⎥

u (t) = + +c1 e tλ1 x1 c2 e tλ2 x2 c3 e tλ3 x3





http://ipython.org/



In [4]: A = Matrix([[0, -1, 0], [1, 0, -1], [0, 1, 0]])A, A.det()

It is clearly singular (dependent rows and columns)


It is skew-symmetric and therefor eigenvalues are purely complex numbers (including 0i)


The solution is thus as follows

The solution moves around the unit circle (doesn't grow / decay)It returns to the same value (it's periodic) after a certain time t

Finding u(t) allows the following

If A is diagonalizable (A = SΛS ) then we have the following

Orthogonal eigenvalues

Which matrices have orthogonal eigenvectors?

The followingSymmetric matricesWhen A = -A (skew-symmetric)Orthogonal matricesIn general these are all when AA =A A

Definitions

Given the following

Out[4]: ⎛

⎝⎜⎜ ,

⎡

⎣⎢⎢

010

−101

0−10

⎤

⎦⎥⎥ 0

⎞

⎠⎟⎟

Out[5]: ⎡

⎣⎢⎢

0−10

10

−1

010

⎤

⎦⎥⎥

Out[6]: { }0 : 1, − i : 1,2‾√ i : 12‾√

u (t) = + +c1 x1 c2 e it2√ x2 c3 e− it2√ x3

iT = 2πi; = 12‾√ e0

T = 2π2‾√T = π 2‾√

u (t) = u (0)eAt

-1

= SeAt eΛt S−1

=eΛt

⎡

⎣⎢⎢⎢

e tλ1

00

0⋱0

00

e tλn

⎤

⎦⎥⎥⎥

T

T T

= 0; = c; = 2λ1 λ2 λ3

= , = , =x1

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥ x2

⎡

⎣⎢⎢

1−10

⎤

⎦⎥⎥ x3

⎡

⎣⎢⎢

112

⎤

⎦⎥⎥



Is the matrix A diagonalizable and of so for which value(s) of c?So we need enough enough eigenvectors and they should be independentThey are indeedMore so they are orthogonalSo, for all c the matrix is diagonalizable

Is A symmetric and if so, for which value(s) of s?The eigenvalues all have to be real valuesThus all real values for c

Is A positive definite and if so, for which values of c?This is a sub-case of symmetric matricesThere are a lot of tests for positive definite matricesOne of the eigenvalues are zero, so it can, at best, be semi-definite, for c ≥ 0

Is this a Markov matrix and if so, for which values of c?One of the eigenvalues must be 1 and the others must be smallerSo, no

Could ½A be a projection matrix?They are symmetric and eigenvalues must be realAny projection matrix eigenvalues must be 0 and 1

Thus c = 0 or c = 2 will work (for ½A we will have ½λ)

Singular value decomposition

In SVD we have the following

Where U and V are orthogonal matrices and Σ a diagonal matrix

Every matrix has a SVD

V is the eigenvector matrix for A&Sigma: has along its main diagonal the square roots of the eigenvaluesU is similarly calculated as the eigenvector matrix of AAThere is always, though, as sign issue when choosing V and U

For whichever signs are chosen for V, this forces the signs for U which can be checked against the following

Σ can tell us a lot about AAll values must be ≥ 0If it contains a 0 along the main diagonal, A is singular

Symmetric AND orthogonal matrices (matrices that are both)

A = A = A

What can be said about the eigenvalues of these?Symmetric matrices have real eigenvalues and the orthogonal matrix eigenvalues must have length 1; ||λ|| = 1

Is A sure to be positive definite?No, as λ can be -1

P = A2

= PP2

∴ = λλ2

∴ λ = 0; λ = 1

A = UΣV T

A = (V ) (UΣ )AT ΣT UT V T

A = V ( Σ)AT ΣT V T

T

T

A =vi σiuiAV = ΣU

T -1



Does it have repeated eigenvalues?Yes (if n ≥ 2, some eigenvalues must be repeated)

Is it diagonalizable?Most definitely

Is it non-singularYes (no zero eigenvalues)

Prove that the following is a projection matrix

Squaring it should result in the same

This begs the question, what is A ?Well, if A equals its inverse, A = IAs an aside the eigenvalues of A + I will be twice the eigenvalues of A

Example problems

Example problem 1

Find the eigenvalues and eigenvectors of the following1: The projection matrix P

2: The matrix Q

3: The matrix R

Solution

1:

In [7]: a = matrix([[3], [4]]) # Not using sympyP = (a * transpose(a)) / (transpose(a) * a)P

The eigenvalues of a projection matrix are either 0 or 1

In [8]: eig(P) # eig() gives the eigenvalues and eigenvector matrix

2:

In [9]: Q = matrix([[0.6, -0.8], [0.8, 0.6]])Q

Note that Q is a projection matrix

(A + I)12

( + 2A + I)14 A2

22

P = ; a = [ ]aaT

aaT34

Q = [ ]0.60.8

−0.80.6

R = 2P − I

Out[7]: matrix([[ 0.36, 0.48], [ 0.48, 0.64]])

Out[8]: (array([ 0., 1.]), matrix([[-0.8, -0.6], [ 0.6, -0.8]]))

Out[9]: matrix([[ 0.6, -0.8], [ 0.8, 0.6]])



In [10]: eig(Q) # eigenvalues come in complex conjugate pairs

3:

R will have the same eigenvectors, but (shifted) eigenvalues

In [11]: R = 2 * P - eye(2)R

In [12]: eig(R)

The eigenvalues of P was 0 and 12 × 0 - 1 = -12 × 1 - 1 = 1

In [ ]:

Out[10]: (array([ 0.6+0.8j, 0.6-0.8j]), matrix([[ 0.70710678+0.j , 0.70710678-0.j ], [ 0.00000000-0.70710678j, 0.00000000+0.70710678j]]))

Out[11]: matrix([[-0.28, 0.96], [ 0.96, 0.28]])

Out[12]: (array([-1., 1.]), matrix([[-0.8, -0.6], [ 0.6, -0.8]]))

2015/03/29, 8:21 PMIII_10_Final_exam_review

Page 1 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false

This notebook is part of Final exam review in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom sympy.solvers import solvefrom numpy import matrix, transpose, sqrt, eyefrom numpy.linalg import pinv, inv, det, svd, norm, eigfrom scipy.linalg import pinv2from warnings import filterwarnings


Final examination review

Previous examination questions

Question 1

If A is a m × n matrix of rank r and the following holdsNo solution

One solution

How many rows in this matrix?m = 3

What is the rank?If there are no solutions then r < mIf there is only a single solution then the nullspace has only the zero vector as so r = n

How many columns?For one solution (as above) r = n and with m = 3 and r < m we have r = n < 3

Out[1]:

Ax =⎡

⎣⎢⎢

100

⎤

⎦⎥⎥

Ax =⎡

⎣⎢⎢

010

⎤

⎦⎥⎥





http://ipython.org/



Write down a matrix that fits the description above

True or False for the aboveDeterminant of A A is same as determinant of AA

FalseA A is invertible

If r = n (independent columns of A) then TRUEAA A is positive definite

False (it is going to be 3 × 3, but still with only rank 2)

In [4]: A = Matrix([[0, 0], [1, 0], [0, 1]])A

In [5]: (A.transpose() * A).inv()

In [6]: (A.transpose() * A).det() == (A * A.transpose())


Prove that A y = c has at least one solution for every c and in fact infinitely many solution for every cIt has at least one solution because the new number of rows (n) is equal to rThe dimension of the nullspace of A is m - r, which in our example here would be > 0, thus infinitely many solutions

Question 2

Suppose we have a matrix A with columns containing vectors v , v , and v

Solve Ax = v - v + vThis is simple multiplication by columns

Suppose v - v + v = 0Is the solution unique or are there more

Uniqueness means nothing in the nullspace except the zero vector, so in this cane the solutions are not uniqueSuppose the columns are orthonormal (would be called q , q , q )

What combination of v and v are closet to v ?Zero for each v and v

Question 3

Consider the Markov matrix

A =⎡

⎣⎢⎢

010

001

⎤

⎦⎥⎥

T T

T

T

Out[4]: ⎡

⎣⎢⎢

010

001

⎤

⎦⎥⎥

Out[5]: [ ]10

01

Out[6]: False

Out[7]: ⎡

⎣⎢⎢

000

010

001

⎤

⎦⎥⎥

T

T

1 2 3

1 2 3

x =⎡

⎣⎢⎢

1−11

⎤

⎦⎥⎥

1 2 3

1 2 31 2 3

1 2

⎡

⎣⎢⎢

0.20.40.4

0.40.20.4

0.30.30.4

⎤

⎦⎥⎥



Calculate the eigenvaluesThe matrix is singular (note how ½ of columns 1 plys ½ of column 2 equals columns 3) so one eigenvalue will be zeroAnother must be 1The trace adds to 0.8 and so must the sums of the eigenvalues, thus the last eigenvalue is -0.2

If for the following the u(0) vector is as indicated, what would teh solution be after k steps?

The following will hold

So at ∞ the only term that survives is c xIndeed, the key eigenvalue in any Markov matrix is 1

Consider the eigenvector and calculate u at ∞We already know that we have to use the λ = 1 eigenvalueThe distribution at ∞ will be as follows (see python code below)

In [8]: A = Matrix([[0.2, 0.4, 0.3], [0.4, 0.2, 0.3], [0.4, 0.4, 0.4]])A

In [9]: A.eigenvects() # Looking for eigenvector of eigenvalue 1# Have to distribute the totals into 10 (were 10 total intiallly)

Question 4

Calculate the projection onto the following line

The projection matrix is

In [10]: a = matrix([[4], [-3]]) # Using scipy(a * transpose(a)) / (transpose(a) * a)

Consider the matrix with eigenvalues 0 and 3 and the following eigenvectors

We use the following decomposition

In [11]: S = matrix([[1, 2], [2, 1]])L = matrix([[0, 0], [0, 3]])S_inv = inv(S)

In [12]: A = S * L * S_invA

= ; u (0) =uk Ak⎡

⎣⎢⎢

0100

⎤

⎦⎥⎥

= ; u (0) = + +uk Ak c1 λk1x1 c2 λk

2x2 c3 λk3x3

= ; u (0) = 0 + (1) +uk Ak c2 x2 c3 (−0.2)kx32 2

u (∞) =⎡

⎣⎢⎢

334

⎤

⎦⎥⎥

Out[8]: ⎡

⎣⎢⎢

0.20.40.4

0.40.20.4

0.30.30.4

⎤

⎦⎥⎥

Out[9]: ⎡

⎣⎢⎢ ,

⎛

⎝⎜⎜ −0.2, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

−1.01.00

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟ ,

⎛

⎝⎜⎜ 0, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

−0.5−0.51.0

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎛

⎝⎜⎜ 1.0, 1,

⎡

⎣⎢⎢

⎡

⎣⎢⎢

0.750.751.0

⎤

⎦⎥⎥

⎤

⎦⎥⎥

⎞

⎠⎟⎟

⎤

⎦⎥⎥

a = [ ]4−3

P = aaT

aaT

Out[10]: matrix([[ 0.64, -0.48], [-0.48, 0.36]])

0, [ ] 3, [ ]12

21

A = SΛS−1

Out[12]: matrix([[ 4., -2.], [ 2., -1.]])



Give a 2 × 2 matrix A such that A ≠ B B for any BB is always symmetric, so A can be any non-symmetric matrix

A matrix that has orthogonal eigenvectors, but is not symmetricAny skew-symmetric matrix (transpose = negative of matrix)

Any orthogonal matrix

Question 5

Consider the following system Ax=b, with the least squares solution shown and calculate the projection of b onto the columnspace of A

The least square solution is given, so simply multiply each entry by its column

Calculate a different vector b such that all the least square solutions are zeroThis requires b to be orthogonal to those columns, such as the following

Question 6 (from recitation)

Consider then 3 × 3 matrix A, with λ =1 and λ =2 and the first two pivots d =d =1

Find λ and dThe sum of the eigenvalues must equal the trace, thus λ =-1Constant multiples of a row subtracted from another won't change the determinant leaving d ×d ×d =|A| (just watching out for singular matrices which will havea zero on the main diagonal; here though we have three non-zero eigenvalues, so the matrix is not-singular), leaving d =-2 (product of eigenvalues is also thedeterminant of A)

Calculate the smallest a entry that will make positive semi-definiteFor positive definite the eigenvalues must all be ≥ zeroThe determinant must also be ≥ 0

In [13]: a33 = symbols('a33')A = Matrix([[1, 0, 1], [0, 1, 1], [1, 1, a33]])A

In [14]: A.det() # Thus a33 must be grteater tha or equal to 2

TT

[ ]0−1

10

[ ]cossin

− sincos

= [ ] = , [ ] = [ ]⎡

⎣⎢⎢

111

012

⎤

⎦⎥⎥

cd

⎡

⎣⎢⎢

341

⎤

⎦⎥⎥

c d

113

−1

− 1113

⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

⎡

⎣⎢⎢

012

⎤

⎦⎥⎥

⎡

⎣⎢⎢

1−21

⎤

⎦⎥⎥

1 2 1 2

A =⎡

⎣⎢⎢

101

011

110

⎤

⎦⎥⎥

3 33

1 2 33

33

Out[13]: ⎡

⎣⎢⎢

101

011

11

a33

⎤

⎦⎥⎥

Out[14]: − 2a33



Calculate the smallest values of c such that the following is still positive semi-definite

We can calculate the determinant using sympy (see below) or we can make use of the fact that adding a constant multiple of the identity matrix will only add thatconstant to each eigenvalue, leaving the eigenvectors intact

Each must be ≥ 0, so the smallest value of c is 1

In [15]: c = symbols('c')A = Matrix([[1, 0, 1], [0, 1, 1], [1, 1, 0]])(A - c * eye(3))

In [16]: (A - c * eye(3)).det() # From here we can calulcate the smallest value of c such# that the determinant is still greater than or equal to zero

In [17]: f = -c ** 3 + 2 * c ** 2 + c - 2f

In [18]: solve(f, c) # solve the equation f for the variable c

Consider now one of the starting vectors u below and with u = ½Au calculate the limiting behavior of u as k approaches ∞

Notice that ½ is a Markov matrixWe cannot be sure that there will be a steady state as there are zero entries in ½AMultiplying a matrix by a constant scalar will not change the eigenvectors, but will change the eigenvalues by the same scalar multiple and we will have λ =½and λ =1 and λ =-½We do have an eigenvalue of 1, so we will reach a steady-stateThe eigenvector of λ =1 is the following (see below)

This already sums to 3, so will be u

In [ ]:

A − cI

1 + c, 2 + c, −1 + c

Out[15]: ⎡

⎣⎢⎢

−1.0c + 101

0−1.0c + 1

1

11

−1.0c

⎤

⎦⎥⎥

Out[16]: −1.0 + 2.0 + 1.0c − 2c3 c2

Out[17]: − + 2 + c − 2c3 c2

Out[18]: [ ]−1, 1, 2

0 k+1 k k

= , ,u0

⎡

⎣⎢⎢

300

⎤

⎦⎥⎥

⎡

⎣⎢⎢

030

⎤

⎦⎥⎥

⎡

⎣⎢⎢

003

⎤

⎦⎥⎥

12 3

2 ⎡

⎣⎢⎢

111

⎤

⎦⎥⎥

∞

2015/03/29, 8:15 PMIII_08_Left_and_right_inverses_Pseudoinverses

Page 1 of 4http://localhost:8888/nbconvert/html/III_08_Left_and_right_inverses_Pseudoinverses.ipynb?download=false

This notebook is part of lecture 32 Left-, right-, and pseudoinverses in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper





In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrtfrom numpy.linalg import pinv, inv, det, svd, normfrom scipy.linalg import pinv2from warnings import filterwarnings


Left- and right-sided inverses and pseudoinverses

The inverse

Recall the four fundamental subspacesThe rowspace (with x) and nullspace in �The columnspace (with Ax) and the nullspace of A in �

The two-sided inverse gives us the following

For this we need r = m = n (i.e. full rank)

For a left-inverse we have the followingFull column rank, with r = n (but possibly more rows)The nullspace contains just the zero vector (columns are independent)The rows might not all be independentWe thus have either no or only a single solution to Ax=bA will now also have full rankFrom (A A) A A = I follows the fact that (A A) A is a left-sided inverse (A )Note, though, that (A A) A is a n × m matrix and A is of size m × n, resulting in a n × n identity matrixWe cannot do AA and have a n × n identity matrix, though, but instead will be a projection matrix (onto the columnspace)

For a right-inverse we have the followingFull row rank, with r = m < nThe nullspace of A is the zero vector (rows are independent)Elimination will result in many solutions to Ax=b (n - m free variables)Now there will be an A to the right of A to give IAA (AA ) = IA is now A (AA )Putting the right-inverse on the left is also a projection (onto the rowspace)

Out[1]:

nT m

A = I = AA−1 A−1

TT -1 T T -1 T -1

T -1 T-1

T

-1T T -1

-1 T T -1





http://ipython.org/



The pseudoinverse

Consider a matrix where r is less than m and nRemember that the rowspace is in � and the columnspace is also in �The nullspace of the rowspace is in � and the nullspace of A is in �The rowspace and columnspace are in the same dimension and every vector x in one translate to another vector in the other (one-to-one)

If y in another vector in the rowspace (not same as x) then Ax ≠ Ay

The pseudoinverse A , then, maps x (or y) from the columnspace to the rowspace

Suppose Ax = Ay or A(x-y) = 0Now (x-y) is in the nullspace and in the rowspace, i.e. it has to be the zero vector

Finding the pseudoinverse A

One way is to start from the singular value decomposition

Σ has along the main diagonal all the square roots of the eigenvalues and r pivots, but m row and n columns which can be more than rΣ will have 1 over the square roots of the eigenvalues along the main diagonals and then (possibly) zero values further along, but be of size n × mΣΣ will have 1 along the main diagonal, and then 0 (if larger tha r)

It will be of size m × mIt is a projection onto the columnspace

Σ Σ will also have 1 along the main diagonal as well, but be of size n × nIt is a projection onto the rowspace


Let's see how easy this is in python™

In [4]: A = matrix([[3, 6], [2, 4]]) # Not sympyA, det(A) # The det is zero, so no inverse exists

In [5]: # The numpy pinv() function use SVDAplus = pinv(A)Aplus

In [6]: # The scipy pinv2() function also uses SVD# The scipy pinv() function uses least squares to approxiamte# the pseudoinverse and as matrices get BIG, this# becomes computationally expensiveAplus_sp = pinv2(A)Aplus_sp

Example problem

Example problem 1

r rn-r T m-r

+

y = AyA+

+

A = UΣV T

++ 's 's

+ 's

= VA+ Σ+UT

Out[4]: (matrix([[3, 6], [2, 4]]), 0.0)

Out[5]: matrix([[ 0.04615385, 0.03076923], [ 0.09230769, 0.06153846]])

Out[6]: array([[ 0.04615385, 0.03076923], [ 0.09230769, 0.06153846]])



Calculate the pseudoinverse of A=[1,2]Calculate AACalculate A AIf x is in the nullspace of A what is the effect of A A on x (i.e. A Ax)If x is in the columnspace of A what is A Ax?

Solution

In [7]: A = matrix([1, 2])A

Let's use singular value decomposition

In [8]: U, S, VT = svd(A)

In [9]: U

In [10]: S

In [11]: VT

Remember,

Σ must be of size 2 × 1, though

In [12]: S = matrix([[sqrt(5)], [0]])

In [13]: Aplus = transpose(VT) * S * UAplus

This needs to be normalized

In [14]: norm(Aplus)

In [15]: 1 / norm(Aplus) * Aplus

In [16]: Aplus = pinv(A)Aplus

In [17]: A * Aplus

In [18]: Aplus * A

++

+ +T +

Out[7]: matrix([[1, 2]])

Out[9]: matrix([[-1.]])

Out[10]: array([ 2.23606798])

Out[11]: matrix([[-0.4472136 , -0.89442719], [-0.89442719, 0.4472136 ]])

= VA+ Σ+UT

Out[13]: matrix([[ 1.], [ 2.]])

Out[14]: 2.2360679775

Out[15]: matrix([[ 0.4472136 ], [ 0.89442719]])

Out[16]: matrix([[ 0.2], [ 0.4]])

Out[17]: matrix([[ 1.]])

Out[18]: matrix([[ 0.2, 0.4], [ 0.4, 0.8]])



Let's create a vector in the nullspace of AIt will be any vector

Let's choose the constant c = 1

In [19]: x_vect_null_A = matrix([[-2], [1]])Aplus * A * x_vect_null_A

This is now surprise as A A reflects a vector onto the rowspace of AWe chose x in the nullspace of A, so Ax must be 0 and A Ax = 0

The columnsapce of A is any vector

We'll choose c = 1 again

In [20]: x_vect_null_AT = matrix([[1], [2]])Aplus * A * x_vect_null_AT

We recover x again

For fun, let's just check what A is when A is invertible

In [21]: A = matrix([[1, 2], [3, 4]])

In [22]: pinv(A)

In [23]: inv(A)

In [ ]:

c [ ]−21

Out[19]: matrix([[ 0.], [ 0.]])

++

T

c [ ]12

Out[20]: matrix([[ 1.], [ 2.]])

+

Out[22]: matrix([[-2. , 1. ], [ 1.5, -0.5]])

Out[23]: matrix([[-2. , 1. ], [ 1.5, -0.5]])

Documents

Geometrical view - Juan Klopper€¦ · Lecture01_Geometric_view_of_linear_systems 2015/03/28, 12:22 PM ... This notebook is part of lecture 1 The geometry of linear equations in