Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 1 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
This notebook is part of lecture 1 The geometry of linear equations in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In this series of notebooks I will make use of a custom cascading style sheetThe file style.css must be in the same folder as the notebook fileThe first block of code executes the stylesheet
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
All the modules and function will be imported here
In [2]: import numpy as np # Using namespace abbreviation to import numerical pythonfrom sympy import init_printing, symbols, Matrix, Eq # Imporint only the# required functions in the sympy moduleimport matplotlib.pyplot as plt # Using namespace abbreviation to import# the pyplot submodule of matplotlibimport seaborn as sns # Using namespace abbreviation to import# the seaborn plotting libraryfrom IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax') # Used to print Latex to the screen%matplotlib inlinefilterwarnings('ignore') # Ignore those ugly pink warning boxes
In [3]: # Comments will be in this form# Comments are not executed
In [4]: x, y, z = symbols('x y z') # Creating symbolic mathematical variables as opposed to computer variables# These symbols can no longer be used as computer variable names
Geometrical view
System of linear equations
A set of variables (each of power one and not transcendental)Example
This can be represented as an augmented matrix
In [5]: A_augm = Matrix([[2, -1, 0], [-1, 2, 3]]) # Note the placement of ()'s and []'sA_augm # A_augm is a computer variable that contains the matrix
Out[1]:
2x − y = 0−x + 2y = 3
Out[5]: [ ]2−1
−12
03
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 2 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
In [6]: # We can ask python what type of computer variable A_augm holdstype(A_augm) # We see that it is a mutable dense matrix
The matrix of coefficients:
In [7]: A = Matrix([[2, -1], [-1, 2]])A
The variable vector:
In [8]: x_vect = Matrix([x, y])x_vect
The solution vector:
In [9]: b_vect = Matrix([0, 3])b_vect
In [10]: Eq(A * x_vect, b_vect) # From Ax = b# The Eq function takes the arguments left-hand-side (LHS), right-handside (RHS) of the equation
The row picture
Out[6]: sympy.matrices.dense.MutableDenseMatrix
Out[7]: [ ]2−1
−12
Out[8]: [ ]xy
Out[9]: [ ]03
Out[10]: [ ] = [ ]2x − y−x + 2y
03
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 3 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
In [11]: # Don't be too concerned about the code for plotting# It does not form part of this series of notebooks
x_vals = np.linspace(-3, 3, 100) # Create 100 values between -3 and 3# Note that we cannot use the computer variable x, because it has been reserved above as a mathematical variable in# the symbols function
plt.figure(figsize = (10,8)) # Create a graph of size 10 by 8plt.plot(x_vals, 2 * x_vals) # Plot every single value created above with 2 times that values# Taken from the first equation which was y = 2x or f(x) = 2x# The plot takes the arguments (code between parentheses) of x,yplt.plot(x_vals, ((x_vals / 2) + (3 / 2))) # Also plot the second equationplt.show; # Draw the plot on screen
The column picture
In the column picture we look at the column vector associate with the variables:
It asks us to look at the linear combination of the columns
Performing this multiplication results in the same equation
In [12]: Eq(x * Matrix([2, -1]) + y * Matrix([-1, 2]), Matrix([0, 3]))
Out[11]: <function matplotlib.pyplot.show>
x [ ] + y [ ] = [ ]2−1
−12
03
Out[12]: [ ] = [ ]2x − y−x + 2y
03
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 4 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
In [13]: from mpl_toolkits.mplot3d import proj3d
fig = plt.figure(figsize = (10, 8))ax = fig.add_subplot(111, projection='3d')ax.plot([0, 2], [0, -1],zs=[0, 0])# The three sets of square bracket contain as first element the starting# point, i.e. 0, 0, 0 (as in x, y ,z coordinates)# The second element in each square bracket represents the end-point, i.e. 2, -1, 0 ax.plot([0, -1], [0, 2],zs=[0, 0])
plt.show();
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 5 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
In [14]: # Method adding arrow heads (very complicated)from mpl_toolkits.mplot3d import Axes3Dfrom itertools import product, combinationsfig = plt.figure(figsize = (10, 8))ax = fig.gca(projection='3d')ax.set_aspect("equal")
#draw a vectorfrom matplotlib.patches import FancyArrowPatchfrom mpl_toolkits.mplot3d import proj3d
class Arrow3D(FancyArrowPatch): def __init__(self, xs, ys, zs, *args, **kwargs): FancyArrowPatch.__init__(self, (0,0), (0,0), *args, **kwargs) self._verts3d = xs, ys, zs
def draw(self, renderer): xs3d, ys3d, zs3d = self._verts3d xs, ys, zs = proj3d.proj_transform(xs3d, ys3d, zs3d, renderer.M) self.set_positions((xs[0],ys[0]),(xs[1],ys[1])) FancyArrowPatch.draw(self, renderer)
a = Arrow3D([0, 2],[0, -1],[0, 0], mutation_scale=20, lw=1, arrowstyle="-|>", color="k")b = Arrow3D([0, -1],[0, 2],[0, 0], mutation_scale=20, lw=1, arrowstyle="-|>", color="k")ax.add_artist(a)ax.add_artist(b)plt.show()
The column view suggest that we need one of the first vector to be added to two times the second vector to get to point (0,3)
Note that we are working in the xy-planeForgetting for now the solution (0,3), if we took all the possible values (on the real line) for x and y, we would fill the whole planeLinear combinations of the two (column) vectors...
...and...
...fill �
It's easy to see that these two vectors are not linear combinations of each other (they don't lie on the same line)If this is so (they are linearly independent) and linear combinations of them fill the plane we say they span the plane (� )
[ ]2−1
[ ]−12
2
2
2015/03/28, 12:22 PMLecture01_Geometric_view_of_linear_systems
Page 6 of 6http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…ecture01_Geometric_view_of_linear_systems.ipynb?download=false
It's also easy to imagine that the xy-plane is filled with (all the points are filled with) vectors, i.e. I can find any coordinate by drawing a vector to itAll these vectors together can be called a setLet's call this set W and is equals �Later we will see that this vector space is a subspace of V = �We will also see that the vectors above span W, i.e. W = span(set of two vectors above)It will also be shown that this set of two vectors is a basis of W (they are linearly independent and they span W)� is of dimension two (2) as the whole space can be represented by a linear combination of just two vectors
The basis vectors for � are actually...
...which we commonly call
The 3-space picture
In [15]: A_augm = Matrix([[3, 2, -1, 2], [1, -2, -1, 3], [2, 1, -1, 1]])A_augm
In [16]: A_augm.rref()
In [17]: Image(filename = '3d.png')
In [ ]:
23
22
[ ] , [ ]10
01
,i j
3x + 2y − z = 2x − 2y − z = 32x + y − z = 1
Out[15]: ⎡
⎣⎢⎢
312
2−21
−1−1−1
231
⎤
⎦⎥⎥
Out[16]: ⎛
⎝
⎜⎜⎜ ,
⎡
⎣
⎢⎢⎢
100
010
001
52
− 32
52
⎤
⎦
⎥⎥⎥ [ ]0, 1, 2
⎞
⎠
⎟⎟⎟
Out[17]:
2015/03/28, 12:27 PMI_02_Overview
Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…n_overview_of_linear_algebra/I_02_Overview.ipynb?download=false
This notebook is part of the addition lecture An overview of key ideas in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrtfrom numpy.linalg import pinv, inv, det, svd, normfrom scipy.linalg import pinv2from warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
An overview of key ideas
Moving from vectors to matrices
Consider a position vector in three-dimensional spaceIt can be written as a column-vector
We can add constant scalar multiples of these vectors
This is simple vector additionIts easy to visualize that if we combine all possible combinations, that we start filling a plane through the originAdding a third vector that is not in this plane will extend all possible linear combinations to fill all of three-dimensional space
We now have the following
Notice how this last equation can be written in matrix form Ax=b
This is the column-view of matrix-vector multiplication as opposed to the row viewMatrices are seen a column, representing vectorsEach element of the column vector x is a scalar multiple of the corresponding column in the matrix A
Out[1]:
u =⎡
⎣⎢⎢
1−10
⎤
⎦⎥⎥
v =⎡
⎣⎢⎢
01
−1
⎤
⎦⎥⎥
u + v = bx1 x2
w =⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
u + v + w = bx1 x2 x3
=⎡
⎣⎢⎢
1−10
01
−1
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣⎢⎢
x1−x2 x1−x3 x2
⎤
⎦⎥⎥
+ + = = u + v + wx1
⎡
⎣⎢⎢
1−10
⎤
⎦⎥⎥ x2
⎡
⎣⎢⎢
01
−1
⎤
⎦⎥⎥ x3
⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
x1− +x1 x2− +x2 x3
x1 x2 x3
2015/03/28, 12:27 PMI_02_Overview
Page 2 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…n_overview_of_linear_algebra/I_02_Overview.ipynb?download=false
Now consider the solution vector b
By substitution we we now have the following
This, though, looks like a matrix times b
This matrix is the inverse of A such that x=A b
The above matrix A is called a difference matrix as it took simple differences between the elements of vector xIt was lower triangularIts inverse became a sum matrixSo it was a good matrix, able to transform between x and b (back-and-forth) and therefor invertible and for every x has a specific inverseIt transforms x into b (maps)
Let's look at the code for this matrix which replaces w above
In [4]: x1, x2, x3, b1, b2, b3 = symbols('x1, x2, x3, b1, b2, b3') # Creating algebraic symbols# This reserves these symbols so as not to see them as computer variable names
In [5]: C = Matrix([[1, 0, -1], [-1, 1, 0], [0, -1, 1]]) # Creating a matrix and putting# it into a computer variable called CC # Displaying it to the screen
In [6]: x_vect = Matrix([[x1], [x2], [x3]]) # Giving this columns vector a computer# variable namex_vect
In [7]: C * x_vect
We now have three equations
Adding the left and right sides we get the following
We are now constrained for values of b
The problem is clear to see geometrically as the new w is in the same plane as u and vIn essence w did not add anythingAll combinations of u, v, and w will still be in the planeThe first matrix A above had three independent columns and their linear combinations could fill all of three-dimensional spaceThat made the first matrix A invertible as opposed to the second one (C), which is not invertible (i.e. it cannot take any vector in three-dimensional space back to x)
= =⎡
⎣⎢⎢
1−10
01
−1
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣⎢⎢
x1−x2 x1−x3 x2
⎤
⎦⎥⎥
⎡
⎣⎢⎢
b1b2b3
⎤
⎦⎥⎥
=⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣⎢⎢
b1+b1 b2
+ +b1 b2 b2
⎤
⎦⎥⎥
⎡
⎣⎢⎢
111
011
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
b1b2b3
⎤
⎦⎥⎥
-1
Out[5]: ⎡
⎣⎢⎢
1−10
01
−1
−101
⎤
⎦⎥⎥
Out[6]: ⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
Out[7]: ⎡
⎣⎢⎢
−x1 x3− +x1 x2− +x2 x3
⎤
⎦⎥⎥
− =x1 x3 b1− =x2 x1 b2− =x3 x2 b3
0 = + +b1 b2 b3i
2015/03/28, 12:27 PMI_02_Overview
Page 3 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…n_overview_of_linear_algebra/I_02_Overview.ipynb?download=false
Let's look at the original column vectors in CRemember the following dot product
In linear algebra getting the dot product of two vectors is written as follows
Which is the transpose of the second times the first
In [8]: u = Matrix([[1], [-1], [0]])v = Matrix([[0], [1], [-1]])w = Matrix([[-1], [0], [1]])u, v, w
In [9]: v.transpose() * u
In [10]: w.transpose() * u
In [11]: w.transpose() * v
In [12]: u.transpose() * v
In [13]: u.transpose() * w
In [14]: v.transpose() * w
The angle between all of them is π radians and therefor they must all lie in a plane
Example problems
Example problem 1
Suppose A is a matrix with the following solution
What can you say about the columns of A?
Solution
In [15]: c = symbols('c')x_vect = Matrix([[0], [1 + 2 * c], [1 + c]])b = Matrix([[1], [4], [1], [1]])
a ⋅ b = ||a||||b|| cos θcos (π) = −1
a ⋅ b = abT
Out[8]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
1−10
⎤
⎦⎥⎥ ,
⎡
⎣⎢⎢
01
−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−101
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[9]: [ ]−1
Out[10]: [ ]−1
Out[11]: [ ]−1
Out[12]: [ ]−1
Out[13]: [ ]−1
Out[14]: [ ]−1
Ax =
⎡
⎣
⎢⎢⎢⎢
1411
⎤
⎦
⎥⎥⎥⎥
x = + c⎡
⎣⎢⎢
011
⎤
⎦⎥⎥
⎡
⎣⎢⎢
021
⎤
⎦⎥⎥
2015/03/28, 12:27 PMI_02_Overview
Page 4 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…n_overview_of_linear_algebra/I_02_Overview.ipynb?download=false
x is of size m × n is 3 × 1b is of size 4 × 1Therefor A must be of size 4 × 3 and each column vector in A is in �
Let's call these columns of A C , C , and C
With the particular way in which x was written we can say that we have a particular solution and a special solution
For c = 0 we have:
For c = 1 we have:
We also have that the following
For x we have the following
For x we have the following
Solving for C and C we have the following
As for the first column of A, we need to know more about ranks and subspacesWe see, though, that columns 2 and three are already constant multiples of each otherSo, as long as column 1 is not a constant multiple of b, we are safe
In [ ]:
4
1 2 3 ⎡
⎣
⎢⎢⎢⎢⎢
⋮C1
⋮⋮
⋮C2
⋮⋮
⋮C3
⋮⋮
⎤
⎦
⎥⎥⎥⎥⎥
A ( + c ⋅ ) = bxp xs
A = bxp
A + A = bxp xs
∵ A = bxp
b + A = bxs∴ A = 0xs
= , =xp
⎡
⎣⎢⎢
011
⎤
⎦⎥⎥ xs
⎡
⎣⎢⎢
021
⎤
⎦⎥⎥
p
= b ⇒ + = b
⎡
⎣
⎢⎢⎢⎢⎢
⋮C1
⋮⋮
⋮C2
⋮⋮
⋮C3
⋮⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣⎢⎢
011
⎤
⎦⎥⎥ C2 C3
s
= ⇒ 2 + = 0
⎡
⎣
⎢⎢⎢⎢⎢
⋮C1
⋮⋮
⋮C2
⋮⋮
⋮C3
⋮⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣⎢⎢
021
⎤
⎦⎥⎥ 0⎯⎯ C2 C3
2 3= −2C3 C2
− 2 = bC2 C2= −bC2= 2bC3
A =
⎡
⎣
⎢⎢⎢⎢⎢
⋮C1
⋮⋮
1411
2822
⎤
⎦
⎥⎥⎥⎥⎥
2015/03/28, 12:34 PMI_03_Elimination
Page 1 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
This notebook is part of lecture 2 Elimination with matrices in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eye, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Elimination
A system of linear equations
Out[1]:
2015/03/28, 12:34 PMI_03_Elimination
Page 2 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
Linear refers to the fact that each variable appears on its own (i.e. to the power 1) and is not transcendtalA solution satisfies all of the equations at onceConsider the following linear set
A solution for x, y, and z could be as follows
Since this is a set ( of three) equations that have a solution (solutions) for the variable in common, all left- and all right hand sides can be manipulated in certain waysWe could simply exchange the order of the equations (here equations 2 and 3 have been exchanged; row exchange)
We could multiply both the left- and right-hand side of one of the equations with a scalar (here I multiply the first equation by 2)
Lastly, we can subtract a constant multiple of one equation from anotherThis serves an excellent purpose, as I can eliminate of one (or more) of the variables (give it a coefficient of 0)Remember that we are trying to solve for all three equations and have three unknownsWe can most definitely struggle by doing this problem algebraically by substitution, but linear algebra makes it much easierHere I have multiplies the first equation by 3 (both sides, so that we maintain integrity of the equation) and subtracted the left hand side of this newequation from the left-hand side of equation two and the new right-hand side of equation 1 from the right-hand side of equation twoThis is quite legitimate, as the left- and right-hand sides are equal (it is an equation after all) and so, when subtracting from equation 2, we are still doingthe same thing to the lfet-hand side as the right-hand side
This has introduced a noice zero for me in the second equationLet's go further and multiply equation 2 by 2 and subtract that from equation 3
Now let last equation is easy to solve for z
Knowing this I can go back up to equation 2 and solve for y
Finally up to equation 1
We need to have gone straight for substitution, indeed, we could have tried to get zeros above all our leading (non-zero) coefficientsLet's just clean up equation three by multiplying out by ⅕
Now we have to get rid of the -2z in equation 2 which we can do by multiplying equation 3 by -2 and subtracting from equations 2
Multiplying equation 2 by ½ gives us the following
Now we can do the same to get rid of the 1z in equation 1 (multiply equation 3 by 1 and subtracting from equation 1)
Now tow get rid of the 2y in equation 1, which is above our leading 1y in equation 2Simple enough, we multiply equation 2 by 2 and subtract that from equation 1
The solution is now clear for x, y, and z
1x + 2y + 1z = 23x + 8y + 1z = 120x + 4y + 1z = 2
1 (2) + 2 (1) + 1 (−2) = 23 (2) + 8 (1) + 1 (−2) = 120 (2) + 4 (1) + 1 (−2) = 2
1x + 2y + 1z = 20x + 4y + 1z = 23x + 8y + 1z = 12
2x + 4y + 2z = 43x + 8y + 1z = 120x + 4y + 1z = 2
1x + 2y + 1z = 20x + 2y − 2z = 60x + 4y + 1z = 2
1x + 2y + 1z = 20x + 2y − 2z = 6
0x + 0y + 5z = −10
z = −2
2y + 2(−2) = 6y = 1
x + 2(1) + 1(−2) = 2x = 2
1x + 2y + 1z = 20x + 2y − 2z = 6
0x + 0y + 1z = −2
1x + 2y + 1z = 20x + 2y − 0z = 2
0x + 0y + 1z = −2
1x + 2y + 1z = 20x + 1y + 0z = 1
0x + 0y + 1z = −2
1x + 2y + 0z = 40x + 1y + 0z = 1
0x + 0y + 1z = −2
1x + 0y + 0z = 20x + 1y + 0z = 1
0x + 0y + 1z = −2
2015/03/28, 12:34 PMI_03_Elimination
Page 3 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
We need not rewrite all of the variables all the timeWe can simply write the coefficients
This is called the augmented matrix (right-hand side is added)A matrix has rows and columns (attcahed in position to our algebraic equation above; we simply omit the variables)
The left-upper entry is called the pivotOur aim is to get everything below it to be a zero (as we did with the algebra)We do exactely the same as we did above, which is multiply row 1 by 3 and subtract these new values from row 2
Now 2 times row 2 subtracted from row 3
Multiply the last row with ⅕
This show 1z to equal -2With this small matrix, it's easy to do back substitution as we did algebraically aboveThe first non-zero number in each row is the pivot (just like the upper-left entry)The steps we have taken up to this point is called Gauss elimination and the form we end up with is row-echelon formWe could carry on and do the same sort of thing to get rid of all the non-zero entries above each pivotThis is called Gauss-Jordan elimination and the result is reduced row-echelon form (see the computer code below)All of these steps are called elementary row operationsThe only one we didn't do is row exchange
We reserve this so as not to have leading (in the pivot position) zeros
In [4]: A_augmented = Matrix([[1, 2, 1, 2], [3, 8, 1, 12], [0, 4, 1, 2]])A_augmented
We can ask python™ to simply get the augmented matrix in reduced row-echelon form and read off the solutions
In [5]: A_augmented.rref() # The rref() method returns the reduced row-echelon form
So row one reads as follows
Elimination matrices
⎡
⎣⎢⎢
130
284
111
2122
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
224
1−21
262
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
220
1−25
26
−10
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
220
1−21
26
−2
⎤
⎦⎥⎥
Out[4]: ⎡
⎣⎢⎢
130
284
111
2122
⎤
⎦⎥⎥
Out[5]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
001
21
−2
⎤
⎦⎥⎥ [ ]0, 1, 2
⎞
⎠⎟⎟
1x + 0y + 0z = 2x = 2
2015/03/28, 12:34 PMI_03_Elimination
Page 4 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
Matrices can only be multiplied by each other if in order we have the first column size equal the second row sizeRows are usually called m and columns nSo, our augmented matrix above will be m × n = 3 × 4Let's look at how matrices are multiplied by looking at two small matrices
The subscripts refer to row and column position, i.e. 21 means row 2 column 1We see that we have a 2 × 2 matrix times a 2 × 2 matrix
The inner two values are the same (2 and 2), so this multiplication is allowedThe resultant matrix will have the size equal to the outer two values (first row and last columns); here also 2 and 2
So let's look at position 11 (row 1 and column 1)To get this we take the entries in row 1 of the first matrix and multiply them by the entries in the first column of the second matrixWe do this element by element and add the multiplication of each set of separate elements tow each otherThe python code below shows you exactly how this is done
In [6]: a11, a12, a21, a22, b11, b12, b21, b22 = symbols('a11 a12 a21 a22 b11 b12 b21 b22')
In [7]: A = Matrix([[a11, a12], [a21, a22]])B = Matrix([[b11, b12], [b21, b22]])A, B
In [8]: A * B
Let's constrain ourselves to the matrix of coefficients (this discards the right-hand side from the augmented matrix above)
In [9]: A = Matrix([[1, 2, 1], [3, 8, 1], [0, 4, 1]]) # I use the same computer variable above, which# will change its value in the computer memoryA # A 3 by 3 matrix, which we call square
The identity matrix is akin to the number 1, i.e. multiplying by it leaves everything unchangedIt has 1 along what is called the main diagonal and 0 everywhere else
In [10]: I = eye(3) # Identity matrices are always square and the argument# here is 3, so it is a 3 by 3 matrixI # Note what the main diagonal is
Let's multiply this by A
In [11]: I * A # Nothing will change
To get rid of the leading 3 in row 2 (because we want a zero under the pivot 1 in row 1), we multiplied row 1 by 3 and subtracted that from row 2Interestingly enough we can do something to this identity matrix that when multiplied by A will results in the first step we have aboveSince we required to subtract 3 times the first row from the 2 (it's all about that 3 in row 2, column 1), we can do the following
In [12]: E21 = Matrix([[1, 0, 0], [-3, 1, 0], [0, 0, 1]])E21 # 21 because we are working on row 2, column 1
[ ] [ ]a11a21
a12a22
b11b21
b12b22
Out[7]: ( )[ ] ,a11a21
a12a22 [ ]b11
b21
b12b22
Out[8]: [ ]+a11 b11 a12 b21+a21 b11 a22 b21
+a11 b12 a12 b22+a21 b12 a22 b22
Out[9]: ⎡
⎣⎢⎢
130
284
111
⎤
⎦⎥⎥
's 's
Out[10]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
Out[11]: ⎡
⎣⎢⎢
130
284
111
⎤
⎦⎥⎥
Out[12]: ⎡
⎣⎢⎢
1−30
010
001
⎤
⎦⎥⎥
2015/03/28, 12:34 PMI_03_Elimination
Page 5 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
That gives us the required 3 times row 1 and the negative shows that we subtract (add the negative)It's a thing of beauty
In [13]: E21 * A
Just what we wantedE1 is called the first elimination matrix
Let's do something to the identity matrix to get rif of the 4 in row 3 column 2It would require 2 times row 2 subtracted from row 3Look carefully at the positions
In [14]: E32 = Matrix([[1, 0, 0], [0, 1, 0], [0, -2, 1]])E32
In [15]: E32 * (E21 * A)
Spot on!We now have nice pivots (leading non-zeros), with nothing under themAs a tip, try not to get fractions involvedAs far as the other two row operations are concerned, we can either exchange rows in the identity matrix or multiply the required row by a scalar constant
Look at what happens we multiply E2 and E1
In [16]: L_inv = E32 * E21L_inv
Later we'll call this matrix the inverse of LIt is in triangular form, in this case lower triangular (note all the zeros above the main diagonal)
In [17]: L_inv * A # Later we'll call this result the matrix U
We now have the following
If we can get the inverse of the inverse of L we'll have the following
The inverse of a square matrix multiplied by itself gives the identity matrix
We can construct L from E32 and E21 above
Out[13]: ⎡
⎣⎢⎢
100
224
1−21
⎤
⎦⎥⎥
Out[14]: ⎡
⎣⎢⎢
100
01
−2
001
⎤
⎦⎥⎥
Out[15]: ⎡
⎣⎢⎢
100
220
1−25
⎤
⎦⎥⎥
Out[16]: ⎡
⎣⎢⎢
1−36
01
−2
001
⎤
⎦⎥⎥
Out[17]: ⎡
⎣⎢⎢
100
220
1−25
⎤
⎦⎥⎥
A = UL−1
L A = LUL−1
IA = LUA = LU
= UE−121 E−1
32 E32E21 E−121 E−1
32∴ = LE−1
21 E−132
2015/03/28, 12:34 PMI_03_Elimination
Page 6 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
In [18]: E21.inv() # The inverse is easy to understand in words# We just want to add 3 instead of subtracting 3
In [19]: E32.inv()
In [20]: E21.inv() * E32.inv()
This is exactly the inverse of our inverse of L above
In [21]: L_inv.inv()
This is called LU-decomposition of AMore about this in two chapter from now (I_05_LU_decomposition)
As an aside we can also do elementary column operation, but then we have to multiply on the right of A and not on the left as above
Example problems
Example problem 1
Solve the following linear set (set of linear equations)
Solution
In [22]: A_augm = Matrix([[1, -1, -1, 1, 0], [2, 0, 2, 0, 8], [0, -1, -2, 0, -8], [3, -3, -2, 4, 7]])A_augm
In [23]: A_augm.rref()
Whoa! That was easy!Let's take it a notch down and do some elementary matricesFirst off, we want the matrix of coefficients
Out[18]: ⎡
⎣⎢⎢
130
010
001
⎤
⎦⎥⎥
Out[19]: ⎡
⎣⎢⎢
100
012
001
⎤
⎦⎥⎥
Out[20]: ⎡
⎣⎢⎢
130
012
001
⎤
⎦⎥⎥
Out[21]: ⎡
⎣⎢⎢
130
012
001
⎤
⎦⎥⎥
x − y − z + u = 02x + 2z = 8
− y − 2z = −83x − 3y − 2z + 4u = 7
Out[22]: ⎡
⎣
⎢⎢⎢⎢
1203
−10
−1−3
−12
−2−2
1004
08
−87
⎤
⎦
⎥⎥⎥⎥
Out[23]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1000
0100
0010
0001
1234
⎤
⎦
⎥⎥⎥⎥[ ]0, 1, 2, 3
⎞
⎠
⎟⎟⎟⎟
2015/03/28, 12:34 PMI_03_Elimination
Page 7 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
In [24]: A = Matrix([[1, -1, -1, 1], [2, 0, 2, 0], [0, -1, -2, 0], [3, -3, -2, 4]])A
Now we need to get rid of the 2 in position row 2, column 1We start by numbering the elementary matrix by this position and modifying the identity matrix
In [25]: E21 = Matrix([[1, 0, 0, 0], [-2, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]])E21 * A
Now for position row 3, column 2We have to use row 2 to do thisIf we used row 1, we would introduce a non-zero into position row 3, column 1
In [26]: E32 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, Rational(1, 2), 1, 0], [0, 0, 0, 1]])E32 * (E21 * A)
Now for the 3 in position row 4, column 1
In [27]: E41 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [-3, 0, 0, 1]])E41 * (E32 * E21 * A)
Let's exchange rows 3 and 4
In [28]: Ee34 = Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 0, 1], [0, 0, 1, 0]])Ee34 * E41 * E32 * E21 * A
Let's see where that leaves b, after all, what we do to the left, we must do to the right
In [29]: b_vect = Matrix([[0], [8], [-8], [7]])b_vect
Out[24]: ⎡
⎣
⎢⎢⎢⎢
1203
−10
−1−3
−12
−2−2
1004
⎤
⎦
⎥⎥⎥⎥
Out[25]: ⎡
⎣
⎢⎢⎢⎢
1003
−12
−1−3
−14
−2−2
1−204
⎤
⎦
⎥⎥⎥⎥
Out[26]: ⎡
⎣
⎢⎢⎢⎢
1003
−120
−3
−140
−2
1−2−14
⎤
⎦
⎥⎥⎥⎥
Out[27]: ⎡
⎣
⎢⎢⎢⎢
1000
−1200
−1401
1−2−11
⎤
⎦
⎥⎥⎥⎥
Out[28]: ⎡
⎣
⎢⎢⎢⎢
1000
−1200
−1410
1−21
−1
⎤
⎦
⎥⎥⎥⎥
× × × Ax = × × × bEe34 E41 E32 E21 Ee34 E41 E32 E21
Out[29]: ⎡
⎣
⎢⎢⎢⎢
08
−87
⎤
⎦
⎥⎥⎥⎥
2015/03/28, 12:34 PMI_03_Elimination
Page 8 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…rt_Strang/I_03_Elimination/I_03_Elimination.ipynb?download=false
In [30]: Ee34 * E41 * E32 * E21 * b_vect
Let's print them next to each other on the screen
In [31]: Ee34 * E41 * E32 * E21 * A, Ee34 * E41 * E32 * E21 * b_vect
So we can simply do back substitutionWe note that -1u = -4 and thus u = 4From here, we work our way back up
In [ ]:
Out[30]: ⎡
⎣
⎢⎢⎢⎢
087
−4
⎤
⎦
⎥⎥⎥⎥
Out[31]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1000
−1200
−1410
1−21
−1
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
087
−4
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
−1(u) = −4 ∴ u = 41(z) + 1(4) = 7 ∴ z = 3
2(y) + 4(3) − 2(4) = 8 ∴ y = 21(x) − 1(2) − 1(3) + 1(4) = 0 ∴ x = 1
2015/03/28, 12:35 PMI_04_Matrix_multiplication_Inverses
Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…nspose/I_04_Matrix_multiplication_Inverses.ipynb?download=false
This notebook is part of lecture 3 Multiplication and inverse matrices in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eye, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Matrix multiplication, inverse and transpose
Multiplying matrices
Method 1
Consider multiply matrices A and B to result in CWe have already seen that the column size of the first must equal the row size of the second, n must equal m
C will then be of size m × nEvery position c , with i as the row position and j as the column position is calculated by taking the dot product (i.e. each element times it's corresponding element, alladded), c = (row i in A ⋅ column j of B)Here we calculate the row 2, column 1 position in C by the dot product of row 2 in A by column 1 in B
Notice how this multiplication is only possible because the row size of A equals the column size of B
Method 2
Out[1]:
A B× ⋅ ×mA nA mB nB= ⋅mA nB
A Bij
ij
=
⎡
⎣
⎢⎢⎢⎢
⋯3⋯⋯
⋯2⋯⋯
⋯−1⋯⋯
⎤
⎦
⎥⎥⎥⎥4×3
⎡
⎣
⎢⎢⎢121
⋮⋮⋮
⎤
⎦
⎥⎥⎥3×2
⎡
⎣
⎢⎢⎢⎢
c11(3 × 1) + (2 × 2) + (−1 × 1)
c31c41
c12c22c32c42
⎤
⎦
⎥⎥⎥⎥4×2
=
⎡
⎣
⎢⎢⎢⎢
⋯a21⋯⋯
⋯a22⋯⋯
⋯a23⋯⋯
⎤
⎦
⎥⎥⎥⎥4×3
⎡
⎣
⎢⎢⎢b11
b21
b31
⋮⋮⋮
⎤
⎦
⎥⎥⎥3×2
⎡
⎣
⎢⎢⎢⎢
c11( ) + ( ) + ( )a21 b11 a22 b21 a23 b31
c31c41
c12c22c32c42
⎤
⎦
⎥⎥⎥⎥4×2
= ∑k=1
na2kbk1
2015/03/28, 12:35 PMI_04_Matrix_multiplication_Inverses
Page 2 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…nspose/I_04_Matrix_multiplication_Inverses.ipynb?download=false
In this method we note that each column in C is the result of the matrix A times the corresponding column in BThis is akin to a matrix multiplied by a vector Ax=bWe see B as made up of vector columnsThe columns of C are thus combinations of columns of A
The numbers in the corresponding columns in B is this combination
Method 3
Here every row in A produces the same numbered row in C by multiplying it with the matrix BThe rows of C are linear combinations of B
Method 4
In method 1 we looked at row × col producing a single number in CWhat if we did column × row?The size of column of A is m × 1 and a row of B is of size 1 × nThis results in C of size m × nLet's look at a simple example using python (with sympy)
In [4]: A = Matrix([[2], [3], [4]])B = Matrix([[1, 6]])A, B
In [5]: C = A * BC
Notice how the columns of C are linear combinations of the values in the columns of AThe rows of C are multiples of the rows of BSo in method 4, C is the sum of the columns of A × the rows of B
Block multiplication
Combining the above we can do the followingBoth A and B can be broken into block of sizes that allow for multiplicationHere is an example of two square matrices
Inverses
If the inverse of a matrix A exists then A =I, the identity matrixAbove is a left inverse, but what about a right inverse, AA ?
This is also equal to the identity for invertible square inversesInvertible matrices are also called non-singular matrices
A B
A BA B
Out[4]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
234
⎤
⎦⎥⎥ [ ]1 6
⎞
⎠⎟⎟
Out[5]: ⎡
⎣⎢⎢
234
121824
⎤
⎦⎥⎥
[ ] = [ ] + [ ]⎡
⎣⎢⎢
a11a21a31
a12a22a32
⎤
⎦⎥⎥
b11b21
b12b22
⎡
⎣⎢⎢
a11a21a31
⎤
⎦⎥⎥ b11 b12
⎡
⎣⎢⎢
a12a22a32
⎤
⎦⎥⎥ b21 b22
[ ] [ ] = [ ]A1A3
A2A4
B1B3
B2B4
+A1B1 A2B3+A3B1 A4B3
+A1B2 A2B4+A3B2 A4B4
-1-1
2015/03/28, 12:35 PMI_04_Matrix_multiplication_Inverses
Page 3 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…nspose/I_04_Matrix_multiplication_Inverses.ipynb?download=false
Non-invertible matrices are also called singular matricesAn example would look like this
Note how the elements on row two are just two times the elements in row 1 (A linear combination)The same go for the columns, the first being a linear combination of the second, multiplying each element by 3More profoundly, note that you could find a column vector x such that Ax=0
This says 3 times column 1 in A plus -1 times column 2 gives nothing
Let construct as example
In essence we have to solve two systemsA × column j of A = column j of IThis is the Gauss-Jordan idea of solving two systems at once
This gives us the two columns of AWe now create the augmented matrix
Now we use elementary row operations to reduced row-echelon form (leading 1 in the pivot positions, with 0 below and above each)
We now read off the two columns of A
To do all of the elimination, we created a lot of elimination (elementary) matricesIf we combine all of them into E we have E[AI]=[IA ], because EA=I tells us E=A
Example problems
Example problem 1
Find the conditions on a and b that makes the matrix A invertible and find A
Solution
[ ]12
36
[ ] [ ] = [ ]12
36
3−1
00
[ ] [ ] = [ ]12
37
ab
cd
10
01
-1
[ ] [ ] = [ ]12
37
ab
10
[ ] [ ] = [ ]12
37
cd
01
-1
[ ]12
37
10
01
's 's
[ ] → [ ] → [ ]12
37
10
01
10
31
1−2
01
10
01
7−2
−31
-1
[ ]7−2
−31
-1 -1
-1
A =⎡
⎣⎢⎢
aaa
baa
bba
⎤
⎦⎥⎥
2015/03/28, 12:35 PMI_04_Matrix_multiplication_Inverses
Page 4 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics/…nspose/I_04_Matrix_multiplication_Inverses.ipynb?download=false
A matrix is singular (non-invertible) if we have a row or column of zeros, so a ≠ 0We can also not have similar columns, so a ≠ bUsing Gauss-Jordan elimination we will have the following
Additionally then we note that for the inverse of A to exist a - b ≠ 0, which is the same as a ≠ b and again a ≠ 0
In [ ]:
→ →⎡
⎣⎢⎢
aaa
baa
bba
100
010
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
a00
ba − ba − b
b0
a − b
1−1−1
010
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
a00
ba − b
0
b0
a − b
1−10
01
−1
001
⎤
⎦⎥⎥
→ →⎡
⎣
⎢⎢⎢
a00
ba−ba−b
0
b0
a−ba−b
1−1a−b
0
01
a−b−1a−b
001
a−b
⎤
⎦
⎥⎥⎥
⎡
⎣
⎢⎢⎢
a00
b10
b01
1−1a−b
0
01
a−b−1
a−b
001
a−b
⎤
⎦
⎥⎥⎥
→ →
⎡
⎣
⎢⎢⎢
a
00
b
10
001
1−1a−b
0
(b)1a−b
1a−b−1a−b
− (b)1a−b
01
a−b
⎤
⎦
⎥⎥⎥
⎡
⎣
⎢⎢⎢
a
00
010
001
1 + ba−b
−1a−b
0
01
a−b−1a−b
− (b)1a−b
01
a−b
⎤
⎦
⎥⎥⎥
→
⎡
⎣
⎢⎢⎢⎢
100
010
001
1a−b−1a−b
0
01
a−b−1a−b
− (b)1a(a−b)
01
a−b
⎤
⎦
⎥⎥⎥⎥
=A−1 1a − b
⎡
⎣⎢⎢⎢
1−10
01
−1
−ba
01
⎤
⎦⎥⎥⎥
2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A
Page 1 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false
This notebook is part of lecture 4 Factorization into LU in the OCW MIT course 18.06 [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: import numpy as npfrom sympy import *import matplotlib.pyplot as pltimport seaborn as snsfrom IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax')%matplotlib inlinefilterwarnings('ignore')
LU decomposition of a matrix A
We will decompose the matrix A into and upper and lower triangular matrix, such that multiplying these will result back into A
Turning the matrix of coefficients into Upper triangular form
Consider the following matrix of coefficients
Successive elementary row operation followWhich is nothing other than matrix multiplication of the elementary matricesAn elementary matrix is an identity matrix on which one elementary row operation was performed
In [3]: A = Matrix([[1, -2, 1], [3, 2, -2], [6, -1, -1]])A
In [4]: eye(3)
We have to get a -3 in the first pivot (the 1 in row 1, column 1) to get rid of the 3 in position row 2, column 1 (we call the resulting matrix E21, referring to the row 2,column 1)Then we add the new row 1 to row two Row one of the identity matrix is then (-3,0,0) (but we leave it (1,0,0) in E21) and adding this to row 2 leaves (-3,1,0)
Out[1]:
A = LU
⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
Out[3]: ⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
Out[4]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A
Page 2 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false
In [5]: E21 = Matrix([[1, 0, 0], [-3, 1, 0], [0, 0, 1]])E21
In [6]: E21 * A # The resulting matrix after multiplication by E21
We do the same to get rid of the 6 in row 3, column 1Multiplying row 1 (of the identity matrix) by -6 and adding this new row to row 3 (but again leaving row 1 as (1,0,0) in E31)
In [7]: E31 = Matrix([[1, 0, 0], [0, 1, 0], [-6, 0, 1]])E31
In [8]: E31 * E21 * A # This got rid of the leading 6 in row 3
Now the 8 in row 2, column 2 is the pivot and we need to get rid of the 11 in row 3, column 2Unfortunately we have an 8 and an 11 to deal withWe will have to do two elementary row operations
-11 times row 2 of the identity matrix (0,-11,0)Added to 8 times row 3 (0,0,8) ∴ (0,-11,8)
In [9]: E32 = Matrix([[1, 0 , 0], [0, 1, 0], [0, -11, 8]])E32
In [10]: U = E32 * E31 * E21 * AU # We call is U for upper triangular
The matrix is now in upper triangular form
Calculating the Lower triangular from
Note, to reverse this process we would have to do the following:
The inverse of a matrix can be calculated using the sympy method .inv()
We can check this with a Boolean request
In [11]: E21.inv() * E31.inv() * E32.inv() * E32 * E31 * E21 * A == A # The Boolean double equal signs asks: Is the# left-hand side equal to the right-hand side?
Out[5]: ⎡
⎣⎢⎢
1−30
010
001
⎤
⎦⎥⎥
Out[6]: ⎡
⎣⎢⎢
106
−28
−1
1−5−1
⎤
⎦⎥⎥
Out[7]: ⎡
⎣⎢⎢
10
−6
010
001
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢
100
−2811
1−5−7
⎤
⎦⎥⎥
Out[9]: ⎡
⎣⎢⎢
100
01
−11
008
⎤
⎦⎥⎥
Out[10]: ⎡
⎣⎢⎢
100
−280
1−5−1
⎤
⎦⎥⎥
( ) ( ) ( ) A = UE32 E31 E21
( ) ( ) ( ) A = A( )E21−1( )E31
−1( )E32−1 E32 E31 E21
Out[11]: True
2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A
Page 3 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false
Indeed, we will be back with the identity matrix just multiplying the inverse elementary matrices and the elementary matrices
In [12]: E21.inv() * E31.inv() * E32.inv() * E32 * E31 * E21
Multiplying the inverse elementary matrices on the left, must also have it happen on the right
The multiplication of these inverse elementary matrices is lower triangularWe can call in L
In [13]: L = E21.inv() * E31.inv() * E32.inv()L
In [14]: A == L * U # Checking this with a Boolean question
In [15]: A, L * U # They are identical
Doing this in one go using sympy
In [16]: L, U, _ = A.LUdecomposition()
In [17]: L
In [18]: U # Note the difference from the U above
In [19]: L * U # Back to A
What's special about L?
This only works when no row interchange happensIt also actually only works when doing the conventional subtracting the scalar multiplication of a row from another row, leaving the positive scalar as opposed to thenegatives I use, allowing me to add the two rows (as opposed to subtraction)Note the 3 (in row 2, column 1) and the 6 (in row 3, column 1)They are the row multiplications we have to do for E21 and E31The ¹¹ / ₈ is what we did for E32 (we just did it in two steps so as not to use fractions)
Out[12]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
( ) ( ) ( ) A = U( )E21−1( )E31
−1( )E32−1 E32 E31 E21 ( )E21
−1( )E31−1( )E32
−1
A = LU
Out[13]: ⎡
⎣⎢⎢⎢
136
01118
0018
⎤
⎦⎥⎥⎥
Out[14]: True
Out[15]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[17]: ⎡
⎣⎢⎢⎢
136
01118
001
⎤
⎦⎥⎥⎥
Out[18]: ⎡
⎣⎢⎢⎢
100
−280
1−5− 1
8
⎤
⎦⎥⎥⎥
Out[19]: ⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A
Page 4 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false
Row exchanges
We have to allow row exchanges if the pivot contains a zero
For an example, from a 3×3 identity matrix we could have:
In [20]: eye(3)
Exchanging rows one and two would be:
In [21]: Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]])
In [22]: A, Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]]) * A # Showing row exchange
How many permutations of row exchanges are there?
In a 3×3 matrix there are 3! = 6 permutationsMultiplying any of them will result in one of the 6They are inverses of each otherThe inverse are the transposes
For 4×4 there are 4! = 24
Example problems
Example problem 01
Perform LU decomposition of:
For which values of a and b does L and U exist?
Solution
In [23]: a, b = symbols('a b')
In [24]: A = Matrix([[1, 0, 1], [a, a, a], [b, b, a]])A
In [25]: L,U, _ = A.LUdecomposition()
Out[20]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
Out[21]: ⎡
⎣⎢⎢
010
100
001
⎤
⎦⎥⎥
Out[22]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
136
−22
−1
1−2−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
316
2−2−1
−21
−1
⎤
⎦⎥⎥
⎞
⎠⎟⎟
n!
⎡
⎣⎢⎢
1ab
0ab
1aa
⎤
⎦⎥⎥
Out[24]: ⎡
⎣⎢⎢
1ab
0ab
1aa
⎤
⎦⎥⎥
2015/03/28, 12:39 PMChapter05_LU_decomposition_of_A
Page 5 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…position/Chapter05_LU_decomposition_of_A.ipynb?download=false
In [26]: L, U
Checking
In [27]: L * U == A
For existence:a ≠ 0
It's easy to see why, since if a equals zero, we will have a zero in a pivot position and we will have to do row exchange, which is not allowed for LU-decomposition
Hints and tips
In [28]: E21, E21.inv() # To take the inverse of an elementary matrix, simply change the sign of the off-diagonal elements and# multiply each element by 1 over the determinant# The determinant is easy to do for these *n* = 3 square matrices, since the top row is (1,0,0)
In [29]: E31, E31.inv()
In [30]: E32, E32.inv()
By keeping track of the elementary matrices it is easy to get L and UIt's easy to get the inverses of L and UThis means it is easy to calculate x
In [ ]:
Out[26]: ⎛
⎝⎜⎜⎜ ,
⎡
⎣⎢⎢⎢
1ab
01ba
001
⎤
⎦⎥⎥⎥
⎡
⎣⎢⎢
100
0a0
10
a − b
⎤
⎦⎥⎥
⎞
⎠⎟⎟⎟
Out[27]: True
Out[28]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
1−30
010
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
130
010
001
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[29]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
10
−6
010
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
106
010
001
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[30]: ⎛
⎝⎜⎜⎜ ,
⎡
⎣⎢⎢
100
01
−11
008
⎤
⎦⎥⎥
⎡
⎣⎢⎢⎢
100
01118
0018
⎤
⎦⎥⎥⎥
⎞
⎠⎟⎟⎟
Ax = LUx = bUx = bL−1
x = bU−1L−1
2015/03/28, 12:51 PMI_06_Transposes_Permutations_Spaces
Page 1 of 3http://localhost:8888/nbconvert/html/I_06_Transposes_Permutations_Spaces.ipynb?download=false
This notebook is part of lecture 5 Transposes, permutations, and vector spaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax')%matplotlib inlinefilterwarnings('ignore')
Transposes, permutations and vector spaces
The permutation matrices
Remember that the permutation matrices allow for row exchangesThey are used to manage zero's in pivot positionsThe have the following property
In [3]: P = Matrix([[0, 1, 0], [1, 0, 0], [0, 0, 1]])P # Exchanging rows 1 and 2
In [4]: P.inv(), P.transpose()
In [5]: P.inv() == P.transpose()
If a matrix is of size n × n then there are n! number of permutations
The transpose of a matrix
Out[1]:
=P−1 PT
Out[3]: ⎡
⎣⎢⎢
010
100
001
⎤
⎦⎥⎥
Out[4]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
010
100
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
010
100
001
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[5]: True
2015/03/28, 12:51 PMI_06_Transposes_Permutations_Spaces
Page 2 of 3http://localhost:8888/nbconvert/html/I_06_Transposes_Permutations_Spaces.ipynb?download=false
We have mentioned transposes of a matrix, but what are they?The simply make row of the column elements and columns of the row elements as in the example below
In [6]: a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34 = symbols('a11, a12, a13, a14, a21, a22, a23, a24, a31, a32, a33, a34')# Creating mathematical scalar constants
In [7]: A = Matrix([[a11, a12, a13], [a21, a22, a23], [a31, a32, a33]])A
In [8]: A.transpose()
This applies to any size matrix
In [9]: A = Matrix([[a11, a12, a13, a14], [a21, a22, a23, a24]])A
In [10]: A.transpose()
Multiplying a matrix by its transpose results in a symmetric matrix
In [11]: A * A.transpose()
Symmetric matrices
A symmetric matrix is a square matrix with elements opposite the main diagonal all equalExample
In [12]: S = Matrix([[1, 3, 2], [3, 2, 4], [2 , 4, 2]])S
On the main diagonal we have 1, 2, 2Opposite this main diagonal we have a 3 and a 3 and a 2 and a 2 and a 4 and a 4The transpose of a symmetric matrix is equal to the matrix
In [13]: S == S.transpose()
Vector spaces
Out[7]: ⎡
⎣⎢⎢
a11a21a31
a12a22a32
a13a23a33
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢
a11a12a13
a21a22a23
a31a32a33
⎤
⎦⎥⎥
Out[9]: [ ]a11a21
a12a22
a13a23
a14a24
Out[10]: ⎡
⎣
⎢⎢⎢⎢
a11a12a13a14
a21a22a23a24
⎤
⎦
⎥⎥⎥⎥
Out[11]: [ ]+ + +a211 a2
12 a213 a2
14+ + +a11 a21 a12 a22 a13 a23 a14 a24
+ + +a11 a21 a12 a22 a13 a23 a14 a24
+ + +a221 a2
22 a223 a2
24
Out[12]: ⎡
⎣⎢⎢
132
324
242
⎤
⎦⎥⎥
Out[13]: True
2015/03/28, 12:51 PMI_06_Transposes_Permutations_Spaces
Page 3 of 3http://localhost:8888/nbconvert/html/I_06_Transposes_Permutations_Spaces.ipynb?download=false
A vector space is a bunch of vectors (a set of vectors) With certain properties that allow us to do stuff withThe space � is all vectors of two components that reaches every coordinate point in �It always includes the zero vector 0We usually call this vector space V, such that V = � or V = �A linear combination of a certain number of these can also fill all of �A good example is the two unit vectors along the two axesSuch a set of vectors form a basis for VThe two of them also span � , i.e. a linear combination of them fills V = �Linear independence means the vectors in � don't fall on the same line
If they do, we can't get to all coordinate points in �The important point about a vector space V is that it allows for vector addition and scalar multiplication
Taking any of the set of vectors in V and adding them results in a new vector which is still a component of VMultiplying a scalar by any of the vectors in V results in a vector still in V
A subspace
For a subspace the rules of vector addition and scalar multiplication must applyI.e. a quadrant of � is not a vector subspace
Addition or scalar multiplication of any vector in this quadrant can lead to a vector outside of this quadrantThe zero vector 0 is a subspace (every subspace must contain the zero vector)The whole space V = � (here we use n = 2) is a subspace of itselfContinuing with our example of n = 2, any line through the origin is a subspace of �
Adding a vector on this line to itself of a scalar multiple of itself will eventually fill the whole lineFor n = 3 we have the whole space V = � , a plane through the origin, a line through the origin and the zero vectors are all subspace of V = �The point is that vector addition and scalar multiplication of vectors in the subspace must result in a new vector that remains in the subspaceEvery subspace must include the zero vector 0All the properties of vectors must apply to the vectors in a subspace (and a space)
Column spaces of matrices
Here we see the columns of a matrix as a vectorIf there are two columns and three rows we will have the following as an example
If they are not linear combinations of each other addition and scalar multiplication of the two of them will fill a plane in �
In [ ]:
2 2
2 n2
2 22
2
2
n2
3 3
= +⎡
⎣⎢⎢
212
132
⎤
⎦⎥⎥
⎡
⎣⎢⎢
212
⎤
⎦⎥⎥
⎡
⎣⎢⎢
132
⎤
⎦⎥⎥
3
2015/03/28, 1:11 PMI_07_Column_and_null_spaces
Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…rt_Strang/ZIP/I_07_Column_and_null_spaces.ipynb?download=false
This notebook is part of lecture 6 Columnspace and nullspace in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax')%matplotlib inlinefilterwarnings('ignore')
Columnspace and nullspace of a matrix
Columnspaces of matrices
We saw in the previous lecture that columns of a matrix can form vectorsConsider now the LU-decomposition of A
The union P∪L (all vectors in P or L or both) is NOT a subspaceThe intersection P∩L (or vectors in P and L) is a subspace (because their intersection is only the zero vector)
The intersection of any two subspaces is a subspace
Consider the following example matrix
In [3]: A = Matrix([[1, 1, 2], [2, 1, 3], [3, 1, 4], [4, 1, 5]])A
Each of the column spaces are vectors (column space) in �
The linear combinations of all the column vectors form a subspaceIs it the whole V = � , though?
Out[1]:
PA = PLU
Out[3]: ⎡
⎣
⎢⎢⎢⎢
1234
1111
2345
⎤
⎦
⎥⎥⎥⎥
4
4
2015/03/28, 1:11 PMI_07_Column_and_null_spaces
Page 2 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…rt_Strang/ZIP/I_07_Column_and_null_spaces.ipynb?download=false
The reason why we ask is because we want to bring it back to a system of linear equations and ask the question: Is there (always) a solution to the following:
Thus, which right-hand sides b are allowed?In our example above we are in � and we ask if linear combination of all of them fill �
From our example above some right-hand sides will be allowed (they form a subspace)Let's look at an example for b
In [4]: x1, x2, x3 = symbols('x1, x2, x3')vec_x = Matrix([x1, x2, x3])b = Matrix([1, 2, 3, 4])A, vec_x, b
In [5]: A * vec_x
You can do the row multiplication, but it's easy to see from above we are asking about linear combinations of the columns, i.e. how many (x ) of column 1 plus how many(x ) of column 2 plus how many (x ) of column 3 equals b?Well, since b is the same as the first column, x would be
So we can solve for all values of b if b is in the column space
Linear independence
We really need to know if the columns above are linearly independentWe note that column three above is a linear combination of the first two, so adds nothing newActually, we could also throw away the first one because it is column 3 plus -1 times column 2Same for column 2We thus have two columns left and we say that the column space is of dimension 2 (a 2-dimensional subspace of � )
The nullspace
It contains all solutions x for Ax=0This solution(s) is in �
In [6]: zero_b = Matrix([0, 0, 0, 0])A, vec_x, zero_b
A =x⎯⎯⎯ b⎯ ⎯⎯
4 4
Out[4]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1234
1111
2345
⎤
⎦
⎥⎥⎥⎥,
⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣
⎢⎢⎢⎢
1234
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
Out[5]: ⎡
⎣
⎢⎢⎢⎢
+ + 2x1 x2 x32 + + 3x1 x2 x33 + + 4x1 x2 x34 + + 5x1 x2 x3
⎤
⎦
⎥⎥⎥⎥
12 3
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
4
3
Out[6]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1234
1111
2345
⎤
⎦
⎥⎥⎥⎥,
⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣
⎢⎢⎢⎢
0000
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
2015/03/28, 1:11 PMI_07_Column_and_null_spaces
Page 3 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…rt_Strang/ZIP/I_07_Column_and_null_spaces.ipynb?download=false
Some solutions would be
In fact, we have:
It is thus a lineThe nullspace is a line in �
PLEASE remember, for any space the rules of addition and scalar multiplication must hold for vectors to remain in that space
In [ ]:
⎡
⎣⎢⎢
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
11
−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
22
−2
⎤
⎦⎥⎥
c⎡
⎣⎢⎢
11
−1
⎤
⎦⎥⎥
3
2015/03/28, 1:15 PMI_08_Solving_homogeneous_systems_Pivot_variables_Special_solutions
Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_systems_Pivot_variables_Special_solutions.ipynb?download=false
This notebook is part of lecture 7 Solving Ax=0, pivot variables, and special solutions in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax')%matplotlib inlinefilterwarnings('ignore')
Solving homogeneous systemsPivot variablesSpecial solutions
We are trying to solve a system of linear equationsFor homogeneous systems the right-hand side is the zero vectorConsider the example below
In [3]: A = Matrix([[1, 2, 2, 2], [2, 4, 6, 8], [3, 6, 8, 10]])A # A 3x4 matrix
In [4]: x1, x2, x3, x4 = symbols('x1, x2, x3, x4')
x_vect = Matrix([x1, x2, x3, x4]) # A 4x1 matrixx_vect
In [5]: b = Matrix([0, 0, 0])b # A 3x1 matrix
The x column vector is a set of all the solutions to this homogeneous equationIt forms the nullspaceNote that the column vectors in A are not linearly independent
Out[1]:
Out[3]: ⎡
⎣⎢⎢
123
246
268
2810
⎤
⎦⎥⎥
Out[4]: ⎡
⎣
⎢⎢⎢⎢
x1x2x3x4
⎤
⎦
⎥⎥⎥⎥
Out[5]: ⎡
⎣⎢⎢
000
⎤
⎦⎥⎥
2015/03/28, 1:15 PMI_08_Solving_homogeneous_systems_Pivot_variables_Special_solutions
Page 2 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_systems_Pivot_variables_Special_solutions.ipynb?download=false
Performing elementary row operations leaves us with the matrix belowIt has two pivots, which is termed rank 2
In [6]: A.rref() # rref being reduced row echelon form
Which represents the following
We are free set a value for x , let's sat t
We will have to make x equal to another variable, say s
This results in the following, which is the complete nullspace and has dimension 2
From the above, we clearly have two vectors in the solution and we can take constant multiples of these to fill up our solution space (our nullspace)
We can easily calculate how many free variables we will have by subtracting the number of pivots (rank) from the number of variables (x) in xHere we have 4 - 2 = 2
Example problem
Calculate x for the transpose of A above
Solution
In [7]: A_trans = A.transpose() # Creating a new matrix called A_trans and giving it the value of the inverse of AA_trans
In [8]: A_trans.rref() # In reduced row echelon form this would be the following matrix
Out[6]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
200
010
−220
⎤
⎦⎥⎥ [ ]0, 2
⎞
⎠⎟⎟
+ + + =x1
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥ x2
⎡
⎣⎢⎢
200
⎤
⎦⎥⎥ x3
⎡
⎣⎢⎢
010
⎤
⎦⎥⎥ x4
⎡
⎣⎢⎢
−220
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
⎤
⎦⎥⎥
+ 2 + 0 − 2 = 0x1 x2 x3 x40 + 0 + + 2 = 0x1 x2 x3 x4
+ 0 + 0 + 0 = 0x1 x2 x3 x4
4+ 2 + 0 − 2 = 0x1 x2 x3 x4
0 + 0 + + 2t = 0x1 x2 x3+ 0 + 0 + 0 = 0x1 x2 x3 x4
∴ = −2tx3
2+ 2s + 0 − 2t = 0x1 x3∴ = 2t − 2sx1
= = + = s + t
⎡
⎣
⎢⎢⎢⎢
x1x2x3x4
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−2s + 2ts
−2tt
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−2ss00
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
2t0
−2tt
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−2100
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
20
−21
⎤
⎦
⎥⎥⎥⎥
Out[7]: ⎡
⎣
⎢⎢⎢⎢
1222
2468
36810
⎤
⎦
⎥⎥⎥⎥
Out[8]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1000
0100
1100
⎤
⎦
⎥⎥⎥⎥[ ]0, 1
⎞
⎠
⎟⎟⎟⎟
2015/03/28, 1:15 PMI_08_Solving_homogeneous_systems_Pivot_variables_Special_solutions
Page 3 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_systems_Pivot_variables_Special_solutions.ipynb?download=false
Remember this is 4 equations in 3 unknowns, i.e.
It seems we are free to choose a value for xLet's make is t
We had n = 3 unknowns and r (rank) = 2 pivotsThe solution set (nullspace) will thus have 1 variable (t) (3-2=1)
The third column is the sum of the first two, so only 2 columns are linearly independentWe thus expect 2 pivots and can predict the nullspace to have only 1 variable (i.e. it is one-dimensional)
In [ ]:
+ + =x1
⎡
⎣
⎢⎢⎢⎢
1000
⎤
⎦
⎥⎥⎥⎥x2
⎡
⎣
⎢⎢⎢⎢
0100
⎤
⎦
⎥⎥⎥⎥x3
⎡
⎣
⎢⎢⎢⎢
1100
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0000
⎤
⎦
⎥⎥⎥⎥+ 0 + = 0x1 x2 x3
0 + + = 0x1 x2 x30 + 0 + 0 = 0x1 x2 x30 + 0 + 0 = 0x1 x2 x3
3
t − t + t =
⎡
⎣
⎢⎢⎢⎢
1000
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0100
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
1100
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0000
⎤
⎦
⎥⎥⎥⎥= tx3
+ 0 + t = 0x1 x20 + + t = 0x1 x2
∴ = −tx2∴ = −tx1
= = t⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎡
⎣⎢⎢
t−tt
⎤
⎦⎥⎥
⎡
⎣⎢⎢
1−11
⎤
⎦⎥⎥
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 1 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
This notebook is part of lecture 22 Diagonalization and powers of A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eye, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Diagonalizing a matrixPowers of a matrix A
Definition
If A is a n×n, then a non-zero vector x in � is called an eigenvector of the matrix A if Ax is a scalar multiple of xWhat this suggests is that if you consider the column vector x and multiply it by a scalar (here called λ) (which is then parallel to x, just of different length) it results in thesame solution as multiplying the matrix A by xLet's try another explanation: if a matrix A, multiplied with a (column) vector (x) results in a scalar multiple of that same (column) vector (and is thus parallel to that(column) vector) then this (column) vector is an eigenvector of the matrix A
In essence this multiplication of a matrix with a (column) vector produces another vector on the same line as the original vectorDepending on the value of this scalar the resulting vector might point in the opposite direction and be shorter or longer than the original
This scalar multiple is called the eigenvalueMatrices can have more than one eigenvalue and eigenvector
Derivations
We need to insert an identity matrix of size n into the equation that describes the explanation above
Look at this carefully and you'll notice that we are suggesting the nullspace (eigenspace) of the matrix (A-λI)This matrix has to be singular, i.e. have a determinant of 0
Solving this equation (called the characteristic equation) will give us the eigenvalues (λ )It will always be a polynomial in λ (called the characteristic polynomial of A), with a leading coefficient of 1 and a degree of n corresponding to the size of A
Substituting them back into...
... allows us to calculate the eigenvector(s) x
Out[1]:
n
A = λx⎯⎯ x⎯⎯A = λIx⎯⎯ x⎯⎯
A − λI =x⎯⎯ x⎯⎯ 0⎯⎯(A − λI) =x⎯⎯ 0⎯⎯
A − λI = 0∣∣ ∣∣'s
p (λ) = + + ⋯ +λn c1 λn−1 cn
(A − λI) =x⎯⎯ x⎯⎯ 0⎯⎯
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 2 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
Let's look at the following matrix A
Let's start with the first eigenvalue, which is equal to 1 and replace it in A-λI
In [4]: A = Matrix([[-1, 0 ,-2], [1, 1, 1], [1, 0, 2]])A
We now need the nullspace of this matrix
In [5]: A.nullspace()
We knew that this would be 1-dimensional after looking at the row-reduced form
In [6]: A.rref()
It has rank 2 (two pivot column and 1 free variable
Now for the other 2 eigenvalues, both equaling 2
In [7]: A = Matrix([[-2, 0, -2], [1, 0, 1], [1, 0, 1]])A
In [8]: A.nullspace()
In [9]: A.rref()
Only a single pivot column, therefor rank of 1 and two independent (free) variables
A =⎡
⎣⎢⎢
011
020
−213
⎤
⎦⎥⎥
A − λI = − =⎡
⎣⎢⎢
011
020
−213
⎤
⎦⎥⎥
⎡
⎣⎢⎢
λ00
0λ0
00λ
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−λ11
02 − λ
0
−21
3 − λ
⎤
⎦⎥⎥
= 0∣
∣
∣∣∣
−λ11
02 − λ
0
−21
3 − λ
∣
∣
∣∣∣
− 5 + 8λ − 4 = 0λ3 λ2
= 1, = = 2λ1 λ2 λ3
Out[4]: ⎡
⎣⎢⎢
−111
010
−212
⎤
⎦⎥⎥
Out[5]: ⎡
⎣⎢⎢
⎡
⎣⎢⎢
−211
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[6]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
2−10
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
Out[7]: ⎡
⎣⎢⎢
−211
000
−211
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−101
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[9]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
000
100
⎤
⎦⎥⎥ [ ]0
⎞
⎠⎟⎟
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 3 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
Corresponding to the first eigenvalue we have a single eigenvector that is the basis for a 1-dimensional (line) eigenspace in �Corresponding to the second (and third) eigenvalues we have two basis vectors for a 2-dimensional plane in �Since we are talking about subspaces, we must note that the zero vector must be in both eigenspaces (type of nullspace), but isn't an eigenvector
The eigenvalues of triangular (upper and lower) and diagonal matrices
The eigenvalue of these type of matrices are exactly the entries along the main diagonal
Real and complex eigenvalues
There will be characteristic polynomials resulting in complex rootsThe consequences of real-valued eigenvalues for a square matrix A of size n are the following
The system (A-λI)x=0 has non-trivial solutionsThere is a non-zero vector x in � such that Ax=λx
The eigenvector matrix S and eigenvalue matrix Λ
We need to create S from the (column) eigenvectors such that the following holds
As such, S should be square of size n×n and invertible, so we need n independent eigenvectors
Suppose we have n linearly independent eigenvectors of APut them in the columns of S and calculate AS
From this we have the following
Later I will use the computer variable D for this diagonal matrix Λ
The power of a matrix (only for n independent eigenvectors)
We saw in the example section of the last lecture that the following holds
The eigenvectors are the same for A and AWe can also see the following
The power need not be 2, but any k which will have S appearing k-1 times
We thus have the following theorems
...and...If k is a positive integer, λ is an eigenvalue of the matrix A, and x is a corresponding eigenvector, then λ is an eigenvalue of A and x is a corresponding eigenvector
33
n
AS = ΛS−1
AS = A = =
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮x1
⋮
⋮⋮x2
⋮
⋮⋮…⋮
⋮⋮xn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮xλ1 1
⋮
⋮⋮
λ2x2
⋮
⋮⋮…⋮
⋮⋮
λnxn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮x1
⋮
⋮⋮x2
⋮
⋮⋮…⋮
⋮⋮xn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
λ10⋮0
0λ2
⋮0
00…0
00⋮λn
⎤
⎦
⎥⎥⎥⎥
AS = SΛ
AS = SΛAS = ΛS−1
A = SΛS−1
= xA2x⎯⎯ λ22
= SΛ SΛ = SA2 S−1 S−1 Λ2S−1
-1
→ 0 ∵ k → ∞;Ak ∣∣λi∣∣
k k
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 4 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
What makes a matrix diagonalizable
In discussing diagonalization we are concerned with finding a basis for � that consists of eigenvectors of a given square matrix of size nThese bases can tell us about geometric properties of A and it can simplify numerical computations involving A
We need to answer two question (which are actually the same)Given a square matrix of size n, is there a basis for � consisting of eigenvectors?Given a square matrix of size n, it there and invertible matrix S, such that S AS is a diagonal matrix? (It is the same matrix S referred to above)
If such a matrix S exists, it is said to diagonalize A (and we will call the resultant diagonal matrix D)
In short the answer to the above question(s) is yes if A has n independent eigenvectorsThis happens if all λ are different (none are repeated) (not totally excluded if they are repeated though)
If they are repeated, we still might have independent eigenvectors, i.e. any size identity matrix (because it is already diagonal)
In [10]: A = eye(5)A
In [11]: A.eigenvals()
In [12]: A.eigenvects()
Here we look at a triangular matrix, though
In [13]: A = Matrix([[2, 1], [0, 2]])A.eigenvals()
In [14]: A.eigenvects()
We can use python™ code to calculate the diagonalized matrix
In [15]: A = Matrix([[3, -2, 4, -2], [5, 3, -3, -2], [5, -2, 2, -2], [5, -2, -3, 3]])A
In [16]: S, D = A.diagonalize()
n
n-1
's
Out[10]: ⎡
⎣
⎢⎢⎢⎢⎢
10000
01000
00100
00010
00001
⎤
⎦
⎥⎥⎥⎥⎥
Out[11]: { }1 : 5
Out[12]: ⎡
⎣
⎢⎢⎢⎢⎢
⎛
⎝
⎜⎜⎜⎜⎜
1, 5,
⎡
⎣
⎢⎢⎢⎢⎢
,
⎡
⎣
⎢⎢⎢⎢⎢
10000
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
01000
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
00100
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
00010
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
00001
⎤
⎦
⎥⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟⎟
⎤
⎦
⎥⎥⎥⎥⎥
Out[13]: { }2 : 2
Out[14]: [ ]( )2, 2, [ ][ ]10
Out[15]: ⎡
⎣
⎢⎢⎢⎢
3555
−23
−2−2
4−32
−3
−2−2−23
⎤
⎦
⎥⎥⎥⎥
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 5 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
In [17]: S # S, such that A = S times D times the inverse of S
In [18]: D # The diagonal
In [19]: S * D * S.inv() == A # Checking to see if our statement above is correct
In [20]: S.inv() * A * S == D # Checking to see if our statement above is correct
In [21]: A.eigenvals()
Remember Λ from above?The eigenvalues are precisely the entries along the main diagonal of the diagonal matrix
To produce the required diagonal matrix manually then will require computing n linearly independent eigenvectors for matrix A of size n (assuming that it isdiagonalizable), creating a matrix with its columns equal to these eigenvectors (called matrix S) and performing the equation S AS to calculate the diagonal matrix D ( orΛ)
Back to the topic of what makes a matrix diagonalizable
Suppose we have an equation that starts with some vector and every subsequent vector is a matrix A time the previous vector
From this arises the following
To really solve this problem, rewrite u as follows (a certain scalar times an eigenvector)
Where the Sc is a linear combination of the individual eigenvectors
Now multiply both sides by A
Taking a power of A now (i.e. k) would be akin to taking each eigenvalue to that power
This can be written as
Out[17]: ⎡
⎣
⎢⎢⎢⎢
0111
1111
1110
0−101
⎤
⎦
⎥⎥⎥⎥
Out[18]: ⎡
⎣
⎢⎢⎢⎢
−2000
0300
0050
0005
⎤
⎦
⎥⎥⎥⎥
Out[19]: True
Out[20]: True
Out[21]: { }−2 : 1, 3 : 1, 5 : 2
-1
= Au⎯⎯k+1 u⎯⎯k
= Au⎯⎯1 u⎯⎯0
= A =u⎯⎯2 Au⎯⎯0 A2 u⎯⎯0=u⎯⎯k Aku⎯⎯0
0= + + ⋯ + = Su⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯n c⎯⎯
A = A + A + ⋯ + Au⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯nA = + + ⋯ +u⎯⎯0 c1 λ1x⎯⎯1 c2 λ2x⎯⎯2 cn λnx⎯⎯n
= + + ⋯ +Aku⎯⎯0 c1 λk1x⎯⎯1 c2 λk
2x⎯⎯2 cn λknx⎯⎯n
= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 6 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
As an example consider the Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, ...What would the 100 number be?Consider the following
This is a (second-order) difference equation; think of this example as similar to a second-order differential equation (without derivatives)By adding a second equation F =F , consider u to be the following vector
This means the following
In [22]: A = Matrix([[1, 1], [1, 0]])A
In [23]: A.eigenvals()
In [24]: A.eigenvects()
In [25]: S, D = A.diagonalize()
In [26]: D
From above we remember the following
We have u contains the first two values
In [27]: u_zero = Matrix([1, 0])u_100 = A ** 100 * u_zerou_100 # The top value is the 100th Fibonacci number
In [28]: u_four = A ** 4 * u_zerou_four # If the first number is 0 the the fourth number would be the top value
Example problems
Example problem 1
Find an equation for C where C is given by the following matrix
Calculate C when a=b=-1
Solution
th
= +Fk+2 Fk+1 Fk
k+1 k+1 k
= [ ]u⎯⎯kFk+1Fk
= [ ] [ ] = [ ]u⎯⎯k+111
10
Fk+1Fk
+Fk+1 Fk
Fk+1
= [ ]u⎯⎯k+111
10
u⎯⎯k
Out[22]: [ ]11
10
Out[23]: { }+ : 1,12
5√2 − + : 15√
212
Out[24]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ + ,1
25√
2 1,⎡
⎣⎢⎢
⎡
⎣⎢⎢
− 1− +5√
212
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ − + ,5√
212 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− 1+1
25√
2
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
Out[26]: ⎡⎣⎢⎢
+12
5√2
0
0
− +5√2
12
⎤⎦⎥⎥
= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯0
Out[27]: [ ]573147844013817084101354224848179261915075
Out[28]: [ ]53
k
100
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 7 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
In [29]: a, b, k = symbols('a b k')
In [30]: C = Matrix([[2 * b - a, a - b], [2 * b - 2 * a, 2 * a - b]])C
We remember the following
Where Λ is denoted by the computer variable D
In [31]: S, D = C.diagonalize()
In [32]: S
Python™ is not always good at simplifying theseIf you look at it carefully you will note the following
In [33]: S = Matrix([[1, Rational(1, 2)], [1, 1]])S
In [34]: D
In [35]: D = Matrix([[b, 0], [0, a]])D
For the values given, we have the following
In [36]: C = Matrix([[-1, 0], [0, -1]])C
In [37]: S, D = C.diagonalize()
In [38]: D
In [39]: D ** 100
In [40]: S * (D ** 100) * S.inv()
Doing the same, but with eigenvalues and eigenvectors
Out[30]: [ ]−a + 2b−2a + 2b
a − b2a − b
= SAk ΛkS−1
Out[32]: ⎡
⎣⎢⎢
− 2a−2b
−3a+3b+ (a−b)2√1
2a−2b
3a−3b+ (a−b)2√1
⎤
⎦⎥⎥
Out[33]: [ ]11
121
Out[34]: ⎡⎣⎢⎢
+ −a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
0
0
+ +a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
⎤⎦⎥⎥
Out[35]: [ ]b0
0a
Out[36]: [ ]−10
0−1
Out[38]: [ ]−10
0−1
Out[39]: [ ]10
01
Out[40]: [ ]10
01
2015/03/28, 1:26 PMII_09_Diagonalization_and_Powers
Page 8 of 8http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…rang/ZIP/II_09_Diagonalization_and_Powers.ipynb?download=false
In [41]: C = Matrix([[2 * b - a, a - b], [2 * b - 2 * a, 2 * a - b]])C
In [42]: C.eigenvals()
This simplifies the λ = b and λ = aThat makes Λ (or D) the following
In [43]: D = Matrix([[b, 0], [0, a]])D
In [44]: C.eigenvects() # The solution is two tuples, with each being eigenvalue, eigenvector
This simplifies to the following eigenvalue matrix S
In [45]: S = Matrix([[1, Rational(1, 2)], [1, 1]])S
We can see if we can get back to C
In [46]: S * D * S.inv()
In [47]: S * D * S.inv() == C
Python™ won't to D for you, but it's easy to do yourself
In [48]: D = Matrix([[b ** k, 0], [0, a ** k]])D
Now we can compute SΛS
In [49]: S * D * S.inv()
Placing the given values into this equation will give you the same solution for C as above
In [ ]:
Out[41]: [ ]−a + 2b−2a + 2b
a − b2a − b
Out[42]: { }+ − : 1,a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ + + : 1a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
1 2
Out[43]: [ ]b0
0a
Out[44]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ + − ,a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− a−b
− + +3a2
3b2
12 (a−b)2√
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ + + ,a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− a−b
− + −3a2
3b2
12 (a−b)2√
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
Out[45]: [ ]11
121
Out[46]: [ ]−a + 2b−2a + 2b
a − b2a − b
Out[47]: True
k
Out[48]: [ ]bk
00ak
-1
Out[49]: [ ]− + 2ak bk
−2 + 2ak bk−ak bk
2 −ak bk
100
2015/03/28, 1:31 PMI_10_Independence_Spanning_Basis_Dimension
Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathemati…10_Independence_Spanning_Basis_Dimension.ipynb?download=false
This notebook is part of lecture 9 Independence, basis and dimension in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax') # Pretty Latex printing to the screen#%matplotlib inlinefilterwarnings('ignore')
IndependenceSpanningBasisDimension
Independence
Vectors are linearly independent ifNo combination of these vectors results in the zero vector (except the zero combinations)
In 2-space, this means that they should noy be on the same line through the originIn 3-space they should not be on the same line through the origin or on a plane through the originIn higher-dimensional space they should not be on the same line through the origin or a hyperplane through the origin
If they are independent by the constraints above, only a zero combination of them will results in zeroIf there are vectors in the nullspace (apart from the zero vector), then (there is a linear combination that will give zero and) then the vectors are not linearly independent
In [3]: A= Matrix([[1, 2, 4], [3, 1, 4]])A
Here we will have a rank of 2 (2 pivots) and 3 unknowns and 2 rowsThus, r = m (full row rank)We are left with n - r freen variable, i.e. 3 - 2 = 1 and will have one vector in the nullspace
In [4]: A.rref()
Out[1]:
+ + ⋯ + ≠ 0, ≠ 0c1 x1 c2 x2 cn xn ci
Out[3]: [ ]13
21
44
Out[4]:
( )[ ] ,10
01
4585
[ ]0, 1
2015/03/28, 1:31 PMI_10_Independence_Spanning_Basis_Dimension
Page 2 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathemati…10_Independence_Spanning_Basis_Dimension.ipynb?download=false
In [5]: A.nullspace()
Another way to state independence
Consider the columns of the matrix A as vectors v , v , ..., v
If r = n then the nullspace only contains the zero vector and the column vectors are linearly independent
Spanning
If we have a set of linearly independent vector that all their linear combinations (including zero) span a subspace ( in this instance a column space)
We are particularly interested in a set of (column) vectors (in a matrix) that are linearly independent and span a subspaceThis leads us to the next topic, basis
Basis
A set of vectors (in a space W) with the propertiesThey are linearly independentThey span the space (linear combinations of them fill the space)
Up until now we looked at columns in a matrix AIt is more common in textbooks to look at a space first and ask about basis vectors, spanning vectors, dimension, etc
So let's look at �The obvious set of basis vectors are
What about
So, are they linearly independent and do they span � ?|
In [6]: A = Matrix([[1, 2], [1, 2], [2, 5]])A
Here we will have r = 2, n = 2 and thus a (n - r = 0) zero nullspace
In [7]: A.nullspace()
In [8]: A.rref()
Out[5]: ⎡
⎣
⎢⎢⎢
⎡
⎣
⎢⎢⎢− 4
5
− 85
1
⎤
⎦
⎥⎥⎥
⎤
⎦
⎥⎥⎥
1 2 n
3
, ,i j k
,⎡
⎣⎢⎢
112
⎤
⎦⎥⎥
⎡
⎣⎢⎢
225
⎤
⎦⎥⎥
3
Out[6]: ⎡
⎣⎢⎢
112
225
⎤
⎦⎥⎥
Out[7]: []
Out[8]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
2015/03/28, 1:31 PMI_10_Independence_Spanning_Basis_Dimension
Page 3 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathemati…10_Independence_Spanning_Basis_Dimension.ipynb?download=false
For now, our intuition is that they will not span �This intuition is correct, because all their linear combinations will only fill a plane through the originThe zero combination does result in the zero vector, though, so they do fill a subspace of �Some textbooks refer to this as V = � , with W a subspace of V
If we added a column vector that is a linear combination of these, it will also fall in the plane and thus not be linearly independent (there will be a vector in the nullspaceother than the zero vector
In [9]: A = Matrix([[1, 2, 3], [1, 2, 3], [2, 5, 7]])A.nullspace()
In [10]: A.rref()
Indeed, we have a column without a pivot and thus a free variable
Let's add another, such that we have
In [11]: A = Matrix([[1, 2, 3], [1, 2, 3], [2, 5, 8]])A.rref()
Again, a column without a pivot and sure enough, we'll find a vector (other than the zero vector) in the nullspace
In [12]: A.nullspace()
The special case of a square matrix
If we now end up with a square matrix, we need only look at it's determinant, i.e., is it invertible
In [13]: A.det() # .det() calculates the determinant
Indeed the determinant is zero as expected
Dimension
Given a (sub)space, every basis for that (sub)space has the same number of vectors (there are usally more than one basis for every (sub)space)This called the dimension of a (sub)space
Important point to remember
The (sub)space which a set of (column) vectors (matrix of coefficients, A) span, is the set of possible b-values
3
3n
Out[9]: ⎡
⎣⎢⎢
⎡
⎣⎢⎢
−1−11
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[10]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
110
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
Out[11]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
−120
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
Out[12]: ⎡
⎣⎢⎢
⎡
⎣⎢⎢
1−21
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[13]: 0
2015/03/28, 1:31 PMI_10_Independence_Spanning_Basis_Dimension
Page 4 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathemati…10_Independence_Spanning_Basis_Dimension.ipynb?download=false
More examples
Example problem
Consider the column space
In [14]: A = Matrix([[1, 2, 3, 1], [1, 1, 2, 1], [1, 2, 3, 1]])
In [15]: A
There are n = 4 unknowns, m = 3 unknownsWe note that column 1 = column 4We note that with 4 unknowns we are dealing with �In essence, there are at most three independent columns, thus the matrix cannot be a basis for �It is possible for them to span the column space (don't get confused by column space (� ) and matrix here)
In [16]: A.nullspace()
In [17]: A.rref()
As we can see here (columns three and four have free variables, i.e. no pivots)
The rank of the matrix is two (it is the number of pivot columns)This matrix space thus have two basis vectors (column vectors 1 and 2) and we say the dimension of this space is twoRemember, a matrix has a rank, which is the dimension of a column space (the column space representing the space 'produced' by the column vectors)We talk about the rank of a matrix, rank(A) and the column space of a matrix, C(A)
In summary we have two basis above (they span a space)Any two vectors that are not linearly dependent will also span this space, they can't help but todimC(A)= rThe nullspace will have n - r vectors (the dimension of the null space equal the number of free variables)
In [ ]:
Out[15]: ⎡
⎣⎢⎢
111
212
323
111
⎤
⎦⎥⎥
44
4
Out[16]: ⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
−1−110
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−1001
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
Out[17]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
110
100
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
+ 0 + + = 0x1 x2 x3 x40 + 1 + + = 0x1 x2 x3 04
=x4 c2=x3 c1
∴ = −x2 c1∴ = − −x1 c1 c2
= = + = +
⎡
⎣
⎢⎢⎢⎢
x1x2x3x4
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
− −c1 c2−c1c1c2
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−c1−c1c10
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−c200c2
⎤
⎦
⎥⎥⎥⎥c1
⎡
⎣
⎢⎢⎢⎢
−1−110
⎤
⎦
⎥⎥⎥⎥c2
⎡
⎣
⎢⎢⎢⎢
−1001
⎤
⎦
⎥⎥⎥⎥
2015/03/28, 1:38 PMI_11_Subspaces
Page 1 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false
This notebook is part of lecture 10 The four fundamental subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [ ]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [ ]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax') # Pretty Latex printing to the screen#%matplotlib inlinefilterwarnings('ignore')
The four fundamental subspacesIntroducing the matrix space
The four fundamental subspaces
Columnspace, C(A)Nullspace, N(A)Rowspaces
All linear combinations of rowsAll the linear combinations of the colums of A , C(A )
Nullspace of A , N(A ) (the left nullspace of A)
Where are these spaces for a matrix A ?
C(A) is in �N(A) is in �C(A ) is in �N(A ) is in �
Calculating basis and dimension
For C(A)
The bases are the pivot columnsThe dimension is the rank r
Out[ ]:
T TT T
m×n
mn
T nT m
2015/03/28, 1:38 PMI_11_Subspaces
Page 2 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false
For N(A)
The bases are the special solutions (one for every free variable, n - r)The dimension is n - r
For C(A )
If A undergoes row reduction to row echelon form (R), then C(R) ≠ C(A), but rowspace(R) = rowspace(A) (or C(R ) = C(A ))A basis for the rowspace of A (or R) is the first r rows of R
So we row reduce A and take the pivot rows and transpose themThe dimension is also equal to the rank r
For N(A )
It is also called the left, because it ends up on the left (as seen below)Here we have A y = 0
y (A ) = 0y A = 0This is (again) the pivot columns of A (after row reduction)
The dimension is m - r
Example problems
Consider this example matrix and calculate the bases and dimension for all four fundamental spaces
In [ ]: A = Matrix([[1, 2, 3, 1], [1, 1, 2, 1], [1, 2, 3, 1]]) # We note that rows 1 and three are identical and that# columns 3 is the addtion of columns 1 and 2 and column 1 equals column 4A
Columnspace
In [ ]: A.rref() # Remember that the columnspace contains the pivot columns as a basis
The basis is thus:
It is indeed in � (rows of A = m = 3, i.e. each column vector is in 3-space or has 3 components)
The rank (no of columns with pivots) are 2, thus dim(A) = 2
Nullspace
T
T T
T
TT T T TT T
T
Out[ ]: ⎡
⎣⎢⎢
111
212
323
111
⎤
⎦⎥⎥
Out[ ]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
110
100
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
⎡
⎣⎢⎢
100
010
⎤
⎦⎥⎥
3
2015/03/28, 1:38 PMI_11_Subspaces
Page 3 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false
In [ ]: A.nullspace() # Calculating the nullspace vectors
So, indeed the basis is in � (A has n = 4 columns)
In [ ]: A.rref() # No pivots for columns 3 and 4
The dimension is two (there are 2 column vectors, which is indeed n - r = 4 - 2 = 2)
Rowspace C(A )
So we are looking for the pivot columns of A
In [ ]: A.rref()
The pivot rows are rows 1 and 2We take them and transpose them
As stated above, it is in �
The dimension is n - r = 4 - 2 = 2
Nullspace of A
In [ ]: A.nullspace()
Which is indeed in �
The dimension is 1, since m - r = 3 - 2 = 1 (remember that the rank is the number of pivot columns)
Consider this example matrix (in LU form) and calculate the bases and dimension for all four fundamental spaces
Out[ ]: ⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
−1−110
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−1001
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
4
Out[ ]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
110
100
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
T
T
Out[ ]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
110
100
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
⎡
⎣
⎢⎢⎢⎢
1011
0110
⎤
⎦
⎥⎥⎥⎥
4
T
Out[ ]: ⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
−1−110
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−1001
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
3
2015/03/28, 1:38 PMI_11_Subspaces
Page 4 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false
In [ ]: L = Matrix([[1, 0, 0], [2, 1, 0], [-1, 0, 1]])U = Matrix([[5, 0, 3], [0, 1, 1], [0, 0, 0]])A = L * UL, U, A
Columnspace of A
In [ ]: A.rref()
The basis is thus:
Another basis would be the pivot columns of L:
It is in � , since m = 3It has a rank of 2 (two pivot columns)Since the dimension of the columnspace is equal to the rank, dim(A) = 2
Note that it is also equal to the number of pivot columns in U
Nullspace of A
In [ ]: A.nullspace()
The nullspace is in � , since n = 3The basis is the special solution(s), which is one column vector for every free variable
Since we only have a single free variable, we have a single nullspace column vectorThis fits in with the fact that it needs to be n - rIt can also be calculated by taking U, setting the free variable to 1 and solving for the other rows by setting each equal to zero
The dimension of the nullspace is also 1 (n - r, i.e. a single column)It is also the number of free variables
The rowspace
This is the columnspace of ADon't take the transpose first!Row reduce, identify the rows with pivots and transpose them
In [ ]: A.rref()
Out[ ]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
12
−1
010
001
⎤
⎦⎥⎥ ,
⎡
⎣⎢⎢
500
010
310
⎤
⎦⎥⎥
⎡
⎣⎢⎢
510−5
010
37
−3
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[ ]: ⎛
⎝⎜⎜⎜ ,
⎡
⎣⎢⎢⎢
100
010
3510
⎤
⎦⎥⎥⎥ [ ]0, 1
⎞
⎠⎟⎟⎟
⎡
⎣⎢⎢
100
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
12
−1
010
⎤
⎦⎥⎥
3
Out[ ]: ⎡
⎣⎢⎢⎢
⎡
⎣⎢⎢⎢
− 35
−11
⎤
⎦⎥⎥⎥
⎤
⎦⎥⎥⎥
3
T
Out[ ]: ⎛
⎝⎜⎜⎜ ,
⎡
⎣⎢⎢⎢
100
010
3510
⎤
⎦⎥⎥⎥ [ ]0, 1
⎞
⎠⎟⎟⎟
2015/03/28, 1:38 PMI_11_Subspaces
Page 5 of 5http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…a_18_06_Gilbert_Strang/ZIP/I_11_Subspaces.ipynb?download=false
The basis is can also be written down by identifying the rows with pivots in U and writing them down as columns (getting their transpose)
It is in � , since n = 3The rank r = 2, which is equal to the dimension, i.e. dim(A ) = 2
The nullspace of A
In [ ]: A.transpose().nullspace()
It is indeed in � , since m = 3A good way to do it is to take the inverse of L, such that L A = U
Now the free variable row in U is row threeTake the corresponding row in L and transpose it
The dimension in m - 2 = 3 - 2 = 1
The matrix space
A square matrix is also a 'vector' space, because they obey the vector space rules of addition and scalar multiplicationSubspaces (of same) would include
Upper triangular matricesSymmetric matrices
In [ ]:
⎡
⎣⎢⎢
503
011
⎤
⎦⎥⎥
3T
T
Out[ ]: ⎡
⎣⎢⎢
⎡
⎣⎢⎢
101
⎤
⎦⎥⎥
⎤
⎦⎥⎥
3-1
-1
2015/03/28, 1:35 PMI_12_Matrix_spaces
Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_06_Gilbert_Strang/ZIP/I_12_Matrix_spaces.ipynb?download=false
This notebook is part of lecture 11 Matrix spaces, rank 1, small world graphs in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: #import numpy as npfrom sympy import init_printing, Matrix, symbols#import matplotlib.pyplot as plt#import seaborn as sns#from IPython.display import Imagefrom warnings import filterwarnings
init_printing(use_latex = 'mathjax') # Pretty Latex printing to the screen#%matplotlib inlinefilterwarnings('ignore')
Matrix spaces
New vector / matrix spaces
Square matrices
Consider M to be all 3×3 matrices (with real elements)
Subspaces would be:Upper or lower triangular matricesSymmetric matrices
Basis would be:
The dimension would be 9
For upper and lower triangular matrices the dimensions would be 6 and the basis:
For symmetric matrices the dimension would also be six ( Knowing the diagonal and entries on one of the two sides)
These are unique cases where the bases for the subspaces are contained in the basis of the 3×3 matrix M
Out[1]:
, , , , , , , ,⎡
⎣⎢⎢
100
000
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
100
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
100
⎤
⎦⎥⎥
⎡
⎣⎢⎢
010
000
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
010
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
001
000
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
001
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
001
⎤
⎦⎥⎥
, , , , ,⎡
⎣⎢⎢
100
000
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
100
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
100
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
010
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
001
⎤
⎦⎥⎥
2015/03/28, 1:35 PMI_12_Matrix_spaces
Page 2 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_06_Gilbert_Strang/ZIP/I_12_Matrix_spaces.ipynb?download=false
Other square matrices that are subspaces of M
The intersection of symmetric and upper triangular matrices (that is symmetric AND upper triangular, S∩U)This is a diagonal matrixThe dimension is 3The basis is
The union of symmetric and upper triangular matrices (that is symmetric OR upper triangular, S∪U)It is NOT a subspace
The addition (sum) of symmetric and upper triangular matricesIt IS a subspaceIt is actually all 3×3 matricesThe dimension is 9
This gives the equation: dim(S) + dim(U) = dim(S∩U) + dim(S+U) = 12
Example problems
Example problem 1
Show that the set of 2×3 matrices whose nullspace contains the column vector below is a vector subspace and find a basis for it
Solution
In essence we have to show the following
... and ...
This can be shown by addition:
Therefor (by virtue of the fact that addition remains in the nullspace) the set is vector subspace
We also need to look at scalar multiplication (if we multiply a matrix in the set by a scalar, does it remain in the set)
, ,⎡
⎣⎢⎢
100
000
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
010
000
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
000
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
A = [ ]⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
[ ] = [ ]a11a21
a12a22
a13a23
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
B = [ ]⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
(A + B) = [ ] = A + B = [ ]⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
(cA) = c A = c [ ] = [ ]⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
⎛
⎝⎜⎜
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
⎞
⎠⎟⎟
00
00
2015/03/28, 1:35 PMI_12_Matrix_spaces
Page 3 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_06_Gilbert_Strang/ZIP/I_12_Matrix_spaces.ipynb?download=false
Example problem 2
Find a basis for the nullspace above
Solution
Let's look at the first row:
From this we can make the following row vectors
From this we can construct 4 basis:
Example problem 3
What about the set of those whose column space contains the following column vector?
Solution
Well, any subspace must contain the zero matrix
It does not contain the above column vector, which is therefor not a subspace
In [ ]:
[ ] = [ ]a11 a12 a13
⎡
⎣⎢⎢
211
⎤
⎦⎥⎥
00
2 + + = 0a11 a12 a13= −2 −a13 a11 a12
[ ] = [ ]a11 a12 a13 a11 a12 (−2 − )a11 a12
= [ ] + [ ]a11 0 −2a11 0 a12 −a12
= [ ] + [ ]a11 1 0 −2 a12 0 1 −1
[ ] , [ ] , [ ] , [ ]10
00
−20
00
10
−10
01
00
0−2
00
01
0−1
[ ]21
[ ]00
00
00
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 1 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
This notebook is part of lecture 12 Graphs, netwroks, and incidence matrices in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarningsfrom IPython.display import Image
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Graphs and networksIncidence matricesKirchhoff's laws
This lecture is about the application of matrices
Graphs and networks
In this instance we refer to nodes and there connections called edgesConsider the graph below:
In [4]: Image(filename = 'Graph1.png')
We will call the nodes n (columns), in this case n = 4The edges (connections) will be called m (rows), with m = 5 in this caseThis will give us a m×n = 5×4 matrixWe will have to give a direction to every edge
The incidence matrix
This corresponds to the graph above
Out[1]:
Out[4]:
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 2 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
In [5]: A = Matrix([[-1, 1, 0, 0], [0, -1, 1, 0], [-1, 0, 1, 0], [-1, 0, 0, 1], [0, 0, -1, 1]])A# For each row (edge) look only at that edge (line)# In the case of row (edge, line) 1, the arrow point away from node 1, hence the first -1 in the matrix# The arrow point towards node 2, hence the 1# It does not point to nodes 3 and 4, hence the 0's
Edges 1, 2, and 3 form a loopNotice for the first loop (edges 1, 2, and 3) the corresponding third row is a linear combination of rows 1 and 2Intuitively, you can see that you can reach node 3 from node 1 by a combination of edges (rows) 1 and 2
In [6]: A.rref()
We note that we have three pivot columns, hence a rank, r = 3We have one column without a pivot and will thus have one in the nullspace (n - r = 4 - 3 = 1)
In [7]: A.nullspace()
The basis for this subspace is one dimensional and includes all scalar multiplications of this vectorThe meaning in our example is that nothing will happen when the solutions fall on this line in 4-dimensional space, i.e. no current will flow
If you think of the solution x and every component of x being a potential at a node, the matrix multiplication Ax gives you the potential differences along the edgesThe nullspace would then be the solution where all the potential differences are 0
In [8]: x1, x2, x3, x4 = symbols('x1, x2, x3, x4')
In [9]: x_vect = Matrix([x1, x2, x3, x4])x_vect
In [10]: A * x_vect
For the nullspace, each row now equals 0 (the potential difference between two nodes)
Let's look at the row space and the nullspace of the row pictureWe now to get the rowspace by transposing the row that contain pivots
Out[5]: ⎡
⎣
⎢⎢⎢⎢⎢
−10
−1−10
1−1000
0110
−1
00011
⎤
⎦
⎥⎥⎥⎥⎥
Out[6]: ⎛
⎝
⎜⎜⎜⎜⎜
,
⎡
⎣
⎢⎢⎢⎢⎢
10000
01000
00100
−1−1−100
⎤
⎦
⎥⎥⎥⎥⎥
[ ]0, 1, 2
⎞
⎠
⎟⎟⎟⎟⎟
Out[7]: ⎡
⎣
⎢⎢⎢⎢
⎡
⎣
⎢⎢⎢⎢
1111
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
Out[9]: ⎡
⎣
⎢⎢⎢⎢
x1x2x3x4
⎤
⎦
⎥⎥⎥⎥
Out[10]: ⎡
⎣
⎢⎢⎢⎢⎢
− +x1 x2− +x2 x3− +x1 x3− +x1 x4− +x3 x4
⎤
⎦
⎥⎥⎥⎥⎥
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 3 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
In [11]: A_row = Matrix([[1, 0, 0, -1], [0, 1, 0, -1], [0, 0, 1, -1]]).transpose()A_row
In [12]: A
In [13]: A.transpose()
In [14]: A.transpose().rref()
Note how the pivot columns are columns 1, 2, and 4These represent edges 1, 2, 4Note (form the graph above) that thye are independent as they are not a part of a loopA graph without a loop (with 1 less edge than nodes) is called a treeIt has a nullspace of
In [15]: A.transpose().nullspace()
The dimension of the nullspace of A is m - r = number of edges minus (number of nodes - 1)∴ number of nodes - number of edges + number of loops = 1This is Euler's formula and works for all graphsIt tells you how many independent loops there are
There is a connection between potentials and currentsWith 5 edges we will have 5 currents, which we can represent as a vector y
This relationship is Ohm's law
Kirchhoff's law
By the way, Kirchhoff's current law is: A y = 0We can look at it in the following way
Out[11]: ⎡
⎣
⎢⎢⎢⎢
100
−1
010
−1
001
−1
⎤
⎦
⎥⎥⎥⎥
Out[12]: ⎡
⎣
⎢⎢⎢⎢⎢
−10
−1−10
1−1000
0110
−1
00011
⎤
⎦
⎥⎥⎥⎥⎥
Out[13]: ⎡
⎣
⎢⎢⎢⎢
−1100
0−110
−1010
−1001
00
−11
⎤
⎦
⎥⎥⎥⎥
Out[14]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1000
0100
1100
0010
−1−110
⎤
⎦
⎥⎥⎥⎥[ ]0, 1, 3
⎞
⎠
⎟⎟⎟⎟
Out[15]: ⎡
⎣
⎢⎢⎢⎢⎢
,
⎡
⎣
⎢⎢⎢⎢⎢
−1−1100
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
110
−11
⎤
⎦
⎥⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥⎥
T
= [ ]y⎯⎯⎯ y1 y2 y3 y4 y5
T
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 4 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
In [16]: A.transpose()
In [17]: y1, y2, y3, y4, y5 = symbols('y1, y2, y3, y4, y5')
In [18]: y_vect = Matrix([y1, y2, y3, y4, y5])y_vect
In [19]: A.transpose() * y_vect
For row 1 (setting it equal to 0 and looking at graph above tells us that current flows out from node 1 on all these 3 edgesFor row 2 (doing the same as above) we note that for node 2 current flow towards it on edge y and away from it along edge yFor row 3 we note that current flows from node three along edges 2 (edge y ) and 3 (edge y ) and away from it along edge 5 (edge y )For row 4 we note that current flows towards it along edges 4 (edge y ) and 5 (edge y )
Look back at the nullspace of AThe two basis vectors show the flow in current that will allow for NO current to accumulate at a nodeIn this example, current flowed along the loop of edges 1, 2, and 3 (with nothing along 4 and 5The other solution would be current flowing all along the periphery, with nothing along 3These are the basis vectors of the nullspaceAnother valid basis would include flow along the upper loopNotice that the basis is two dimensional as (between the 3 flows explained above) one is a linear combination of the other two
Putting it all together
All of the above can be stated as follows
Wheree is the potential differencesf is an external current in Kirchhoff's law
This gives us the fundamental equation for applications as stated here
These equations are for equilibrium (no Newton's law, no time)
Remember that A A is always symmetric
Example problem
Example problem 1
Out[16]: ⎡
⎣
⎢⎢⎢⎢
−1100
0−110
−1010
−1001
00
−11
⎤
⎦
⎥⎥⎥⎥
Out[18]: ⎡
⎣
⎢⎢⎢⎢⎢⎢
y1y2y3y4y5
⎤
⎦
⎥⎥⎥⎥⎥⎥
Out[19]: ⎡
⎣
⎢⎢⎢⎢
− − −y1 y3 y4−y1 y2
+ −y2 y3 y5+y4 y5
⎤
⎦
⎥⎥⎥⎥
1 22 3 5
4 5
T
= Ae⎯⎯⎯ x⎯⎯⎯= Cy⎯⎯⎯ e⎯⎯⎯
=AT y⎯⎯⎯ f⎯⎯⎯
CA =AT x⎯⎯⎯ f⎯⎯⎯
T
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 5 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
In [20]: Image(filename = 'Graph2.png')
Calculate the incidence matrix ACalculate the nullspaces of A and ACalculate the trace of A A
Solution
In [21]: A = Matrix([[-1, 1, 0, 0, 0], [0, -1, 1, 0, 0], [-1, 0, 1, 0, 0], [0, -1, 0, 1, 0], [0, 0, 0, -1, 1], [0, 0, 1, 0, -1]])A
In [22]: A.rref()
We note that we have 4 independent columnsThe dimension of the nullspace will be n - r = 5 - 4 = 1We will let x = s, then from the row-reduced echelon form abobe we have
In [23]: A.nullspace()
It represents a potential difference between all nodes t be zero: Ax = 0This means that the potential at all nodes must be a constant
Out[20]:
TT
Out[21]: ⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
−10
−1000
1−10
−100
011001
0001
−10
00001
−1
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
Out[22]: ⎛
⎝
⎜⎜⎜⎜⎜⎜⎜
,
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
100000
010000
001000
000100
−1−1−1−100
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥[ ]0, 1, 2, 3
⎞
⎠
⎟⎟⎟⎟⎟⎟⎟
5− = 0x1 x5− = 0x2 x5− = 0x3 x5− = 0x4 x5
= s
⎡
⎣
⎢⎢⎢⎢⎢
x1x2x3x4x5
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
11111
⎤
⎦
⎥⎥⎥⎥⎥
Out[23]: ⎡
⎣
⎢⎢⎢⎢⎢
⎡
⎣
⎢⎢⎢⎢⎢
11111
⎤
⎦
⎥⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥⎥
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 6 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
In [24]: A.transpose().nullspace()
It is of dimension 2, as there are two independent loopsAs per Euler's formula
nodes - edges + loops = 15 - 6 + 2 = 1
This tells us about current that needs to flow so as not to accumulate current at a nodeIt therefor indicates the independent loopsIt works out beautifully
Look at the two loops and assign flow as per the two vector columns for each edge and you will see perfect flow along either of the two independent loops withno current accumulating at any node
We could calculate it from the row-reduced echelon for of A
In [25]: A.transpose().rref()
This gives us 4 independent columns, with dependent y and y
In [26]: A.transpose() * A
In [27]: (A.transpose() * A).trace()
Out[24]: ⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
,
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
−1−11000
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
0−10111
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
T
Out[25]: ⎛
⎝
⎜⎜⎜⎜⎜
,
⎡
⎣
⎢⎢⎢⎢⎢
10000
01000
11000
00100
00010
01
−1−10
⎤
⎦
⎥⎥⎥⎥⎥
[ ]0, 1, 3, 4
⎞
⎠
⎟⎟⎟⎟⎟
3 6= sy6= ty3
+ = + t = 0y1 y3 y1∴ = −ty1+ + = 0y2 y3 y6
∴ = −s − ty2− = − s = 0y4 y6 y4
∴ = sy4− = − s = 0y5 y6 y5
∴ = sy5
= = + = s + t
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
y1y2y3y4y5y6
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
−t−s − t
tsss
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
0−s0sss
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
−t−tt000
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
0−10111
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
−1−11000
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
Out[26]: ⎡
⎣
⎢⎢⎢⎢⎢
2−1−100
−13
−1−10
−1−130
−1
0−102
−1
00
−1−12
⎤
⎦
⎥⎥⎥⎥⎥
Out[27]: 12
2015/03/28, 1:43 PMI_13_Graphs_Incidence_matrices_Kirchhoff_laws
Page 7 of 7http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematics…_Graphs_Incidence_matrices_Kirchhoff_laws.ipynb?download=false
The degree of the node is the number of edges it hasLook at the columns of the incidence matrix AEvery non-trivial (non-zero) entry represents an edgeNote that there are 2 in column 1
This gives us a degree of 2, which will also be the first entry on the diagonal of A AColumn 2 has 3 entries representing 3 edges from node 2 and an entry of 3 on the diagonal of A A... and so onThe trace is therefor just the sum of the degree of all the nodes
In [ ]:
TT
2015/03/28, 1:46 PMI_14_Exam_review
Page 1 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…18_06_Gilbert_Strang/ZIP/I_14_Exam_review.ipynb?download=false
This notebook is part of Exam review 1 in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarnings
In [3]: init_printing(use_latex='mathjax')filterwarnings('ignore')
Exam review
Question 1
Consider three non-zero vectors in �What is the dimension of the subspace that they can span?
Solution 1
One, two, or threeThey can't span a subspace of higher dimension as there are only three vectorsZero cannot be an answer, because they are all non-zero vectors
Question 2
Part 1Consider a 5×3 matrix in echelon form with three pivots
What's the nullspacePart 2
Consider a 10×3 matrix of form the form below and calculate its rank and echelon form
Part 3Give the row-reduced form of the matrix
Solution 2
Out[1]:
7
[ ]R2R
[ ]UU
U0
2015/03/28, 1:46 PMI_14_Exam_review
Page 2 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…18_06_Gilbert_Strang/ZIP/I_14_Exam_review.ipynb?download=false
Part 1The nullspace can only be the zero vector
With three columns all with pivots, we have n - r = 3 - 3 = 0
Part 2Row reduction will take us to
Part 3
In reduced row echelon form
Question 3
Consider
With
Part 1What is the dimension of the rowspace of A and the nullspace of A
Part 2For what values of b can Ax = b be solved?
Solution 3
Part 1Well, the size of the matrix must be 3×3The dimension of the nullspace is 2 (because two non-pivot columns)With n - r = 2, we have r = 1, which must hold for the rowspace of A
Part 2Looking only at the particular solution we must have
So, how did I get the last two columns?Well, they cannot be independent of the first column and the last column must have all zeros to set up the first variable solution fr x , i.e. x = dAdding column 2 to column 1 to get all zeros must result in what you seen for column 2
⎡
⎣⎢⎢
000
⎤
⎦⎥⎥
[ ]R0
[ ]U0
U−U
[ ]U0
0U
A =x⎯⎯⎯⎡
⎣⎢⎢
242
⎤
⎦⎥⎥
x = + c + d⎡
⎣⎢⎢
200
⎤
⎦⎥⎥
⎡
⎣⎢⎢
110
⎤
⎦⎥⎥
⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
2 =⎡
⎣⎢⎢
a11a21a31
⎤
⎦⎥⎥
⎡
⎣⎢⎢
242
⎤
⎦⎥⎥
∴ =⎡
⎣⎢⎢
a11a21a31
⎤
⎦⎥⎥
⎡
⎣⎢⎢
121
⎤
⎦⎥⎥
∴ =⎡
⎣⎢⎢
a11a21a31
a12a22a32
a13a23a33
⎤
⎦⎥⎥
⎡
⎣⎢⎢
121
−1−2−1
000
⎤
⎦⎥⎥
3 3
2015/03/28, 1:46 PMI_14_Exam_review
Page 3 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…18_06_Gilbert_Strang/ZIP/I_14_Exam_review.ipynb?download=false
In [4]: A = Matrix([[1, -1, 0], [2, -2, 0], [1, -1, 0]])x_vect = Matrix([2, 0, 0])x_vect_null_1 = Matrix([1, 1, 0])x_vect_null_2 = Matrix([0, 0, 1])A * x_vect + A * x_vect_null_1 + A * x_vect_null_2
In [5]: A.nullspace()
It can only be solve for scalar multiples of
Question and solution 4
If the nullspace of a square matrix is the zero vector only, does the nullspace of the transpose also only contain the zero vectorYes
Consider the matrix space of all 5×5 matrices; do the invertible 5×5 matrices form a subspaceNo, as the set of invertible matrices would not contain the zero matrixAlso if I add two invertible matrices, I don't know if the resultant matrix is invertibleThe singular ones won't work either as adding two we also don't know if the resultant matrix is invertible
If B = 0, is B = 0?No, i.e
In [6]: B = Matrix([[0, 1], [0, 0]])B ** 2 # Could also use B * B
In [7]: B == B * B # Checking by Boolean statement
A system of n unknowns in n equations is solvable for every b if the columns of the matrix of coefficients are independent?Yes
Question 5
Calculate the basis of the nullspace of B
Solution 5
B will have to be a 3×4 matrixThe last row will be all zeros
Out[4]: ⎡
⎣⎢⎢
242
⎤
⎦⎥⎥
Out[5]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
110
⎤
⎦⎥⎥
⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎡
⎣⎢⎢
121
⎤
⎦⎥⎥
2
B = [ ]00
10
Out[6]: [ ]00
00
Out[7]: False
B =⎡
⎣⎢⎢
101
110
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
010
−110
2−10
⎤
⎦⎥⎥
2015/03/28, 1:46 PMI_14_Exam_review
Page 4 of 4http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…18_06_Gilbert_Strang/ZIP/I_14_Exam_review.ipynb?download=false
In [8]: B1 = Matrix([[1, 1, 0], [0, 1, 0], [1, 0, 1]])B2 = Matrix([[1, 0, -1, 2], [0, 1, 1, -1], [0, 0, 0, 0]])B1, B2
It is important to note that of we multiply an invertible matrix by another matrix (assuming multiplication is possible by the shape of the matrices), then the invertible one(B1 above), plays no part in the nullspace
N(CD) = N(D) if C is invertibleWe therefor only have to look at B2It has 2 pivot columns, i.e. rank is r = 2It will therefor have 2 independent variables, making the nullspace 2-dimensional
In [9]: B = B1 * B2B.nullspace()
In [ ]:
Out[8]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
101
110
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
010
−110
2−10
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[9]: ⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
1−110
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
−2101
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 1:46 PMII_01_Orthogonality_of_vectors_and_subspaces
Page 1 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…01_Orthogonality_of_vectors_and_subspaces.ipynb?download=false
This notebook is part of lecture 14 Orthogonal vectors and subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Orthogonal vectors and subspaces
Rowspace orthogonal to nullspace and columnspace to nullspace of A
N(A A) = N(A)
Orthogonal vectors
Two vectors are orthogonal if their dot product is zeroIf they are written as column vectors x and y, their dot product is x y
For orthogonal (perpendicular) vectors x y = 0From the Pythagorean theorem they are orthogonal if
The length squared of a (column) vector x can be calculated by x xThis achieves exactly the same as the sum of the squares of each element in the vector
Following from the Pythagorean theorem we have
This states that the dot product of orthogonal vectors equal zero
The zero vector is orthogonal to all other similar dimensional vectors
Out[1]:
T
T
TT
+ =∥∥x⎯⎯⎯ ∥∥2 ∥∥y⎯⎯⎯ ∥∥2 +∥∥x⎯⎯⎯ y⎯⎯⎯ ∥∥2
=∥∥x⎯⎯⎯ ∥∥ + + ⋯ +x21 x2
2 x2b
‾ ‾‾‾‾‾‾‾‾‾‾‾‾‾‾‾√
T
+ + ⋯ +x21 x2
2 x2n
+ =∥∥x⎯⎯⎯ ∥∥2 ∥∥y⎯⎯⎯ ∥∥2 +∥∥x⎯⎯⎯ y⎯⎯⎯ ∥∥2
+ = ( + )x⎯⎯Tx⎯⎯ y
⎯⎯T y⎯⎯ ( + )x⎯⎯ y
⎯⎯T x⎯⎯ y⎯⎯
+ = + + +x⎯⎯Tx⎯⎯ y
⎯⎯T y⎯⎯ x⎯⎯
Tx⎯⎯ x⎯⎯Ty⎯⎯ y
⎯⎯T x⎯⎯ y
⎯⎯T y⎯⎯
∵ =x⎯⎯Ty⎯⎯ y
⎯⎯T x⎯⎯
+ = + 2 +x⎯⎯Tx⎯⎯ y
⎯⎯T y⎯⎯ x⎯⎯
Tx⎯⎯ x⎯⎯Ty⎯⎯ y
⎯⎯T y⎯⎯
2 = 0x⎯⎯Ty⎯⎯
= 0x⎯⎯Ty⎯⎯
2015/03/29, 1:46 PMII_01_Orthogonality_of_vectors_and_subspaces
Page 2 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…01_Orthogonality_of_vectors_and_subspaces.ipynb?download=false
Orthogonality of subspaces
Consider two subspaces S and TTo be orthogonal every vector in S must be orthogonal to any vector in T
Consider the XY and YZ planes in 3-spaceThey are not orthogonal, since many combinations of vectors (one in each plane) are not orthogonalVectors in the intersection, even though, one each from each plane can indeed be the same vectorWe can say that any planes that intersect cannot be orthogonal to each other
Orthogonality of the rowspace and the nullspace
The nullspace contains vectors x such that Ax = 0Now remembering that x y = 0 for orthogonal column vectors and considering each row in A as a transposed column vector and x (indeed a column vector) and theirproduct being zero meaning that they are orthogonal, we have:
The rows (row vectors) in A are NOT the only vectors in the rowspace, since we also need to show that ALL linear combinations of them are also orthogonal to xThis is easy to see by the structure above
Orthogonality of the columnspace and the nullspace of A
The proof is the same as above
The orthogonality of the rowspace and the nullspace is creating two orthogonal subspaces in �The orthogonality of the columnspace and the nullspace of A is creating two orthogonal subspaces in �
Note how the dimension add up to the degree of the space �The rowspace (a fundamental subspace in � ) is of dimension rThe dimension of the nullspace (a fundamental subspace in � ) is of dimension n - rAddition of these dimensions gives us the dimension of the total space n as in �ANDThe columnspace is of dimension r and the nullspace of A is of dimension m - r, which adds to m as in �
This means that two lines that may be orthogonal in � cannot be two orthogonal subspaces of � since the addition of the dimensions of these two subspaces (lines) isnot 3 (as in � )
We call this complementarity, i.e. the nullspace and rowspace are orthogonal complements in �
A A
We know thatThe result is squareThe result is symmetric, i.e. (n×m)(m×n)=n×n(A A) = A A = A A
T
=
⎡
⎣
⎢⎢⎢⎢
a11a21
⋮am1
a12a22
⋮am2
……⋮…
a1n
a2n
⋮amn
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
x1x2
⋮xn
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
00⋮0
⎤
⎦
⎥⎥⎥⎥
[ ] = 0a11 a12 … a1n
⎡
⎣
⎢⎢⎢⎢
x1x2
⋮xn
⎤
⎦
⎥⎥⎥⎥…
T
nT m
nn
n
T m
3 33
n
T
T T T TT T
2015/03/29, 1:46 PMII_01_Orthogonality_of_vectors_and_subspaces
Page 3 of 3http://localhost:8888/nbconvert/html/Dropbox/Python/Mathematic…01_Orthogonality_of_vectors_and_subspaces.ipynb?download=false
When Ax = b is not solvable we use A Ax = A bx in the first instance did not have a solution, but after multiplying both side with A , we hope that the second x has an solution, now called
Consider the matrix below with m = 4 equation in n = 2 unknownsThe only b solutions must be linear combinations of the columnspace of A
In [4]: A = Matrix([[1, 1], [1, 2], [1, 5]])A
In [5]: A.transpose() * A
Note how the nullspace of A A is equal to the nullspace of A
In [6]: (A.transpose() * A).nullspace() == A.nullspace()
The same goes for the rank
In [7]: A.rref(), (A.transpose() * A).rref()
A A is not always invertibleIn fact it is only invertible if the nullspace of A only contains the zero vector (has independent columns)
In [ ]:
T TT
A = bAT x AT
Out[4]: ⎡
⎣⎢⎢
111
125
⎤
⎦⎥⎥
+ =x1
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥ x2
⎡
⎣⎢⎢
125
⎤
⎦⎥⎥
⎡
⎣⎢⎢
b1b2b3
⎤
⎦⎥⎥
Out[5]: [ ]38
830
T
Out[6]: True
Out[7]: ⎛
⎝⎜⎜ ,
⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟ ( )[ ] ,1
001 [ ]0, 1
⎞
⎠⎟⎟
T
2015/03/29, 1:51 PMII_02_Projection_onto_subspaces
Page 1 of 3http://localhost:8888/nbconvert/html/II_02_Projection_onto_subspaces.ipynb?download=false
This notebook is part of lecture 15 Projections onto subspaces in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import Imagefrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Projections onto subspaces
Geometry in the plane
Projection of a vector onto another (in the plane)Consider the orthogonal projection of b onto a
In [4]: Image(filename = 'Orthogonal projection in the plane.png')
Note that p falls on a line, which is a subspace of the plane �Remember from the previous lecture that orthogonal subspaces have Ax = 0Note that p is some scalar multiple of aWith a perpendicular to e and e = b - xaThus we have a (b - xa) = 0 and xa a = a bSince a a is a number we can simplify
We also have p = ax
Out[1]:
Out[4]:
2
T T TT
x = a⎯⎯T b⎯⎯
a⎯⎯T a⎯⎯
= x =p⎯⎯ a⎯⎯ a⎯⎯a⎯⎯
T b⎯⎯a⎯⎯
T a⎯⎯
2015/03/29, 1:51 PMII_02_Projection_onto_subspaces
Page 2 of 3http://localhost:8888/nbconvert/html/II_02_Projection_onto_subspaces.ipynb?download=false
This equation is helpfulDoubling (or any other scalar multiple of) b doubles (or scalar multiplies) pDoubling (or scalar multiple of) a has no effect
Eventually we are looking for proj = Pb, where P is the projection matrix
Properties of the projection matrix PThe columnspace of P (C(P)) is the line which contains aThe rank is 1, rank(P) = 1P is symmetrix, i.e. P = PApplying the projection matrix a second time (i.e. P ) nothing changes, thus P = P
Why project?
(projecting onto more than a one-dimensional line)
Because Ax = b may not have a solutionb may not be in the columnspaceMay have more equations than unknowns
Solve for the closest vector in the columnspaceThis is done by solving for p instead, where p is the projection of b onto the columnsapce of A
Now we have to get b orthogonally project (as p) onto the column(sub)spaceThis is done by calculating two bases vectors for the plane that contains p, i.e. a and a
Going way back to the graph up top we note that e is perpendicular to the planeSo, we have:
We know that both a and a is perpendicular to e, so:
We know that from ...
... e must be in the nullspace of AWhich is right because from the previous lecture the nullspace of A is orthogonal to the columnspace of A
Simplifying the last equations we have
Just look back at the plane example in � example we started withSimplifying things back to a column vector a instead of a matrix subspace A in this last equation does give us what we had in �
Solving this we have
p= Pp⎯⎯ b⎯⎯
P = 1a⎯⎯
T a⎯⎯a⎯⎯a⎯⎯
T
T2 2
A =x p⎯⎯
1 2
A =x p⎯⎯1 2
= 0; = 0aT1 e⎯⎯ aT
2 e⎯⎯∵ = −e⎯⎯ b⎯⎯ p⎯⎯
∵ = Ap⎯⎯ x
( − A ) = 0; ( − A ) = 0aT1 b⎯⎯ x aT
2 b⎯⎯ x
[ ] ( − A ) = [ ]aT1
aT2
b⎯⎯ x 00
( − A ) = 0AT b⎯⎯ x T
T
A = bAT x AT
22
=x ( A)AT −1AT b⎯⎯
2015/03/29, 1:51 PMII_02_Projection_onto_subspaces
Page 3 of 3http://localhost:8888/nbconvert/html/II_02_Projection_onto_subspaces.ipynb?download=false
Which leaves us with
Making the projection matrix P
Just note that for a square invertible matrix A, P is the identity matrixMost of the time A is not square (and thus invertible) so we have to leave the equation as it isAlso, note that P = P and P = P
Applications
Least squares
Given a set of data points in two dimensions, i.e. with variables (t,b)We need to fit them onto the best lineSo, as an example consider the points (1,1), (2,2), (3,2)
A best line in this instance means a straight line in the form
Using the three points above we get three equations
If the line goes through all points, we would give a solutionInstead we have the following
Three equation, two unknowns, no solution, so solve ...
... which for the solution is
In [5]: A = Matrix([[1, 1], [1, 2], [1, 3]])A
In [6]: b = Matrix([1, 2, 2])b
In [7]: (A.transpose() * A).inv() * A.transpose() * b
Thus, the solution is:
In [ ]:
= Ap⎯⎯ x
= Ap⎯⎯ ( A)AT −1AT b⎯⎯
P = A( A)AT −1AT
T 2
b = C + Dt
C + D = 1C + 2D = 2C + 3D = 2
[ ] =⎡
⎣⎢⎢
111
123
⎤
⎦⎥⎥
CD
⎡
⎣⎢⎢
122
⎤
⎦⎥⎥
A = bAT x AT
= bx ( A)AT −1AT
Out[5]: ⎡
⎣⎢⎢
111
123
⎤
⎦⎥⎥
Out[6]: ⎡
⎣⎢⎢
122
⎤
⎦⎥⎥
Out[7]:
[ ]2312
b = + t23
12
2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares
Page 1 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false
This notebook is part of lecture 16 Projection matrices and least squares in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import Imagefrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Projection matrices and least squares
In [4]: Image(filename = 'Line.png')
Least squares
Out[1]:
Out[4]:
2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares
Page 2 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false
Consider from the previous lecture the three data point in the plain
From this we need to construct a straight lineThis could be helpful say in, statistics (remember, though in statistics we might have to get rid of statistical outliers)Nonetheless (view image above) we note that we have a straight line in slope-intercept form
On the line at t values of 1, 2, and 3 we will have
The actual y values at these t values are 1, 2, and 2, thoughWe are thus including an error of
Since some are positive and some are negative (actual values below or above the line), we simply determine the square (which will always be positive)Adding the (three in our example here) squares we have the sum total of the error (which is actuall just the sqautre of the distance between the line and actual y values)The line will be the best fit when this error sum is at a minimum (hence least squares)We can do this with calculus or with linear algebraFor calculus we take the partial derivatives of both unknowns and set to zeroFor linear algebra we project orthogonally onto the columnspace (hence minimizing the error)
Note that the solution b does not exist in the columnspace (it is not a linear combination of the columns)
Calculus method
We'll create a function f(C,D) and successively take the partial derivatives of both variables and set it to zeroWe fill then have two equation with two unknowns to solve (which is easy enough to do manually or by simple linear algebra and row reduction)
In [5]: C, D = symbols('C D')
In [6]: e1_squared = ((C + D) - 1) ** 2e2_squared = ((C + 2 * D) - 2) ** 2e3_squared = ((C + 3 * D) - 2) ** 2f = e1_squared + e2_squared + e3_squaredf
In [7]: f.expand() # Expanding the expression
Doing the partial derivatives will be
In [8]: f.diff(C) # Taking the partial derivative with respect to C
In [9]: f.diff(D) # Taking the partial derivative with respect to D
Setting both equal to zero (and creating a simple augmented matrix) we get
( , ) = (1, 1), (2, 2), (3, 2)ti yi
y = C + Dt
= C + D = 1y1= C + 2D = 2y2= C + 3D = 2y3
δy=( )e1
2 [(C + D) − 1]2
=( )e22 [(C + 2D) − 2]2
=( )e32 [(C + 3D) − 2]2
Out[6]: + +(C + D − 1)2 (C + 2D − 2)2 (C + 3D − 2)2
Out[7]: 3 + 12CD − 10C + 14 − 22D + 9C2 D2
f (C, D) = 3 + 12CD − 10C + 14 − 22D + 9C2 D2
= 6C + 12D − 10 = 0∂f∂C
= 12C + 28D − 22 = 0∂f∂D
Out[8]: 6C + 12D − 10
Out[9]: 12C + 28D − 22
6C + 12D − 10 = 012C + 28D − 22 = 0∴ 6C + 12D = 10∴ 12C + 28D = 22
2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares
Page 3 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false
In [10]: A_augm = Matrix([[6, 12, 10], [12, 28, 22]])A_augm
In [11]: A_augm.rref() # Doing a Gauss-Jordan elimination to reduced row echelon form
We now have a solution
Linear algebra
We note that we can construct the following
b is not in the columnspace of A and we have to do orthogonal projection
In [12]: A = Matrix([[1, 1], [1, 2], [1, 3]])b = Matrix([1, 2, 2])A, b # Showing the two matrices
In [13]: x_hat = (A.transpose() * A).inv() * A.transpose() * bx_hat
Again, we get the same values for C and D
Remember the following
p and e are perpendicularIndeed p is in the columnspace of A and e is perpendicular to the columspace (or any vector in the columnspace)
Example problem
Example problem 1
Find the quadratic (second order polynomial) equation through the origin, with the following data points: (1,1), (2,5) and (-1,-2)
Out[10]: [ ]612
1228
1022
Out[11]:
( )[ ] ,10
01
2312
[ ]0, 1
y = + t23
12
C + 1D = 1C + 2D = 2C + 3D = 2
C + D =⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
⎡
⎣⎢⎢
123
⎤
⎦⎥⎥
⎡
⎣⎢⎢
122
⎤
⎦⎥⎥
A =x⎯⎯ b⎯⎯
[ ] =⎡
⎣⎢⎢
111
123
⎤
⎦⎥⎥
CD
⎡
⎣⎢⎢
122
⎤
⎦⎥⎥
A =AT x AT b⎯⎯=x ( A)AT −1AT b⎯⎯
Out[12]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
111
123
⎤
⎦⎥⎥
⎡
⎣⎢⎢
122
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[13]:
[ ]2312
= +b⎯⎯ p⎯⎯ e⎯⎯
2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares
Page 4 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false
Solution
Let's just think about a quadratic equation in y and t
Through the origin (0,0) means y = 0 and t = 0, thus we have
This gives us three equation for our three data points
Clearly b is not in the columnspace of A and we have to project orthogonally onto the columnspace using
In [14]: A = Matrix([[1, 1], [2, 4], [-1, 1]])b = Matrix([1, 5, -2])x_hat = (A.transpose() * A).inv() * A.transpose() * bx_hat
Here's a simple plot of the equation
In [15]: import matplotlib.pyplot as plt # The graph plotting moduleimport numpy as np # The numerical mathematics module%matplotlib inline
y = + Ct + Dc1 t2
0 = + C0 + Dc1 02
= 0c1y = Ct + Dt2
C (1) + D = 1(1)2
C (2) + D = 5(2)2
C (−1) + D = −2(−1)2
C + D =⎡
⎣⎢⎢
12
−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
141
⎤
⎦⎥⎥
⎡
⎣⎢⎢
15
−2
⎤
⎦⎥⎥
A =⎡
⎣⎢⎢
12
−1
141
⎤
⎦⎥⎥
= [ ]x⎯⎯CD
=b⎯⎯
⎡
⎣⎢⎢
15
−2
⎤
⎦⎥⎥
=x ( A)AT −1AT b⎯⎯
Out[14]:
[ ]4122522
2015/03/29, 1:56 PMII_03_Projection_matrices_and_least_squares
Page 5 of 5http://localhost:8888/nbconvert/html/II_03_Projection_matrices_and_least_squares.ipynb?download=false
In [16]: x = np.linspace(-2, 3, 100) # Creating 100 x-valuesy = (41 / 22) * x + (5 / 22) * x ** 2 # From the equation aboveplt.figure(figsize = (8, 6)) # Creating a plot of the indicated sizeplt.plot(x, y, 'b-') # Plot the equation above , in essence 100 little plots using small segmnets of blue linesplt.plot(1, 1, 'ro') # Plot the point in a red dotplt.plot(2, 5, 'ro')plt.plot(-1, -2, 'ro')plt.plot(0, 0, 'gs') # Plot the origin as a green squareplt.show(); # Create the plot
In [ ]:
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 1 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
This notebook is part of lecture 17 Orthogonal matrices and Gram-Scmidt in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, Matrix, sin, cos, sqrt, Rational, GramSchmidtfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: theta = symbols('theta')
Orthogonal basis
Orthogonal matrix
Gram-Schmidt
Orthogonal basis
Here we mean vectors q ,q ,...,q
We actually mean orthonormal vectors (for orthogonal or perpendicular and of unit length / normalized)Vectors that are orthogonal have a dot product equal to zero
If they are orthogonal
If they are not
Orthogonal matrix
We can now put these (column) basis vectors into a matrix Q
This brings about
In the case of the matrix Q being square the word orthogonal matrix is usedWhen it is square we can calculate the inverse making
Consider the following permutation matrix with orthonormal column vectors
Out[1]:
1 2 n
= 0qTi qj
≠ 0qTi qj
Q = IQT
=QT Q−1
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 2 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
In [5]: Q = Matrix([[0, 0, 1], [1, 0, 0], [0, 1, 0]])Q, Q.transpose()
In this example the transpose also contains orthonormal column vectorsMultiplication gives the identity matrix
In [6]: Q.transpose() * Q
Consider this example
In [7]: Q = Matrix([[cos(theta), -sin(theta)], [sin(theta), cos(theta)]])Q, Q.transpose()
The two column vectors are orthogonal and the length of each column vector is 1It is thus an orthogonal matrix
In [8]: Q.transpose() * Q
The example below certainly has orthogonal column vectors, but they are not of unit length
Well, we can change them into unit vectors by dividing each component by the length of that vector
As it stands Q Q is not the identity matrix
In [9]: Q = Matrix([[1, 1], [1, -1]])Q.transpose() * Q
Turning it into an orthogonal matrix
In [10]: Q = (1 / sqrt(2)) * Matrix([[1, 1], [1, -1]])Q
In [11]: Q.transpose() * Q
Out[5]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
010
001
100
⎤
⎦⎥⎥
⎡
⎣⎢⎢
001
100
010
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[6]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
Out[7]: ( )[ ] ,cos (θ)sin (θ)
− sin (θ)cos (θ) [ ]cos (θ)
− sin (θ)sin (θ)cos (θ)
Out[8]: [ ](θ) + (θ)sin2 cos2
00
(θ) + (θ)sin2 cos2
Q = [ ]11
1−1
=+(1)2 (1)2‾ ‾‾‾‾‾‾‾‾‾√ 2‾√
=+(1)2 (−1)2‾ ‾‾‾‾‾‾‾‾‾‾‾√ 2‾√
Q = [ ]12‾√
11
1−1
T
Out[9]: [ ]20
02
Out[10]: ⎡⎣⎢⎢
2√22√
2
2√2
− 2√2
⎤⎦⎥⎥
Out[11]: [ ]10
01
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 3 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
Consider this example with orthogonal (but not orthonormal) column vectors
In [12]: Q = Matrix([[1, 1, 1, 1], [1, -1, 1, -1], [1, 1, -1, -1], [1, -1, -1, 1]])Q
Again, as it stands Q Q is not the identity matrix
In [13]: Q.transpose() * Q
But turning it into an orthogonal matrix works
In [14]: Q = Rational(1, 2) * Matrix([[1, 1, 1, 1], [1, -1, 1, -1], [1, 1, -1, -1], [1, -1, -1, 1]])# Rational() creates a mathematical fraction instead of a decimalQ
In [15]: Q.transpose() * Q
Consider this matrix Q with orthogonal column vectors, but that is not square
In [16]: Q = Rational(1, 3) * Matrix([[1, -2], [2, -1], [2, 2]])Q
We now have a matrix with two column vectors that are normalized and orthogonal to each other and they form a basis for a plane (subspace) in �
There must be a third column matrix of unit length, orthogonal to the other two so we end up with an orthogonal matrix
In [17]: Q = Rational(1, 3) * Matrix([[1, -2, 2], [2, -1, -2], [2, 2, 1]])Q
In [18]: Q.transpose() * Q
Out[12]: ⎡
⎣
⎢⎢⎢⎢
1111
1−11
−1
11
−1−1
1−1−11
⎤
⎦
⎥⎥⎥⎥
T
Out[13]: ⎡
⎣
⎢⎢⎢⎢
4000
0400
0040
0004
⎤
⎦
⎥⎥⎥⎥
Out[14]: ⎡
⎣
⎢⎢⎢⎢⎢
12121212
12
− 12
12
− 12
1212
− 12
− 12
12
− 12
− 12
12
⎤
⎦
⎥⎥⎥⎥⎥
Out[15]: ⎡
⎣
⎢⎢⎢⎢
1000
0100
0010
0001
⎤
⎦
⎥⎥⎥⎥
Out[16]: ⎡
⎣
⎢⎢⎢
132323
− 23
− 13
23
⎤
⎦
⎥⎥⎥
3
Out[17]: ⎡
⎣
⎢⎢⎢
132323
− 23
− 13
23
23
− 23
13
⎤
⎦
⎥⎥⎥
Out[18]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 4 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
Let's make use of these matrices with orthonormal columns (which we will always denote with a letter Q) and project them onto their columnspaceWhat would the projection matrix be?
Remember, though that for matrices with orthonormal column vectors we have Q Q is the identity matrix and we have
If additionally, Q is square, then we have independent columns and the columnspace contain the whole space � and the projection matrix is the identity matrix in nRemember Q = Q in these cases making it easy to see that we get the identity matrixRemember also that the projection matrix is symmetricLastly the projection matrix has the property of squaring it leaves us in the same spot, so here we will have (QQ ) =QQ
All of this has the final consequence that
Gram-Schmidt
All of the above makes things quite easy, so we should try and create orthogonal matrices
Good, let's start with two independent vectors a and b and try and create two orthogonal vectors A and B and then create two orthonormal vectors
We can choose one of them as our initial vector, say a = A, so we have to get an orthogonal projection (to a) for BThis is what we previously called the error vector e
Remembering how to get p we have the following
Let's do an example
In [19]: a = Matrix([1, 1, 1])b = Matrix([1, 0, 2])a, b
In [20]: A = aA
In [21]: A.transpose() * b
In [22]: A.transpose() * A
Q = bx⎯⎯P = Q( Q)QT −1QT
T
P = QQT
nT -1
T 2 T
Q =QT x QT b⎯⎯=x QT b⎯⎯= bx i qT
i
=q1A
∥A∥
=q2B
∥B∥
= −e⎯⎯ b⎯⎯ p⎯⎯
B = − Ab⎯⎯AT b⎯⎯
AAT
Out[19]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
⎡
⎣⎢⎢
102
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[20]: ⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
Out[21]: [ ]3
Out[22]: [ ]3
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 5 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
In [23]: B = b - AB
Checking that they are perpendicular
In [24]: A.transpose() * B
Now we have to create Q by turning A and B into unit vectors and place them in the same matrix
In [25]: A.normalized() # Easy way no normalize a matrix
In [26]: B.normalized()
In [27]: Q = Matrix([[sqrt(3) / 3, 0], [sqrt(3) / 3, -sqrt(2) / 2], [sqrt(3) / 3, sqrt(2) / 2]])Q
The columnspace of the original matrix (of two column vectors) and Q are the same
In python™ we can use the following code
In [28]: # The column matrices (independant orthogonal column vectors) are entered indivisually inside square bracket []A = [Matrix([1, 1, 1]), Matrix([1, 0, 2])]A
In [29]: Q = GramSchmidt(A, True) # The True argument normalizes the columnsQ
Example problems
Example problem 1
Out[23]: ⎡
⎣⎢⎢
0−11
⎤
⎦⎥⎥
Out[24]: [ ]0
Out[25]: ⎡
⎣
⎢⎢⎢⎢
3√33√
33√
3
⎤
⎦
⎥⎥⎥⎥
Out[26]: ⎡
⎣
⎢⎢⎢⎢
0− 2√
22√
2
⎤
⎦
⎥⎥⎥⎥
Out[27]: ⎡
⎣
⎢⎢⎢⎢
3√33√
33√
3
0
− 2√22√
2
⎤
⎦
⎥⎥⎥⎥
Out[28]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
⎡
⎣⎢⎢
102
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[29]: ⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
3√33√
33√
3
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0− 2√
22√
2
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 1:59 PMII_04_Orthogonal_matrices_Gram_Schmidt
Page 6 of 6http://localhost:8888/nbconvert/html/II_04_Orthogonal_matrices_Gram_Schmidt.ipynb?download=false
Create an orthogonal matrix from the following matrix
Solution
In [30]: A = [Matrix([1, 0, 0]), Matrix([2, 0, 3]), Matrix([4, 5, 6])]A
In [31]: Q = GramSchmidt(A, True)Q
We can also consider QR-factorization
In [32]: from sympy.mpmath import matrix, qr
In [33]: A = matrix([[1, 2, 4], [0, 0, 5], [0, 3, 6]])print(A)
In [34]: Q, R = qr(A)
In [35]: print(Q)
In [36]: print(R)
In [ ]:
⎡
⎣⎢⎢
100
203
456
⎤
⎦⎥⎥
Out[30]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥ ,
⎡
⎣⎢⎢
203
⎤
⎦⎥⎥
⎡
⎣⎢⎢
456
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[31]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥ ,
⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
010
⎤
⎦⎥⎥
⎤
⎦⎥⎥
[1.0 2.0 4.0][0.0 0.0 5.0][0.0 3.0 6.0]
[1.0 0.0 0.0][0.0 0.0 -1.0][0.0 -1.0 0.0]
[1.0 2.0 4.0][0.0 -3.0 -6.0][0.0 0.0 -5.0]
2015/03/29, 2:01 PMII_05_Properties_of_the_determinant
Page 1 of 4http://localhost:8888/nbconvert/html/II_05_Properties_of_the_determinant.ipynb?download=false
This notebook is part of lecture 18 Properties of determinants in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbolsfrom IPython.display import HTMLfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: a, b, c = symbols('a b c')
Properties of the determinant
Notation
The determinant of a matrix A is written as det(A) or |A|
Main properties
There are three main properties (first three listed below) and seven that follow from them
Out[1]:
2015/03/29, 2:01 PMII_05_Properties_of_the_determinant
Page 2 of 4http://localhost:8888/nbconvert/html/II_05_Properties_of_the_determinant.ipynb?download=false
det(I)=1A row exchange changes the sign of the determinant
We now know the determinant of every permutation matrixMultiplying any row with a constant results in the determinant also being multiplied by that constant
* Only works when altering a single row (the determinant is a linear function of each row separately) * An alternate way of seeing this is
If two rows are equal then the determinant is zeroThis follows from property number two, where if we interchange rows the sign must changeThis only works for zero, since row exchange leaves the matrix unchanged, which now can't have a different determinant (opposite sign)
Subtracting a constant multiple of one row from another leaves the determinant unchangedThis flows from property three above
From property four (determinant of matrix with two similar rows equal zero) we now have the following
The determinant of a matrix with a complete row (or columns) of zero is zeroThis also follows from property three above, but multiplying a row by zero
The determinant of an upper triangular matrix is the product of the elements of the main diagonal (the pivots)Same goes for a diagonal matrix
This helps us to develop the expression for the determinant of a 2×2 matrix
We can change an upper triangular matrix into a diagonal matrix by row operations (leaving the determinant unchanged by property five)Now we can use the first part of the third property and take out each pivotEventually we are left with the identity matrix and the product of all the pivotsFor a zero on the main diagonal we can use the property of a row of zeros and know the determinant is zero
If the determinant is zero, the matrix is singular (only has the zero solution / not invertible)The determinant of the product of matrices
For the determinant of a transpose of a matrix we have
Example problems
Do the following by making use of the properties above
Example problem 1
= +∣∣∣ a + a′
cb + b′
d∣∣∣
∣∣∣ ac
bd
∣∣∣
∣∣∣ a′
cb′
d∣∣∣
= +∣∣∣ ac − la
bd − lb
∣∣∣
∣∣∣ ac
bd
∣∣∣
∣∣∣ a−la
b−lb
∣∣∣
= + (−l)∣∣∣ ac
bd
∣∣∣
∣∣∣ aa
bb
∣∣∣
= + 0∣∣∣ ac
bd
∣∣∣
=∣∣∣ ac
bd
∣∣∣
[ ]ac
bd
[ ]aca
cbc
a
d
[ ]ac − ac
a
bd − bc
a
[ ]a0
bd − bc
a
∴∣∣∣∣a0
bd − bc
a
∣∣∣∣
= (a) (d − b)ca
= ad − a bca
= ad − bc
|AB| = |A| |B|=∣∣A−1 ∣∣
1|A|
= |A| |A| =∣∣A2∣∣ |A|2
|cA| = |A|cn
= |A|∣∣AT ∣∣
2015/03/29, 2:01 PMII_05_Properties_of_the_determinant
Page 3 of 4http://localhost:8888/nbconvert/html/II_05_Properties_of_the_determinant.ipynb?download=false
In [5]: A = Matrix([[101, 201, 301], [102, 202, 302], [103, 203, 303]])A
Solution
By constant multiple subtraction we get
Two identical rows, thus the determinant is zero
In [6]: A.det()
Example problem 2
In [7]: A = Matrix([[1, a, a ** 2], [1, b, b ** 2], [1, c, c ** 2]])A
Solution
Subtracting constant multiple of row 1 from rows 2 and three and expanding the elements
Using property three that states that the determinant is a linear property of each row
Another elimination on row three
Now we have upper triangular form and the determinant is the product of the elements in the main diagonal and also multiplying the (b-a)(c-a)
In [8]: (A.det()).factor() # Calculating the determinant and factorizing the result
This is called a Vandermonde matrixhttp://en.wikipedia.org/wiki/Vandermonde_matrix (http://en.wikipedia.org/wiki/Vandermonde_matrix)
Example problem 3
Out[5]: ⎡
⎣⎢⎢
101102103
201202203
301302303
⎤
⎦⎥⎥
⎡
⎣⎢⎢
10111
20111
30111
⎤
⎦⎥⎥
Out[6]: 0
Out[7]: ⎡
⎣⎢⎢⎢
111
abc
a2
b2
c2
⎤
⎦⎥⎥⎥
=∣
∣
∣∣∣
100
ab − ac − a
a2
−b2 a2
−c2 a2
∣
∣
∣∣∣
=∣
∣
∣∣∣
100
ab − ac − a
a2
(b − a) (b + a)(c − a) (c + a)
∣
∣
∣∣∣
= (b − a) (c − a) =∣
∣
∣∣∣
100
a11
a2
(b + a)(c + a)
∣
∣
∣∣∣
(b − a) (c − a) =∣
∣
∣∣∣
100
a11
a2
(b + a)(c + a)
∣
∣
∣∣∣
= (b − a) (c − a) (b − c)
Out[8]: − (a − b) (a − c) (b − c)
2015/03/29, 2:01 PMII_05_Properties_of_the_determinant
Page 4 of 4http://localhost:8888/nbconvert/html/II_05_Properties_of_the_determinant.ipynb?download=false
In [9]: A = Matrix([1, 2, 3]) * Matrix([[1, -4, 5]])A
Solution
The rows of the resultant 3×3 matrix is linearly dependent, i.e. they are 1 times the row (1,-4,5), then twice this same row for row two and lastly three times the same rowfor row threeThis means that the determinant will be zero
In [10]: A.det()
Example problem 4
In [11]: A = Matrix([[0, 1, 3], [-1, 0, 4], [-3, -4, 0]])A
Solution
Note how this matrix is skew symmetricThis means that A =-AWith the matrices A and -A being equal, their determinant are equalRemember, though that the determinant of a matrix is the same as the determinant of the transpose of the matrix
In [12]: A.det()
Not all skew symmetric matrices have a zero determinantIt only works because n is odd for this size matrix being 3×3 allowing for the negative
In [ ]:
Out[9]: ⎡
⎣⎢⎢
123
−4−8−12
51015
⎤
⎦⎥⎥
Out[10]: 0
Out[11]: ⎡
⎣⎢⎢
0−1−3
10
−4
340
⎤
⎦⎥⎥
TT
|A| = = |−A| = |A| = − |A|∣∣AT ∣∣ (−1)3
|A| = − |A|∴ |A| = 0
Out[12]: 0
2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors
Page 1 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false
This notebook is part of lecture 19 Determinant formulas and cofactors in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, Matrixfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: x, y = symbols('x y')
Determinant formulas and cofactorsTridiagonal matrices
Creating an equation for the determinant of a 2×2 matrix
Using just the three main properties from the previous lecture and knowing that the determinant of a matrix with a column of zero's is zero we have the following
Creating an equation for the determinant of a 3×3 matrix
By the method above, this will create a lot of matricesWe need to figure out which ones remain, i.e. do not have columns of zerosNote carefully that we just keep those with at least one element from each row and column
Out[1]:
= +∣∣∣ ac
bd
∣∣∣
∣∣∣ ac
0d
∣∣∣
∣∣∣ 0c
bd
∣∣∣
= + + +∣∣∣ ac
00
∣∣∣
∣∣∣ a0
0d
∣∣∣
∣∣∣ 0c
b0
∣∣∣
∣∣∣ 00
bd
∣∣∣
∵ = = 0∣∣∣ ac
00
∣∣∣
∣∣∣ 00
bd
∣∣∣
+∣∣∣ a0
0d
∣∣∣
∣∣∣ 0c
b0
∣∣∣
= −∣∣∣ a0
0d
∣∣∣
∣∣∣ c0
0b
∣∣∣
= ad − bc
∣
∣
∣∣∣
a11a21a31
a12a22a32
a13a23a33
∣
∣
∣∣∣
= + + + + +∣
∣
∣∣∣
a1100
0a220
00
a33
∣
∣
∣∣∣
∣
∣
∣∣∣
a1100
00
a32
0a230
∣
∣
∣∣∣
∣
∣
∣∣∣
0a210
a1200
00
a33
∣
∣
∣∣∣
∣
∣
∣∣∣
00
a31
a1200
0a230
∣
∣
∣∣∣
∣
∣
∣∣∣
0a210
00
a32
a1300
∣
∣
∣∣∣
∣
∣
∣∣∣
00
a31
0a220
a1300
∣
∣
∣∣∣
= − − + + −a11 a22 a33 a11 a23 a32 a12 a21 a33 a12 a23 a31 a13 a21 a32 a13 a22 a31
2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors
Page 2 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false
Creating an equation for the determinant of a n × n matrix
We will have n! terms, half of which is positive and the other half negativeWe have n because for the first row we have n positions to choose from, the for the second lot we have n-1 and so on
This holds for permuations of the columns (each used only once)
Consider this example
Successively choosing a single element from each column (using column numbers for the Greek symbols above), we get the following permutations (note their sign as weinterchange the numbers to follow in order 1 2 3 4
(4,3,2,1) = (1,2,3,4) Two swaps(3,2,1,4) = -(1,2,3,4) One swapThat is it!So we have 1 - 1 = 0
Note that in this example of a 4×4 matrix a lot of the permutations would have a zero in the, so we won't end up with 4! = 24 permutations
In [5]: A = Matrix([[0, 0, 1, 1], [0, 1, 1, 0], [1, 1, 0, 0], [1, 0, 0, 1]])A
In [6]: A.det()
We could have seen that this matrix is singular by noting that some combination of rows give identical rows and then by subtraction, a row of zero
In [7]: A.rref()
Cofactors of a 3×3 matrix
Start with the equation above
The cofactors are in parentheses and are the 2×2 submatrix determinantsThey signify the determinant of a smaller (n-1) matrix with some sign problems, i.e. some are positive the determinant and some are negative the determinantWe are especially interested here in row one, but any row (or even column) will doSo for any a the cofactor is the ± determinant of the n-1 matrix with its i row and j column erasedFor the sign, if i + j is even, the sign is positive and if it is odd, then the sign is negativeSo the cofactor of a = C
For rows we have
|A| = ∑ ± . . .a1α a2β a3γ anω
(α, β, γ, δ, … , ω) = (1, 2, 3, 4, … , n)
⎡
⎣
⎢⎢⎢⎢
0011
0110
1100
1001
⎤
⎦
⎥⎥⎥⎥
Out[5]: ⎡
⎣
⎢⎢⎢⎢
0011
0110
1100
1001
⎤
⎦
⎥⎥⎥⎥
Out[6]: 0
Out[7]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1000
0100
0010
1−110
⎤
⎦
⎥⎥⎥⎥[ ]0, 1, 2
⎞
⎠
⎟⎟⎟⎟
− − + + −a11 a22 a33 a11 a23 a32 a12 a21 a33 a12 a23 a31 a13 a21 a32 a13 a22 a31= ( − ) + (− + ) + ( − )a11 a22 a33 a23 a32 a12 a21 a33 a23 a31 a13 a21 a32 a22 a31
ij
ij ij
=|A|i ∑k=1
naikCik
2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors
Page 3 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false
Diagonal matrices
Calculate
In [8]: A = Matrix([1])A
In [9]: A.det()
Calculate
In [10]: A = Matrix([[1, 1], [1, 1]])A
In [11]: A.det()
Calculate
In [12]: A = Matrix([[1, 1, 0], [1, 1, 1], [0, 1, 1]])A
By the cofactor equation above
In [13]: A.det()
Calculate
In [14]: A = Matrix([[1, 1, 0, 0], [1, 1, 1, 0], [0, 1, 1, 1], [0, 0, 1, 1]])A
In [15]: A.det()
∣∣A1∣∣
Out[8]: [ ]1
Out[9]: 1
∣∣A2∣∣
Out[10]: [ ]11
11
Out[11]: 0
∣∣A3∣∣
Out[12]: ⎡
⎣⎢⎢
110
111
011
⎤
⎦⎥⎥
=|A|i ∑k=1
naikCik
= + +|A|1 a11 C11 a12 C12 a13 C13→ +; (i + j) ∈ 2nCij
→ −; (i + j) ∈ 2n + 1Cij
= 1 (0) − 1 (1) + 0 (1) = −1|A|1
Out[13]: −1
∣∣A4∣∣
Out[14]: ⎡
⎣
⎢⎢⎢⎢
1100
1110
0111
0011
⎤
⎦
⎥⎥⎥⎥
Out[15]: −1
2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors
Page 4 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false
Continuing on this path of tridiagonal matrices we have
We would thus have
We note that A starts the sequence all over againTridiagonal matrices have determinants of period 6
Example problems
Example problem 1
Calculate the determinant of the following matrix
In [16]: A = Matrix([[x, y, 0, 0, 0,], [0, x, y ,0 ,0 ], [0, 0, x, y, 0], [0, 0, 0, x, y], [y, 0, 0, 0, x]])A
In [17]: A.det()
Solution
Note how first selecting row 1's x and the y leaves triangular matrices in the remaining (n-1)×(n-1) matrixThese form cofactors and their determinant are simply the product of the entries along the main diagonalWe simply have to remember the sign rule, which well be (-1)
Example problem 2
In [18]: A = Matrix([[x, y, y, y, y], [y, x, y, y, y], [y, y, x, y, y], [y, y, y, x, y], [y, y, y, y, x]])A
Solution
In [19]: A.det()
In [20]: (A.det()).factor()
= −∣∣An∣∣ ∣∣An−1 ∣∣ ∣∣An−2 ∣∣
= −∣∣A5 ∣∣ ∣∣A4∣∣ ∣∣A3∣∣= −1 − (−1) = 0∣∣A5∣∣
= −∣∣A6∣∣ ∣∣A5∣∣ ∣∣A4∣∣= 0 − (−1) = 1∣∣A6∣∣
7
Out[16]: ⎡
⎣
⎢⎢⎢⎢⎢⎢
x000y
yx000
0yx00
00yx0
000yx
⎤
⎦
⎥⎥⎥⎥⎥⎥
Out[17]: +x5 y5
(5+1)
|A| = x ( ) + y ( ) = +x4 y4 x5 y5
Out[18]: ⎡
⎣
⎢⎢⎢⎢⎢⎢
xyyyy
yxyyy
yyxyy
yyyxy
yyyyx
⎤
⎦
⎥⎥⎥⎥⎥⎥
Out[19]: − 10 + 20 − 15x + 4x5 x3 y2 x2 y3 y4 y5
Out[20]: (x + 4y)(x − y)4
2015/03/29, 2:04 PMII_06_Determinant_formulas_and_cofactors
Page 5 of 5http://localhost:8888/nbconvert/html/II_06_Determinant_formulas_and_cofactors.ipynb?download=false
Note that we can introduce many zero entry by the elementary row operation of subtracting one row from anotherLet's subtract row 4 from row 5
Now subtract row 3 from 4
Subtract 2 from 3
... and 1 from 2
Now consider some column operations, adding the 5th column to the fourth column and then 4 to 3 etc...This will introduce new non-zero entries, thoughThese can be changed back to a zero by adding the 5 column and the 4 to the 3Then columns 5, 4, 3 to 2, etc...
This is upper triangular and the determinant is the product of the entries on the main diagonal
In [21]: (x + 4 * y) * (x - y) ** 4
In [22]: ((x + 4 * y) * (x - y) ** 4).expand()
In [ ]:
⎡
⎣
⎢⎢⎢⎢⎢⎢
xyyy0
yxyy0
yyxy0
yyyx
y − x
yyyy
x − y
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
xyy00
yxy00
yyx
y − x0
yyy
x − yy − x
yyy0
x − y
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
xy000
yx
y − x00
yy
x − yy − x
0
yy0
x − yy − x
yy00
x − y
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
xy − x
000
yx − yy − x
00
y0
x − yy − x
0
y00
x − yy − x
y000
x − y
⎤
⎦
⎥⎥⎥⎥⎥⎥
th rd
/th th rd
⎡
⎣
⎢⎢⎢⎢⎢⎢
x + 4y0000
4yx − y
000
3y0
x − y
0
2y00
x − y0
y000
x − y
⎤
⎦
⎥⎥⎥⎥⎥⎥
Out[21]: (x + 4y)(x − y)4
Out[22]: − 10 + 20 − 15x + 4x5 x3 y2 x2 y3 y4 y5
2015/03/29, 2:07 PMII_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box
Page 1 of 4http://localhost:8888/nbconvert/html/II_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box.ipynb?download=false
This notebook is part of lecture 20 Cramer's rule, inverse, and volume of a box in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, symbols, eye, Matrix, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Equations for the inverse of a matrix
Cramer's rule
The volume of a box
Deriving an equation for the inverse of a matrix
The equation for the inverse of a matrix
With arithmetic alteration we have the following
Writing out the left-hand side we have
Out[1]:
=A−1 1|A| C
T
∴ A = |A| ICT
⎡
⎣
⎢⎢⎢⎢
a11a21
⋮an1
a12a22
⋮an2
…………
a1n
a2n
⋮ann
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
C11C12
⋮C1n
C21C22
⋮C2n
…………
Cn1Cn2
⋮Cnn
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 2:07 PMII_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box
Page 2 of 4http://localhost:8888/nbconvert/html/II_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box.ipynb?download=false
From the previous lecture we had the equation for the determinant (using cofactors), which correlates with the above (looking at row 1 times column 1 (i=1)
Alas we have to get |A| only on the main diagonal for the right-hand side aboveIt follows, though that i.e. for position row 1, column 2 we do get a zeroLook at the 2×2 matrix
So for AC we would have the following (note, though what happens if we try and get row 1, column 2
... and that's so cool!
Cramer's rule
From Ax=b we have x=A b, which gives us the following
This is difficult to see, but we successively replace each column in A with the column vector b, whic creates a bunch of new matrices B , such that the following applies
The volume of a box (parallelepiped )
Consider a box in three dimensions (each side is a parallelogram)Make one corner coincide with originThe vector coordinate of the three sides that emanate from this corner become the rows of a square matrix, 3×3 in this caseThe volume is then the determinant of this matrix
Consider a square box of sides of unit length one
In [4]: A = eye(3)A
In [5]: A.det()
This proves the first property of determinants
What about the orthogonal matrix
|A| = + …ai1Ci1 ai2 Ci2 ainCin
=[ ]ac
bd
−1 1|A|C
T
=[ ]ac
bd
−1 1|A|[ ]d
−b−ca
T
= [ ][ ]ac
bd
−1 1|A|
d−c
−ba
T
[ ] [ ] = [ ] = [ ] = [ ] = |A| [ ]ac
bd
d−c
−ba
ad − bccd − cd
−ab + abad − bc
ad − bc0
0ad − bc
|A|0
0|A|
10
01
-1
=x⎯⎯1|A| C
Tb⎯⎯
∴ =
⎡
⎣
⎢⎢⎢⎢
x1x2
⋮xn
⎤
⎦
⎥⎥⎥⎥1|A|
⎡
⎣
⎢⎢⎢⎢
C11C12
⋮C1n
C21C22
⋮C2n
…………
Cn1Cn2
⋮Cnn
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
b1b2
⋮bn
⎤
⎦
⎥⎥⎥⎥
j
=
⎡
⎣
⎢⎢⎢⎢
x1x2
⋮xn
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
∣∣B1 ∣∣|A|∣∣B2 ∣∣|A|
⋮∣∣Bn ∣∣|A|
⎤
⎦
⎥⎥⎥⎥⎥⎥
Out[4]: ⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
Out[5]: 1
2015/03/29, 2:07 PMII_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box
Page 3 of 4http://localhost:8888/nbconvert/html/II_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box.ipynb?download=false
Here we have the three edges being orthonormalWe know that Q Q = I
A rectangular box (edges square)
Doubling an edge doubles the volumeThis is akin to a single row multiplied by a scalarThus determinant will increase by this scalar (multiplication)
Area of parallelogram and a triangle
The area of a parallelogram is just the determinant of a 2×2 matrix with the rows being row vectors of two sides from the originThe area of a triangle is simply half of this
For the triangle that is not at the origin, with three angles at (x , y ), (x , y ) and (x , y ), we simply subtract values along the axes from each otherThat is akin to getting the determinant of this matrix
Simple row reduction is equivalent to moving the triangle by the subtraction above
Example problems
Example problem 1
Calculate the volume of the tetrahedron with vertices being the following vectorsa =(2,2,-1), a =(1,3,0), a =(-1,1,4)
Also calculate the volume if a =(-201,-199,104)
Solution
In [6]: A = Matrix([[2, 2, -1], [1, 3, 0], [-1, 1, 4]])A
The volume of a tetrahedron is a third times the area of the (any) base and the height from the (chosen) baseThe volume of a parallelepiped is the area of the base times the heightIf we keep the base of the two the same and the apex the same, we note that the base is twice the area of the triangle that forms the base of the tetrahedronWe thus have that the volume of the tetrahedron is a 6 of the volume of the parallelepiped
T
Q = |I |∣∣QT ∣∣Q = |I |∣∣QT ∣∣ ∣∣ ∣∣
∵ = Q∣∣QT ∣∣ ∣∣ ∣∣∴ Q Q = |I |∣∣ ∣∣ ∣∣ ∣∣
= |I | = 1Q∣∣ ∣∣2∴ Q = ±1∣∣ ∣∣
1 1 2 2 3 3
∣
∣
∣∣∣
x1x2x3
y1y2y3
111
∣
∣
∣∣∣
∣
∣
∣∣∣
x1−x2 x1−x3 x1
y1−y2 y1−y3 y1
100
∣
∣
∣∣∣
1 2 33
Out[6]: ⎡
⎣⎢⎢
21
−1
231
−104
⎤
⎦⎥⎥
th
2015/03/29, 2:07 PMII_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box
Page 4 of 4http://localhost:8888/nbconvert/html/II_07_Equations_for_the_inverse_Cramer_rule_Volume_of_a_box.ipynb?download=false
In [7]: A.det()
In [8]: Rational(1, 6) * A.det()
In [9]: A_new = Matrix([[2, 2, -1], [1, 3, 0], [-201, -199, 104]])A_new
In [10]: Rational(1, 6) * A_new.det()
By the second part of the third property of determinants we know, though, that a constant multiple of a row subtracted from another (one of the elementary rowoperations) does not change the determinantIn this case we subtracted 100 times row 1 from row 3
In effect, the height is not changing; the apex is moving away parallel to a
In [ ]:
Out[7]: 12
Out[8]: 2
Out[9]: ⎡
⎣⎢⎢
21
−201
23
−199
−10
104
⎤
⎦⎥⎥
Out[10]: 2
1
2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors
Page 1 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false
This notebook is part of lecture 21 Eigenvalues and eigenvectors in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eyefrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: lamda = symbols('lamda') # Note that lambda is a reserved word in python, so we use lamda (without the b)
Eigenvalues and eigenvectors
What are eigenvectors?
A Matrix is a mathematical object that acts on a (column) vector, resulting in a new vector, i.e. Ax=bAn eigenvector is the resulting vector that is parallel to x (some multiple of x)
The eigenvectors with an eigenvalue of zero are the vectors in the nullspaceIf A is singular (takes some non-zero vector into 0) then λ=0
What are the eigenvectors and eigenvalues for projection matrices?
A projection matrix P projects some vector (b) onto a subspace (in 3-space we are talking about a plane through the origin)Pb is not in the same direction as bA vector x that is already in the subspace will result in Px=x, so λ=1Another good x would be one perpendicular to the subspace, i.e. Px=0x, so λ=0
What are the eigenvectors and eigenvalues for permutation matrices?
Out[1]:
A = λx⎯⎯ x⎯⎯
2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors
Page 2 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false
A permutation matrix such as the one below changes the order of the elements in a (column) vector
A good example of a vector that would remain in the same direction after multiplication by the permutation matrix above would the following vector
The eigenvalue would just be λ=1The next (eigen)vector would also work
It would have an eigenvalue of λ=-1
The trace and the determinant
The trace is the sum of the values down the main diagonal of a square matrixNote how this is the same as the sum of the eigenvalues (look at the permutation matrix above and its eigenvalues)The determinant of A is the product of the eigenvalues
How to solve Ax=λx
The only solution to this equation is for A-λI to be singular and therefor have a determinant of zero
This is called the characteristic (or eigenvalue) equationThere will be n λ for a n×n matrix(some of which may be of equal value)
In [5]: A = Matrix([[3, 1], [1, 3]])I = eye(2)A, I # Printing A and the 2-by-2 identity matrix to the screen
In [6]: (A - lamda * I) # Printing A minus lambda times the identity matrix to the screen
This will have the following determinant
In [7]: (A - lamda * I).det()
For this 2×2 matrix the absolute value of the -6 is the trace of A and the 8 is the determinant of A
In [8]: ((A - lamda * I).det()).factor()
I now have two eigenvalues of 2 and 4
In python we could also use the .eigenvals() statement
In [9]: A.eigenvals() # There is one value of 2 and one value of 4
[ ]01
10
[ ]11
[ ]−11
A = λx⎯⎯ x⎯⎯(A − λI) =x⎯⎯ 0⎯⎯
A − λI = 0∣∣ ∣∣
's
Out[5]: ( )[ ] ,31
13 [ ]1
001
Out[6]: [ ]−λ + 31
1−λ + 3
Out[7]: − 6λ + 8λ2
Out[8]: (λ − 4) (λ − 2)
Out[9]: { }2 : 1, 4 : 1
2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors
Page 3 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false
The eigenvectors are calculated by substituting the two values of λ into the original equation
In [10]: A.eigenvects()
The results above is interpreted as followsThe first eigenvalue has one eigenvector and the second eigenvalue also has a single eigenvector
Note the similarity between the eigenvectors of the two examples aboveIt is easy to see that adding a constant multiple of the identity matrix to another matrix (above we added 3I to the initial matrix) doesn't change the eigenvectors; it doesadd that constant to the eigenvalues though (we went from -1 and 1 to 2 and 4)
If we add another matrix to A (not a constant multiple of I) or even multiply them, then the influence on the original eigenvalues and eigenvectors of A is NOT sopredictable (as above)
The eigenvalues and eigenvectors of a rotation matrix
Consider this rotation matrix that rotates a vector by 90 (it is orthogonal)Think about it, though: what vector can come out parallel to itself after a 90 rotation?
In [11]: Q = Matrix([[0, -1], [1, 0]])Q
From the trace and determinant above we know that we will have the following equation
In [12]: Q.eigenvals()
In [13]: Q.eigenvects()
Note how the eigenvalues are complex conjugatesSymmetric matrices will only have real eigenvaluesAn anti-symmetric matrix (where the transpose is the original matrix times the scalar -1, as our example above) will only have complex eigenvaluesMatrices in between can have a mix of these
Eigenvalues and eigenvectors of an upper triangular matrix
Compute the eigenvalues and eigenvectors of the following matrix (note it is upper triangular)
In [14]: A = Matrix([[3, 1], [0, 3]])A
In [15]: A.eigenvals()
(A − λI) =x⎯⎯ 0⎯⎯
Out[10]: [ ]( ) ,2, 1, [ ][ ]−11 ( )4, 1, [ ][ ]1
1
A = λx⎯⎯ x⎯⎯∴ (A + cI) = (λ + c)x⎯⎯ x⎯⎯
oo
Out[11]: [ ]01
−10
− 0λ + 1 = 0λ2
= −1λ2
Out[12]: { }−i : 1, i : 1
Out[13]: [ ]( ) ,−i, 1, [ ][ ]−i1 ( )i, 1, [ ][ ]i
1
Out[14]: [ ]30
13
Out[15]: { }3 : 2
2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors
Page 4 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false
We have two eigenvalues, both equal to 3
In [16]: A.eigenvects()
This is a degenerate matrix; it does not have independent eigenvectors
Look at this upper triangular matrix
In [17]: A = Matrix([[3, 1, 1], [0, 3, 4], [0, 0, 3]])A
In [18]: A.eigenvals()
In [19]: A.eigenvects()
Example problems
Example problem 1
Find the eigenvalues and eigenvectors of the square of the following matrix as well as the inverse of the matrix minus the identity matrix
Solution
Notice the following
Once we know the eigenvalues for A we than simply square them to get the eigenvalues of the matrix squared
Similarly for the inverse of the matrix we have the following (for a non-zero λ, which is fine as A must be invertible for this problem)
In [20]: A = Matrix([[1, 2, 3], [0, 1, -2], [0, 1, 4]])A
In [21]: A.eigenvals()
Out[16]: [ ]( )3, 2, [ ][ ]10
Out[17]: ⎡
⎣⎢⎢
300
130
143
⎤
⎦⎥⎥
Out[18]: { }3 : 3
Out[19]: ⎡
⎣⎢⎢
⎛
⎝⎜⎜ 3, 3,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
A =⎡
⎣⎢⎢
100
211
3−24
⎤
⎦⎥⎥
A = λx⎯⎯ x⎯⎯= A (A ) = A (λ ) = λ (A ) =A2 x⎯⎯ x⎯⎯ x⎯⎯ x⎯⎯ λ2x⎯⎯
= = A =A−1x⎯⎯ A−1 Ax⎯⎯λ
A−1 1λ
x⎯⎯1λ
x⎯⎯
Out[20]: ⎡
⎣⎢⎢
100
211
3−24
⎤
⎦⎥⎥
Out[21]: { }1 : 1, 2 : 1, 3 : 1
2015/03/29, 2:10 PMII_08_Eigenvalues_and_eigenvectors
Page 5 of 5http://localhost:8888/nbconvert/html/II_08_Eigenvalues_and_eigenvectors.ipynb?download=false
In [22]: A.eigenvects()
From this it is clear that the eigenvalues of A will be 1, 4, and 9 and for A would be a 1, a half and a third
In [23]: (A ** 2).eigenvals()
In [24]: (A.inv()).eigenvals()
The eigenvectors will be as follows (exactly the same)
In [25]: (A ** 2).eigenvects()
In [26]: (A.inv()).eigenvects()
In [ ]:
Out[22]: ⎡
⎣⎢⎢⎢ ,
⎛
⎝⎜⎜ 1, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟ ,
⎛
⎝⎜⎜ 2, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
−1−21
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜⎜ 3, 1,
⎡
⎣⎢⎢⎢
⎡
⎣⎢⎢⎢
12
−11
⎤
⎦⎥⎥⎥
⎤
⎦⎥⎥⎥
⎞
⎠⎟⎟⎟
⎤
⎦⎥⎥⎥
2 -1
Out[23]: { }1 : 1, 4 : 1, 9 : 1
Out[24]: { }: 1,13 : 1,1
2 1 : 1
Out[25]: ⎡
⎣⎢⎢⎢ ,
⎛
⎝⎜⎜ 1, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟ ,
⎛
⎝⎜⎜ 4, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
−1−21
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜⎜ 9, 1,
⎡
⎣⎢⎢⎢
⎡
⎣⎢⎢⎢
12
−11
⎤
⎦⎥⎥⎥
⎤
⎦⎥⎥⎥
⎞
⎠⎟⎟⎟
⎤
⎦⎥⎥⎥
Out[26]: ⎡
⎣⎢⎢⎢ ,
⎛
⎝⎜⎜⎜ ,1
3 1,⎡
⎣⎢⎢⎢
⎡
⎣⎢⎢⎢
12
−11
⎤
⎦⎥⎥⎥
⎤
⎦⎥⎥⎥
⎞
⎠⎟⎟⎟ ,
⎛
⎝⎜⎜ ,1
2 1,⎡
⎣⎢⎢
⎡
⎣⎢⎢
−1−21
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ 1, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥⎥
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 1 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
This notebook is part of lecture 22 Diagonalization and powers of A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eye, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Diagonalizing a matrixPowers of a matrix A
Definition
If A is a n×n, then a non-zero vector x in � is called an eigenvector of the matrix A if Ax is a scalar multiple of xWhat this suggests is that if you consider the column vector x and multiply it by a scalar (here called λ) (which is then parallel to x, just of different length) it results in thesame solution as multiplying the matrix A by xLet's try another explanation: if a matrix A, multiplied with a (column) vector (x) results in a scalar multiple of that same (column) vector (and is thus parallel to that(column) vector) then this (column) vector is an eigenvector of the matrix A
In essence this multiplication of a matrix with a (column) vector produces another vector on the same line as the original vectorDepending on the value of this scalar the resulting vector might point in the opposite direction and be shorter or longer than the original
This scalar multiple is called the eigenvalueMatrices can have more than one eigenvalue and eigenvector
Derivations
We need to insert an identity matrix of size n into the equation that describes the explanation above
Look at this carefully and you'll notice that we are suggesting the nullspace (eigenspace) of the matrix (A-λI)This matrix has to be singular, i.e. have a determinant of 0
Solving this equation (called the characteristic equation) will give us the eigenvalues (λ )It will always be a polynomial in λ (called the characteristic polynomial of A), with a leading coefficient of 1 and a degree of n corresponding to the size of A
Substituting them back into...
... allows us to calculate the eigenvector(s) x
Out[1]:
n
A = λx⎯⎯ x⎯⎯A = λIx⎯⎯ x⎯⎯
A − λI =x⎯⎯ x⎯⎯ 0⎯⎯(A − λI) =x⎯⎯ 0⎯⎯
A − λI = 0∣∣ ∣∣'s
p (λ) = + + ⋯ +λn c1 λn−1 cn
(A − λI) =x⎯⎯ x⎯⎯ 0⎯⎯
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 2 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
Let's look at the following matrix A
Let's start with the first eigenvalue, which is equal to 1 and replace it in A-λI
In [4]: A = Matrix([[-1, 0 ,-2], [1, 1, 1], [1, 0, 2]])A
We now need the nullspace of this matrix
In [5]: A.nullspace()
We knew that this would be 1-dimensional after looking at the row-reduced form
In [6]: A.rref()
It has rank 2 (two pivot column and 1 free variable
Now for the other 2 eigenvalues, both equaling 2
In [7]: A = Matrix([[-2, 0, -2], [1, 0, 1], [1, 0, 1]])A
In [8]: A.nullspace()
In [9]: A.rref()
Only a single pivot column, therefor rank of 1 and two independent (free) variables
A =⎡
⎣⎢⎢
011
020
−213
⎤
⎦⎥⎥
A − λI = − =⎡
⎣⎢⎢
011
020
−213
⎤
⎦⎥⎥
⎡
⎣⎢⎢
λ00
0λ0
00λ
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−λ11
02 − λ
0
−21
3 − λ
⎤
⎦⎥⎥
= 0∣
∣
∣∣∣
−λ11
02 − λ
0
−21
3 − λ
∣
∣
∣∣∣
− 5 + 8λ − 4 = 0λ3 λ2
= 1, = = 2λ1 λ2 λ3
Out[4]: ⎡
⎣⎢⎢
−111
010
−212
⎤
⎦⎥⎥
Out[5]: ⎡
⎣⎢⎢
⎡
⎣⎢⎢
−211
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[6]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
010
2−10
⎤
⎦⎥⎥ [ ]0, 1
⎞
⎠⎟⎟
Out[7]: ⎡
⎣⎢⎢
−211
000
−211
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢ ,
⎡
⎣⎢⎢
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−101
⎤
⎦⎥⎥
⎤
⎦⎥⎥
Out[9]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
100
000
100
⎤
⎦⎥⎥ [ ]0
⎞
⎠⎟⎟
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 3 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
Corresponding to the first eigenvalue we have a single eigenvector that is the basis for a 1-dimensional (line) eigenspace in �Corresponding to the second (and third) eigenvalues we have two basis vectors for a 2-dimensional plane in �Since we are talking about subspaces, we must note that the zero vector must be in both eigenspaces (type of nullspace), but isn't an eigenvector
The eigenvalues of triangular (upper and lower) and diagonal matrices
The eigenvalue of these type of matrices are exactly the entries along the main diagonal
Real and complex eigenvalues
There will be characteristic polynomials resulting in complex rootsThe consequences of real-valued eigenvalues for a square matrix A of size n are the following
The system (A-λI)x=0 has non-trivial solutionsThere is a non-zero vector x in � such that Ax=λx
The eigenvector matrix S and eigenvalue matrix Λ
We need to create S from the (column) eigenvectors such that the following holds
As such, S should be square of size n×n and invertible, so we need n independent eigenvectors
Suppose we have n linearly independent eigenvectors of APut them in the columns of S and calculate AS
From this we have the following
Later I will use the computer variable D for this diagonal matrix Λ
The power of a matrix (only for n independent eigenvectors)
We saw in the example section of the last lecture that the following holds
The eigenvectors are the same for A and AWe can also see the following
The power need not be 2, but any k which will have S appearing k-1 times
We thus have the following theorems
...and...If k is a positive integer, λ is an eigenvalue of the matrix A, and x is a corresponding eigenvector, then λ is an eigenvalue of A and x is a corresponding eigenvector
33
n
AS = ΛS−1
AS = A = =
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮x1
⋮
⋮⋮x2
⋮
⋮⋮…⋮
⋮⋮xn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮xλ1 1
⋮
⋮⋮
λ2x2
⋮
⋮⋮…⋮
⋮⋮
λnxn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
⋮⋮x1
⋮
⋮⋮x2
⋮
⋮⋮…⋮
⋮⋮xn
⋮
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
λ10⋮0
0λ2
⋮0
00…0
00⋮λn
⎤
⎦
⎥⎥⎥⎥
AS = SΛ
AS = SΛAS = ΛS−1
A = SΛS−1
= xA2x⎯⎯ λ22
= SΛ SΛ = SA2 S−1 S−1 Λ2S−1
-1
→ 0 ∵ k → ∞;Ak ∣∣λi∣∣
k k
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 4 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
What makes a matrix diagonalizable
In discussing diagonalization we are concerned with finding a basis for � that consists of eigenvectors of a given square matrix of size nThese bases can tell us about geometric properties of A and it can simplify numerical computations involving A
We need to answer two question (which are actually the same)Given a square matrix of size n, is there a basis for � consisting of eigenvectors?Given a square matrix of size n, it there and invertible matrix S, such that S AS is a diagonal matrix? (It is the same matrix S referred to above)
If such a matrix S exists, it is said to diagonalize A (and we will call the resultant diagonal matrix D)
In short the answer to the above question(s) is yes if A has n independent eigenvectorsThis happens if all λ are different (none are repeated) (not totally excluded if they are repeated though)
If they are repeated, we still might have independent eigenvectors, i.e. any size identity matrix (because it is already diagonal)
In [10]: A = eye(5)A
In [11]: A.eigenvals()
In [12]: A.eigenvects()
Here we look at a triangular matrix, though
In [13]: A = Matrix([[2, 1], [0, 2]])A.eigenvals()
In [14]: A.eigenvects()
We can use python™ code to calculate the diagonalized matrix
In [15]: A = Matrix([[3, -2, 4, -2], [5, 3, -3, -2], [5, -2, 2, -2], [5, -2, -3, 3]])A
In [16]: S, D = A.diagonalize()
n
n-1
's
Out[10]: ⎡
⎣
⎢⎢⎢⎢⎢
10000
01000
00100
00010
00001
⎤
⎦
⎥⎥⎥⎥⎥
Out[11]: { }1 : 5
Out[12]: ⎡
⎣
⎢⎢⎢⎢⎢
⎛
⎝
⎜⎜⎜⎜⎜
1, 5,
⎡
⎣
⎢⎢⎢⎢⎢
,
⎡
⎣
⎢⎢⎢⎢⎢
10000
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
01000
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
00100
⎤
⎦
⎥⎥⎥⎥⎥
,
⎡
⎣
⎢⎢⎢⎢⎢
00010
⎤
⎦
⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢
00001
⎤
⎦
⎥⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟⎟
⎤
⎦
⎥⎥⎥⎥⎥
Out[13]: { }2 : 2
Out[14]: [ ]( )2, 2, [ ][ ]10
Out[15]: ⎡
⎣
⎢⎢⎢⎢
3555
−23
−2−2
4−32
−3
−2−2−23
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 5 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
In [17]: S # S, such that A = S times D times the inverse of S
In [18]: D # The diagonal
In [19]: S * D * S.inv() == A # Checking to see if our statement above is correct
In [20]: S.inv() * A * S == D # Checking to see if our statement above is correct
In [21]: A.eigenvals()
Remember Λ from above?The eigenvalues are precisely the entries along the main diagonal of the diagonal matrix
To produce the required diagonal matrix manually then will require computing n linearly independent eigenvectors for matrix A of size n (assuming that it isdiagonalizable), creating a matrix with its columns equal to these eigenvectors (called matrix S) and performing the equation S AS to calculate the diagonal matrix D ( orΛ)
Back to the topic of what makes a matrix diagonalizable
Suppose we have an equation that starts with some vector and every subsequent vector is a matrix A time the previous vector
From this arises the following
To really solve this problem, rewrite u as follows (a certain scalar times an eigenvector)
Where the Sc is a linear combination of the individual eigenvectors
Now multiply both sides by A
Taking a power of A now (i.e. k) would be akin to taking each eigenvalue to that power
This can be written as
Out[17]: ⎡
⎣
⎢⎢⎢⎢
0111
1111
1110
0−101
⎤
⎦
⎥⎥⎥⎥
Out[18]: ⎡
⎣
⎢⎢⎢⎢
−2000
0300
0050
0005
⎤
⎦
⎥⎥⎥⎥
Out[19]: True
Out[20]: True
Out[21]: { }−2 : 1, 3 : 1, 5 : 2
-1
= Au⎯⎯k+1 u⎯⎯k
= Au⎯⎯1 u⎯⎯0
= A =u⎯⎯2 Au⎯⎯0 A2 u⎯⎯0=u⎯⎯k Aku⎯⎯0
0= + + ⋯ + = Su⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯n c⎯⎯
A = A + A + ⋯ + Au⎯⎯0 c1 x⎯⎯1 c2 x⎯⎯2 cn x⎯⎯nA = + + ⋯ +u⎯⎯0 c1 λ1x⎯⎯1 c2 λ2x⎯⎯2 cn λnx⎯⎯n
= + + ⋯ +Aku⎯⎯0 c1 λk1x⎯⎯1 c2 λk
2x⎯⎯2 cn λknx⎯⎯n
= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 6 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
As an example consider the Fibonacci numbers: 0, 1, 1, 2, 3, 5, 8, 13, ...What would the 100 number be?Consider the following
This is a (second-order) difference equation; think of this example as similar to a second-order differential equation (without derivatives)By adding a second equation F =F , consider u to be the following vector
This means the following
In [22]: A = Matrix([[1, 1], [1, 0]])A
In [23]: A.eigenvals()
In [24]: A.eigenvects()
In [25]: S, D = A.diagonalize()
In [26]: D
From above we remember the following
We have u contains the first two values
In [27]: u_zero = Matrix([1, 0])u_100 = A ** 100 * u_zerou_100 # The top value is the 100th Fibonacci number
In [28]: u_four = A ** 4 * u_zerou_four # If the first number is 0 the the fourth number would be the top value
Example problems
Example problem 1
Find an equation for C where C is given by the following matrix
Calculate C when a=b=-1
Solution
th
= +Fk+2 Fk+1 Fk
k+1 k+1 k
= [ ]u⎯⎯kFk+1Fk
= [ ] [ ] = [ ]u⎯⎯k+111
10
Fk+1Fk
+Fk+1 Fk
Fk+1
= [ ]u⎯⎯k+111
10
u⎯⎯k
Out[22]: [ ]11
10
Out[23]: { }+ : 1,12
5√2 − + : 15√
212
Out[24]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ + ,1
25√
2 1,⎡
⎣⎢⎢
⎡
⎣⎢⎢
− 1− +5√
212
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ − + ,5√
212 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− 1+1
25√
2
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
Out[26]: ⎡⎣⎢⎢
+12
5√2
0
0
− +5√2
12
⎤⎦⎥⎥
= = Su⎯⎯k Aku⎯⎯0 Λk c⎯⎯0
Out[27]: [ ]573147844013817084101354224848179261915075
Out[28]: [ ]53
k
100
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 7 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
In [29]: a, b, k = symbols('a b k')
In [30]: C = Matrix([[2 * b - a, a - b], [2 * b - 2 * a, 2 * a - b]])C
We remember the following
Where Λ is denoted by the computer variable D
In [31]: S, D = C.diagonalize()
In [32]: S
Python™ is not always good at simplifying theseIf you look at it carefully you will note the following
In [33]: S = Matrix([[1, Rational(1, 2)], [1, 1]])S
In [34]: D
In [35]: D = Matrix([[b, 0], [0, a]])D
For the values given, we have the following
In [36]: C = Matrix([[-1, 0], [0, -1]])C
In [37]: S, D = C.diagonalize()
In [38]: D
In [39]: D ** 100
In [40]: S * (D ** 100) * S.inv()
Doing the same, but with eigenvalues and eigenvectors
Out[30]: [ ]−a + 2b−2a + 2b
a − b2a − b
= SAk ΛkS−1
Out[32]: ⎡
⎣⎢⎢
− 2a−2b
−3a+3b+ (a−b)2√1
2a−2b
3a−3b+ (a−b)2√1
⎤
⎦⎥⎥
Out[33]: [ ]11
121
Out[34]: ⎡⎣⎢⎢
+ −a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
0
0
+ +a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
⎤⎦⎥⎥
Out[35]: [ ]b0
0a
Out[36]: [ ]−10
0−1
Out[38]: [ ]−10
0−1
Out[39]: [ ]10
01
Out[40]: [ ]10
01
2015/03/29, 7:37 PMII_09_Diagonalization_and_Powers
Page 8 of 8http://localhost:8888/nbconvert/html/II_09_Diagonalization_and_Powers.ipynb?download=false
In [41]: C = Matrix([[2 * b - a, a - b], [2 * b - 2 * a, 2 * a - b]])C
In [42]: C.eigenvals()
This simplifies the λ = b and λ = aThat makes Λ (or D) the following
In [43]: D = Matrix([[b, 0], [0, a]])D
In [44]: C.eigenvects() # The solution is two tuples, with each being eigenvalue, eigenvector
This simplifies to the following eigenvalue matrix S
In [45]: S = Matrix([[1, Rational(1, 2)], [1, 1]])S
We can see if we can get back to C
In [46]: S * D * S.inv()
In [47]: S * D * S.inv() == C
Python™ won't to D for you, but it's easy to do yourself
In [48]: D = Matrix([[b ** k, 0], [0, a ** k]])D
Now we can compute SΛS
In [49]: S * D * S.inv()
Placing the given values into this equation will give you the same solution for C as above
In [ ]:
Out[41]: [ ]−a + 2b−2a + 2b
a − b2a − b
Out[42]: { }+ − : 1,a2
b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ + + : 1a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√
1 2
Out[43]: [ ]b0
0a
Out[44]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ + − ,a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− a−b
− + +3a2
3b2
12 (a−b)2√
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ + + ,a
2b2
12 (a − b)2‾ ‾‾‾‾‾‾‾√ 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
− a−b
− + −3a2
3b2
12 (a−b)2√
1
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
Out[45]: [ ]11
121
Out[46]: [ ]−a + 2b−2a + 2b
a − b2a − b
Out[47]: True
k
Out[48]: [ ]bk
00ak
-1
Out[49]: [ ]− + 2ak bk
−2 + 2ak bk−ak bk
2 −ak bk
100
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 1 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
This notebook is part of lecture 23 Differential equations and exponent A in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, expfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: u, u1, u2, t, a, b, c = symbols('u u1 u2 t a b c')
Differential equationsExponential e of a matrix
Differential equations (ordinary only)
A differential equation moves on from the previous lecture's difference equation which had finite steps to continuously changing systems (here in the time parameter t)A differential equation included a function and its derivative(s)It has and order based on the highest derivative that appearsHere we are only concerned with differential equation with constant coefficientsThe simplest differential equation is the following
It is simply solved in the following manner (which gives us some insight into the general solution for these equations)
We can solve for the constant(s) if we have values for the initial condition (usually t=0) for y(t) and all its derivatives (called initial value problems)We can write a system of differential equations in matrix form
Out[1]:
At
= ay (t)dydt
= aydydt
dy = adt1y
∫ dy = a ∫ dt1y
ln y = a (t + )∣∣ ∣∣ c1
=eln y∣∣ ∣∣ eat+ac1
y = eateac1
y = ceat
= 3y′
1 y1
= −2y′
2 y2
= 6y′
3 y3
∴ =⎡
⎣
⎢⎢⎢y′
1
y′
2y′
3
⎤
⎦
⎥⎥⎥⎡
⎣⎢⎢
300
0−20
006
⎤
⎦⎥⎥
⎡
⎣⎢⎢⎢
y1y2y3
⎤
⎦⎥⎥⎥
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 2 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
Rewriting the above, we consider the following differential equations in this lecture
So suppose we have these two differential equations
The intial consitions are given by the following
We can write it as Au
In [5]: A = Matrix([[-1, 2], [1, -2]])A
In [6]: u_vect = Matrix([u1, u2]) # u is now a sympy mathematical symbol# Have to use another variable name, i.e. u_vectu_vect
Multiplying this Au brings you back to the two linear equations
In [7]: A * u_vect
In [8]: A.eigenvals()
In [9]: A.eigenvects() # The results give the two eigenvectors in the following format# (eigenvalue, no of eigenvectors, eigenvector)
In [10]: S, D = A.diagonalize() # For interest sake we get the matrix of eigenvectors and the diagonal matrixS, D
To complete the solution now, we note that there are two eigenvalues, which will give us the followingTwo constantsTwo exponent to the power eigenvalue times tTwo eigenvectors
It is written like this, with x denoting an eigenvector
This makes our solution as follows
There is clearly a constant term and a term that approaches zero at t approaches infinity
Writing this as two separate equations we have the following
= Adu⎯⎯dt
u⎯⎯
= − + 2du⎯⎯1dt
u⎯⎯1 u⎯⎯2
= − 2du⎯⎯2dt
u⎯⎯1 u⎯⎯2
(0) = [ ]u⎯⎯10
Out[5]: [ ]−11
2−2
Out[6]: [ ]u1u2
Out[7]: [ ]− + 2u1 u2− 2u1 u2
Out[8]: { }−3 : 1, 0 : 1
Out[9]: [ ]( ) ,−3, 1, [ ][ ]−11 ( )0, 1, [ ][ ]2
1
Out[10]: ( )[ ] ,−11
21 [ ]−3
000
i(t) = +u⎯⎯ c1 e tλ1 x⎯⎯1 c2 e tλ2 x⎯⎯2
(t) = [ ] + [ ]u⎯⎯ c1 e−3t −11
c2 e(0)t 21
(t) = [ ] + [ ]u⎯⎯ c1 e−3t −11
c221
(t) = − + 2u1 c1 e−3t c2(t) = +u2 c1 e−3t c2
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 3 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
Using the initial conditions we can solve for c
Let's use an augmented matrix and Gauss-Jordan elimination to solve for the two constants (or at least the python™ equivalent)
In [11]: C = Matrix([[-1, 2, 1], [1, 1, 0]])C.rref()
Which gives us the final solution
Just to remind ourselves about the previous lecture where we had difference equations and had something like the following
Which is for finite steps, i.e. stepping by one
What can the eigenvalues tell us about these equations as t approaches ∞
If both eigenvalues (real parts) are negative, the equation t(t) approaches 0 (called stability)If one eigenvalue (real part) is zero and the others (real parts) are less than one, the equations reach is specific value (called a steady state)If any eigenvalue (real parts) is larger than zero, the equations approach ±∞
What can the matrix A tell us about the eigenvalues (and then what happens when tapproaches ∞)
The trace is equal to the sum of the eigenvaluesThe determinant is the product of the eigenvalues
If the trace is negative and the determinant positive, we will have stability
Using diagonalization
Consider the following derivation
From this we have the following
i(0) = − + 2 = 1u1 c1 e−3(0) c2
(0) = + = 0u2 c1 e−3(0) c2− + 2 = 1c1 c2
+ = 0c1 c2
Out[11]:
( )[ ] ,10
01
− 13
13
[ ]0, 1
(t) = +u113 e−3t 2
3(t) = +u2
−13 e−3t 1
3
= +u⎯⎯ c1 λk1x⎯⎯1 c2 λk
2x⎯⎯2
= Au⎯⎯k+1 u⎯⎯k
2×2
= Adu⎯⎯dt
u⎯⎯∵ A = Su⎯⎯ v⎯⎯
S = Sdv⎯⎯dt
v⎯⎯
= S = Λdv⎯⎯dt
v⎯⎯S−1 v⎯⎯
(t) = (0)v⎯⎯ eΛt v⎯⎯(t) = S (0)u⎯⎯ eΛt S−1u⎯⎯
= SeAt eΛt S−1
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 4 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
Matrix exponential e
How do we calculate a matrix as a power?Consider Taylor series expansion
This comes from the following
As the denominator increases the n term approaches 0Remember also this (geometric) series (just for fun)
This will blow up unless then eigenvalues of A are less than 1
Now, let calculate e , remembering the following
We thus have the following
This is with the assumption that A can be diagonalized (otherwise we will have to use the infinite series (above)
Remember that Λ is a diagonal matrix and therefor we would have the following
The S and S matrices are stable, it is therefor the Λ matrix that provides an approach to zero as t approaches ∞This is achieved by every λ having a real part less than zero
The powers of A go to zero if the absolute value of the real part of all the λ -values is less than 1
Let's consider this example
We have to create a system of two first-order equations
Example problems
Example problem 1
At
= I + At + + + ⋯ +eAt (At)2
2!(At)3
3!(At)n
n!
=ex ∑n=0
∞ xn
n!th
=11 − x ∑
0
∞xn
= I + At + + …(I − At)−1 (At)2 (At)n
At
= SAk ΛkS−1
=eAt ∑n=0
∞ (S )ΛnS−1 ntn
n!
= SeAt eΛt S−1
= I + At + + + ⋯ +eAt (At)2
2!(At)3
3!(At)n
n!
=eΛt
⎡
⎣
⎢⎢⎢⎢
e tλ1
0⋮0
0e tλ2
⋮0
00…0
00⋮
e tλn
⎤
⎦
⎥⎥⎥⎥
-1
i
i
+ b + k =y⎯⎯
″ y⎯⎯
′ y⎯⎯ 0⎯⎯
= [ ]u⎯⎯y′
y
= [ ] = [ ] [ ]u⎯⎯′ y″
y′−b1
−k0
y′
y
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 5 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
Find the general solutions, the matrix A, and the first column of e of the following third-order, ordinary, homogeneous differential equation with constant coefficients
Solution
Since the differential equation is third order, we need to create three first-order differential equations
Notice that when we do the last matrix multiplication, we get exactly what we need
We now have the matrix A and we can calculate e if A is diagonalizable
In [12]: A = Matrix([[-2, 1, 2], [1, 0, 0], [0, 1, 0]])A
In [13]: S, D = A.diagonalize()S, D
We can calculate e from the following
Remember that Λ is a diagonal matrix and therefor we would have the following
In [14]: A.eigenvals()
This gives us the three eigenvalues from which we can create the diagonal matrix e
At
+ 2 − − 2y = 0yd3
dt3yd2
dt2dydt
+ 2 − − 2y = 0yd3
dt3yd2
dt2dydt
∴ = −2 + + 2y‴ y″ y′
=u⎯⎯
⎡
⎣⎢⎢
y″
y′
y
⎤
⎦⎥⎥
=u⎯⎯′
⎡
⎣⎢⎢
y‴
y″
y′
⎤
⎦⎥⎥
= A =u⎯⎯′ u⎯⎯
⎡
⎣⎢⎢
−210
101
200
⎤
⎦⎥⎥
⎡
⎣⎢⎢
y″
y′
y
⎤
⎦⎥⎥
= =u⎯⎯′
⎡
⎣⎢⎢
y‴
y″
y′
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−2 + + 2yy″ y′
y″
y′
⎤
⎦⎥⎥
At
Out[12]: ⎡
⎣⎢⎢
−210
101
200
⎤
⎦⎥⎥
Out[13]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
4−21
1−11
111
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−200
0−10
001
⎤
⎦⎥⎥
⎞
⎠⎟⎟
At
= SeAt eΛt S−1
=eΛt
⎡
⎣
⎢⎢⎢⎢
e tλ1
0⋮0
0e tλ2
⋮0
00…0
00⋮
e tλn
⎤
⎦
⎥⎥⎥⎥
Out[14]: { }−2 : 1, −1 : 1, 1 : 1
Λt
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 6 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
In [15]: e_Lamda_t = Matrix([[exp(-2 * t), 0, 0], [0, exp(-t), 0], [0, 0, exp(t)]])e_Lamda_t
We have to multiply the following three matrices (with which python is not comfortable, so I'll do it in two steps)
In [16]: S, e_Lamda_t, S.inv()
In [17]: first_part = S * e_Lamda_tfirst_part
In [18]: first_part * S.inv()
We still need to write our general solution
In [19]: A.eigenvects()
With these eigenvalues and eigenvectors we get
Example problem 2
Solve the following second-order ordinary differential equation
Solution
We need to create two first-order equations
Out[15]: ⎡
⎣⎢⎢⎢
e−2t
00
0e−t
0
00et
⎤
⎦⎥⎥⎥
Out[16]: ⎛
⎝
⎜⎜⎜ ,⎡
⎣⎢⎢
4−21
1−11
111
⎤
⎦⎥⎥ ,
⎡
⎣⎢⎢⎢
e−2t
00
0e−t
0
00et
⎤
⎦⎥⎥⎥
⎡
⎣
⎢⎢⎢
13
− 12
16
0− 1
212
− 13
113
⎤
⎦
⎥⎥⎥
⎞
⎠
⎟⎟⎟
Out[17]: ⎡
⎣⎢⎢⎢
4e−2t
−2e−2t
e−2t
e−t
−e−t
e−t
et
et
et
⎤
⎦⎥⎥⎥
Out[18]: ⎡
⎣
⎢⎢⎢⎢
− +et
6e−t
243 e−2t
+ −et
6e−t
223 e−2t
− +et
6e−t
213 e−2t
−et
2e−t
2
+et
2e−t
2
−et
2e−t
2
+ −et
3 e−t 43 e−2t
− +et
3 e−t 23 e−2t
+ −et
3 e−t 13 e−2t
⎤
⎦
⎥⎥⎥⎥
(t) = + +u⎯⎯ c1 e tλ1 x⎯⎯1 c2 e tλ2 x⎯⎯2 c3 e tλ3 x⎯⎯3
Out[19]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ −2, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
4−21
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟ ,
⎛
⎝⎜⎜ −1, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
1−11
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ 1, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
(t) = = + +u⎯⎯
⎡
⎣⎢⎢
y″
y′
y
⎤
⎦⎥⎥ c1 e−2t
⎡
⎣⎢⎢
4−21
⎤
⎦⎥⎥ cse−t
⎡
⎣⎢⎢
1−11
⎤
⎦⎥⎥ c3 et
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
∴ y (t) = + +c1 e−2t c2 e−t c3 et
− − 6y = 0y″ y′
∴ = + 6yy″ y′
∵ (t) = [ ]u⎯⎯y′
y
(t) = [ ]u⎯⎯′ y″
y′
∴ (t) = [ ] (t)u⎯⎯′ 1
160
u⎯⎯
2015/03/29, 7:41 PMII_10_Differential_equations_Exponential_of_a_matrix
Page 7 of 7http://localhost:8888/nbconvert/html/II_10_Differential_equations_Exponential_of_a_matrix.ipynb?download=false
In [20]: A = Matrix([[1, 6], [1, 0]])A
In [21]: A.eigenvects()
The solution is thus as follows
In [ ]:
Out[20]: [ ]11
60
Out[21]: [ ]( ) ,−2, 1, [ ][ ]−21 ( )3, 1, [ ][ ]3
1
y (t) = +c1 e−2t c2 e3t
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 1 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
This notebook is part of lecture 24 Markov matrices and Fourier series in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [3]: from sympy import init_printing, Matrix, symbols, eyefrom warnings import filterwarnings
In [4]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Markov matrices and steady stateProjections and Fourier series
Markov matrix
Consider the follow Markov matrix (transition matrix)
In [5]: A = Matrix([[0.1, 0.01, 0.3], [0.2, 0.99, 0.3], [0.7, 0, 0.4]])A
All the column entries add to 1 (also true for powers of the matrix)All entries ≥ 0 (also true for powers of the matrix)There will be an eigenvalue of 1All other eigenvalues will have an absolute value of at most 1For difference equations we will have the following
Thus for the eigenvalues other than 1, successive powers will lead to those terms approaching zero and a steady state being reached by the term with the eigenvalue of1
In [6]: A.eigenvals()
All the components of the eigenvector of the eigenvalue 1 is positive
Out[1]:
Out[2]:
Out[5]: ⎡
⎣⎢⎢
0.10.20.7
0.010.99
0
0.30.30.4
⎤
⎦⎥⎥
= = + + …uk Aku0 c1 λk1x1 c1 λk
2x2
Out[6]: { }1 : 1, + : 1,49200
3 1009√200 − + : 13 1009√
20049200
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 2 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
In [7]: A.eigenvects()
If the matrix A - 1λ is singular, then the eigenvalue is 1
In [8]: A - eye(3) # A minus the identity matrix
In [9]: (A - eye(3)).det() # A computer peculiarity to inidcate 0
The sum of the entries in each column is now zeroWe would like a proof involving this (sum of column entries equal zero) as an assumption to give a singular matrix, without having to calculate the determinantIt is easy to see that the rows are linearly dependent and that would give proof of a singular matrixLook also at the nullspace of A
In [10]: A_1t = A - eye(3)(A_1t.transpose()).nullspace()
In [11]: (A.transpose()).nullspace()
In [12]: A.nullspace()
In [13]: (A - eye(3)).rref()
Note how the eigenvalues of A and A are the same
In [14]: A.eigenvals() == A.transpose().eigenvals()
The proof of this lies in the fact that we calculate the eigenvalue(s) by the following equation
The eigenvalue(s) must also solve this equation
Since λI=λI and both of these equations eqal the same value (0), the eigenvalue(s) must be equal
Example of a Markov matrix
Consider the population of two (isolated) states (c and m)Over time there is movement between the two (no loss or entry from outside the system)We consider the following system fo movement (difference equation)
We can create the following matrix equation
Out[7]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ 1.0, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
0.85714285714285747.1428571428571
1.0
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟ ,
⎛
⎝⎜⎜ 0.721471405228058, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
0.459244864611511−1.45924486461151
1.0
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ −0.231471405228058, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
−0.902102007468654−0.0978979925313461
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢
−0.90.20.7
0.01−0.01
0
0.30.3
−0.6
⎤
⎦⎥⎥
Out[9]: −3.03576608295941 ⋅ 10−18
T
Out[10]: []
Out[11]: []
Out[12]: []
Out[13]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
1.000
01.00
00
1.0
⎤
⎦⎥⎥ [ ]0, 1, 2
⎞
⎠⎟⎟
T
Out[14]: True
A − λI = 0∣∣ ∣∣
− λI = 0∣∣AT ∣∣T
= Auk+1 uk
= [ ][ ]uc
um k+1
0.90.1
0.20.8 [ ]uc
um k
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 3 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
In [15]: A = Matrix([[0.9, 0.2], [0.1, 0.8]])A
Let's start at time, t=0 (or in our case k=0), with the whole population in state mBecause the system is closed, the total population stays at 1000
In [16]: u0 = Matrix([0, 1000])
After 1 time period k+1 will be given by the following
In [17]: u1 = A * u0u1
In [18]: u2 = A * u1u2
Where will we end up?Let's look at the eigenvaluesOne will be one and the other must be the trace minus 1 = 0.7
In [19]: A.eigenvals()
In [20]: A.eigenvects()
When can look at what happens after a hundred steps using a small while loop
In [21]: u = Matrix([0, 1000])i = 0
while i < 101: u = A * u i += 1u
It seems that we are at about ⅔ vs ⅓It looks suspiciously like the eigenvector of the eigenvalue 1
Remember the equation for the difference equation we used above?
We only have two terms, so this will become somewhat shortened
Out[15]: [ ]0.90.1
0.20.8
Out[17]: [ ]200.0800.0
Out[18]: [ ]340.0660.0
Out[19]: { }: 1,710 1 : 1
Out[20]: [ ]( ) ,0.7, 1, [ ][ ]−1.01.0 ( )1.0, 1, [ ][ ]2.0
1.0
Out[21]: [ ]666.666666666668333.333333333334
= = + + …uk Aku0 c1 λk1x1 c1 λk
2x2
= = +uk Aku0 c1 λk1x1 c1 λk
2x2
= [ ] + [ ]uk c1 1k 21
c1710
k −11
= [ ] + [ ]u0 c1 10 21
c1 ( )710
0 −11
= [ ] + [ ] = [ ]u0 c121
c1−11
01000
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 4 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
In [22]: result = Matrix([[2, -1, 0], [1, 1, 1000]])result.rref()
So we have c =1000÷3 and c =2000÷3
This results in the final solution
Now we can work out the solution for any time step k and even see what happens at time approaches infinityWe note that the second expression disappears as k approaches infinity, which represents the steady state
Projections and Fourier series
Consider projections with orthonormal basis q , q , ..., qAny vector v can be expressed (expanded) as an combination of this basis
Because the basis q were chosen to be orthogonal taking the dot product of q on both sides will cancel out all q factors
We can see this clearly in matrix form below
The first component of x, x is the the first row of Q times vThe first row of Q is just q , with a bunch of zeros and just ends as we had above
The key here was to choose orthonormal basis vectorsNow for how this relates to Fourier seriesWe might want to write a function as an infinite series of expressions
The idea here is that there is still something orthogonal in each of these expressions (sine and cosine)Joseph Fourier realized that he could work in function spaceThe vectors are now functions in an infinitely large spaceThe basis vectors are cos(x), sin(x), cos(2x)...
Out[22]:
( )[ ] ,10
01
10003
20003
[ ]0, 1
1 2
= [ ] + [ ]uk1000
3 1k 21
20003 ( )7
10k −1
1
1 2 n
v = + + ⋯ +x1 q1 x2 q2 xn qn
i 1 ≠1v = + 0 + 0 + …qT
1 x1 qT1 q1
v =qT1 x1
= v
⎡
⎣⎢⎢⎢
⋮q1
⋮
⋮⋯⋮
⋮qn
⋮
⎤
⎦⎥⎥⎥
⎡
⎣⎢⎢⎢
x1
⋮xn
⎤
⎦⎥⎥⎥
Qx = vx = vQ−1
∵ =Q−1 QT
x = vQT
1 TT 1
v =qT1 x1
f (x) = + cos x + sin x + cos 2x + sin 2x + …a0 a1 b1 a2 b2
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 5 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
What is the inner (dot) product of functions that make them orthogonal?For vectors is was the following
For functions f and g the analogue is to multiply them and the analogue of all the addition is integration
Now we need to know the lower and upper limit of integrationWe note that the sine and cosine functions are periodic and repeat every 2π (these are periodic functions)
Just like the inner product of pairs gave zero because of orthogonality we have the same here
With some trigonometric identities we can show the same for all the other pairs
A Fourier series is then an expression of a function expanded on this orthonormal basis pairs
How do we get the coefficients then?Same as above, i.e. taking the inner product of both sides with cos(x)
Example problem (Markov matrixes)
Example problem 1
A particle jumps between positions A and B with the following probabilitiesA to A (stays in A) probability is 0.6A to B probability is 0.4 (so all states from A add to 1.0)B to B probability is 0.8B to A probability is 0.2
If the particle starts in position A, what is the probability that it will be at A and B after the first step, n-steps, and ∞-steps
Solution
In [23]: x_vect = Matrix([1, 0])A = Matrix([[0.6, 0.2], [0.4, 0.8]]) # Look carefully at what goes whereA, x_vect # Displaying the two matrices
After a single step we will have the following
In [24]: A * x_vect
w = + + ⋯ +vT v1 w1 v2 w2 vn wn
g = ∫ f (x) g (x)dxf T
∵ f (x) = f (x + 2π)
g = f (x) g (x)dxf T ∫2π
0
sin x cos xdx∫2π
0u (x) = sin x
= cos xdudx
cos xdx = du(2π) = sin 2π = 0u2
(0) = sin 0 = 0u1
udu = 0∫0
0
f (x) cos xdx = dx + 0 + 0 + ⋯ = π∫2π
0a1 ∫
2π
0(cos x)2
= f (x) cos xdxa11π ∫
2π
0
Out[23]: ( )[ ] ,0.60.4
0.20.8 [ ]1
0
Out[24]: [ ]0.60.4
2015/03/29, 7:46 PMII_11_Markov_matrices_Projections_and_Fourier_series
Page 6 of 6http://localhost:8888/nbconvert/html/II_11_Markov_matrices_Projections_and_Fourier_series.ipynb?download=false
A probability of 0.6 of being in position A and a probability of 0.4 of being in position B
Now look at the following trend
To take the power of a matrix in python is easy, but let's remind ourselves that we need to wotk with eignevalues and eigenvectorsBecause all the column entries add to 1, we are dealing with a Markov matrixOne of the eigenvalues will be 1, the trace is 1.4, therefor the other eigenvalue is 0.4Remember diagonalization?
In [25]: S, D = A.diagonalize()S, D
Note the matrix of eigenvalues (diagonal) DNote the matrix of eigenvectorsSo for the eigenvalue of 1 we have the following eigenvector (steady state)
For the other eigenvalue we have the following eigenvector (decay)
Solving for the constants we can create an augmented matrix
In [26]: u = Matrix([[1, -1, 1], [2, 1, 0]])u.rref()
This gives us the following solution
So at infinity it will be at position A with a probability of ⅓ and at position B with a probability of ⅔
In [27]: (A ** 1000000) * x_vect # Just to show how easy it is just to take the power of A in python
In [ ]:
= Ap1 p0
= A = A ( ) =p2 p1 Ap0 A2p0
=p3 A3p0
⋮=pn Anp0
Out[25]: ( )[ ] ,−1.01.0
1.02.0 [ ]0.4
00
1.0
[ ]12
[ ]−10
= = +uk Aku0 c1 λk1x1 c1 λk
2x2
= [ ] + [ ]uk c1 1k 12
c1410
k −11
= [ ] + [ ]u0 c1 10 12
c1 ( )410
0 −11
= [ ] + [ ] = [ ]u0 c112
c1−11
10
Out[26]:
( )[ ] ,10
01
13
− 23
[ ]0, 1
= ( ) [ ] + [ ]un13 1n 1
2−23 ( )4
10n −1
1
= + (−1)p (A)n13
−23 ( )4
10n
= (2) + (1)p (B)n13
−23 ( )4
10n
Out[27]: [ ]0.3333333333627920.666666666725585
2015/03/29, 7:50 PMIII_01_Symmetric_matrices
Page 1 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false
This notebook is part of lecture 25 Symmetric matrices and positive definiteness in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [3]: from sympy import init_printing, Matrix, symbols, sqrtfrom warnings import filterwarnings
In [4]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Symmetric matricesPositive definite matrices
Symmetric matrices
Symmetric matrices are square with the following property
We are concerned with the eigenvalues and eigenvectors of symmetric matricesThe eigenvalues are realThe eigenvectors are orthogonal, or at least, can be chosen orthogonal
Out[1]:
Out[2]:
A = AT
2015/03/29, 7:50 PMIII_01_Symmetric_matrices
Page 2 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false
Considering proof of the real nature of eigenvalues we have the followingAny matrix equation of the following example can be changed to its complex conjugate form by changing each element into its complex conjugate form (heremarked with a bar over the top)
We can take the complex conjugate transpose of x on both sides
In the complex conjugate form this becomes the following
Now if A is symmetric we use the fact that A=A
Note how the right-hand sides of (1) and (2) are equal and we therefor have the following
This means the following
The only ways that this is possible is if the imaginary part is zero and only real eigenvalues are possibleNote also what happens if the complex conjugate of the vector x is multiplied by the vector itself
Remember that x x is a form of the dot product (which is the length squared)Any number times its complex conjugate gets rid of the imaginary part
Consider the following symmetric matrix A
In [5]: A = Matrix([[5, 2], [2, 3]])A
Let's see if it really is symmetric by making sure that it is equal to it's transpose
In [6]: A == A.transpose() # Boolean (true or false) statement
In [7]: S, D = A.diagonalize()
S, the matrix containing the eigenvectors as it's columnsRemember that these eigenvectors are not necessarily the same as those you would get doing these problems by handWhen substituting the values for λ a singular matrix is created with rows that are simply linear combinations of each otherYou are free to choose values for the components of the eigenvectors for each eigenvalue (usually choosing the simplest ones)
In [8]: S
D, the matrix containing the values of the eigenvalues down the main diagonal
In [9]: D
In decomposition, a symmetric matrix results in the following
In this case we have an orthogonal matrix times diagonal matrix times transpose of orthogonal matrix
In [10]: A.eigenvals()
In [11]: A.eigenvects()
A = λx⎯⎯ x⎯⎯A =x⎯⎯⎯ λ
⎯⎯⎯x⎯⎯⎯
A = λ … (1)x⎯⎯⎯ T x⎯⎯ x⎯⎯⎯ T x⎯⎯
= λx⎯⎯⎯ T AT x⎯⎯ x⎯⎯⎯ T x⎯⎯T
A = … (2)x⎯⎯⎯ T x⎯⎯ x⎯⎯⎯ T λ⎯⎯⎯x⎯⎯
λ =x⎯⎯⎯T x⎯⎯ λ⎯⎯⎯x⎯⎯⎯ T x⎯⎯
λ = λ⎯⎯⎯
T
Out[5]: [ ]52
23
Out[6]: True
i
Out[8]:
[ ]− 2
1+ 5√
1− 2
− +15√
1
Out[9]: [ ]− + 45‾√0
0+ 45‾√
A = SΛST
A = QΛQT
Out[10]: { }− + 4 : 1,5‾√ + 4 : 15‾√
Out[11]:
[ ]( ) ,− + 4,5‾√ 1, [ ][ ]− 2
1+ 5√
1 ( )+ 4,5‾√ 1, [ ][ ]− 2
− +15√
1
2015/03/29, 7:50 PMIII_01_Symmetric_matrices
Page 3 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false
We've seen in our example that, indeed, the eigenvalues are realLet's see of the eigenvectors are orthogonal by looking at their dot product
In [12]: eigenvec_1 = Matrix([-2 / (1 + sqrt(5)), 1])eigenvec_2 = Matrix([-2 / (1 - sqrt(5)), 1])eigenvec_1.dot(eigenvec_2)
This is certainly zero when simplified
In [13]: (eigenvec_1.dot(eigenvec_2)).simplify() # Using the simplify() method
We need not use symbolic computing (computer algebra system, CAS)Let's look at numerical evaluation using numerical python (numpy)
In [14]: import numpy as np # Using namespace abbreviations
In [15]: A = np.matrix([[5, 2], [2, 3]])A
In [16]: w, v = np.linalg.eig(A) # Calculating the eigenvalues and eigenvectors# The result of np.linalg.eig() is a tuple, the first being the eigenvalues# The second being the eigenvectors
In [17]: w
In [18]: v
In [19]: # Creating the diagonal matrix manually from the eigenvaluesD = np.matrix([[6.23606798, 0], [0, 1.76393202]])D
In [20]: # Checking to see if our equation for A holdsv * D * np.matrix.transpose(v)
Positive definite matrices (referring to symmetric matrices)
The properties of positive definite (symmetric) matricesAll eigenvalues are positiveAll pivots are positiveAll determinants (actually also all sub-determinants) are positive
The fact that a (square symmetric) matrix A is invertible implies the followingThe determinant is non-zero (actually larger than zero)The determinant is the product of the eigenvaluesThe determinant must therefor be larger than zero
Out[12]: + 14(1 + ) (− + 1)5‾√ 5‾√
Out[13]: 0
Out[15]: matrix([[5, 2], [2, 3]])
Out[17]: array([ 6.23606798, 1.76393202])
Out[18]: matrix([[ 0.85065081, -0.52573111], [ 0.52573111, 0.85065081]])
Out[19]: matrix([[ 6.23606798, 0. ], [ 0. , 1.76393202]])
Out[20]: matrix([[ 5., 2.], [ 2., 3.]])
2015/03/29, 7:50 PMIII_01_Symmetric_matrices
Page 4 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false
For projection matricesThe eigenvalues are either 0 or 1If this projection matrix is positive definite
The eigenvalues must all be 1 (since they must be larger than zero)The only matrix that satisfies this property is the identity matrix
The diagonal matrix D is positive definiteThis means that for any non-zero vector x we have x Dx>0Let's look at a 3-component vector with a 3×3 matrix D
In [21]: d1, d2, d3, x1, x2, x3 = symbols('d1 d2 d3 x1 x2 x3')
In [22]: D = Matrix([[d1, 0, 0], [0, d2, 0], [0, 0, d3]])x_vect = Matrix([x1, x2, x3])x_vect.transpose(), D, x_vect
Indeed we have x Dx>0 since the components if x are squared and the eigenvalues are all positive
In [23]: x_vect.transpose() * D * x_vect
Not all symmetric matrices with a positive determinant are definite positiveEasy matrices to construct with this property have negative values on the main diagonalNote below how the eigenvalues are not all more than zeroAlso note how x Dx≯0It is important to note that the sub-determinant must also be positive
In the example below the sub-determinant of 3 is -1
In [24]: A = Matrix([[3, 1], [1, -1]])A
In [25]: A == A.transpose()
In [26]: A.det()
In [27]: A.eigenvals()
In [28]: A.eigenvects()
In [29]: S, D = A.diagonalize()
In [30]: S
In [31]: D
T
Out[22]: ⎛
⎝⎜⎜ [ ] ,x1 x2 x3 ,
⎡
⎣⎢⎢
d100
0d20
00d3
⎤
⎦⎥⎥
⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
⎞
⎠⎟⎟
T
Out[23]: [ ]+ +d1 x21 d2 x2
2 d3 x23
T
Out[24]: [ ]31
1−1
Out[25]: True
Out[26]: −4
Out[27]: { }1 + : 1,5‾√ − + 1 : 15‾√
Out[28]:
[ ]( ) ,1 + ,5‾√ 1, [ ][ ]− 1
− +25√
1 ( )− + 1,5‾√ 1, [ ][ ]− 1
2+ 5√
1
Out[30]:
[ ]− 1
− +25√
1− 1
2+ 5√
1
Out[31]: [ ]1 + 5‾√0
0− + 15‾√
2015/03/29, 7:50 PMIII_01_Symmetric_matrices
Page 5 of 5http://localhost:8888/nbconvert/html/III_01_Symmetric_matrices.ipynb?download=false
In [32]: x_vect = Matrix([x1, x2])x_vect
In [33]: x_vect.transpose() * D * x_vect
In this example the sub-determinant of 1 is -3
In [34]: A = Matrix([[-3, 1], [1, 1]])A
In [35]: A == A.transpose()
In [36]: S, D = A.diagonalize()
In [37]: x_vect.transpose() * D * x_vect
In [ ]:
Out[32]: [ ]x1x2
Out[33]: [ ](1 + ) + (− + 1)x21 5‾√ x2
2 5‾√
Out[34]: [ ]−31
11
Out[35]: True
Out[37]: [ ](−1 + ) + (− − 1)x21 5‾√ x2
2 5‾√
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 1 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
This notebook is part of lecture 26 Complex matrices and the fast Fourier transform in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, I, sqrt, Rationalfrom IPython.display import Imagefrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Complex vectors, matricesFast Fourier transform
Complex vectors
Consider the following vector with complex entries (from this point on I will not use the underscore to indicate a vector, so as not to create confusion with the bar, notingcomplex conjugate, instead, inferring from context)
The length (actually length squared) of this vector is no good, since it should be positive
Instead we consider the following
In [4]: z = Matrix([1, I]) # I is the sympy symbol for the imaginary number iz
Let's calculate this manually
In [5]: z.norm() # The length of a vector
Out[1]:
z =
⎡
⎣
⎢⎢⎢⎢
z1z2
⋮zn
⎤
⎦
⎥⎥⎥⎥
zzT
z =z z∣∣ ∣∣2
∴ zzT
[ , , … , ]z1 z2 zn
⎡
⎣
⎢⎢⎢⎢
z1z2
⋮zn
⎤
⎦
⎥⎥⎥⎥
Out[4]: [ ]1i
Out[5]: 2‾√
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 2 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
In [6]: z_cc = Matrix([1, -I])z_cc
In [7]: sqrt(z_cc.transpose() * z)
Taking the transpose of the complex conjugate is called the Hermitian
We can use the Hermitian for non-complex (or mixed complex) vectors u and v too
In [8]: from sympy.physics.quantum.dagger import Dagger # A fun way to quickly get the Hermitian
In [9]: Dagger(z)
In [10]: sqrt(Dagger(z) * z)
Complex symmetric matrices
The transpose
If the symmetric matrix has complex entries then A =A is no good
In [11]: A = Matrix([[2, 3 + I], [3 - I, 5]])A # A Hermitian matrix
In [12]: A.transpose() == A
In [13]: Dagger(A)
In [14]: Dagger(A) == A
This will work for real-values symmetric matrices as well
In [15]: A = Matrix([[3, 4], [4, 2]])A
In [16]: A.transpose() == A
In [17]: Dagger(A) == A
Out[6]: [ ]1−i
Out[7]: ([ ])212
zzH
xyT
xyH
Out[9]: [ ]1 −i
Out[10]: ([ ])212
T
Out[11]: [ ]23 − i
3 + i5
Out[12]: False
Out[13]: [ ]23 − i
3 + i5
Out[14]: True
Out[15]: [ ]34
42
Out[16]: True
Out[17]: True
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 3 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
The eigenvalues and eigenvectors
Back to the complex matrix A
In [18]: A = Matrix([[2, 3 + I], [3 - I, 5]])A
In [19]: A.eigenvals()
In [20]: A.eigenvects()
In [21]: S, D = A.diagonalize()
In [22]: S
In [23]: D
What about S now?We have to use its transpose, but it is complex, so we have to take the Hermitian
In [24]: Dagger(S)
In [25]: S == Dagger(S) # Don't get confused here, S is not symmetric
Remember that for a symmetric matrix the column vectors in S (usually called Q, the matrix of eigenvectors) are orthogonal, with Q Q=IWith complex entries we have to consider the Hermitian here, not just the simple transposeHere we call Q unitary
The fast Fourier transform
Look at this special matrix (where we start counting rows and columns at zero)
Out[18]: [ ]23 − i
3 + i5
Out[19]: { }0 : 1, 7 : 1
A = [ ]23 − i
3 + i5
A − λI = 0⎯⎯
[ ] − [ ] = 0∣∣∣ 2
3 − i3 + i
5λ0
0λ
∣∣∣
= 0∣∣∣ 2 − λ
3 − i3 + i5 − λ
∣∣∣
(2 − λ) (5 − λ) − (3 + i) (3 − i) = 010 − 7λ + − (9 + 1) = 0λ2
− 7λ = 0λ2
= 0λ1= 7λ2
Out[20]:
[ ]( ) ,0, 1, [ ][ ]− −32
i2
1 ( )7, 1, [ ][ ]+35
i5
1
Out[22]: [ ]−3 − i2
3 + i5
Out[23]: [ ]00
07
Out[24]: [ ]−3 + i3 − i
25
Out[25]: False
T
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 4 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
W is a special number whose n power equals 1
It is in the complex plane of course (as written in sin and cos above)
Remember than n here refers to the size the matrixHere it also refers to the n n roots (if that makes any sense, else look at the image below)
In [26]: Image(filename = 'W.png')
So for n=4 we will have the following
We note that a quarter of the way around is i
We thus have the following
Note how the columns are orthogonal
=Fn
⎡
⎣
⎢⎢⎢⎢⎢⎢
W (0)(0)
W (1)(0)
W (2)(0)
⋮W (n−1)(0)
W (0)(1)
W (1)(1)
W (2)(1)
⋮W (n−1)(1)
W (0)(2)
W (1)(2)
W (2)(2)
⋮W (n−1)(2)
……………
W (0)(n−1)
W (1)(n−1)
W (2)(n−1)
⋮W (n−1)(n−1)
⎤
⎦
⎥⎥⎥⎥⎥⎥
= ; i, j = 0, 1, 2, … , n − 1( )Fn ij W ij
th
= 1W n
W = = cos + i sinei2πn
2πn
2πn
th
Out[26]:
=F4
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
1
1
1
1
1
( )e2πi4
1
( )e2πi4
2
( )e2πi4
3
1
( )e2πi4
2
( )e2πi4
4
( )e2πi4
6
1
( )e2πi4
3
( )e2πi4
6
( )e2πi4
9
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
= ie2πi4
=F4
⎡
⎣
⎢⎢⎢⎢
1111
1ii2
i3
1i2
i4
i6
1i3
i6
i9
⎤
⎦
⎥⎥⎥⎥
=F4
⎡
⎣
⎢⎢⎢⎢
1111
1i
−1−i
1−11
−1
1−i−1i
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 5 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
In [27]: F = Matrix([[1, 1, 1, 1], [1, I, -1, -I], [1, -1, 1, -1], [1, -I, -1, I]])F
In [28]: F.col(0) # Calling only the selected column (counting starts at 0)
The columns are supposed to be orthogonal, i.e. inner (dot) product should be zeroClearly below it is not
In [29]: F.col(1).dot(F.col(3))
Remember, though, that this is a complex matrix and we have to use the Hermitian
In [30]: col1 = F.col(1)col3 = F.col(3)col1, col3
In [31]: Dagger(col3), col1
In [32]: Dagger(col3) * col1 # Another way to do the dot product
So, these columns are all orthogonal, but they are not orthonormalNote, though that the are all of length 2, so we can normalize each
In [33]: Rational(1, 2) * F
We also note the following
Just remember to normalize them
In [34]: Dagger(Rational(1, 2) * F)
Out[27]: ⎡
⎣
⎢⎢⎢⎢
1111
1i
−1−i
1−11
−1
1−i−1i
⎤
⎦
⎥⎥⎥⎥
Out[28]: ⎡
⎣
⎢⎢⎢⎢
1111
⎤
⎦
⎥⎥⎥⎥
Out[29]: 4
Out[30]: ⎛
⎝
⎜⎜⎜⎜,
⎡
⎣
⎢⎢⎢⎢
1i
−1−i
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
1−i−1i
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
Out[31]: ⎛
⎝
⎜⎜⎜⎜[ ] ,1 i −1 −i
⎡
⎣
⎢⎢⎢⎢
1i
−1−i
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
Out[32]: [ ]0
Out[33]: ⎡
⎣
⎢⎢⎢⎢⎢
12121212
12i2
− 12
− i2
12
− 12
12
− 12
12
− i2
− 12
i2
⎤
⎦
⎥⎥⎥⎥⎥
= IFHn Fn
Out[34]: ⎡
⎣
⎢⎢⎢⎢⎢
12121212
12
− i2
− 12
i2
12
− 12
12
− 12
12i2
− 12
− i2
⎤
⎦
⎥⎥⎥⎥⎥
2015/03/29, 7:54 PMIII_02_Complex_matrices_FFT
Page 6 of 6http://localhost:8888/nbconvert/html/III_02_Complex_matrices_FFT.ipynb?download=false
In [35]: Dagger(Rational(1, 2) * F) * ((Rational(1, 2) * F))
Now why do we call it fast Fourier transformNote the following
Now we have the following connection between the two
P is a permutation matrix
Going down to 16 will include the following
The recursive work above leads to decreasing the work that is required for working with these problems
In [ ]:
Out[35]: ⎡
⎣
⎢⎢⎢⎢
1000
0100
0010
0001
⎤
⎦
⎥⎥⎥⎥
=Wn e2πin
=( )Wnp ( )e
2πin
p
= ; n = 64, p = 2( )W642 ( )e
2πi64
2
∴ =( )W642 W32
[ ] = [ ] [ ] [P]F64II
D−D
F320
0F32
D =
⎡
⎣
⎢⎢⎢⎢⎢⎢
100⋮0
0W0⋮0
00
W 2
⋮0
……………
000⋮
W 31
⎤
⎦
⎥⎥⎥⎥⎥⎥
[P]
⎡
⎣
⎢⎢⎢⎢
II00
D−D00
00II
00D
−D
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
F16000
0F1600
00
F160
000
F16
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 1 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
This notebook is part of lecture 27 Positive definite matrices and minima in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, Derivative, difffrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
In [4]: a, b, c, d, lamda, x1, x2, x3 = symbols('a b c d lamda x1 x2 x3')
Tests for positive definite matricesTests for minimum x Ax>0Ellipsoids in �
When is a symmetric matrix positive definite
Let's first consider the 2×2 matrix
Tests for complete definitivenessλ >0, λ >0a>0, ac-bThe pivots larger than zero
x Ax>0
Let's look at some example matrices
In [5]: A = Matrix([[2, 6], [6, a]])A
The first question is what value of a would make this symmetric matrix positive definiteThe second would be, which of the tests above would you use
The second question firstSeems the determinant tests would sufficeWe need 2a-36>0
The first question is the answereda must therefor be larger than 18
Out[1]:
T
n
1 22
a > 0; > 0ac − b2
aT
Out[5]: [ ]26
6a
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 2 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
Let's play around by making a equal to 18
In [6]: A = Matrix([[2, 6], [6, 18]])A
In [7]: A.charpoly(lamda)
In [8]: A.eigenvals()
One of the eigenvalues is zero (after all, it is a singular matrix now and one eigenvalues must be zero)It is a 2×2 matrix and we must have two eigenvaluesThe other eigenvalue must equal the trace of A, which is 20 (therefor there was no need to calculate the eigenvalues, we could just reason and read it off)We'll call this matrix positive semi-definite
Notice that the pivot test would not have helped
Let's look at x Ax>0 (where x is any correctly-sized vector)
In [9]: x_vect = Matrix([x1, x2])x_vect
In [10]: f = x_vect.transpose() * A * x_vectf
In [11]: f.expand() # Expanding the expression shows it is no quadratic (not linear anymore)
For A to be positive definite, this quadratic must be positive for all values of x
Below I use some 3D plottingNot too clear to see, but note that nowhere does plot go below zero on the z-axis
In [12]: import scipy as spimport matplotlib.pyplot as pltfrom mpl_toolkits.mplot3d import Axes3D
%matplotlib inline
Out[6]: [ ]26
618
Out[7]: PurePoly ( − 20λ, λ, domain = ℤ)λ2
Out[8]: { }0 : 1, 20 : 1
(6)(2) − 62
2
T
Out[9]: [ ]x1x2
Out[10]: [ ](2 + 6 ) + (6 + 18 )x1 x1 x2 x2 x1 x2
Out[11]: [ ]2 + 12 + 18x21 x1 x2 x2
2
i
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 3 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [13]: fig = plt.figure(figsize = (10, 8))ax = Axes3D(fig)x = sp.linspace(-2, 2, 100)y = sp.linspace(-2, 2, 100)[x, y] = sp.meshgrid(x, y)z = 2 * x ** 2 + 12 * x * y + 18 * y ** 2
ax.plot_wireframe(x, y, z, rstride = 5, cstride = 5)plt.show();
We can construct a matrix with a value for a (same matrix as above), which will clearly not be positive definite
In [14]: A = Matrix([[2, 6], [6, 7]])A
In [15]: f = x_vect.transpose() * A * x_vectf.expand()
Out[14]: [ ]26
67
Out[15]: [ ]2 + 12 + 7x21 x1 x2 x2
2
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 4 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [16]: fig = plt.figure(figsize = (10, 8))ax = Axes3D(fig)x = sp.linspace(-20, 20, 100)y = sp.linspace(-20, 20, 100)[x, y] = sp.meshgrid(x, y)z = 2 * x ** 2 + 12 * x * y + 7 * y ** 2
ax.plot_wireframe(x, y, z, rstride = 5, cstride = 5)plt.show();
I've saved a separate rendition of this which is rotated so that you can see, we are dipping below z=0
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 5 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [17]: Image(filename ='figure_1.png', width = 800)
Clearly now, for some values of x the z value is less than zero
Now for an example which is clearly definite positive
In [18]: A = Matrix([[2, 6], [6, 26]])A
In [19]: f = x_vect.transpose() * A * x_vectf.expand()
Out[17]:
i
Out[18]: [ ]26
626
Out[19]: [ ]2 + 12 + 26x21 x1 x2 x2
2
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 6 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [20]: fig = plt.figure(figsize = (10, 8))ax = Axes3D(fig)x = sp.linspace(-2, 2, 100)y = sp.linspace(-2, 2, 100)[x, y] = sp.meshgrid(x, y)z = 2 * x ** 2 + 12 * x * y + 26 * y ** 2
ax.plot_wireframe(x, y, z, rstride = 5, cstride = 5)plt.show();
Minina
We use the following function from our symmetric matrix above
Completing the square we have the following
From this we can see that we are dealing with all positive values irrespective of the values of the variables
In [21]: (2 * (x1 + 3 * x2) ** 2).expand() # Just checking if we are correct
Setting the equation equal to (a positive) value will cut through the plot and result in an ellipseCutting though a saddle point results in a hyperbola
In [22]: # Rewriting the function above as a computer variablefunction = 2 * (x1 + 3 * x2) ** 2 + 2 * x2 ** 2function
In [23]: # Derivative(f, variable with whicg respect to partial derivative is taken, order)Derivative(function, x1) # Printing the partial derivative to the screen
f ( , ) = 2 + 12 + 20x1 x2 x21 x1 x2 x2
2
f ( , ) = 2 + 2x1 x2 ( + 3 )x1 x22 x2
Out[21]: 2 + 12 + 18x21 x1 x2 x2
2
Out[22]: 2 + 2x22 ( + 3 )x1 x2
2
Out[23]: (2 + 2 )∂∂x1
x22 ( + 3 )x1 x2
2
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 7 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [24]: Derivative(function, x1).doit() # The .doit() method execute the partial derivative
In [25]: diff(function, x2, 1) # Alternative method of doing the partial derivative
Solving for the two variables in two equations using an augmented matrix
In [26]: M = Matrix([[4, 12, 0], [12, 40, 0]])M.rref()
Let's look at this if we cut through the x z-plane (that is x = 0) and the x z-plane (that is x =0)
Let's look at the two plots
In [27]: x = sp.linspace(-2, 2, 100)
plt.figure(figsize = (10, 8))plt.plot(x, 2 * x **2)plt.show();
Out[24]: 4 + 12x1 x2
Out[25]: 12 + 40x1 x2
Out[26]: ( )[ ] ,10
01
00 [ ]0, 1
1 y 2 1f ( , ) = 2 + 2x1 x2 ( + 3 )x1 x2
2 x22
f ( , 0) = 2x1 ( )x12
f (0, ) = 2 + 2 = 8x2 (3 )x22 x2
2 x22
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 8 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [28]: y = sp.linspace(-2, 2, 100)
plt.figure(figsize = (10, 8))plt.plot(y, 8 * x **2)plt.show();
If you reconstruct this in your mind's eye, you can see that we are dealing with a bowl shapeRemember from calculus that an extrema is a first derivative set to zeroWe look at the second derivative to know if we are dealing with a minimum or a maximum
In [29]: diff(2 * x1 ** 2, x1, 2) # Taking a derivative twice
In [30]: diff(8 * x2 ** 2, x2, 2)
These second derivatives are both positive and we have a minimum
Let's go for a setup again which will not be positive definite
Completing the square will have the following equation
In [31]: function = 2 * (x1 + 3 * x2) ** 2 - 11 * x2 ** 2function, function.expand() # Just checking to see if our completion of the square was correct
In [32]: Derivative(function, x1), diff(function, x1)
In [33]: Derivative(function, x2), diff(function, x2)
In [34]: M = Matrix([[4, 12, 0], [12, 14, 0]])M.rref()
Again, an extrema at (0,0)
Out[29]: 4
Out[30]: 16
f ( , ) = 2 + 12 + 7x1 x2 x21 x1 x2 x2
2
f ( , ) = 2 − 11x1 x2 ( + 3 )x1 x22 x2
2
Out[31]: ( )−11 + 2 ,x22 ( + 3 )x1 x2
2 2 + 12 + 7x21 x1 x2 x2
2
Out[32]: ( )(−11 + 2 ) ,∂∂x1
x22 ( + 3 )x1 x2
2 4 + 12x1 x2
Out[33]: ( )(−11 + 2 ) ,∂∂x2
x22 ( + 3 )x1 x2
2 12 + 14x1 x2
Out[34]: ( )[ ] ,10
01
00 [ ]0, 1
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 9 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In your mind's eye we clearly have a saddle point at (x ,x )=(0,0)
Let's look at x =-x
We are creating a 45 plane and values for z will be negative here
This is what makes the matrix non-positive definitex Ax will result in an equation which we can use to show always, not-always positive (and the marginal case described above)
So positive definite is the matrix equivalent of the first and second derivative in calculus (which looks at the shape of the plot, i.e extrema)
For a 2×2 we are thus looking for the following
The pivots, the multiplier and completing the square
Let's take the matrix belowWe know it's symmetric and positive definiteWe also saw that for x Ax we had to complete the square to show z>0This can easily be done by looking at the pivots and the multiplier
In [35]: A = Matrix([[2, 6], [6, 20]])A
In [36]: L, U, _ = A.LUdecomposition()L, U
Note how the pivots are 2 and 2 and the multiplier was 3Now look at the completed square equation
In [37]: (x_vect.transpose() * A * x_vect).expand()
So, we wanted squares, but we are also interested in what goes on outside the squares, i.e. the pivots (2 and 2 in our example)Positive pivots give sum of squares, everything positive means there is a minimum (everything goes up)We can extend this for any n×n symmetric matrices
Let's look again at the matrix of second derivatives we had above
f and f have to be positive (for a minimum) and they have to be larger than the product of the other two, f and f
1 2
1 2f ( , ) = 2 + 12 + 7x1 x2 x2
1 x1 x2 x22
f ( , ) = 2 − 12 + 7x1 −x1 x21 x1 x1 x2
1f ( , ) = 2 − 12 + 7x1 −x1 x2
1 x21 x2
1f ( , ) = −3x1 −x1 x2
10
T
[ ]∂
∂x∂x∂
∂y∂x
∂∂x∂y
∂∂y∂y
T
Out[35]: [ ]26
620
Out[36]: ( )[ ] ,13
01 [ ]2
062
Out[37]: [ ]2 + 12 + 20x21 x1 x2 x2
2
2 + 12 + 20x21 x1 x2 x2
2
2 + 2( + 3 )x1 x22 x2
2
[ ]∂
∂x∂x∂
∂y∂x
∂∂x∂y
∂∂y∂y
xx yy xy yx
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 10 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [38]: function = 2 * x1 ** 2 + 12 * x1 * x2 + 20 * x2 ** 2function
In [39]: fxx = diff(function, x1, 2)fxy = diff(function, x1, x2)fyx = diff(function, x2, x1)fyy = diff(function, x2, 2)
In [40]: deriv_matr = Matrix([[fxx, fxy], [fyx, fyy]])deriv_matr
In [41]: deriv_matr.det()
Setting the first partial derivatives equal to zero finds the extremaThe condition above sets everything positive (positive definite)
Let's the look at the following
In [42]: function = 2 * x1 ** 2 + 12 * x1 * x2 + 7 * x2 ** 2function
In [43]: fxx = diff(function, x1, 2)fxy = diff(function, x1, x2)fyx = diff(function, x2, x1)fyy = diff(function, x2, 2)
In [44]: deriv_matr = Matrix([[fxx, fxy], [fyx, fyy]])deriv_matr
In [45]: deriv_matr.det()
Although the matrix of second derivatives are all positive entries, the determinant is negativeNot all conditions are met for the original matrix to be positive definite
Let's step this up to 3×3 symmetric matrices
In [46]: A = Matrix([[2, -1, 0], [-1, 2, -1], [0, -1, 2]])A
In [47]: A.transpose() == A # Test to see of A is symmetric
Is this symmetric matrix positive definite?
Let's start by looking at the determinant (and sub-determinants)
In [48]: A.det() # determinant of the whole matrix
Out[38]: 2 + 12 + 20x21 x1 x2 x2
2
Out[40]: [ ]412
1240
Out[41]: 16
Out[42]: 2 + 12 + 7x21 x1 x2 x2
2
Out[44]: [ ]412
1214
Out[45]: −88
Out[46]: ⎡
⎣⎢⎢
2−10
−12
−1
0−12
⎤
⎦⎥⎥
Out[47]: True
Out[48]: 4
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 11 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
All the submatrices will be the following
There determinant are all positive (2, 3, 4)
Let's look at the pivots
In [49]: L, U, _ = A.LUdecomposition()L, U
The pivots are all positive
Notice how the first determinant was 2 (also the first sub-determinant)The product of the first two pivots must equal the 2 determinant (which is 3) and must therefor be /The product of the first three (all) pivots must equal the 4 determinant (which is 4) and must therefor be /
Let's look at the eigenvalues
In [50]: A.eigenvals()
Again, all positiveSo far, so goodJust as a reminder, remember that the sum of the eigenvalues must equal the trace (sum of the entries on the main diagonal) and multiplying them must equal thedeterminant
Let's look at x Ax
In [51]: x_vect = Matrix([x1, x2, x3])x_vect
In [52]: (x_vect.transpose() * A * x_vect).expand()
In [53]: function = 2 * x1 ** 2 - 2 * x1 * x2 + 2 * x2 ** 2 - 2 * x2 * x3 + 2 * x3 ** 2function # A cubic equation
We can construct this as followsThe main diagonal entries are the constant of the squared variables (2, 2, and 2)There is a -1 and a -1 in the 12 and 21 row-column positions, whose sum is -2 and which then belongs to the x x (or x x )The 13 and 31 entries are both zero, so there will be no x x coefficientThe 23 and 32 entries are both -1, so again, a coefficient of -2 for x x
This matrix represents a plot in 4D space, so we can't draw itWe can construct it as the sum of three squares thoughThe three squares will be made up of the three pivots (for their coefficients)They (and obviously the squared values) are all positive and therefor we will only have f(x,y,z) values which are positive
[2] , [ ] ,2−1
−12
⎡
⎣⎢⎢
2−10
−12
−1
0−12
⎤
⎦⎥⎥
Out[49]: ⎛
⎝
⎜⎜⎜ ,⎡
⎣
⎢⎢⎢
1− 1
2
0
01
− 23
001
⎤
⎦
⎥⎥⎥
⎡
⎣
⎢⎢⎢
200
−132
0
0−1
43
⎤
⎦
⎥⎥⎥
⎞
⎠
⎟⎟⎟
2, ,32
43
nd 3 2th 4 3
Out[50]: { }2 : 1, − + 2 : 1,2‾√ + 2 : 12‾√
T
Out[51]: ⎡
⎣⎢⎢
x1x2x3
⎤
⎦⎥⎥
Out[52]: [ ]2 − 2 + 2 − 2 + 2x21 x1 x2 x2
2 x2 x3 x23
Out[53]: 2 − 2 + 2 − 2 + 2x21 x1 x2 x2
2 x2 x3 x23
1 2 2 11 3
2 3
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 12 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
Cutting through this 4D space (which is difficult to visualize) as say f(x,y,z) will give an ellipsoid (lopsided football)A sphere would have three equal eigenvaluesA football-shape would have two identical eigenvalues and the third differentThe lopsided ellipsoid would all three eigenvalues different as in this caseThe half-lengths the axes of these shapes are 1 over the eigenvaluesDiagonalization will give the principle axis theorem
In [54]: fxx = diff(function, x1, x1)fxy = diff(function, x1, x2)fxz = diff(function, x1, x3)fyx = diff(function, x2, x1)fyy = diff(function, x2, x2)fyz = diff(function, x2, x3)fzx = diff(function, x3, x1)fzy = diff(function, x3, x2)fzz = diff(function, x3, x3)
In [55]: deriv_matr = Matrix([[fxx, fxy, fxz], [fyx, fyy, fyz], [fzx, fzy, fzz]])deriv_matr
In [56]: deriv_matr.det()
The determinant (and all sub-determinants) are positive
Example problems
Example problem 1
For which values of c will the following matrix be positive definite and semi-positive definite
Solution
Let's try the determinant test first
In [57]: A = Matrix([[2, -1, -1], [-1, 2, -1], [-1, -1, 2 + c]])A
In [58]: A.det()
All the sub-determinants are positive, being 2, 3 and then 3c for c>0
Let's look at the pivot test
QΛQT
Out[55]: ⎡
⎣⎢⎢
4−20
−24
−2
0−24
⎤
⎦⎥⎥
Out[56]: 32
⎡
⎣⎢⎢
2−1−1
−12
−1
−1−1
2 + c
⎤
⎦⎥⎥
Out[57]: ⎡
⎣⎢⎢
2−1−1
−12
−1
−1−1
c + 2
⎤
⎦⎥⎥
Out[58]: 3c
2015/03/29, 8:00 PMIII_03_Positive_definite_matrix_Minima_Ellipsoid
Page 13 of 13http://localhost:8888/nbconvert/html/III_03_Positive_definite_matrix_Minima_Ellipsoid.ipynb?download=false
In [59]: L, U, _ = A.LUdecomposition()L, U
Again, all the pivots (in U) are positive for c > 0So for positive definite we have c>0 and for semi-definite we have c=0
In [60]: A.eigenvals()
The energy or completing the square test
In [61]: (x_vect.transpose() * A * x_vect).expand()
For x we have (c+2)(x )Remember, though, that for the squares of the x-components we must have the entries along the main diagonal of A as their coefficients; this c + 2 = 2 and hence, again,c=0
For interest's sake, we will have the following completed square equation
Now the coefficients come from the values along the diagonal of UThe -½ values come from the multipliers as seen in column 1 of LThe +1 and -1 for (y-z) come from column 2 of LThe +1 in front of z come from column 3 of L (actually every set of () contains and x, y and z, some coefficients (from L) are just zero)Be that as it may, the squared equation as it stands will only equal zero if x=y=z=0For a value of more than zero, c must be positivePlease note that by x,y and z I am referring to x , x and x
If x=0 we have the following matrix
In [62]: A = Matrix([[2, -1, -1], [-1, 2, -1], [-1, -1, 2]])A
In [63]: A.det()
A singular matrix, again only possible if all variables equal zero
In [ ]:
Out[59]: ⎛
⎝
⎜⎜⎜ ,⎡
⎣
⎢⎢⎢
1− 1
2
− 12
01
−1
001
⎤
⎦
⎥⎥⎥
⎡
⎣⎢⎢⎢
200
−1320
−1− 3
2c
⎤
⎦⎥⎥⎥
⎞
⎠
⎟⎟⎟
Out[60]: { }3 : 1, − + : 1,c2
12 + 2c + 9c2‾ ‾‾‾‾‾‾‾‾‾‾√ 3
2 + + : 1c2
12 + 2c + 9c2‾ ‾‾‾‾‾‾‾‾‾‾√ 3
2
Out[61]: [ ]c + 2 − 2 − 2 + 2 − 2 + 2x23 x2
1 x1 x2 x1 x3 x22 x2 x3 x2
3
3 32
2 + + c(x − y − z)12
12
2 32 (y − z)2 z2
22 's
1 2 3
Out[62]: ⎡
⎣⎢⎢
2−1−1
−12
−1
−1−12
⎤
⎦⎥⎥
Out[63]: 0
2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form
Page 1 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false
This notebook is part of lecture 28 Similar matrices and Jordan form in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, eyefrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Similar matrices
Positive definite matrices
Remember the following from positive definite matrices
These always refer to symmetric matricesWhat do we know about their inverses?
We can't say anything about their pivotsThe following is true for their eigenvalues, though
The inverse is also positive definiteIf both A and B are positive definite
We don't know the pivots of (A+B)We don't know the eigenvalues of (A+B)We could look at the following (which is true)
From least squares the m×n matrix A (is not square, neither symmetric (for this section, though, we assume rank is n)) we used A A, which is square and symmetric, butis it positive definite?Analogous with real numbers, where we ask is the square of any value positive?
Again we don't know is the pivots or eigenvaluesWe do look at the following which is always positive (which you can show by grouping some terms)
This last statement is just the length of Ax, which must be positive (or zero; only if x=0)
Similar matrices
Out[1]:
Ax > 0; x ≠ 0xT
=λA−11λA
(A + B) x > 0xT
T
AxxT AT
= (Ax)(Ax)T
= ∥Ax∥2
2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form
Page 2 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false
Consider two similar, square matrices A and B (no longer with the requirement that they are symmetric)They have similar sizes, thoughThe similarity lies in the fact that there is some invertible matrix M for which the following holds
Remember the creation of the diagonal matrix using the eigenvector matrix
This says A is similar to Λ
Now we consider some (invertible) matrix M and create a matrix B from M AMWe state that B is then similar to A (it is now part of some family of matrices of A, the neatest of which is the diagonal matrix Λ for its creation via the eigenvector matrix ofA)
In [4]: A = Matrix([[2, 1], [1, 2]])A
In [5]: S, D = A.diagonalize() # S is the eigenvector matrix
In [6]: S.inv() * A * S # The matrix Lambda
Now let's invent a matrix M
In [7]: M = Matrix([[1, 4], [0, 1]])M
In [8]: B = M.inv() * A * MA, B # Printing both to the screen
What does A and B have in common?The have the same eigenvalues
In [9]: A.eigenvals(), B.eigenvals() # The solution is in the form {eigenvalue:how many times that that value occur...}
All similar-sized matrices with the same eigenvalues are similar matricesThe most special member of this family is the diagonal matrix with the eigenvalues on the main diagonal
The eigenvectors are not the same though
In [10]: A.eigenvects(), B.eigenvects()
B = AMM −1
AS = ΛS−1
-1
Out[4]: [ ]21
12
Out[6]: [ ]10
03
Out[7]: [ ]10
41
Out[8]: ( )[ ] ,21
12 [ ]−2
1−15
6
Out[9]: ( ){ } ,1 : 1, 3 : 1 { }1 : 1, 3 : 1
Ax = λxA x = λ xM −1 M −1
∵ M = IM −1
AM x = λ xM−1 M −1 M −1
∵ B = AMM −1
B x = λ xM −1 M −1
Out[10]: ( )[ ] ,( ) ,1, 1, [ ][ ]−11 ( )3, 1, [ ][ ]1
1 [ ]( ) ,1, 1, [ ][ ]−51 ( )3, 1, [ ][ ]−3
1
2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form
Page 3 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false
Remember that we have a problem when eigenvalues are repeated for a matrixIf this is so, we might not have a full set of eigenvectors and we cannot diagonalize
In [11]: A1 = Matrix([[4, 0], [0, 4]])A2 = Matrix([[4, 1], [0, 4]])
In [12]: A1.eigenvals()
In [13]: A2.eigenvals()
Both the two matrices A 1 and A have two similar eigenvalues each, namely 4They are not similar, thoughThere is no matrix M to use with A to produce A
Note that A is 4 multiplied by the identity matrix of size 2It is a small family, with only this memberA is the neatest member of its much larger familyDiagonalizing it is not possible, though, as if it was, it would results in A which is not in the same family, leaving A as the neatest family member
The nicest (most diagonal one) is called the Jordan form of the family
Let's find more members of A
The matrix A is
The trace is 8, so let's choose 5 and 3
The determinant must remain 16, so let's choose 1 and -1
In [14]: A3 = Matrix([[5, 1], [-1, 3]])A1.eigenvals() == A3.eigenvals() # Check to see if the eigenvalues are similar
So we have to add, similar independent columns of eigenvectors to the definition of similar matricesIt's more than that, though
In [15]: A4 = Matrix([[0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0]])A4
In [16]: A4.eigenvals() # Four zeros
In [17]: A4.eigenvects() # Rank of 2
Out[12]: { }4 : 2
Out[13]: { }4 : 2
1 2
1 2
1
22 1
1
1
[ ]40
14
[ ]53
[ ]5−1
13
Out[14]: True
Out[15]: ⎡
⎣
⎢⎢⎢⎢
0000
1000
0100
0000
⎤
⎦
⎥⎥⎥⎥
Out[16]: { }0 : 4
Out[17]: ⎡
⎣
⎢⎢⎢⎢
⎛
⎝
⎜⎜⎜⎜0, 4,
⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
1000
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0001
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
⎤
⎦
⎥⎥⎥⎥
2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form
Page 4 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false
In [18]: A5 = Matrix([[0, 1, 7, 0], [0, 0, 1, 0], [0, 0, 0, 0], [0, 0, 0, 0]])A5
In [19]: A5.eigenvals()
In [20]: A5.eigenvects()
In [21]: A6 = Matrix([[0, 1, 0, 0], [0, 0, 0, 0], [0, 0, 0, 1], [0, 0, 0, 0]])A6
In [22]: A6.eigenvals()
In [23]: A6.eigenvects()
In [24]: A4.eigenvects() == A5.eigenvects()
In [25]: A4.eigenvects() == A6.eigenvects()
Jordan's theoremEvery square matrix A is similar to a Jordan matrix JThere is one eigenvector per blockThe eigenvalues sit along the main diagonalThe matrices are not similar if the blocks are not of similar sizeSee problem 3 below where Jordan blocks are formed (they must actually both be broken down further into true Jordan blocks which will show the blocks to beof unequal size, instead I keep them in non-Jordan form (not correct) and show different number of pivots and thereby different eigenvectors)
Example problems
Example problem 1
If A and B are similar matrices, why are the following similar?
Solution
Out[18]: ⎡
⎣
⎢⎢⎢⎢
0000
1000
7100
0000
⎤
⎦
⎥⎥⎥⎥
Out[19]: { }0 : 4
Out[20]: ⎡
⎣
⎢⎢⎢⎢
⎛
⎝
⎜⎜⎜⎜0, 4,
⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
1000
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0001
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
⎤
⎦
⎥⎥⎥⎥
Out[21]: ⎡
⎣
⎢⎢⎢⎢
0000
1000
0000
0010
⎤
⎦
⎥⎥⎥⎥
Out[22]: { }0 : 4
Out[23]: ⎡
⎣
⎢⎢⎢⎢
⎛
⎝
⎜⎜⎜⎜0, 4,
⎡
⎣
⎢⎢⎢⎢,
⎡
⎣
⎢⎢⎢⎢
1000
⎤
⎦
⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢
0010
⎤
⎦
⎥⎥⎥⎥
⎤
⎦
⎥⎥⎥⎥
⎞
⎠
⎟⎟⎟⎟
⎤
⎦
⎥⎥⎥⎥
Out[24]: True
Out[25]: False
2 + A − 3IA3
2 + B − 3IB3
2015/03/29, 8:05 PMIII_04_Similar_matrices_Jordan_form
Page 5 of 5http://localhost:8888/nbconvert/html/III_04_Similar_matrices_Jordan_form.ipynb?download=false
There is some matrix such that the following is true
From this follows
I.e. if two matrices (A and B) are similar any polynomial involving them (replacing A with B) are similar
Example problem 2
Are the two 3×3 matrices A and B , with eigenvalues 1, 0, -1 similar?
Solution
Yes, because the eigenvalues are distinct (and diagonalizable)
Example problem 3
Are these two matrices similar?
Solution
No
In [26]: J1 = Matrix([[-1, 1, 0], [0, -1, 1], [0, 0, -1]])J2 = Matrix([[-1, 1, 0], [0, -1, 0], [0, 0, -1]])J1, J2
Let's create Jordan block from these
In [27]: J1 + eye(3), J2 + eye(3)
Jordan blocks have zeros on the main diagonal and various forms of 1 just above the main diagonalNote the difference between the Jordan blocks of J and JThe first now contains two pivots and the second only 1; they will not have the same number of eigenvectors and cannot be similar
In [ ]:
MA = BM −1
M (2 + A − 3I)A3 M −1
= 2 (MA MA MA ) + MA − 3MIM −1 M −1 M −1 M −1 M −1
= 2 + B − 3IB3
=J1
⎡
⎣⎢⎢
−100
1−10
01
−1
⎤
⎦⎥⎥
=J2
⎡
⎣⎢⎢
−100
1−10
00
−1
⎤
⎦⎥⎥
Out[26]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
−100
1−10
01
−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
−100
1−10
00
−1
⎤
⎦⎥⎥
⎞
⎠⎟⎟
Out[27]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
000
100
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
000
100
000
⎤
⎦⎥⎥
⎞
⎠⎟⎟
1 2
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 1 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
This notebook is part of lecture 29 Singular value decomposition in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
This chapter starts with explanations using sympyA proper method using numpy and sympy is described in the example problem at the end
Singular value decomposition (SVD)
Derivation
This is the final form of matrix factorizationThe factors are an orthogonal matrix A, a diagonal matrix Σ, and an orthogonal matrix V
In case the matrix A is symmetric positive definite, the decomposition is akin to the following
Consider a vector v in � row space, transformed into a vector u in � column space by the matrix A
What we are looking for is an orthogonal basis in � row space, transformed into an orthogonal basis in � column space
It's easy to calculate an orthogonal basis in the row space using Gram-SchmidtNow, though, we need something special in A that would ensure that the basis u in in � column space is also orthogonal (and at the same time make it orthonormal, sothat v ends up as σ u )The two nullspaces are not requiredSo, we are looking for the following
Out[1]:
A = UΣV T
A = QΛQT
1 n 1 m
= Au1 v1
n m
= Au1 v1⊥ ; ⊥v1 v2 u1 u2
i m
i i i
A =⎡
⎣⎢⎢⎢
⋮v1
⋮
⋮v2
⋮
⋮⋯⋮
⋮vr
⋮
⎤
⎦⎥⎥⎥
⎡
⎣⎢⎢⎢
⋮u1
⋮
⋮u2
⋮
⋮⋯⋮
⋮ur
⋮
⎤
⎦⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
σ1σ2
⋱σr
(0)
⎤
⎦
⎥⎥⎥⎥⎥⎥AV = UΣ
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 2 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In the case that we are not changing spaces V and U would be the same matrix Q (and then Q )
Example problem explaining the derivation
Look at the next matrix A that is square and invertible (i.e. rank 2)
In [4]: A = Matrix([[4, 4], [-3, 3]])A
We are looking for v and v in the � rowspace and u and u in the � columnspace, as well as the scaling factors σ >0 and σ >0
Just to be complete, we extend V until v with zero columns and U with zero columns until u , as well as zeros for Σ to include the nullspaces
Now A is not symmetric so that their eigenvectors are not orthogonal (Q), so we can't go that route
From above we have the following and because V is square and orthogonal we have
Multiplying both sides by A we will have a left-hand side that is square and definte (semi)definte
Because A A is now definite (semi)positive, we have a perfect situation akin to being able to use QΛQThe eigenvalues are the squares of the σ valuesTo get U we use AA and use its eigenvalues and eigenvectors
All of this is easy to accomplish with the mpmath submodule of sympy
In [5]: from sympy.mpmath import svd
In [6]: U, S, V = svd(A)
In [7]: U # The numbers round to zero!!! Please see it as zero
In [8]: S # Not the final Sigma matrix
In [9]: V
There are square roots, so the values are given instead of symbols
Now let's do it step-by-step
-1
Out[4]: [ ]4−3
43
1 2 2 1 2 2 1 2
n m
A = UΣV −1
A = UΣV T
T
A = UΣV T
A = V UΣAT ΣT UT V T
∵ U = IUT
∵ Σ = … …ΣT σ2i
A = V ΣAT ΣT V T
T T
iT
Out[7]: [ ]1.01.11022302462516 ⋅ 10−16
3.33066907387547 ⋅ 10−16
−1.0
Out[8]: matrix([['5.65685424949238'], ['4.24264068711928']])
Out[9]: matrix([['0.707106781186547', '0.707106781186548'], ['0.707106781186548', '-0.707106781186547']])
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 3 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In [10]: A.transpose() * A
In [11]: (A.transpose() * A).eigenvals()
In [12]: (A.transpose() * A).eigenvects()
These are not normalized, thoughAlso remember to take the square roots of the eigenvalues... and to add zeros to incorporate the correct size for m and n... and to take the transpose
Now let's tackle U
In [13]: A * A.transpose()
In [14]: (A * A.transpose()).eigenvals()
The eigenvalues are always the same
In [15]: (A * A.transpose()).eigenvects()
Also remember to normalize (see example problem below)
We now have U, Σ and V (although Σ must still be constructed; see below)
Example problem to explain the derivation for dependent rows, columns
Let's consider this rank=1, 2×2 singular matrixThe rowspace is just a line (the second row is a constant multiple of the first)The nullspace of this row picture is a line perpendicular to thisThe columnspace is also on a line, with the nullspace of A being a line perpendicular to this
In [16]: A = Matrix([[4, 3], [8, 6]])A
Let's use svd() first
In [17]: U, S, V = svd(A, full_matrices = True, compute_uv = True)
This is likely to be different to the value you calculate for UWe are talking unit basis vectors, though, which can be in a different direction depending on your choice
In [18]: U
Out[10]: [ ]257
725
Out[11]: { }18 : 1, 32 : 1
Out[12]: [ ]( ) ,18, 1, [ ][ ]−11 ( )32, 1, [ ][ ]1
1
Out[13]: [ ]320
018
Out[14]: { }18 : 1, 32 : 1
Out[15]: [ ]( ) ,18, 1, [ ][ ]01 ( )32, 1, [ ][ ]1
0
T
T
Out[16]: [ ]48
36
Out[18]: [ ]−0.447213595499958−0.894427190999916
−0.8944271909999160.447213595499958
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 4 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In [19]: S
Note that the size of our Σ matrix is wrongIt has to be 2×2 and we have to create it from this infoSince A has rank = 1 and all off-diagonal entries must be zero, we will only have a value in the first row, first column positionBelow I show you how to correct this
In [20]: V
In [21]: A.transpose() * A # Which will be symmetric positive definite and of rank = 1
In [22]: (A.transpose() * A).eigenvals() # One eigenvalue will be zero and the other must then be the trace
In [23]: (A.transpose() * A).eigenvects()
In [24]: A * A.transpose()
In [25]: (A * A.transpose()).eigenvals()
In [26]: (A * A.transpose()).eigenvects()
Inserted below is the three resultant matrices from our calculations above (normalized, etc)
In [27]: Matrix([[1 / sqrt(5), 2 / sqrt(5)], [2 / sqrt(5), 1 / sqrt(5)]]), Matrix([[sqrt(125), 0], [0, 0]]), Matrix([[0.8, 0.6], .6, -0.8]])
Now let me show you how to correct the svd() solutions
In [28]: U
In [29]: S
In [30]: S = Matrix([[11.1803398874989, 0], [0, 0]])S # Composed by hand (proper method further below)
Out[19]: matrix([['11.1803398874989'], ['0.0']])
Out[20]: matrix([['-0.8', '-0.6'], ['-0.6', '0.8']])
Out[21]: [ ]8060
6045
Out[22]: { }0 : 1, 125 : 1
Out[23]:
[ ]( ) ,0, 1, [ ][ ]− 34
1 ( )125, 1, [ ][ ]431
Out[24]: [ ]2550
50100
Out[25]: { }0 : 1, 125 : 1
Out[26]: [ ]( ) ,0, 1, [ ][ ]−21 ( )125, 1, [ ][ ]
121
Out[27]: ⎛⎝⎜⎜ ,
⎡⎣⎢⎢
5√5
2 5√5
2 5√55√
5
⎤⎦⎥⎥ [ ] ,5 5‾√
000 [ ]0.8
0.60.6
−0.8
⎞⎠⎟⎟
Out[28]: [ ]−0.447213595499958−0.894427190999916
−0.8944271909999160.447213595499958
Out[29]: matrix([['11.1803398874989'], ['0.0']])
Out[30]: [ ]11.18033988749890
00
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 5 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In [31]: V
In [32]: V = Matrix([[-0.8, -0.6], [-0.6, 0.8]])V # Remember that this is actually V transpose
Let's calculate AΣV
In [33]: U * S * V
Compensating for rounding, this is the original matrix A
Summary
The orthonormal basis for the rowspace is
The orthonormal basis for the columnspace is
The orthonormal basis for the nullspace is
The orthonormal basis for the nullspace of A
Example problem
Example problem 1
Find the singular value decomposition of the matrix
Solution
First off, I'll show you how to make proper use of numpy and scipy (as opposed to sympy) to solve singular value decomposition problems
In [34]: from numpy import matrix, transpose # Importing the matrix object and the # transpose object from numerical python (numpy)from numpy.linalg import svd, det # Importing the svd and determinant# methods from the linalg submodule from scipy.linalg import diagsvd
In [35]: type(transpose) # Type tells us what 'something' is (sometimes)
In [36]: C = matrix([[5, 5], [-1, 7]]) # Using the numpy matrix objectC
We can see from the determinant that the rows and columns are independent
Out[31]: matrix([['-0.8', '-0.6'], ['-0.6', '0.8']])
Out[32]: [ ]−0.8−0.6
−0.60.8
T
Out[33]: [ ]3.999999999999987.99999999999996
2.999999999999995.99999999999997
, , … ,v1 v2 vr
, , … ,u1 u2 ur
, , … ,vr+1 vr+2 vnT
, , … ,ur+1 ur+2 um
[ ]5−1
57
Out[35]: function
Out[36]: matrix([[ 5, 5], [-1, 7]])
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 6 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In [37]: det(C) # Notice the difference in syntax
Let's calculate U by looking at A A
In [38]: transpose(C) *C # Notice the difference in synmtax
This is symmetric, positive definiteOne eigenvalue will be 0 and the other, the trace (since they (the eigenvalues) must sum to the trace)Remember that the eigenvalues are the squares of the σ values
Now let's put numpy and sympy to good use
In [39]: U, S, VT = svd(C) # I use the computer variable VT to remind us that# this is the transpose of V
S will only indicate the eigenvalues and must be converted to the correct sized matrix
In [40]: M, N = C.shape # Shape returns a tuple (two values), indicating# row and column sizeM, N
In [41]: Sig = diagsvd(S, M, N) # Creating a m times n matrix from SSig
In [42]: VT
Let's check if it worked!
In [43]: U * Sig * VT
Now, let's use good old sympy
In [44]: C = Matrix([[5, 5], [-1, 7]])C
We need to work with a positive (semi)definite matrix
In [45]: CTC = C.transpose() * C # Using the computer variable CTC to remind that# it is C transpose times CCTC, CTC.det()
Let's look at the eigenvalues
In [46]: CTC.eigenvals()
Out[37]: 40.0
T
Out[38]: matrix([[26, 18], [18, 74]])
i
Out[40]: ( )2, 2
Out[41]: array([[ 8.94427191, 0. ], [ 0. , 4.47213595]])
Out[42]: matrix([[ 0.31622777, 0.9486833 ], [ 0.9486833 , -0.31622777]])
Out[43]: matrix([[ 5., 5.], [-1., 7.]])
Out[44]: [ ]5−1
57
Out[45]: ( )[ ] ,2618
1874
1600
Out[46]: { }20 : 1, 80 : 1
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 7 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
Σ will contain along its main diagonal the square root of these eigenvalues
In [47]: Sig = Matrix([[sqrt(20), 0], [0, sqrt(80)]])Sig
For V we require the eigenvectors of C CWe need to remember to normalize each vector (dividing each component by the length (norm) of that vector
In [48]: CTC.eigenvects()
Let's normalize each v by calculating the length (norm) of each
In [49]: v1 = Matrix([-3, 1])v1.norm()
In [50]: v2 = Matrix([Rational(1, 3), 1])v2.norm()
We'll get each element of V by dividing by these norms
In [51]: -3 / v1.norm(), 1 / v1.norm()
In [52]: Rational(1, 3) / v2.norm(), 1 / v2.norm()
In [53]: V = Matrix([[-3 / sqrt(10), 1 / sqrt(10)], [1 / sqrt(10), 3 / sqrt(10)]])# Just remember to put the elements of V in the correct placeV
Remember that it is equal to the transpose of V
In [54]: V == V.transpose()
Now for U using CC
In [55]: CCT = C * C.transpose() # Using the computer variable CCTCCT
The eigenvalues will be the same
In [56]: CCT.eigenvals()
Out[47]: [ ]2 5‾√0
04 5‾√
T
Out[48]:
[ ]( ) ,20, 1, [ ][ ]−31 ( )80, 1, [ ][ ]
131
i
Out[49]: 10‾‾‾√
Out[50]: 10‾‾‾√3
Out[51]: ( )− ,3 10√10
10√10
Out[52]: ( ),10√10
3 10√10
Out[53]: ⎡⎣⎢⎢
− 3 10√1010√
10
10√10
3 10√10
⎤⎦⎥⎥
Out[54]: True
T
Out[55]: [ ]5030
3050
Out[56]: { }20 : 1, 80 : 1
2015/03/29, 8:08 PMIII_05_Singular_value_decomposition
Page 8 of 8http://localhost:8888/nbconvert/html/III_05_Singular_value_decomposition.ipynb?download=false
In [57]: CCT.eigenvects()
In [58]: u1 = Matrix([-1, 1])u2 = Matrix([1, 1])
In [59]: -1 / u1.norm(), 1 / u1.norm()
In [60]: 1 / u2.norm(), 1 / u2.norm()
In [61]: U = Matrix([[-sqrt(2) / 2, sqrt(2) / 2], [sqrt(2) / 2, sqrt(2) / 2]])# Just remember to put the elements of U in the correct placeU
Let's see if it worked!
In [62]: U * Sig * V
In [ ]:
Out[57]: [ ]( ) ,20, 1, [ ][ ]−11 ( )80, 1, [ ][ ]1
1
Out[59]: ( )− ,2√2
2√2
Out[60]: ( ),2√2
2√2
Out[61]: ⎡⎣⎢⎢
− 2√22√
2
2√22√
2
⎤⎦⎥⎥
Out[62]: [ ]5−1
57
2015/03/29, 8:10 PMIII_06_Linear_transformations
Page 1 of 3http://localhost:8888/nbconvert/html/III_06_Linear_transformations.ipynb?download=false
This notebook is part of lecture 30 Linear transformation and their matrices in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Linear transformations
Mappings / transformations / projections
We'll begin the chapter with a familiar example: Projections
Let's look at a mapping / transformation / projectionT: � → T: �No axes are required for this mapping
Another example using matrices
For these to be a linear mapping, the following must apply
Example that are not linear transformationsShifting the axesThe transformation that turns a vector into its lengthTransformation involving power or transcendental function
Example
Let's create and example which accomplishes the followingT: � → �
Without axes we could look at something like this
Out[1]:
2 2
T (v) = Av
T (v + w) = T (v) + T (w)T (cv) = cT (v)
3 2
f (x, y, z) = (x + y + z, 3x − 2y + z)
2015/03/29, 8:10 PMIII_06_Linear_transformations
Page 2 of 3http://localhost:8888/nbconvert/html/III_06_Linear_transformations.ipynb?download=false
With axes we naturally turn to matricesWe require v in 3-space
To have a resultant vector in 2-space we require a matrix A of size 2×3
Notice if we know what a linear transformation does to a single vector, we know what it does to constant multiples of that vector (from the property T(cv)=cT(v))In 2- or 3-space this would represent a line
If we knew what T does to two (linearly independent or basis) vectors, we know what it does to the subspace created by those two vectors (the whole plane in � or aplane in � ), i.e. linear combinations of these two vectors
Coordinates / basis
Coordinates originate from a basis
This need not be the standard basis, though
Let's look then at the following transformation: T: � →�We need a basis for the input (v ,v ,...,v in � ) and the output (w ,w ,...,w in � )
So what is the rule for finding A that will transform v into w?Rule finding A given basis for v and w
The first column of A: apply T(v )=a w +a w +...+a w , with the first column entries of A being a , a , ..., aThe second column of A: apply T(v )=a w +a w +...+a w , with the second column entries of A being a , a , ..., a... and so on until v
Example
Let's look at polynomials with the following transformation
The input might be
The output will be
This gives use the following basis for each
We need the following
Following the rule above we will have to have the following
⎡
⎣⎢⎢
v1v2v3
⎤
⎦⎥⎥
T (v) = Av
=[ ]a11a21
a12a22
a13a23 2×3
⎡
⎣⎢⎢
v1v2v3
⎤
⎦⎥⎥
3×1[ ]+ +a11 v1 a12 v2 a13 v3
+ +a21 v1 a22 v2 a23 v3 2×1
23
v = + + ⋯ +c1 v1 c2 v2 cn vn
v = = 2 + 3 −⎡
⎣⎢⎢
23
−1
⎤
⎦⎥⎥
⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
⎡
⎣⎢⎢
010
⎤
⎦⎥⎥
⎡
⎣⎢⎢
001
⎤
⎦⎥⎥
n m
1 2 n n 1 2 m m
1 11 1 21 2 m1 m 11 21 m12 12 1 22 2 m2 m 12 22 m2
n
T = ddx
+ x +c1 c2 c3 x2
+ 2 xc2 c3
1, x, x2
1, x
A = [ ]⎡
⎣⎢⎢
c1c2c3
⎤
⎦⎥⎥
c22c2
A = [ ]00
10
02
2015/03/29, 8:10 PMIII_06_Linear_transformations
Page 3 of 3http://localhost:8888/nbconvert/html/III_06_Linear_transformations.ipynb?download=false
Example problems
Example problem 1
Consider the 2×2 matrix A and let T(A)=AWhy is T linear and what is TWrite the matrix of T in:
What are the eigenvalues and eigenvectors of T?
Solution
From the properties of linear transformation we have the following
A transpose turns a row into a columns and vice versa, from which we infer the following
For the next question we will have to see what T does to each basis matrix
From this we have to form a matrixThink of the columns each being Tv , Tv , Tv , and TvWe see that transforming v takes 1 v , and none of the rest
For the w we will have the following matrix
For the last question we will have the following
In [ ]:
T-1
= [ ] , = [ ] , = [ ] , = [ ]v110
00
v200
10
v301
00
v400
01
= [ ] , = [ ] , = [ ] , = [ ]w111
00
w200
01
w301
10
w40
−110
T (A + B) = = + = T (A) + T (B)(A + B)T AT BT
T (cA) = = cT (A)(cA)T
= IT 2
∴ = TT −1
T ( ) =v1 v1T ( ) =v2 v3T ( ) =v3 v2T ( ) =v4 v4
1 2 3 41 1
=MT
⎡
⎣
⎢⎢⎢⎢
1000
0010
0100
0001
⎤
⎦
⎥⎥⎥⎥
i
=MT
⎡
⎣
⎢⎢⎢⎢
1000
0100
0010
000
−1
⎤
⎦
⎥⎥⎥⎥
T (v) = λv
2015/03/29, 8:12 PMIII_07_Image_compression_Change_of_basis
Page 1 of 4http://localhost:8888/nbconvert/html/III_07_Image_compression_Change_of_basis.ipynb?download=false
This notebook is part of lecture 31 Change of basis and image compression in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Image compression and change of basis
Lossy image compression
Consider a 2 × 2 monochrome imageEvery pixel in this 512×512 image can take a value of 0 ≤ x < 255 (this is 8-bit)This make x a vector in � , with n = 512 (for color images this would be 3×n)
In [4]: # Just look at what 512 square is512 ** 2
This is a very large, unwieldy basisConsider the standard basis
Consider now the better basis
Indeed, there are many options
Out[1]:
9 9
in 2
Out[4]: 262144
, , ⋯ ,
⎡
⎣
⎢⎢⎢⎢⎢⎢
100⋮0
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
010⋮0
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
000⋮1
⎤
⎦
⎥⎥⎥⎥⎥⎥
, , , ⋯
⎡
⎣
⎢⎢⎢⎢⎢⎢
111⋮1
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢
1⋮1
−1⋮
−1
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
1−11
−1⋮
⎤
⎦
⎥⎥⎥⎥⎥⎥
2015/03/29, 8:12 PMIII_07_Image_compression_Change_of_basis
Page 2 of 4http://localhost:8888/nbconvert/html/III_07_Image_compression_Change_of_basis.ipynb?download=false
JPEG uses an 8 × 8 Fourier basisThis means that an image is broken up into 64 × 64 pixel blocksSee the lectures on the Fourier basis
This gives us a vector x in � (i.e. with 64 coefficients)Up until this point the compression is losslessNow comes the compression (of which there are many such as thresholding)Thresholding
Get rid of values more or less than set values (now there a less coefficients)
Video is a sequence of images that are highly correlated (not big changes from one image to the next) and you can predict future changes from previous changes
There are newer basis such as waveletsHere is an example
Every vector in � is a linear combination of these 8 basis vectors
Let's do some linear algebraConsider only a top row of 8 pixels
The standard vector of the values will be as follows (with 0 ≤ p < 255)
We have to write this as a linear combination of the wavelet basis vectors w (the lossless step)
In vector form we have the following
Let's bring some reality to thisFor fast computation, W must be as easy to invert as possibleThere is great competition to come up with better compression matricesA good matrix must have the following
Be fast, i.e. the fast Fourier transform (FFT)The wavelet basis above is fast
The basis vectors are orthogonal (and can be made orthonormal)If they are orthonormal then the inverse is equal to the transpose
Good compressionIf we threw away some of the p values, we would just have a dark imageWe we threw away, say the last two c values (last two basis vectors) that won't lose us so much quality
, , ⋯
⎡
⎣
⎢⎢⎢⎢⎢⎢
111⋮1
⎤
⎦
⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢
1WW 2
⋮W n−1
⎤
⎦
⎥⎥⎥⎥⎥⎥64
= ∑x c i vi
, , , , , , ,
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
11111111
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
1111
−1−1−1−1
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
11
−1−100
00
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
00001111
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
1−1000000
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
001
−10000
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
00001
−100
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
0000001
−1
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥8
i ⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
p1p2p3p4p5p6p7p8
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥i
P = + + ⋯ +c1 w1 c2 w2 c8 w8
P =⎡
⎣⎢⎢⎢
⋮w1
⋮
⋯⋯⋯
⋮w8
⋮
⎤
⎦⎥⎥⎥
⎡
⎣⎢⎢⎢
c1
⋮c8
⎤
⎦⎥⎥⎥
P = Wcc = PW −1
ii
2015/03/29, 8:12 PMIII_07_Image_compression_Change_of_basis
Page 3 of 4http://localhost:8888/nbconvert/html/III_07_Image_compression_Change_of_basis.ipynb?download=false
Change of basis
Let's look at this change in basisAbove, we had the following
Here W is the matrix that takes us from the vector x in the old basis to the vector c in the new basis
Consider any transformation T (such as a rotation transformation)With respect to v ,...,v it has a matrix AWith respect to w ,...,w it has a matrix B
Turns out that matrices A and B are similar
Here M is the matrix that transforms the basis
What is A then, using the basis v ,...,v ?We know T completely from T(v )...
... because if every x=Σc v
... then T(x)=Σc T(v )
Constructing AWrite down all the transformations
Now we know A
Let's consider the linear transformation T(v )=λThis makes A the following
Example problems
Example problem 1
The vector space of all polynomials in x (of degree ≤ 2) has the basis 1, x, xConsider a different basis w , w , w whose values at x = -1, 0, and 1 are given by the following
Express y(x)=-x+5 in the new basisFind the change of basis matricesFind the matrix of taking derivatives in both of the basis
Solution
x = Wc
1 81 8
B = AMM −1
1 8i
i ii i
T ( ) = + + ⋯ +v1 a11 v1 a21 v2 a81 v8T ( ) = + + ⋯ +v2 a12 v1 a22 v2 a82 v8
⋮T ( ) = + + ⋯ +v8 a18 v1 a28 v2 a88 v8
A =⎡
⎣⎢⎢⎢
a11
⋮a81
⋯⋯⋯
a18
⋮a88
⎤
⎦⎥⎥⎥
i i
A =
⎡
⎣
⎢⎢⎢⎢⎢⎢
λ1
0⋮⋮0
0λ2
0⋮⋯
⋯0⋱⋮⋯
⋯⋯⋯⋱0
0⋮⋮0λ8
⎤
⎦
⎥⎥⎥⎥⎥⎥
2
1 2 3x = −1 → 1 + +w1 0w2 0w3x = 0 → 0 + 1 +w1 w2 0w3x = 1 → 0 + +w1 0w2 1w3
2015/03/29, 8:12 PMIII_07_Image_compression_Change_of_basis
Page 4 of 4http://localhost:8888/nbconvert/html/III_07_Image_compression_Change_of_basis.ipynb?download=false
For the second part let's look at what happens at x for the various values at 1 (which is x ), x, and xFor -1 we have 1, -1, and 1For 0 we have 1, 0, and 0For 1 we have 1, 1, and 1
From this we can conclude the following
Now we have the following matrix
This converts the first basis to the secondTo convert the second basis to the original we just need A
In [5]: A = Matrix([[1, -1, 1], [1, 0, 0], [1, 1, 1]])A.inv()
Now for derivative matricesFor the original basis, this is easy
For the second basis we need the following
In [6]: Dx = Matrix([[0, 1, 0], [0, 0, 2], [0, 0, 0]])Dw = A * Dx * A.inv()Dw
Just to conclude we can write the values for w from the inverse of A (the columns)
In [ ]:
y (x) = 5 − xy (x) = α + β + γw1 w2 w3
y (−1) = 6y (0) = 5y (1) = 4
=⎡
⎣⎢⎢
100
010
001
⎤
⎦⎥⎥
⎡
⎣⎢⎢
αβγ
⎤
⎦⎥⎥
⎡
⎣⎢⎢
654
⎤
⎦⎥⎥
α = 6, β = 5, γ = 4y = 6 + 5 + 4w1 w2 w3
0 2
1 = + +w1 w2 w3x = − +w1 w3
= +x2 w1 w3
A =⎡
⎣⎢⎢
111
−101
101
⎤
⎦⎥⎥
-1
Out[5]: ⎡
⎣
⎢⎢⎢
0− 1
212
10
−1
01212
⎤
⎦
⎥⎥⎥
=Dx
⎡
⎣⎢⎢
000
100
020
⎤
⎦⎥⎥
= ADDw A−1
Out[6]: ⎡
⎣
⎢⎢⎢
− 32
− 12
12
20
−2
− 12
1232
⎤
⎦
⎥⎥⎥
i
= x +w1−12
12 x2
= 1 −w2 x2
= x +w312
12 x2
2015/03/29, 8:18 PMIII_09_Quiz_review
Page 1 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false
This notebook is part of Quiz review 3 in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrt, eyefrom numpy.linalg import pinv, inv, det, svd, norm, eigfrom scipy.linalg import pinv2from warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Review of select topics
Quick facts
EigenvaluesThere are shortcuts that we can sometimes employ to calculate them
In symmetric matrices A = ATheir eigenvalues are always realThere are always enough eigenvectors and we can choose them to be orthogonalThey van be diagonalized and factorized as Q&Lambda'Q
Similar matrices are any square matrices that are related by A = M BMThey have the same eigenvalues (not eigenvectors)As one grows / decays so does the other A = M B M
Exercise problems
Differential equation matrix
Consider the following and solve for a general solution and solve for e
There are no initial condition, so we need the general solutionsThey will be in the form
Out[1]:
T
T-1
k -1 k
At
= Au = ududt
⎡
⎣⎢⎢
010
−101
0−10
⎤
⎦⎥⎥
u (t) = + +c1 e tλ1 x1 c2 e tλ2 x2 c3 e tλ3 x3
2015/03/29, 8:18 PMIII_09_Quiz_review
Page 2 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false
In [4]: A = Matrix([[0, -1, 0], [1, 0, -1], [0, 1, 0]])A, A.det()
It is clearly singular (dependent rows and columns)
In [5]: A.transpose()
It is skew-symmetric and therefor eigenvalues are purely complex numbers (including 0i)
In [6]: A.eigenvals()
The solution is thus as follows
The solution moves around the unit circle (doesn't grow / decay)It returns to the same value (it's periodic) after a certain time t
Finding u(t) allows the following
If A is diagonalizable (A = SΛS ) then we have the following
Orthogonal eigenvalues
Which matrices have orthogonal eigenvectors?
The followingSymmetric matricesWhen A = -A (skew-symmetric)Orthogonal matricesIn general these are all when AA =A A
Definitions
Given the following
Out[4]: ⎛
⎝⎜⎜ ,
⎡
⎣⎢⎢
010
−101
0−10
⎤
⎦⎥⎥ 0
⎞
⎠⎟⎟
Out[5]: ⎡
⎣⎢⎢
0−10
10
−1
010
⎤
⎦⎥⎥
Out[6]: { }0 : 1, − i : 1,2‾√ i : 12‾√
u (t) = + +c1 x1 c2 e it2√ x2 c3 e− it2√ x3
iT = 2πi; = 12‾√ e0
T = 2π2‾√T = π 2‾√
u (t) = u (0)eAt
-1
= SeAt eΛt S−1
=eΛt
⎡
⎣⎢⎢⎢
e tλ1
00
0⋱0
00
e tλn
⎤
⎦⎥⎥⎥
T
T T
= 0; = c; = 2λ1 λ2 λ3
= , = , =x1
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥ x2
⎡
⎣⎢⎢
1−10
⎤
⎦⎥⎥ x3
⎡
⎣⎢⎢
112
⎤
⎦⎥⎥
2015/03/29, 8:18 PMIII_09_Quiz_review
Page 3 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false
Is the matrix A diagonalizable and of so for which value(s) of c?So we need enough enough eigenvectors and they should be independentThey are indeedMore so they are orthogonalSo, for all c the matrix is diagonalizable
Is A symmetric and if so, for which value(s) of s?The eigenvalues all have to be real valuesThus all real values for c
Is A positive definite and if so, for which values of c?This is a sub-case of symmetric matricesThere are a lot of tests for positive definite matricesOne of the eigenvalues are zero, so it can, at best, be semi-definite, for c ≥ 0
Is this a Markov matrix and if so, for which values of c?One of the eigenvalues must be 1 and the others must be smallerSo, no
Could ½A be a projection matrix?They are symmetric and eigenvalues must be realAny projection matrix eigenvalues must be 0 and 1
Thus c = 0 or c = 2 will work (for ½A we will have ½λ)
Singular value decomposition
In SVD we have the following
Where U and V are orthogonal matrices and Σ a diagonal matrix
Every matrix has a SVD
V is the eigenvector matrix for A&Sigma: has along its main diagonal the square roots of the eigenvaluesU is similarly calculated as the eigenvector matrix of AAThere is always, though, as sign issue when choosing V and U
For whichever signs are chosen for V, this forces the signs for U which can be checked against the following
Σ can tell us a lot about AAll values must be ≥ 0If it contains a 0 along the main diagonal, A is singular
Symmetric AND orthogonal matrices (matrices that are both)
A = A = A
What can be said about the eigenvalues of these?Symmetric matrices have real eigenvalues and the orthogonal matrix eigenvalues must have length 1; ||λ|| = 1
Is A sure to be positive definite?No, as λ can be -1
P = A2
= PP2
∴ = λλ2
∴ λ = 0; λ = 1
A = UΣV T
A = (V ) (UΣ )AT ΣT UT V T
A = V ( Σ)AT ΣT V T
T
T
A =vi σiuiAV = ΣU
T -1
2015/03/29, 8:18 PMIII_09_Quiz_review
Page 4 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false
Does it have repeated eigenvalues?Yes (if n ≥ 2, some eigenvalues must be repeated)
Is it diagonalizable?Most definitely
Is it non-singularYes (no zero eigenvalues)
Prove that the following is a projection matrix
Squaring it should result in the same
This begs the question, what is A ?Well, if A equals its inverse, A = IAs an aside the eigenvalues of A + I will be twice the eigenvalues of A
Example problems
Example problem 1
Find the eigenvalues and eigenvectors of the following1: The projection matrix P
2: The matrix Q
3: The matrix R
Solution
1:
In [7]: a = matrix([[3], [4]]) # Not using sympyP = (a * transpose(a)) / (transpose(a) * a)P
The eigenvalues of a projection matrix are either 0 or 1
In [8]: eig(P) # eig() gives the eigenvalues and eigenvector matrix
2:
In [9]: Q = matrix([[0.6, -0.8], [0.8, 0.6]])Q
Note that Q is a projection matrix
(A + I)12
( + 2A + I)14 A2
22
P = ; a = [ ]aaT
aaT34
Q = [ ]0.60.8
−0.80.6
R = 2P − I
Out[7]: matrix([[ 0.36, 0.48], [ 0.48, 0.64]])
Out[8]: (array([ 0., 1.]), matrix([[-0.8, -0.6], [ 0.6, -0.8]]))
Out[9]: matrix([[ 0.6, -0.8], [ 0.8, 0.6]])
2015/03/29, 8:18 PMIII_09_Quiz_review
Page 5 of 5http://localhost:8888/nbconvert/html/III_09_Quiz_review.ipynb?download=false
In [10]: eig(Q) # eigenvalues come in complex conjugate pairs
3:
R will have the same eigenvectors, but (shifted) eigenvalues
In [11]: R = 2 * P - eye(2)R
In [12]: eig(R)
The eigenvalues of P was 0 and 12 × 0 - 1 = -12 × 1 - 1 = 1
In [ ]:
Out[10]: (array([ 0.6+0.8j, 0.6-0.8j]), matrix([[ 0.70710678+0.j , 0.70710678-0.j ], [ 0.00000000-0.70710678j, 0.00000000+0.70710678j]]))
Out[11]: matrix([[-0.28, 0.96], [ 0.96, 0.28]])
Out[12]: (array([-1., 1.]), matrix([[-0.8, -0.6], [ 0.6, -0.8]]))
2015/03/29, 8:21 PMIII_10_Final_exam_review
Page 1 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false
This notebook is part of Final exam review in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom sympy.solvers import solvefrom numpy import matrix, transpose, sqrt, eyefrom numpy.linalg import pinv, inv, det, svd, norm, eigfrom scipy.linalg import pinv2from warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Final examination review
Previous examination questions
Question 1
If A is a m × n matrix of rank r and the following holdsNo solution
One solution
How many rows in this matrix?m = 3
What is the rank?If there are no solutions then r < mIf there is only a single solution then the nullspace has only the zero vector as so r = n
How many columns?For one solution (as above) r = n and with m = 3 and r < m we have r = n < 3
Out[1]:
Ax =⎡
⎣⎢⎢
100
⎤
⎦⎥⎥
Ax =⎡
⎣⎢⎢
010
⎤
⎦⎥⎥
2015/03/29, 8:21 PMIII_10_Final_exam_review
Page 2 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false
Write down a matrix that fits the description above
True or False for the aboveDeterminant of A A is same as determinant of AA
FalseA A is invertible
If r = n (independent columns of A) then TRUEAA A is positive definite
False (it is going to be 3 × 3, but still with only rank 2)
In [4]: A = Matrix([[0, 0], [1, 0], [0, 1]])A
In [5]: (A.transpose() * A).inv()
In [6]: (A.transpose() * A).det() == (A * A.transpose())
In [7]: A * A.transpose()
Prove that A y = c has at least one solution for every c and in fact infinitely many solution for every cIt has at least one solution because the new number of rows (n) is equal to rThe dimension of the nullspace of A is m - r, which in our example here would be > 0, thus infinitely many solutions
Question 2
Suppose we have a matrix A with columns containing vectors v , v , and v
Solve Ax = v - v + vThis is simple multiplication by columns
Suppose v - v + v = 0Is the solution unique or are there more
Uniqueness means nothing in the nullspace except the zero vector, so in this cane the solutions are not uniqueSuppose the columns are orthonormal (would be called q , q , q )
What combination of v and v are closet to v ?Zero for each v and v
Question 3
Consider the Markov matrix
A =⎡
⎣⎢⎢
010
001
⎤
⎦⎥⎥
T T
T
T
Out[4]: ⎡
⎣⎢⎢
010
001
⎤
⎦⎥⎥
Out[5]: [ ]10
01
Out[6]: False
Out[7]: ⎡
⎣⎢⎢
000
010
001
⎤
⎦⎥⎥
T
T
1 2 3
1 2 3
x =⎡
⎣⎢⎢
1−11
⎤
⎦⎥⎥
1 2 3
1 2 31 2 3
1 2
⎡
⎣⎢⎢
0.20.40.4
0.40.20.4
0.30.30.4
⎤
⎦⎥⎥
2015/03/29, 8:21 PMIII_10_Final_exam_review
Page 3 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false
Calculate the eigenvaluesThe matrix is singular (note how ½ of columns 1 plys ½ of column 2 equals columns 3) so one eigenvalue will be zeroAnother must be 1The trace adds to 0.8 and so must the sums of the eigenvalues, thus the last eigenvalue is -0.2
If for the following the u(0) vector is as indicated, what would teh solution be after k steps?
The following will hold
So at ∞ the only term that survives is c xIndeed, the key eigenvalue in any Markov matrix is 1
Consider the eigenvector and calculate u at ∞We already know that we have to use the λ = 1 eigenvalueThe distribution at ∞ will be as follows (see python code below)
In [8]: A = Matrix([[0.2, 0.4, 0.3], [0.4, 0.2, 0.3], [0.4, 0.4, 0.4]])A
In [9]: A.eigenvects() # Looking for eigenvector of eigenvalue 1# Have to distribute the totals into 10 (were 10 total intiallly)
Question 4
Calculate the projection onto the following line
The projection matrix is
In [10]: a = matrix([[4], [-3]]) # Using scipy(a * transpose(a)) / (transpose(a) * a)
Consider the matrix with eigenvalues 0 and 3 and the following eigenvectors
We use the following decomposition
In [11]: S = matrix([[1, 2], [2, 1]])L = matrix([[0, 0], [0, 3]])S_inv = inv(S)
In [12]: A = S * L * S_invA
= ; u (0) =uk Ak⎡
⎣⎢⎢
0100
⎤
⎦⎥⎥
= ; u (0) = + +uk Ak c1 λk1x1 c2 λk
2x2 c3 λk3x3
= ; u (0) = 0 + (1) +uk Ak c2 x2 c3 (−0.2)kx32 2
u (∞) =⎡
⎣⎢⎢
334
⎤
⎦⎥⎥
Out[8]: ⎡
⎣⎢⎢
0.20.40.4
0.40.20.4
0.30.30.4
⎤
⎦⎥⎥
Out[9]: ⎡
⎣⎢⎢ ,
⎛
⎝⎜⎜ −0.2, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
−1.01.00
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟ ,
⎛
⎝⎜⎜ 0, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
−0.5−0.51.0
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎛
⎝⎜⎜ 1.0, 1,
⎡
⎣⎢⎢
⎡
⎣⎢⎢
0.750.751.0
⎤
⎦⎥⎥
⎤
⎦⎥⎥
⎞
⎠⎟⎟
⎤
⎦⎥⎥
a = [ ]4−3
P = aaT
aaT
Out[10]: matrix([[ 0.64, -0.48], [-0.48, 0.36]])
0, [ ] 3, [ ]12
21
A = SΛS−1
Out[12]: matrix([[ 4., -2.], [ 2., -1.]])
2015/03/29, 8:21 PMIII_10_Final_exam_review
Page 4 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false
Give a 2 × 2 matrix A such that A ≠ B B for any BB is always symmetric, so A can be any non-symmetric matrix
A matrix that has orthogonal eigenvectors, but is not symmetricAny skew-symmetric matrix (transpose = negative of matrix)
Any orthogonal matrix
Question 5
Consider the following system Ax=b, with the least squares solution shown and calculate the projection of b onto the columnspace of A
The least square solution is given, so simply multiply each entry by its column
Calculate a different vector b such that all the least square solutions are zeroThis requires b to be orthogonal to those columns, such as the following
Question 6 (from recitation)
Consider then 3 × 3 matrix A, with λ =1 and λ =2 and the first two pivots d =d =1
Find λ and dThe sum of the eigenvalues must equal the trace, thus λ =-1Constant multiples of a row subtracted from another won't change the determinant leaving d ×d ×d =|A| (just watching out for singular matrices which will havea zero on the main diagonal; here though we have three non-zero eigenvalues, so the matrix is not-singular), leaving d =-2 (product of eigenvalues is also thedeterminant of A)
Calculate the smallest a entry that will make positive semi-definiteFor positive definite the eigenvalues must all be ≥ zeroThe determinant must also be ≥ 0
In [13]: a33 = symbols('a33')A = Matrix([[1, 0, 1], [0, 1, 1], [1, 1, a33]])A
In [14]: A.det() # Thus a33 must be grteater tha or equal to 2
TT
[ ]0−1
10
[ ]cossin
− sincos
= [ ] = , [ ] = [ ]⎡
⎣⎢⎢
111
012
⎤
⎦⎥⎥
cd
⎡
⎣⎢⎢
341
⎤
⎦⎥⎥
c d
113
−1
− 1113
⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
⎡
⎣⎢⎢
012
⎤
⎦⎥⎥
⎡
⎣⎢⎢
1−21
⎤
⎦⎥⎥
1 2 1 2
A =⎡
⎣⎢⎢
101
011
110
⎤
⎦⎥⎥
3 33
1 2 33
33
Out[13]: ⎡
⎣⎢⎢
101
011
11
a33
⎤
⎦⎥⎥
Out[14]: − 2a33
2015/03/29, 8:21 PMIII_10_Final_exam_review
Page 5 of 5http://localhost:8888/nbconvert/html/III_10_Final_exam_review.ipynb?download=false
Calculate the smallest values of c such that the following is still positive semi-definite
We can calculate the determinant using sympy (see below) or we can make use of the fact that adding a constant multiple of the identity matrix will only add thatconstant to each eigenvalue, leaving the eigenvectors intact
Each must be ≥ 0, so the smallest value of c is 1
In [15]: c = symbols('c')A = Matrix([[1, 0, 1], [0, 1, 1], [1, 1, 0]])(A - c * eye(3))
In [16]: (A - c * eye(3)).det() # From here we can calulcate the smallest value of c such# that the determinant is still greater than or equal to zero
In [17]: f = -c ** 3 + 2 * c ** 2 + c - 2f
In [18]: solve(f, c) # solve the equation f for the variable c
Consider now one of the starting vectors u below and with u = ½Au calculate the limiting behavior of u as k approaches ∞
Notice that ½ is a Markov matrixWe cannot be sure that there will be a steady state as there are zero entries in ½AMultiplying a matrix by a constant scalar will not change the eigenvectors, but will change the eigenvalues by the same scalar multiple and we will have λ =½and λ =1 and λ =-½We do have an eigenvalue of 1, so we will reach a steady-stateThe eigenvector of λ =1 is the following (see below)
This already sums to 3, so will be u
In [ ]:
A − cI
1 + c, 2 + c, −1 + c
Out[15]: ⎡
⎣⎢⎢
−1.0c + 101
0−1.0c + 1
1
11
−1.0c
⎤
⎦⎥⎥
Out[16]: −1.0 + 2.0 + 1.0c − 2c3 c2
Out[17]: − + 2 + c − 2c3 c2
Out[18]: [ ]−1, 1, 2
0 k+1 k k
= , ,u0
⎡
⎣⎢⎢
300
⎤
⎦⎥⎥
⎡
⎣⎢⎢
030
⎤
⎦⎥⎥
⎡
⎣⎢⎢
003
⎤
⎦⎥⎥
12 3
2 ⎡
⎣⎢⎢
111
⎤
⎦⎥⎥
∞
2015/03/29, 8:15 PMIII_08_Left_and_right_inverses_Pseudoinverses
Page 1 of 4http://localhost:8888/nbconvert/html/III_08_Left_and_right_inverses_Pseudoinverses.ipynb?download=false
This notebook is part of lecture 32 Left-, right-, and pseudoinverses in the OCW MIT course 18.06 by Prof Gilbert Strang [1]Created by me, Dr Juan H Klopper
Head of Acute Care SurgeryGroote Schuur HospitalUniversity Cape TownEmail me with your thoughts, comments, suggestions and corrections (mailto:[email protected])
(http://creativecommons.org/licenses/by-nc/4.0/)Linear Algebra OCW MIT18.06 IPython notebook [2] study notes by Dr Juan H Klopper is licensed under a Creative Commons Attribution-NonCommercial 4.0International License (http://creativecommons.org/licenses/by-nc/4.0/).
[1] OCW MIT 18.06 (http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/index.htm)[2] Fernando Pérez, Brian E. Granger, IPython: A System for Interactive Scientific Computing, Computing in Science and Engineering, vol. 9, no. 3, pp. 21-29, May/June2007, doi:10.1109/MCSE.2007.53. URL: http://ipython.org (http://ipython.org)
In [1]: from IPython.core.display import HTML, Imagecss_file = 'style.css'HTML(open(css_file, 'r').read())
In [2]: from sympy import init_printing, Matrix, symbols, sqrt, Rationalfrom numpy import matrix, transpose, sqrtfrom numpy.linalg import pinv, inv, det, svd, normfrom scipy.linalg import pinv2from warnings import filterwarnings
In [3]: init_printing(use_latex = 'mathjax')filterwarnings('ignore')
Left- and right-sided inverses and pseudoinverses
The inverse
Recall the four fundamental subspacesThe rowspace (with x) and nullspace in �The columnspace (with Ax) and the nullspace of A in �
The two-sided inverse gives us the following
For this we need r = m = n (i.e. full rank)
For a left-inverse we have the followingFull column rank, with r = n (but possibly more rows)The nullspace contains just the zero vector (columns are independent)The rows might not all be independentWe thus have either no or only a single solution to Ax=bA will now also have full rankFrom (A A) A A = I follows the fact that (A A) A is a left-sided inverse (A )Note, though, that (A A) A is a n × m matrix and A is of size m × n, resulting in a n × n identity matrixWe cannot do AA and have a n × n identity matrix, though, but instead will be a projection matrix (onto the columnspace)
For a right-inverse we have the followingFull row rank, with r = m < nThe nullspace of A is the zero vector (rows are independent)Elimination will result in many solutions to Ax=b (n - m free variables)Now there will be an A to the right of A to give IAA (AA ) = IA is now A (AA )Putting the right-inverse on the left is also a projection (onto the rowspace)
Out[1]:
nT m
A = I = AA−1 A−1
TT -1 T T -1 T -1
T -1 T-1
T
-1T T -1
-1 T T -1
2015/03/29, 8:15 PMIII_08_Left_and_right_inverses_Pseudoinverses
Page 2 of 4http://localhost:8888/nbconvert/html/III_08_Left_and_right_inverses_Pseudoinverses.ipynb?download=false
The pseudoinverse
Consider a matrix where r is less than m and nRemember that the rowspace is in � and the columnspace is also in �The nullspace of the rowspace is in � and the nullspace of A is in �The rowspace and columnspace are in the same dimension and every vector x in one translate to another vector in the other (one-to-one)
If y in another vector in the rowspace (not same as x) then Ax ≠ Ay
The pseudoinverse A , then, maps x (or y) from the columnspace to the rowspace
Suppose Ax = Ay or A(x-y) = 0Now (x-y) is in the nullspace and in the rowspace, i.e. it has to be the zero vector
Finding the pseudoinverse A
One way is to start from the singular value decomposition
Σ has along the main diagonal all the square roots of the eigenvalues and r pivots, but m row and n columns which can be more than rΣ will have 1 over the square roots of the eigenvalues along the main diagonals and then (possibly) zero values further along, but be of size n × mΣΣ will have 1 along the main diagonal, and then 0 (if larger tha r)
It will be of size m × mIt is a projection onto the columnspace
Σ Σ will also have 1 along the main diagonal as well, but be of size n × nIt is a projection onto the rowspace
We now have the following
Let's see how easy this is in python™
In [4]: A = matrix([[3, 6], [2, 4]]) # Not sympyA, det(A) # The det is zero, so no inverse exists
In [5]: # The numpy pinv() function use SVDAplus = pinv(A)Aplus
In [6]: # The scipy pinv2() function also uses SVD# The scipy pinv() function uses least squares to approxiamte# the pseudoinverse and as matrices get BIG, this# becomes computationally expensiveAplus_sp = pinv2(A)Aplus_sp
Example problem
Example problem 1
r rn-r T m-r
+
y = AyA+
+
A = UΣV T
++ 's 's
+ 's
= VA+ Σ+UT
Out[4]: (matrix([[3, 6], [2, 4]]), 0.0)
Out[5]: matrix([[ 0.04615385, 0.03076923], [ 0.09230769, 0.06153846]])
Out[6]: array([[ 0.04615385, 0.03076923], [ 0.09230769, 0.06153846]])
2015/03/29, 8:15 PMIII_08_Left_and_right_inverses_Pseudoinverses
Page 3 of 4http://localhost:8888/nbconvert/html/III_08_Left_and_right_inverses_Pseudoinverses.ipynb?download=false
Calculate the pseudoinverse of A=[1,2]Calculate AACalculate A AIf x is in the nullspace of A what is the effect of A A on x (i.e. A Ax)If x is in the columnspace of A what is A Ax?
Solution
In [7]: A = matrix([1, 2])A
Let's use singular value decomposition
In [8]: U, S, VT = svd(A)
In [9]: U
In [10]: S
In [11]: VT
Remember,
Σ must be of size 2 × 1, though
In [12]: S = matrix([[sqrt(5)], [0]])
In [13]: Aplus = transpose(VT) * S * UAplus
This needs to be normalized
In [14]: norm(Aplus)
In [15]: 1 / norm(Aplus) * Aplus
In [16]: Aplus = pinv(A)Aplus
In [17]: A * Aplus
In [18]: Aplus * A
++
+ +T +
Out[7]: matrix([[1, 2]])
Out[9]: matrix([[-1.]])
Out[10]: array([ 2.23606798])
Out[11]: matrix([[-0.4472136 , -0.89442719], [-0.89442719, 0.4472136 ]])
= VA+ Σ+UT
Out[13]: matrix([[ 1.], [ 2.]])
Out[14]: 2.2360679775
Out[15]: matrix([[ 0.4472136 ], [ 0.89442719]])
Out[16]: matrix([[ 0.2], [ 0.4]])
Out[17]: matrix([[ 1.]])
Out[18]: matrix([[ 0.2, 0.4], [ 0.4, 0.8]])
2015/03/29, 8:15 PMIII_08_Left_and_right_inverses_Pseudoinverses
Page 4 of 4http://localhost:8888/nbconvert/html/III_08_Left_and_right_inverses_Pseudoinverses.ipynb?download=false
Let's create a vector in the nullspace of AIt will be any vector
Let's choose the constant c = 1
In [19]: x_vect_null_A = matrix([[-2], [1]])Aplus * A * x_vect_null_A
This is now surprise as A A reflects a vector onto the rowspace of AWe chose x in the nullspace of A, so Ax must be 0 and A Ax = 0
The columnsapce of A is any vector
We'll choose c = 1 again
In [20]: x_vect_null_AT = matrix([[1], [2]])Aplus * A * x_vect_null_AT
We recover x again
For fun, let's just check what A is when A is invertible
In [21]: A = matrix([[1, 2], [3, 4]])
In [22]: pinv(A)
In [23]: inv(A)
In [ ]:
c [ ]−21
Out[19]: matrix([[ 0.], [ 0.]])
++
T
c [ ]12
Out[20]: matrix([[ 1.], [ 2.]])
+
Out[22]: matrix([[-2. , 1. ], [ 1.5, -0.5]])
Out[23]: matrix([[-2. , 1. ], [ 1.5, -0.5]])