Upload
constance-browning
View
66
Download
4
Embed Size (px)
DESCRIPTION
Linear Algebra Review (with a Small Dose of Optimization). Hristo Paskov CS246. Outline. Basic definitions Subspaces and Dimensionality Matrix functions: inverses and eigenvalue decompositions Convex optimization. Vectors and Matrices. Vector May also write. Vectors and Matrices. - PowerPoint PPT Presentation
Citation preview
Linear Algebra Review(with a Small Dose of Optimization)
Hristo PaskovCS246
Outline
• Basic definitions• Subspaces and Dimensionality• Matrix functions: inverses and eigenvalue
decompositions• Convex optimization
Vectors and Matrices
• Vector
• May also write
Vectors and Matrices
• Matrix
• Written in terms of rows or columns
Multiplication
• Vector-vector:
• Matrix-vector:
Multiplication
• Matrix-matrix:
=5
3
3
4 4
5
Multiplication
• Matrix-matrix: – rows of , cols of
Multiplication Properties
• Associative
• Distributive
• NOT commutative
– Dimensions may not even be conformable
Useful Matrices
• Identity matrix
• Diagonal matrix
Useful Matrices
• Symmetric • Orthogonal
– Columns/ rows are orthonormal• Positive semidefinite :
– Equivalently, there exists
Outline
• Basic definitions• Subspaces and Dimensionality• Matrix functions: inverses and eigenvalue
decompositions• Convex optimization
Norms
• Quantify “size” of a vector• Given , a norm satisfies
1. 2. 3.
• Common norms:1. Euclidean -norm: 2. -norm: 3. -norm:
Linear Subspaces
Linear Subspaces
• Subspace satisfies1. 2. If and , then
• Vectors span if
Linear Independence and Dimension
• Vectors are linearly independent if
– Every linear combination of the is unique• if span and are linearly independent– If span then
• If then are NOT linearly independent
Linear Independence and Dimension
Matrix Subspaces
• Matrix defines two subspaces– Column space – Row space
• Nullspace of :
– Analog for column space
Matrix Rank
• gives dimensionality of row and column spaces
• If has rank , can decompose into product of and matrices
=𝑚
𝑛
𝑚𝑛
𝑘
𝑘
Properties of Rank
• For
• has full rank if • If rows not linearly independent– Same for columns if
Outline
• Basic definitions• Subspaces and Dimensionality• Matrix functions: inverses and eigenvalue
decompositions• Convex optimization
Matrix Inverse
• is invertible iff • Inverse is unique and satisfies
1. 2. 3. 4. If is invertible then is invertible and
Systems of Equations
• Given wish to solve
– Exists only if • Possibly infinite number of solutions
• If is invertible then – Notational device, do not actually invert matrices– Computationally, use solving routines like
Gaussian elimination
Systems of Equations
• What if ?• Find that gives closest to – is projection of onto – Also known as regression
• Assume
Invertible Projection matrix
Systems of Equations
[1 02 1] [ .5−2.5]=[ .5−1.5] [−1 1
1 1][ −1− .5 ]=[ .5−1.5 ]
Eigenvalue Decomposition
• Eigenvalue decomposition of symmetric is
– contains eigenvalues of – is orthogonal and contains eigenvectors of
• If is not symmetric but diagonalizable
– is diagonal by possibly complex– not necessarily orthogonal
Characterizations of Eigenvalues
• Traditional formulation
– Leads to characteristic polynomial
• Rayleigh quotient (symmetric )
Eigenvalue Properties
• For with eigenvalues 1. 2. 3.
• When is symmetric– Eigenvalue decomposition is singular value
decomposition– Eigenvectors for nonzero eigenvalues give
orthogonal basis for
Simple Eigenvalue Proof
• Why ?• Assume is symmetric and full rank1. 2. 3. If , eigenvalue of is 04. Since is product of eigenvalues, one of the
terms is , so product is 0
𝑄𝑄𝑇=𝐼
Outline
• Basic definitions• Subspaces and Dimensionality• Matrix functions: inverses and eigenvalue
decompositions• Convex optimization
Convex Optimization• Find minimum of a function subject to solution
constraints• Business/economics/ game theory– Resource allocation– Optimal planning and strategies
• Statistics and Machine Learning– All forms of regression and classification– Unsupervised learning
• Control theory– Keeping planes in the air!
Convex Sets
• A set is convex if and
– Line segment between points in also lies in • Ex– Intersection of halfspaces– balls– Intersection of convex sets
Convex Functions
• A real-valued function is convex if is convex and and
– Graph of upper bounded by line segment between points on graph
(𝑥 , 𝑓 (𝑥 ) )
( 𝑦 , 𝑓 (𝑦 ) )
Gradients
• Differentiable convex with • Gradient at gives linear approximation
𝑓
𝑥
𝑓 (𝑥 )+𝑤𝑇𝛻 𝑓
Gradients
• Differentiable convex with • Gradient at gives linear approximation
𝑓
𝑥
𝑓 (𝑥 )+𝑤𝑇𝛻 𝑓
Gradient Descent
• To minimize move down gradient– But not too far!– Optimum when
• Given , learning rate , starting point
Do until
Stochastic Gradient Descent
• Many learning problems have extra structure
• Computing gradient requires iterating over all points, can be too costly
• Instead, compute gradient at single training example
Stochastic Gradient Descent
• Given , learning rate , starting point Do until nearly optimal
For in random order
• Finds nearly optimal
Minimize
Learning Parameter