View
230
Download
0
Tags:
Embed Size (px)
Citation preview
Lecture 17
Factor Analysis
SyllabusLecture 01 Describing Inverse ProblemsLecture 02 Probability and Measurement Error, Part 1Lecture 03 Probability and Measurement Error, Part 2 Lecture 04 The L2 Norm and Simple Least SquaresLecture 05 A Priori Information and Weighted Least SquaredLecture 06 Resolution and Generalized InversesLecture 07 Backus-Gilbert Inverse and the Trade Off of Resolution and VarianceLecture 08 The Principle of Maximum LikelihoodLecture 09 Inexact TheoriesLecture 10 Nonuniqueness and Localized AveragesLecture 11 Vector Spaces and Singular Value DecompositionLecture 12 Equality and Inequality ConstraintsLecture 13 L1 , L∞ Norm Problems and Linear ProgrammingLecture 14 Nonlinear Problems: Grid and Monte Carlo Searches Lecture 15 Nonlinear Problems: Newton’s Method Lecture 16 Nonlinear Problems: Simulated Annealing and Bootstrap Confidence Intervals Lecture 17 Factor AnalysisLecture 18 Varimax Factors, Empircal Orthogonal FunctionsLecture 19 Backus-Gilbert Theory for Continuous Problems; Radon’s ProblemLecture 20 Linear Operators and Their AdjointsLecture 21 Fréchet DerivativesLecture 22 Exemplary Inverse Problems, incl. Filter DesignLecture 23 Exemplary Inverse Problems, incl. Earthquake LocationLecture 24 Exemplary Inverse Problems, incl. Vibrational Problems
Purpose of the Lecture
Introduce Factor Analysis
Work through an example
Part 1
Factor Analysis
source A
ocean
sediment
source B
s4s2 s3s1
sample matrix S
S arranged row-wisebut we’ll use a column vector s(i) for individual samples)
theory
samples are a linear mixture of sources
S = C F
theory
samples are a linear mixture of sources
S = C Fsamples contain
“elements”
theory
samples are a linear mixture of sources
S = C Fsources called “factors”
factors contain “elements”
factor matrix F
F arranged row-wisebut we’ll use a column vector f(i) for individual factors
theory
samples are a linear mixture of sources
S = C Fcoefficients
called “loadings”
loading matrix C
inverse problem
given Sfind C and F
so that S=CF
very non-unique
given T with inverse T-1
if S=CFthen S=[C T-1][TF] =C’F’
very non-unique
so a priori information needed to select a solution
simplicity
what is the minimum number of factors needed
call that number p
does S span the full space of M elements?
or just a p –dimensional subspace?
0 0.2 0.4 0.6 0.8 10
0.5
1
0
0.2
0.4
0.6
0.8
1
E2
E1
E3 E3
E2
s1s3
A
Bs1
s4
E1
we know how to answer this question
p is the number of non-zero singular values
-0.50
0.51
1.5
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
E1
E2
E3
E3
E2E1
s2 s3
A
Bs1
s4
v2
v1
v3
SVD identifies a subspace
but the SVD factors
f(i) = v(i) i=1, pnot unique
usually not the “best”
factor f(1)v with the largest singular value
usually near the mean sample
sample mean <s>minimize
eigenvector <v>minimize
factor f(1)v with the largest singular value
usually near the mean sample
sample mean <s>minimize
eigenvector <v>minimize
about the same if samples are clustered
0 0.2 0.4 0.6 0.8 10
0.5
10
0.2
0.4
0.6
0.8
1
E1
E2
E3
0 0.2 0.4 0.6 0.8 10
0.5
10
0.2
0.4
0.6
0.8
1
E1
E2
E3 s3
f(1) s1s4
f(1)
f(2) f(2)s4s3s2 s2s1E
3E2 E1
E3E2 E1
(A) (B)
[U, LAMBDA, V] = svd(S,0);lambda = diag(LAMBDA);F = V';C = U*LAMBDA;
in MatLab
[U, LAMBDA, V] = svd(S,0);lambda = diag(LAMBDA);F = V';C = U*LAMBDA;
“economy” calculationLAMBDA is M⨉Min MatLab
since samples have measurement noise
probably no exactly singular values
just very small ones
so pick pfor whichS≈CF
is an adequate approximation
Atlantic Rock Dataset
51.97 1.25 14.28 11.57 7.02 11.67 2.12 0.0750.21 1.46 16.41 10.39 7.46 11.27 2.94 0.0750.08 1.93 15.6 11.62 7.66 10.69 2.92 0.3451.04 1.35 16.4 9.69 7.29 10.82 2.65 0.1352.29 0.74 15.06 8.97 8.14 13.19 1.81 0.0449.18 1.69 13.95 12.11 7.26 12.33 2 0.1550.82 1.59 14.21 12.85 6.61 11.25 2.16 0.1649.85 1.54 14.07 12.24 6.95 11.31 2.17 0.1550.87 1.52 14.38 12.38 6.69 11.28 2.11 0.17(several thousand more rows)
SiO2 TiO2 Al2O3 FeOt MgO CaO Na2O K2O
Al203
Ti02Al203
Si02
K20
Fe0
Mg0
Al203
A) B)
C) D)
1 2 3 4 5 6 7 80
1000
2000
3000
4000
5000singular values, s(i)
index, i
s(i)
sing
ular
val
ues,
λ i
index, i
f2 f3 f4 f5
f2p f3p f4p f5pf(5)f(2) f(3) f(4)
SiO2
TiO2
Al2O3
FeOtotal
MgO
CaO
Na2O
K2O
C2C3
C4