Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Sundermeyer
MAR 550
Spring 2020 1
Laboratory in Oceanography:
Data and Methods
MAR550, Spring 2020
Miles A. Sundermeyer
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 2
Basics idea of Empirical Orthogonal Functions
EOFs are an orthogonal linear transformation into new coordinate system such
that the greatest variance by any projection of the data is contained in the
first EOF (also called the first principal component), the second greatest
variance on the second EOF, etc.
• Distinguish patterns/noise
• Reduce dimensionality
• Prediction
• Smoothing
• Also known as:
• Principal Component Analysis (PCA)
• Discrete Karhunen-Loeve Transform (KLT)
• The Hotelling Transform
• Proper Orthogonal Decomposition (POD)
• Singular Value Decomposition (SVD) …
Empirical Orthogonal Functions (EOFs)
Principal Component Analysis (PCA)
Sundermeyer
MAR 550
Spring 2020 3
SVD: Singular Value Decomposition
• Decomposes any [n x p] matrix X into the form:
X = U S V’
where:
• U is a [n x n] orthonormal matrix
• S is a diagnoal [n x p] matrix with si,i elements on the diagonal. The
elements, s, are called singular values.
• The columns of U and V contain the singular vectors of X.
Empirical Orthogonal Functions EOFs via SVD
Sundermeyer
MAR 550
Spring 2020 4
• X is the demeaned data matrix as before.
1) Cx = X’ X = (U S V’)’ (U S V’) = V S’ U’ U S V’ = V S’ S V’
2) Cx = EOFs L EOFs’ (re-written eigenvalue problem)
• Comparing 1) & 2):
– EOFs = V (at least almost)
– L = S’ S: the squared singular values are the eigenvalues
– The columns of V contain the eigenvectors of Cx= X’ X; our EOFs
– The columns of U contain the eigenvectors of X’ X, which is also the
normalized time series.
Empirical Orthogonal Functions EOFs via SVD
Sundermeyer
MAR 550
Spring 2020 5
1. Use SVD to find U, S, and V such that X = U S V’
2. Compute the eigenvalues of Cx.
3. The eigenvectors of Cx are the column vectors of V.
• Note: we never have to actually compute Cx!
Empirical Orthogonal Functions EOFs via SVD
Sundermeyer
MAR 550
Spring 2020 6
How to in Matlab:
1. Shape your data, e.g., into [time x space]
2. Demean the data: >> X = detrend (X,0);
3. Perform SVD: >> [ U, S, V ] = svd(X);
4. Compute Eigenvalues: >> EVal = diag(S.^2);
5. Compute explained variance: >> expl_var = EVal/sum(EVal);
6. EOFs are the column vectors of V’: >> EOFs = V’;
7. Compute Expansion Coefficients: >> EC = U*S;
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 7
• There are basically two techniques:
• Computing Eigenvector and Eigenvalues of the Covariance Matrix
• Singular Value Decomposition (SVD) of the data.
• Both Methods give similar results; however:
• There are some differences in dimensionality.
• SVD is much faster – especially for data above 1000 x 1000 points
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 8
Empirical Orthogonal Functions
Options for dealing with data gaps:
• Ignore them; leave them be.
• Introduce randomly generated data to fill gaps and test for M realizations
• Fill gaps, e.g., using optimal interpolation
A word about removing the mean
• Removing the mean has nothing to do with the process of finding
eigenvectors, but it allows us to interpret Cx as a covariance matrix, and
hence we can understand our results. Strictly speaking one can find EOFs
without removing the mean.
Sundermeyer
MAR 550
Spring 2020 9
• Example: Coastal Mixing and Optics Shipboard Velocity
time (days)
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 10
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 11
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 12
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 13
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 14
North et al. (1982) proposed a “rule of thumb” for deciding whether an EOF
is likely subject to large sampling fluctuations.
For large sample size, N, an approximate 95% confidence interval for the
eigenvalue of the sample covariance matrix is:
Confidence Interval = λi ± 1.96λi p 2/N
Rule: if the confidence interval is comparable to the spacing between
neighboring eigenvalues, then the corresponding eigenvalues will be
strongly affected by sampling fluctuations.
Since the confidence interval scales with λ, the CIs will be equally spaced on
a log scale.
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 15
Useful Tidbits:
• pca.m - Matlab’s Principal Component Analysis function
References
• R. W. Preisendorfer. Principal component analysis in meteorology and
oceanography. Elsevier. Science, 1988
• Hans v. Storch and Francis W. Zwiers: Statistical Analysis in Climate
Research. Cambridge University Press, 2002.
• North, G.R., T.L. Bell, R.F. Cahalan, and F.J. Moeng, Sampling errors in the
estimation of empirical orthogonal functions, Mon. Wea. Rev., 110, 699-706,
1982.
• Hannachi, A., I. T. Jolliffe and D. B. Stephenson: Empirical orthogonal
functions and related techniques in atmospheric science: A review.
International Journal of Climatology, 27, 1119–1152, 2007.
Empirical Orthogonal Functions
Sundermeyer
MAR 550
Spring 2020 16
Cautionary note from von Storch and Navarra (2002):
“I have learned the following rule to be useful when dealing with
advanced methods. Such methods are often needed to find a
signal in a vast noisy space, i.e. the needle in the haystack. But
after having the needle in our hand, we should be able to identify
the needle by simply looking at it. Whenever you are unable to do
so there is a good chance that something is rotten in the
analysis.”