Upload
moeshe
View
64
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Independent components analysis of starch deficient pgm mutants. GCB 2004 M. Scholz, Y. Gibon, M. Stitt, J. Selbig. Overview. Introduction Methods PCA – Principal Component Analysis ICA – Independent Component Analysis Kurtosis Results Summary. Introduction – techniques. - PowerPoint PPT Presentation
Citation preview
Matthias Maneck - Journal Club WS 04/05
Independent components analysis of starch deficient pgm mutants
GCB 2004
M. Scholz, Y. Gibon, M. Stitt, J. Selbig
Matthias Maneck - Journal Club WS 04/05
Overview
Introduction Methods
PCA – Principal Component Analysis ICA – Independent Component AnalysisKurtosis
Results Summary
Matthias Maneck - Journal Club WS 04/05
Introduction – techniques
visualization techniques supervised
biological background informationunsupervised
present major global information General questions about the underlying data
structure. Detect relevant components independent from
background knowledge.
Matthias Maneck - Journal Club WS 04/05
Introduction – techniques
PCAdimensionality reductionextracts relevant information related to the
highest variance ICA
Optimizes independence conditionComponents represent different non-
overlapping information
Matthias Maneck - Journal Club WS 04/05
Introduction - experiments
Micro plate assays of enzymes form Arabidopsis thaliana. pgm mutant vs. wild type continuous night
data j Samples
i Enz
ymes
Matthias Maneck - Journal Club WS 04/05
Introduction – workflow
j Samples
i Enz
ymes
j Samples
PC
’s
1st IC
2nd IC
j SamplesIC
s
PCA ICA KurtosisData ICs
Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
Enzyme 1
Enz
yme
2
Matthias Maneck - Journal Club WS 04/05
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
Enzyme 1
Enz
yme
2
PCA – principal component analysis
2. Principal Component
1. Principal Component
Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
1. PC
2. P
C
Matthias Maneck - Journal Club WS 04/05
PCA – calculation
j Samples
i Enz
ymes
i Enz
ymes
i Enzymes
Eigenvectors
x1 ... ... xi
- mean
- mean
- mean
- mean
Data-Matrix Cov-Matrix
Eigenvalues
λ1
λi
Matthias Maneck - Journal Club WS 04/05
PCA – dimensionality reductionP
Cs
i Enzymes
j Samples
i Enz
ymes
j Samples
PC
s
=
Reduced Data MatrixData MatrixSelected Components
Matthias Maneck - Journal Club WS 04/05
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
Enzyme 1
Enz
yme
2
PCA – principal component analysis
1. Principal Component
2. Principal Component
Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
-4 -3 -2 -1 0 1 2 3 4-4
-3
-2
-1
0
1
2
3
4
1. PC
Matthias Maneck - Journal Club WS 04/05
PCA – principal component analysis
Minimizes correlation between components. Components are orthogonal to each other. Delivers transformation matrix, that gives the influence of
the enzymes on the principal components. PCs ordered by size of eigenvalues of cov-matrix
PC
s
i Enzymes
j Samples
i Enz
ymes
j Samples
PC
s
=
Reduced Data Matrix Data MatrixSelected Components
Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis
Person 1
Person 3
Person 2
Mike 1
Mike 3
Mike 2
microphone signals are mixed speech signals
)()()()(
)()()()(
)()()()(
3332321313
3232221212
3132121111
tsatsatsatx
tsatsatsatx
tsatsatsatx
Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysism
icro
phon
esi
gnal
s
time tmixingspeech
mic
roph
one
time t
spee
chsi
gnal
s
=m
icro
phon
esi
gnal
s
time tdemixingspeech
spea
ker
time t
spee
chsi
gnal
s
=
Microphone Signals X Mixing Matrix A Speech Signals S
Microphone signals XDemixing matrix A-1 Speech signals S
Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10
5
10
15
20
25
30
35
0 0.5 1 1.5 2 2.5 30
10
20
30
40
50
60
The sum of distribution of the same time is more Gaussian.
Matthias Maneck - Journal Club WS 04/05
ICA – independent component analysis
Maximizes independence (non Gaussianity) between components.
ICA doesn’t work with purely Gaussian distributed data. Components are not orthogonal to each other. Delivers transformation matrix, that gives the influence of the PCs
on the independent components. ICs are unordered
j Samples
PC
s
j Samples
ICs
PCs
=
ICs Demixing Matrix Data Matrix
Matthias Maneck - Journal Club WS 04/05
Kurtosis – significant components
measure of non Gaussianity
z – random variable (IC) μ – mean σ – standard deviation
positive kurtosis super Gaussian
negative kurtosis sub Gaussian
3)1(
)()(
41
4
n
zzkurtosis
n
ii
Matthias Maneck - Journal Club WS 04/05
Kurtosis – significant components
Matthias Maneck - Journal Club WS 04/05
Influence Values
Which enzymes have most influence on ICs?
PC
si Enzymes
j Samples
i Enz
ymes
j Samples
PC
s
=
Reduced Data Matrix Data MatrixSelected Components
j Samples
PC
s
j Samples
ICs
PCs
=
ICs Demixing Matrix Data Matrix
Matthias Maneck - Journal Club WS 04/05
Influence Values
PC
s
i Enzymes
Selected Components
PCs
Demixing Matrix
i Enzymes
ICs
=
Influence Matrix
j Samples
i Enz
ymes
Data Matrix
i Enzymes
ICs
Influence Matrix
j Samples
ICs
ICs
=
Matthias Maneck - Journal Club WS 04/05
Results
pgm mutantcompares wild type and pgm mutant17 enzymes,125 samples
wild type, pgm mutant
continuous nightresponse to carbon starvation17 enzymes, 55 samples
+0, +2, +4, +8, +24, +48, +72, +148 h
Matthias Maneck - Journal Club WS 04/05
Results – pgm mutant
Matthias Maneck - Journal Club WS 04/05
Matthias Maneck - Journal Club WS 04/05
Results – continuous night
Matthias Maneck - Journal Club WS 04/05
Results – combined
Matthias Maneck - Journal Club WS 04/05
Results – combined
Matthias Maneck - Journal Club WS 04/05
Results – combined
Matthias Maneck - Journal Club WS 04/05
Summary
ICA in combination with PCA has higher discriminating power than only PCA.
Kurtosis is used for selection optimal PCA dimension and ordering of ICs.
pgm experiment, 1st IC discriminates between mutant and wild type.
Continuous night, 2nd IC represents time component.
The two most strongly implicated enzymes are identical.
Matthias Maneck - Journal Club WS 04/05
References
Scholz M., Gibon Y., Stitt M., Selbig J.: Independent components analysis of starch deficient pgm mutants.
Scholz M., Gatzek S., Sterling A., Fiehn O., Selbig J.: Metabolite fingerprinting: an ICA approach.
Blaschke, T., Wiskott, L.: CuBICA: Independent Component Analysis by Simultaneous Third- and Fourth-Order Cumulant Diagonalization. IEEE Transactions on Signal Processing, 52(5):1250-1256.http://itb.biologie.hu-berlin.de/~blaschke/
Hyvärinen A., Karhunen J., Oja E.: Independent Component Analysis. J. Wiley. 2001.