Upload
others
View
22
Download
0
Embed Size (px)
Citation preview
1
Multivariate StatisticsSummary and Comparison of Techniques
PThe key to multivariate statistics is understandingconceptually the relationship among techniques withregards to:
<The kinds of problems each technique is suited for
<The objective(s) of each technique
<The data structure required for each technique
< Sampling considerations for each technique
<Underlying mathematical model, or lack thereof, ofeach technique
< Potential for complementary use of techniques
2
Model Techniques
y1 + y2 + ... + yi = x
y1 + y2 + ... + yi = x1 + x2 + ... + xj
Unconstrained OrdinationCluster Analysis
Multivariate ANOVAMulti-Response PermutationAnalysis of SimilaritiesMantel TestDiscriminant AnalysisLogistic RegressionClassification TreesIndicator Species Analysis
Constrained OrdinationCanonical CorrelationMultivariate Regression Trees
y1 + y2 + ... + yi
Multivariate Techniques
3
Extract gradients of maximumvariation
Multivariate Techniques
Establish groups of similarentities
Test for & describe differencesamong groups of entities orpredict group membership
Extract gradients of variation independent variables explainableby independent variables
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Objective
4
Emphasizes variation amongindividual sampling entitiesby defining gradients ofmaximum total samplevariance; describes the inter-entity variance structure.
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Variance Emphasis
5
Emphasizes both differencesand similarities amongindividual sampling entitiesby clustering entities basedon inter-entity resemblance.
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Variance Emphasis
6
Emphasizes variation amonggroups of sampling entities;describes the inter-groupvariance structure.
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Variance Emphasis
7
Emphasizes variation amongindividual sampling entitiesby defining gradients ofmaximum total samplevariance explainable byenvironmental variables
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Variance Emphasis
8
Interdependence
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Dependence Type
Interdependence
Dependence
Dependence (& interdependence?)
9
One set; >>2 variables (Y)
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Data Structure
One set; >>2 varibles (Y)
Two sets; 1 grouping variable(X), >>2 discriminatingvariables (Y)
Two sets; >>2 response (Y)variables, $1 explanatory (X)variables
10
Multivariate Techniques
Obs Group Y-set X-set
1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m
2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m
3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m
. . . . . ... . . . . ... .
. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm
n+1 C c11 c12 c13 ... c1p
n+2 C c21 c22 c23 ... c2p
n+3 C c31 c32 c33 ... c3p
. . . . . ... .
. . . . . ... .N C cn1 cn2 cn3 ... cnp
Unconstrained Ordination(PCA, PCO, CA, DCA,NMDS)Cluster Analysis(Family of techinques)
11
Obs Group Y-set X-set
1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m
2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m
3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m
. . . . . ... . . . . ... .
. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm
n+1 C c11 c12 c13 ... c1p
n+2 C c21 c22 c23 ... c2p
n+3 C c31 c32 c33 ... c3p
. . . . . ... .
. . . . . ... .N C cn1 cn2 cn3 ... cnp
Multivariate Techniques
Discrimination Techniques(MANOVA, MRPP,ANOSIM, Mantel; DA,LR, CART, ISA)
12
Obs Group Y-set X-set
1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m
2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m
3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m
. . . . . ... . . . . ... .
. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm
n+1 C c11 c12 c13 ... c1p
n+2 C c21 c22 c23 ... c2p
n+3 C c31 c32 c33 ... c3p
. . . . . ... .
. . . . . ... .N C cn1 cn2 cn3 ... cnp
Multivariate Techniques
Constrained Ordination(RDA, CCA, CAP, COR)MRT
13
Multivariate Techniques
Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)
Cluster Analysis(Family of techinques)
Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)
Constrained Ordination(RDA, CCA, CAP)
Technique Sample Characteristics
N (from one or unknown #pops)
N (from unknown # pop's)
N (from known # pop's) orN1, N2, ... (from separate
pop's)
N (from one pop)
14
Multivariate Techniques
If the research objective is to:
UnconstrainedOrdination
PDescribe the major ecological gradients ofvariation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use...
15
PAssume linear relationship toecological gradients... PCA, PCO(MDS)
Multivariate Techniques
If the research objective is to:PDescribe the major ecological gradients of
variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained
Ordination
16
CA(RA), DCA
PAssume linear relationship toecological gradients...
PAssume unimodal relationship toecological gradients...
PCA, PCO(MDS)
Multivariate Techniques
If the research objective is to:PDescribe the major ecological gradients of
variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained
Ordination
17
NMDS
CA(RA), DCAAlt: PCA*
PAssume linear relationship toecological gradients...
PAssume unimodal relationship toecological gradients...
PAssume no particular relationship;only monotonic relationship betweeninput and output dissimilarities...
PCA, PCO(MDS)
Multivariate Techniques
If the research objective is to:PDescribe the major ecological gradients of
variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained
Ordination
18
PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis
Multivariate Techniques
If the research objective is to:
19
PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis
Multivariate Techniques
If the research objective is to:
PAssign entities to a specifiednumber of groups to maximizewithin-group similarity or formcomposite clusters... Non-hierarchical
Cluster Analysis
20
PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis
Multivariate Techniques
If the research objective is to:
PAssign entities to a specifiednumber of groups to maximizewithin-group similarity or formcomposite clusters...
PAssign entities to groups anddisplay relationships amonggroups as they form...
Non-hierarchicalCluster Analysis
HierarchicalCluster Analysis
21
PEstablish artificial classes or groups ofentities with similar species composition andabundance where pre-specified, well-definedgroups do not already exist, based onmeasured environmental variables, and/or toportray sampling entities in "discrete" groupsrepresenting species assemblages withdistinct environmental affinities, then use... Constrained
Cluster Analysis(MRT)
Multivariate Techniques
If the research objective is to:
22
MANOVA / DA
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:P Test for “significant” differences
among groups...
< Parametric test...
Multivariate Techniques
If the research objective is to:
23
Multivariate Techniques
MRPP,ANOSIM,Mantel
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:P Test for “significant” differences
among groups...
< Parametric test...
<Nonparametric tests...MANOVA / DA
If the research objective is to:
24
PDescribe the major ecological differences amonggroups...
<Assume a linear discrimination function... DA (CAD)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
25
PDescribe the major ecological differences amonggroups...
<Assume a linear discrimination function...
<Assume a logistic discrimination function...DA (CAD)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
LR (MLR)
26
PDescribe the major ecological differences amonggroups...
<Assume a linear discrimination function...
<Assume a logistic discrimination function...
<Do not assume any particular function...
DA (CAD)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
LR (MLR)
CART (UCT)
27
PDescribe the major ecological differences amonggroups...
<Assume a linear discrimination function...
<Assume a logistic discrimination function...
<Do not assume any particular function...
< Identify “indicators” for each group...
DA (CAD)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
LR (MLR)
CART (UCT)
ISA
28
P Predict group membership of futureobservations...
<Linear classification function... DA (LDF)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
29
P Predict group membership of futureobservations...
<Linear classification function...
<Logistic classification function...DA (LDF)
LR (MLR)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
30
P Predict group membership of futureobservations...
<Linear classification function...
<Logistic classification function...
<Decision tree classifier...
DA (LDF)
LR (MLR)
CART (UCT)
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
31
P Predict group membership of futureobservations...
<Linear classification function...
<Logistic classification function...
<Decision tree classifier...
<Other nonparametric classifiers...
DA (LDF)
LR (MLR)
CART (UCT)
KernelK nearest-neighbor
Multivariate Techniques
PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:
If the research objective is to:
32
Multivariate Techniques
PExplain the variation in a singlecontinuous dependent variable using twoor more continuous independentvariables, and/or to develop a modelfor predicting the value of thedependent variable from the values ofthe independent variables, then use... Multiple
LinearRegression
Alternatives: CART (URT)
If the research objective is to:
33
PExplain the variation in a singledichotomous dependent (grouping) variableusing two or more continuous and/orcategorical independent variables, and/orto develop a model for predicting thegroup membership of a sampling entityfrom the values of the independentvariables, then use... Multiple
LogisticRegression
Alternatives: DACART (UCT)
Multivariate Techniques
If the research objective is to:
34
PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained
Ordination orMRT
Multivariate Techniques
If the research objective is to:
35
PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained
Ordination orMRT
RDA, CAP
Multivariate Techniques
If the research objective is to:
PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...
36
PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained
Ordination orMRT
RDA, CAP
CCA, DCCAAlt: RDA*
Multivariate Techniques
If the research objective is to:
PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...
PAssume unimodal response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...
37
PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained
Ordination orMRT
RDA, CAP
CCA, DCCA
MRT
Multivariate Techniques
If the research objective is to:
PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...
PAssume unimodal response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...
PDo not assume any response function...
38
PDetermine the magnitude of theecological relationships between two setsof variables expressed as distancematrices; i.e., dissimilarities betweensamples, then use... Mantel Test
Multivariate Techniques
If the research objective is to:
PDetermine the magnitude of theecological relationships between two setsof variables expressed as distancematrices after accounting for a third setof variables (i.e., Y~X|Z), then use... Partial Mantel
Test
39
Dependence Techniques
Independent Variables
40
CT
SLR
MLR
SRA
MRA
T-test
T2-test
DA
RDA
CCA
UCT
URT MRT
ISA
COR
MANOVA
= Contingency tables
= Simple logistic regression
= Multiple logistic regression
= Simple linear regression
= Multiple linear regression
= T-test
= Univar. classification trees
ANOVA
= Univar. regression trees
= Analysis of variance
= Hotelling’s T2
= Multivariate analysisof variance
= Discriminant analysis
= Indicator species analysis
= Redundancy analysis
= Can. correspond. analysis
= Multivar. regression trees
= Canonical corr. analysis
CAP = Can. prin. coord. analysis
Dependence Techniques
41
CT CT
CT CT
CT CT
CT
CT
CT
SLR SLR MLRSLR MLR
SRA SRA SRAMRA MRAT-test ANOVA ANOVA
T2-test Manova Manova
DA DA
DA
DA
DA
DARDA
RDA
RDA
CAP
CCA
CCACCA
CCA
UCT
UCT
UCT
UCT
URTURT
MRT MRTISA ISA COR
COR
DA DA
RDACAP
CAP CAP
MRTMRT
COR
COR
Dependence Techniques
Independent Variables
RDA
CCACAP
RDA
CCACAP
RDA
CCACAP
RDA
CCACAP
RDA
CCACAP
RDA
CCACAP
42
Advantages of Multivariate Statistics
PReflect more accurately the truemultidimensional, multivariatenature of natural systems.
P Provide a way to handle large datasets with large numbers ofvariables.
P Provide a way of summarizingredundancy in large data sets.
P Provide rules for combiningvariables in an "optimal" way.
43
Advantages of Multivariate Statistics
P Provide a solution to the multiplecomparison problem by controllingexperimentwise error rate.
P Provide a means of detecting andquantifying truly multivariate patternsthat arise out of the correlationalstructure of the variable set.
P Provide a means of exploring complexdata sets for patterns and relationshipsfrom which hypotheses can begenerated and subsequently testedexperimentally.