22
1 Multivariate Statistics Summary and Comparison of Techniques P The key to multivariate statistics is understanding conceptually the relationship among techniques with regards to: < The kinds of problems each technique is suited for < The objective(s) of each technique < The data structure required for each technique < Sampling considerations for each technique < Underlying mathematical model, or lack thereof, of each technique < Potential for complementary use of techniques 2 Model Techniques y 1 + y 2 + ... + y i = x y 1 + y 2 + ... + y i = x 1 + x 2 + ... + x j Unconstrained Ordination Cluster Analysis Multivariate ANOVA Multi-Response Permutation Analysis of Similarities Mantel Test Discriminant Analysis Logistic Regression Classification Trees Indicator Species Analysis Constrained Ordination Canonical Correlation Multivariate Regression Trees y1 + y2 + ... + yi Multivariate Techniques

Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

  • Upload
    others

  • View
    22

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

1

Multivariate StatisticsSummary and Comparison of Techniques

PThe key to multivariate statistics is understandingconceptually the relationship among techniques withregards to:

<The kinds of problems each technique is suited for

<The objective(s) of each technique

<The data structure required for each technique

< Sampling considerations for each technique

<Underlying mathematical model, or lack thereof, ofeach technique

< Potential for complementary use of techniques

2

Model Techniques

y1 + y2 + ... + yi = x

y1 + y2 + ... + yi = x1 + x2 + ... + xj

Unconstrained OrdinationCluster Analysis

Multivariate ANOVAMulti-Response PermutationAnalysis of SimilaritiesMantel TestDiscriminant AnalysisLogistic RegressionClassification TreesIndicator Species Analysis

Constrained OrdinationCanonical CorrelationMultivariate Regression Trees

y1 + y2 + ... + yi

Multivariate Techniques

Page 2: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

3

Extract gradients of maximumvariation

Multivariate Techniques

Establish groups of similarentities

Test for & describe differencesamong groups of entities orpredict group membership

Extract gradients of variation independent variables explainableby independent variables

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Objective

4

Emphasizes variation amongindividual sampling entitiesby defining gradients ofmaximum total samplevariance; describes the inter-entity variance structure.

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Variance Emphasis

Page 3: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

5

Emphasizes both differencesand similarities amongindividual sampling entitiesby clustering entities basedon inter-entity resemblance.

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Variance Emphasis

6

Emphasizes variation amonggroups of sampling entities;describes the inter-groupvariance structure.

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Variance Emphasis

Page 4: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

7

Emphasizes variation amongindividual sampling entitiesby defining gradients ofmaximum total samplevariance explainable byenvironmental variables

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Variance Emphasis

8

Interdependence

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Dependence Type

Interdependence

Dependence

Dependence (& interdependence?)

Page 5: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

9

One set; >>2 variables (Y)

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Data Structure

One set; >>2 varibles (Y)

Two sets; 1 grouping variable(X), >>2 discriminatingvariables (Y)

Two sets; >>2 response (Y)variables, $1 explanatory (X)variables

10

Multivariate Techniques

Obs Group Y-set X-set

1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m

2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m

3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m

. . . . . ... . . . . ... .

. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm

n+1 C c11 c12 c13 ... c1p

n+2 C c21 c22 c23 ... c2p

n+3 C c31 c32 c33 ... c3p

. . . . . ... .

. . . . . ... .N C cn1 cn2 cn3 ... cnp

Unconstrained Ordination(PCA, PCO, CA, DCA,NMDS)Cluster Analysis(Family of techinques)

Page 6: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

11

Obs Group Y-set X-set

1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m

2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m

3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m

. . . . . ... . . . . ... .

. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm

n+1 C c11 c12 c13 ... c1p

n+2 C c21 c22 c23 ... c2p

n+3 C c31 c32 c33 ... c3p

. . . . . ... .

. . . . . ... .N C cn1 cn2 cn3 ... cnp

Multivariate Techniques

Discrimination Techniques(MANOVA, MRPP,ANOSIM, Mantel; DA,LR, CART, ISA)

12

Obs Group Y-set X-set

1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m

2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m

3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m

. . . . . ... . . . . ... .

. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm

n+1 C c11 c12 c13 ... c1p

n+2 C c21 c22 c23 ... c2p

n+3 C c31 c32 c33 ... c3p

. . . . . ... .

. . . . . ... .N C cn1 cn2 cn3 ... cnp

Multivariate Techniques

Constrained Ordination(RDA, CCA, CAP, COR)MRT

Page 7: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

13

Multivariate Techniques

Unconstrained Ordination(PCA, MDS, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MANOVA, MRPP, ANOSIM,Mantel, DA, LR, CART, ISA)

Constrained Ordination(RDA, CCA, CAP)

Technique Sample Characteristics

N (from one or unknown #pops)

N (from unknown # pop's)

N (from known # pop's) orN1, N2, ... (from separate

pop's)

N (from one pop)

14

Multivariate Techniques

If the research objective is to:

UnconstrainedOrdination

PDescribe the major ecological gradients ofvariation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use...

Page 8: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

15

PAssume linear relationship toecological gradients... PCA, PCO(MDS)

Multivariate Techniques

If the research objective is to:PDescribe the major ecological gradients of

variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained

Ordination

16

CA(RA), DCA

PAssume linear relationship toecological gradients...

PAssume unimodal relationship toecological gradients...

PCA, PCO(MDS)

Multivariate Techniques

If the research objective is to:PDescribe the major ecological gradients of

variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained

Ordination

Page 9: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

17

NMDS

CA(RA), DCAAlt: PCA*

PAssume linear relationship toecological gradients...

PAssume unimodal relationship toecological gradients...

PAssume no particular relationship;only monotonic relationship betweeninput and output dissimilarities...

PCA, PCO(MDS)

Multivariate Techniques

If the research objective is to:PDescribe the major ecological gradients of

variation among individual samplingentities, and/or to portray samplingentities along "continuous" gradients ofmaximum sample variation, then use... Unconstrained

Ordination

18

PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis

Multivariate Techniques

If the research objective is to:

Page 10: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

19

PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis

Multivariate Techniques

If the research objective is to:

PAssign entities to a specifiednumber of groups to maximizewithin-group similarity or formcomposite clusters... Non-hierarchical

Cluster Analysis

20

PEstablish artificial classes or groups ofsimilar entities where pre-specified, well-defined groups do not already exist,and/or to portray sampling entities in"discrete" groups, then use... Cluster Analysis

Multivariate Techniques

If the research objective is to:

PAssign entities to a specifiednumber of groups to maximizewithin-group similarity or formcomposite clusters...

PAssign entities to groups anddisplay relationships amonggroups as they form...

Non-hierarchicalCluster Analysis

HierarchicalCluster Analysis

Page 11: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

21

PEstablish artificial classes or groups ofentities with similar species composition andabundance where pre-specified, well-definedgroups do not already exist, based onmeasured environmental variables, and/or toportray sampling entities in "discrete" groupsrepresenting species assemblages withdistinct environmental affinities, then use... Constrained

Cluster Analysis(MRT)

Multivariate Techniques

If the research objective is to:

22

MANOVA / DA

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:P Test for “significant” differences

among groups...

< Parametric test...

Multivariate Techniques

If the research objective is to:

Page 12: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

23

Multivariate Techniques

MRPP,ANOSIM,Mantel

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:P Test for “significant” differences

among groups...

< Parametric test...

<Nonparametric tests...MANOVA / DA

If the research objective is to:

24

PDescribe the major ecological differences amonggroups...

<Assume a linear discrimination function... DA (CAD)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

Page 13: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

25

PDescribe the major ecological differences amonggroups...

<Assume a linear discrimination function...

<Assume a logistic discrimination function...DA (CAD)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

LR (MLR)

26

PDescribe the major ecological differences amonggroups...

<Assume a linear discrimination function...

<Assume a logistic discrimination function...

<Do not assume any particular function...

DA (CAD)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

LR (MLR)

CART (UCT)

Page 14: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

27

PDescribe the major ecological differences amonggroups...

<Assume a linear discrimination function...

<Assume a logistic discrimination function...

<Do not assume any particular function...

< Identify “indicators” for each group...

DA (CAD)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

LR (MLR)

CART (UCT)

ISA

28

P Predict group membership of futureobservations...

<Linear classification function... DA (LDF)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

Page 15: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

29

P Predict group membership of futureobservations...

<Linear classification function...

<Logistic classification function...DA (LDF)

LR (MLR)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

30

P Predict group membership of futureobservations...

<Linear classification function...

<Logistic classification function...

<Decision tree classifier...

DA (LDF)

LR (MLR)

CART (UCT)

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

Page 16: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

31

P Predict group membership of futureobservations...

<Linear classification function...

<Logistic classification function...

<Decision tree classifier...

<Other nonparametric classifiers...

DA (LDF)

LR (MLR)

CART (UCT)

KernelK nearest-neighbor

Multivariate Techniques

PDifferentiate among pre-specified,well-defined classes or groups ofsampling entities, and to:

If the research objective is to:

32

Multivariate Techniques

PExplain the variation in a singlecontinuous dependent variable using twoor more continuous independentvariables, and/or to develop a modelfor predicting the value of thedependent variable from the values ofthe independent variables, then use... Multiple

LinearRegression

Alternatives: CART (URT)

If the research objective is to:

Page 17: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

33

PExplain the variation in a singledichotomous dependent (grouping) variableusing two or more continuous and/orcategorical independent variables, and/orto develop a model for predicting thegroup membership of a sampling entityfrom the values of the independentvariables, then use... Multiple

LogisticRegression

Alternatives: DACART (UCT)

Multivariate Techniques

If the research objective is to:

34

PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained

Ordination orMRT

Multivariate Techniques

If the research objective is to:

Page 18: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

35

PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained

Ordination orMRT

RDA, CAP

Multivariate Techniques

If the research objective is to:

PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...

36

PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained

Ordination orMRT

RDA, CAP

CCA, DCCAAlt: RDA*

Multivariate Techniques

If the research objective is to:

PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...

PAssume unimodal response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...

Page 19: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

37

PDescribe the major ecological patternsin one set of (response) variablesexplainable by another set of(explanatory) variables, then use... Constrained

Ordination orMRT

RDA, CAP

CCA, DCCA

MRT

Multivariate Techniques

If the research objective is to:

PAssume linear response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...

PAssume unimodal response function ofresponse variables (species) along lineargradients defined by the explanatoryvariables (environment)...

PDo not assume any response function...

38

PDetermine the magnitude of theecological relationships between two setsof variables expressed as distancematrices; i.e., dissimilarities betweensamples, then use... Mantel Test

Multivariate Techniques

If the research objective is to:

PDetermine the magnitude of theecological relationships between two setsof variables expressed as distancematrices after accounting for a third setof variables (i.e., Y~X|Z), then use... Partial Mantel

Test

Page 20: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

39

Dependence Techniques

Independent Variables

40

CT

SLR

MLR

SRA

MRA

T-test

T2-test

DA

RDA

CCA

UCT

URT MRT

ISA

COR

MANOVA

= Contingency tables

= Simple logistic regression

= Multiple logistic regression

= Simple linear regression

= Multiple linear regression

= T-test

= Univar. classification trees

ANOVA

= Univar. regression trees

= Analysis of variance

= Hotelling’s T2

= Multivariate analysisof variance

= Discriminant analysis

= Indicator species analysis

= Redundancy analysis

= Can. correspond. analysis

= Multivar. regression trees

= Canonical corr. analysis

CAP = Can. prin. coord. analysis

Dependence Techniques

Page 21: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

41

CT CT

CT CT

CT CT

CT

CT

CT

SLR SLR MLRSLR MLR

SRA SRA SRAMRA MRAT-test ANOVA ANOVA

T2-test Manova Manova

DA DA

DA

DA

DA

DARDA

RDA

RDA

CAP

CCA

CCACCA

CCA

UCT

UCT

UCT

UCT

URTURT

MRT MRTISA ISA COR

COR

DA DA

RDACAP

CAP CAP

MRTMRT

COR

COR

Dependence Techniques

Independent Variables

RDA

CCACAP

RDA

CCACAP

RDA

CCACAP

RDA

CCACAP

RDA

CCACAP

RDA

CCACAP

42

Advantages of Multivariate Statistics

PReflect more accurately the truemultidimensional, multivariatenature of natural systems.

P Provide a way to handle large datasets with large numbers ofvariables.

P Provide a way of summarizingredundancy in large data sets.

P Provide rules for combiningvariables in an "optimal" way.

Page 22: Multivariate Statistics Summary and Comparison of Techniques · 2013. 4. 30. · 1 Multivariate Statistics Summary and Comparison of Techniques PThe key to multivariate statistics

43

Advantages of Multivariate Statistics

P Provide a solution to the multiplecomparison problem by controllingexperimentwise error rate.

P Provide a means of detecting andquantifying truly multivariate patternsthat arise out of the correlationalstructure of the variable set.

P Provide a means of exploring complexdata sets for patterns and relationshipsfrom which hypotheses can begenerated and subsequently testedexperimentally.