15
1 Multivariate Statistics: An Ecological Perspective Nature is Complex! 2 Advantages of Multivariate Statistics P Reflect more accurately the true multidimensional, multivariate nature of natural systems. P Provide a way to handle large data sets with large numbers of variables. P Provide a way of summarizing redundancy in large data sets. P Provide rules for combining variables in an "optimal" way.

Multivariate Statistics: An Ecological Perspective

  • Upload
    others

  • View
    18

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multivariate Statistics: An Ecological Perspective

1

Multivariate Statistics: An Ecological Perspective

Nature is Complex!

2

Advantages of Multivariate Statistics

P Reflect more accurately the truemultidimensional, multivariatenature of natural systems.

P Provide a way to handle large datasets with large numbers ofvariables.

P Provide a way of summarizingredundancy in large data sets.

P Provide rules for combiningvariables in an "optimal" way.

Page 2: Multivariate Statistics: An Ecological Perspective

3

Advantages of Multivariate Statistics

P Provide a solution to the multiplecomparison problem by controllingexperimentwise error rate.

P Provide a means of detecting andquantifying truly multivariate patternsthat arise out of the correlationalstructure of the variable set.

P Provide a means of exploring complexdata sets for patterns and relationshipsfrom which hypotheses can begenerated and subsequently testedexperimentally.

4

What is Multivariate Statistics?

Model Techniques

y = x1 + x2 + ... xj RegressionAnalysis of VarianceContingency Tables

y1 + y2 + ... yi = x Multivariate ANOVADiscriminant AnalysisCART,MRPP,MANTEL

y1 + y2 + ... yi = x1 + x2 + ... xj

y1 + y2 + ... yi

Canonical Corr. AnalysisRedundancy AnalysisCan. Correspond. Analysis

OrdinationCluster Analysis

Multivariate Statistics

Page 3: Multivariate Statistics: An Ecological Perspective

5

Canopy Snag CanopyObs Cover Density Height

1 80 0.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Example 1-Environmental Gradient

Data Matrix3-Dimensional Data Space

Ordination

6

Example 1-Environmental Gradient

3-Dimensional Data Space

ClusterAnalysis

Canopy Snag CanopyObs Cover Density Height

1 80 0.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Data Matrix

Page 4: Multivariate Statistics: An Ecological Perspective

7

Sample Species A Species B Species C

1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Example 2-Community Structure

Data Matrix

3-Dimensional Species Space

1

3

5

6

7

8

9

1011

12

2 4

Ordination

8

1

3

5

6

7

8

9

1011

12

2 4

Sample Species A Species B Species C

1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Data Matrix

Example 2-Community Structure

3-Dimensional Species Space

ClusterAnalysis

Page 5: Multivariate Statistics: An Ecological Perspective

9

AB

C

Sample Species A Species B Species C

1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Example 2-Community Structure

Data Matrix

3-Dimensional Sample Space

Ordination

10

3-Dimensional Ordination Space

A

B

C1

3

5

6

7

8 10

11

12

2 4

Sample Species A Species B Species C

1 80 1.2 352 75 0.5 323 72 0.8 28. . . .. . . .12 25 0.6 15

Example 2-Community Structure

Data Matrix

Ordination

Page 6: Multivariate Statistics: An Ecological Perspective

11

Ind. Species Canopy Snag CanopyCover Density Height

1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15

Data Matrix

Example 3-Niche Separation

X2=Snag density

A

B

C

AB

C

X3=Canopy height

ABC

1-Dimensional Data Space

12

2-Dimensional Data Space

A

AA

B

B

B

C

C

C

Ind. Species Canopy Snag CanopyCover Density Height

1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15

Data Matrix

Example 3-Niche Separation

Page 7: Multivariate Statistics: An Ecological Perspective

13

Ind. Species Canopy Snag CanopyCover Density Height

1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15

Data Matrix

Example 3-Niche Separation

14

3-Dimensional Data Space

A AA

A

X2=Snag density

X3=Canopy height

B

BB

B

C

C

C

C

Ind. Species Canopy Snag CanopyCover Density Height

1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15

Data Matrix

Example 3-Niche Separation

Page 8: Multivariate Statistics: An Ecological Perspective

15

Ind. Species Canopy Snag CanopyCover Density Height

1 A 80 1.2 352 A 75 0.5 323 A 72 0.8 28. . . . .31 B 35 3.3 1532 B 75 4.1 2560 B 15 5.0 3. . . . .61 C 5 2.1 562 C 8 3.4 290 C 25 0.6 15

Data Matrix

Example 3-Niche Separation

16

Data Matrix

Example 4-Habitat Use

3-Dimensional Data Space

X2=Snag density

X3=Canopy height

Obs Group Canopy Snag CanopyCover Density Height

1 Use 80 1.2 352 Use 75 0.5 323 Use 72 0.8 284 Use 35 3.3 15. . . . .31 Random 5 2.1 532 Random 68 3.4 233 Random 25 0.6 1534 Random 70 1.3 33. . . . .

Use

Random

Page 9: Multivariate Statistics: An Ecological Perspective

17

Data Matrix

Example 5-Constrained Ordination

3-D Environment Space

3-D Species Space

X2=Species B

X3=Species C

X2=Snag Density

X3=Canopy Height

A

B

C1

3

5

6

7

8 10

1124

12

CanopyCover

Snags

CanopyHeight

18

Multivariate StatisticsKey Points

P Multivariate statistics involves cases involving multiple “dependent”variables, or a single set of variables presumed to be dependent onsome underlying (latent) but unknown factors.

P All multivariate problems can be respresented as a two-way datamatrix in which rows represent sampling entities and columnsrepresent variables; the internal structure of the matrix with respect togroups of sampling entities or dependence relationships amongvariables distinquishes among the various multivariate techniques.

P All multivariate problems can be conceptualized geometrically as adata cloud in a P-dimensional data space, where the dimensions (oraxes) are defined by the variables of interest; it is the shape, clumping,and dispersion of this cloud that multivariate techniques seek todescribe.

Page 10: Multivariate Statistics: An Ecological Perspective

19

Multivariate Description versus Inference

P Provide rules for combining the variables in an optimalway. What is meant by ‘optimal' may vary from onetechnique to the next.

On the Descriptive Side:

P Provide explicit control over the experimentwise error rate.Many situations in which multivariate techniques areapplied could be analyzed through a series of univariatesignificance tests

On the Inferential Side:

20

Multivariate Confusion

PWhich technique to use?< Ordination or cluster analysis?

< Unconstrained or constrainedordination?

< Polar ordination, principalcomponents analysis, principalcoordinates analysis,correspondence analysis,nonmetric multidimensionalscaling?

Page 11: Multivariate Statistics: An Ecological Perspective

21

Multivariate Confusion

PAlternative Terminology forTechniques

< Indirect Gradient Analysis orUnconstrained Ordination?

<Reciprocal Averaging orCorrespondence Analysis?

<Canonical Ordination orConstrained Ordination?

<Discriminant Analysis or CanonicalVariates Analysis?

22

Multivariate Confusion

PTerminology for Variable LabelsBased on Data Type andMeasurement Scale!Categorical Variable< Dichotomous< Polytomous

– Ordinal Scale– Nominal Scale

!Continuous Variable– Ratio Scale (true zero)– Interval Scale (arbitrary zero)

!Count Variable

Page 12: Multivariate Statistics: An Ecological Perspective

23

Multivariate Confusion

PTerminology for Variable Labels basedon the Relationship with otherVariables!Independent Variable< Variable presumed to be a cause of any change

in a dependent variable; often regarded as fixed,either as in experimentation or because thecontext of the data suggests they play a causalrole in the situation under study.

!Dependent Variable< Variable presumed to be responding to a

change in an independent variable; variablesfree to vary in response to controlledconditions.

24

Multivariate Techniques

Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MRPP, MANTEL, DA, CART, ...)

Constrained Ordination(RDA, CCA, CAPS, CanCorr)

Extract gradients of maximumvariation

Establish groups of similar entities

Test for or describe differencesamong groups of entities or predictgroup membership

Extract gradients of variation independent variables explanable byindependent variables

Technique Objective

Page 13: Multivariate Statistics: An Ecological Perspective

25

Multivariate Techniques

Interdependence

Interdependence

Dependence

Dependence

Dependence Type

Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MRPP, MANTEL, DA, CART, ...)

Constrained Ordination(RDA, CCA, CAPS, CanCorr)

Technique

26

One set; >>2 variables

One set; >>2 varibles

Two sets; 1 grouping variable, >>2discriminating variables

Two sets; >>2 depend variables,>>2 independent variables

Data Structure

Multivariate Techniques

Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MRPP, MANTEL, DA, CART, ...)

Constrained Ordination(RDA, CCA, CAPS, CanCorr)

Technique

Page 14: Multivariate Statistics: An Ecological Perspective

27

Multivariate Techniques

Obs Group X-set Y-set

1 A a11 a12 a13 ... a1p b11 b12 b13 ... b1m

2 A a21 a22 a23 ... a2p b21 b22 b23 ... b2m

3 A a31 a32 a33 ... a3p b31 b32 b33 ... b3m

. . . . . ... . . . . ... .

. . . . . ... . . . . ... .n A an1 an2 an3 ... anp bn1 bn2 bn3 ... bnm

n+1 C c11 c12 c13 ... c1p

n+2 C c21 c22 c23 ... c2p

n+3 C c31 c32 c33 ... c3p

. . . . . ... .

. . . . . ... .N C cn1 cn2 cn3 ... cnp

28

N (from known or unknown #pop's)

N (from known or unknown #pop's)

N (from known # pop's) orN1, N2, .. (from separate pop's)

N (from one pop)

Sample Characteristics

Multivariate Techniques

Unconstrained Ordination(PCA, PO, CA, DCA, NMDS)

Cluster Analysis(Family of techinques)

Discrimination(MRPP, MANTEL, DA, CART, ...)

Constrained Ordination(RDA, CCA, CAPS, CanCorr)

Technique

Page 15: Multivariate Statistics: An Ecological Perspective

29

P Multivariate statistics involves both descriptive andinferential statistics, although most applications areexploratory and descriptive in nature.

P Multivariate statistics includes a broad array of techniquesand confusing and inconsistent use of terminology – sorry,no way around this.

P Research questions often warrant the use of more than onetechnique as the same technique can often be used toanswer different questions and the same question can oftenbe answered with different techniques.

Multivariate StatisticsKey Points