24
Spearman’s Ranked Correlation

Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Embed Size (px)

Citation preview

Page 1: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Spearman’s Ranked Correlation

Page 2: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Spearman Ranked Correlation

If the data are not normally distributed one can use ranked data to determine the correlation coefficient.

where d2 is the squared difference in ranks for each i observation and n is the sample size. Remember to rank within the columns.

nn

d

r

n

i

i

s

3

1

26

1

Page 3: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Remember that when the ranks are tied we assign the average rank to those values.

Technically, if there are many tied ranks we could use the Pearson equation using the ranks as observations, since it is a more powerful test.

However, either one could be used… just recognize thatthe Spearman test is less powerful with many tied ranks.

If there are few or no tied ranks, the Spearman test is moreappropriate.

Page 4: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Emergency Room Hospital Admissions for AsthmaPM10 Non-Attainment Event: December 8, 1992

Page 5: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Tests of Normality

.171 69 .000 .865 69 .000

.211 69 .000 .904 69 .000

Asthma

PM10

Stat istic df Sig. Stat istic df Sig.

Kolmogorov-Smirnova

Shapiro-Wilk

Lillief ors Signif icance Correctiona.

Asthma Stem-and-Leaf Plot

Frequency Stem & Leaf

20.00 0 . 11111111111111111111

13.00 0 . 2222222233333

10.00 0 . 4445555555

8.00 0 . 66677777

9.00 0 . 888999999

3.00 1 . 111

2.00 1 . 33

2.00 1 . 44

1.00 1 . 6

1.00 1 . 8

Stem width: 10

Each leaf: 1 case(s)

PM10 Stem-and-Leaf Plot

Frequency Stem & Leaf

3.00 Extremes (=<18)

5.00 0 . 33333

5.00 0 . 55555

4.00 0 . 7777

8.00 0 . 99999999

9.00 1 . 111111111

19.00 1 . 3333333333333333333

13.00 1 . 5555555555555

3.00 1 . 777

Stem width: 100

Each leaf: 1 case(s)

Page 6: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Asthma PM10

Asthma

Rank

Pm10

Rank di di2 Asthma PM10

Asthma

Rank

Pm10

Rank di di2

1 17.5 10.5 2 8.5 72.25 1 134 10.5 44 -33.5 1122.25

2 17.5 24.5 2 22.5 506.25 1 134 10.5 44 -33.5 1122.25

2 17.5 24.5 2 22.5 506.25 1 134 10.5 44 -33.5 1122.25

1 37.5 10.5 6 4.5 20.25 1 134 10.5 44 -33.5 1122.25

1 37.5 10.5 6 4.5 20.25 1 134 10.5 44 -33.5 1122.25

3 37.5 31 6 25 625 2 134 24.5 44 -19.5 380.25

3 37.5 31 6 25 625 3 134 31 44 -13 169

4 37.5 35 6 29 841 4 134 35 44 -9 81

1 57 10.5 11 -0.5 0.25 5 134 39.5 44 -4.5 20.25

1 57 10.5 11 -0.5 0.25 6 134 45 44 1 1

2 57 24.5 11 13.5 182.25 7 134 49 44 5 25

5 57 39.5 11 28.5 812.25 7 134 49 44 5 25

9 57 57.5 11 46.5 2162.25 8 134 53 44 9 81

1 76 10.5 15.5 -5 25 9 134 57.5 44 13.5 182.25

1 76 10.5 15.5 -5 25 9 134 57.5 44 13.5 182.25

6 76 45 15.5 29.5 870.25 11 134 62 44 18 324

7 76 49 15.5 33.5 1122.25 13 134 64.5 44 20.5 420.25

1 95 10.5 21.5 -11 121 16 134 68 44 24 576

1 95 10.5 21.5 -11 121 1 154 10.5 60 -49.5 2450.25

1 95 10.5 21.5 -11 121 2 154 24.5 60 -35.5 1260.25

5 95 39.5 21.5 18 324 2 154 24.5 60 -35.5 1260.25

6 95 45 21.5 23.5 552.25 3 154 31 60 -29 841

8 95 53 21.5 31.5 992.25 4 154 35 60 -25 625

9 95 57.5 21.5 36 1296 5 154 39.5 60 -20.5 420.25

13 95 64.5 21.5 43 1849 5 154 39.5 60 -20.5 420.25

1 115 10.5 30 -19.5 380.25 8 154 53 60 -7 49

1 115 10.5 30 -19.5 380.25 9 154 57.5 60 -2.5 6.25

2 115 24.5 30 -5.5 30.25 9 154 57.5 60 -2.5 6.25

2 115 24.5 30 -5.5 30.25 11 154 62 60 2 4

3 115 31 30 1 1 14 154 66.5 60 6.5 42.25

5 115 39.5 30 9.5 90.25 18 154 69 60 9 81

5 115 39.5 30 9.5 90.25 1 173 10.5 68 -57.5 3306.25

11 115 62 30 32 1024 7 173 49 68 -19 361

14 115 66.5 30 36.5 1332.25 7 173 49 68 -19 361

1 134 10.5 44 -33.5 1122.25

Page 7: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

31.0

6914.01

328440

5.2270771

6969

)6)(25.37846(1

3

s

s

s

s

r

r

r

r

nn

d

r

n

i

i

s

3

1

26

1

Page 8: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

1 nrt s

To test the significance of the Spearman’s correlation coefficient, use the t statistic calculated as:

where rs is the correlation coefficient and n is the sample size.

Page 9: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

1 nrt s

681

56.2

)246.8(31.0

16931.0

ndf

t

t

t

tcritical = 1.995 Since 2.56 > 1.995 reject Ho.

Asthma hospital admissions and zipcode average PM10 levelsduring the 1992 Denver non-attainment event were significantlycorrelated (t2.56,0.02 > p > 0.01, rs = 0.31)

Page 10: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Correlations

1.000 .287*

. .017

69 69

.287* 1.000

.017 .

69 69

Correlation Coef f icient

Sig. (2-tailed)

N

Correlation Coef f icient

Sig. (2-tailed)

N

Asthma

PM10

Spearman's rho

Asthma PM10

Correlation is signif icant at the 0.05 level (2-tailed).*.

The difference between our results (rs = 0.31) and the SPSS results (rs = 0.287) is that SPSS provides a correction factor for the tied ranks.

SPSS Output

Page 11: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

.12

)(

.12

)(

26/)(26/)(

)6/)((

3

3

33

23

tiesYgroupofnumbertheistwherett

T

and

tiesXgroupofnumbertheistwherett

T

where

TnnTnn

TTdnnr

i

ii

y

i

ii

x

yx

yxi

s

Spearman’s r Correction for Tied Ranks

Page 12: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Asthma PM10

Asthma

Rank

Pm10

Rank di di2 Asthma PM10

Asthma

Rank

Pm10

Rank di di2

1 17.5 10.5 2 8.5 72.25 4 154 35 60 -25 625

1 37.5 10.5 6 4.5 20.25 5 57 39.5 11 28.5 812.25

1 37.5 10.5 6 4.5 20.25 5 95 39.5 21.5 18 324

1 57 10.5 11 -0.5 0.25 5 115 39.5 30 9.5 90.25

1 57 10.5 11 -0.5 0.25 5 115 39.5 30 9.5 90.25

1 76 10.5 15.5 -5 25 5 134 39.5 44 -4.5 20.25

1 76 10.5 15.5 -5 25 5 154 39.5 60 -20.5 420.25

1 95 10.5 21.5 -11 121 5 154 39.5 60 -20.5 420.25

1 95 10.5 21.5 -11 121 6 76 45 15.5 29.5 870.25

1 95 10.5 21.5 -11 121 6 95 45 21.5 23.5 552.25

1 115 10.5 30 -19.5 380.25 6 134 45 44 1 1

1 115 10.5 30 -19.5 380.25 7 76 49 15.5 33.5 1122.25

1 134 10.5 44 -33.5 1122.25 7 134 49 44 5 25

1 134 10.5 44 -33.5 1122.25 7 134 49 44 5 25

1 134 10.5 44 -33.5 1122.25 7 173 49 68 -19 361

1 134 10.5 44 -33.5 1122.25 7 173 49 68 -19 361

1 134 10.5 44 -33.5 1122.25 8 95 53 21.5 31.5 992.25

1 134 10.5 44 -33.5 1122.25 8 134 53 44 9 81

1 154 10.5 60 -49.5 2450.25 8 154 53 60 -7 49

1 173 10.5 68 -57.5 3306.25 9 57 57.5 11 46.5 2162.25

2 17.5 24.5 2 22.5 506.25 9 95 57.5 21.5 36 1296

2 17.5 24.5 2 22.5 506.25 9 134 57.5 44 13.5 182.25

2 57 24.5 11 13.5 182.25 9 134 57.5 44 13.5 182.25

2 115 24.5 30 -5.5 30.25 9 154 57.5 60 -2.5 6.25

2 115 24.5 30 -5.5 30.25 9 154 57.5 60 -2.5 6.25

2 134 24.5 44 -19.5 380.25 11 115 62 30 32 1024

2 154 24.5 60 -35.5 1260.25 11 134 62 44 18 324

2 154 24.5 60 -35.5 1260.25 11 154 62 60 2 4

3 37.5 31 6 25 625 13 95 64.5 21.5 43 1849

3 37.5 31 6 25 625 13 134 64.5 44 20.5 420.25

3 115 31 30 1 1 14 115 66.5 30 36.5 1332.25

3 134 31 44 -13 169 14 154 66.5 60 6.5 42.25

3 154 31 60 -29 841 16 134 68 44 24 576

4 37.5 35 6 29 841 18 154 69 60 9 81

4 134 35 44 -9 81

20 tied ranks

Page 13: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

5.781

12

)22()22()33()66()33()55()33()77()33()55()88()2020( 333333333333

x

x

T

T

Do this for each set of tied ranks.

886

12

)33()1313()1919()99()88()44()55()55()33( 333333333

y

y

T

T

287.0

4.53072

25.15226

)52968)(53177(

25.15226

)886(26/)6969()5.781(26/)6969(

8865.78125.37846)6/)6969((

33

3

s

s

s

s

r

r

r

r

Correlations

1.000 .287*

. .017

69 69

.287* 1.000

.017 .

69 69

Correlation Coef f icient

Sig. (2-tailed)

N

Correlation Coef f icient

Sig. (2-tailed)

N

Asthma

PM10

Spearman's rho

Asthma PM10

Correlation is signif icant at the 0.05 level (2-tailed).*.

Page 14: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

If there are a significant number of tied ranks (as in our example),the Pearson’s correlation methods can be used on the ranks andthe results will be quite close:

Correlations

1 .286*

.017

69 69

.286* 1

.017

69 69

Pearson Correlation

Sig. (2-tailed)

N

Pearson Correlation

Sig. (2-tailed)

N

AsthmaRank

PM10Rank

AsthmaRank PM10Rank

Correlation is signif icant at the 0.05 level (2-tailed).*.

Correlations

1.000 .287*

. .017

69 69

.287* 1.000

.017 .

69 69

Correlation Coef f icient

Sig. (2-tailed)

N

Correlation Coef f icient

Sig. (2-tailed)

N

Asthma

PM10

Spearman's rho

Asthma PM10

Correlation is signif icant at the 0.05 level (2-tailed).*.

Page 15: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Correlation Among Dichotomous Variables:Cramér Coefficient (φ)

Page 16: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

The Cramér φ (pronounced fy as in simplify) statistic is used to determine whether correlation exists between two dichotomous variables:

• Yes, no• Presence, absence• Etc…

Such analysis is often called contingency correlation since the data are represented by a 2x2 contingency table.

Just as with Pearson’s correlation, Cramér’s correlation coefficient ranges between -1 and 1, with 0 meaning the complete absence of association.

Page 17: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

The variables are positively associated if most of the data falls along the left to right diagonal cells and negatively associated if most of the data falls along the right to left diagonal cells.

The variables are considered to have no association if most of the data fall along the rows or columns.

Page 18: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

2121

21122211

RRCC

ffff

where f are the cell values, C are the column totals, and R are the row totals.

Always set f11 and f22 to be the frequencies of agreement between the two variables.

Present Absent Column

Total

Present f11 f12 R1

Absent f21 f22 R2

Row Total C1 C2

Cramér’s φ equation:

Page 19: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Mining Town Gold Silver

Peru Y N

Argentine N Y

Chihuahua Y Y

Tiger Y N

Swanville Y N

Parkville Y Y

Rexford Y N

Swandyke Y N

Spencerville Y N

Lincoln Y N

Dyersville Y N

Montgomery Y Y

Quartzville N Y

Dudley N Y

Silverheels Y N

Tarryall Y N

Hamilton Y N

Park City N N

Holland Y N

Sacramento N Y

Georgetown Y Y

Montezuma Y N

Breckenridge Y Y

Alma N Y

Fairplay Y Y

Delaware Flats Y Y

Buckskin Joe Y N

Swan City Y N

Wild Irishman N Y

Mitchell Cabins Y Y

Saints John Y Y

Silver Plume N Y

Conger Y Y

Waldorf Y Y

Santiago Y Y

Horseshoe N Y

Mudsill N Y

Leavick N Y

Mosquito Y N

Colorado Mining Towns (late 1800s)

Page 20: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Research Question:Is there correlation between the occurrence of gold and silveramong the mining towns.

Ho: There is no association between the occurrence of gold and silver among the mining towns.

Ha: There is association between the occurrence of gold and silver among the mining towns.

Page 21: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Mining Town Gold Silver Cell Code

Peru Y N 21

Argentine N Y 12

Chihuahua Y Y 11

Tiger Y N 21

Swanville Y N 21

Parkville Y Y 11

Rexford Y N 21

Swandyke Y N 21

Spencerville Y N 21

Lincoln Y N 21

Dyersville Y N 21

Montgomery Y Y 11

Quartzville N Y 12

Dudley N Y 12

Silverheels Y N 21

Tarryall Y N 21

Hamilton Y N 21

Park City N N 22

Holland Y N 21

Sacramento N Y 12

Georgetown Y Y 11

Montezuma Y N 21

Breckenridge Y Y 11

Alma N Y 12

Fairplay Y Y 11

Delaware Flats Y Y 11

Buckskin Joe Y N 21

Swan City Y N 21

Wild Irishman N Y 12

Mitchell Cabins Y Y 11

Saints John Y Y 11

Silver Plume N Y 12

Conger Y Y 11

Waldorf Y Y 11

Santiago Y Y 11

Horseshoe N Y 12

Mudsill N Y 12

Leavick N Y 12

Mosquito Y N 21

Gold (YES) Gold (NO) Row TotalSilver (YES) 12 10 22Silver (NO) 16 1 17Column Total 28 11 39

Contingency Table

436.0

4.339

148

)17)(22)(11)(28(

)16)(10()1)(12(

Page 22: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

The probability of Cramer’s Φ can be determined by employing theχ2 statistic for the contingency table using the equation:

where fij are the observed values for row i and column j, and are the expected values for row i and column j which are calculated for each cell as:

Degrees of Freedom (DF) = (R-1)(C-1)

ij

ijij

f

ff

ˆ

)ˆ( 2

2

i jf̂

n

CR ji

ij

))((ˆ f

Page 23: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

OBSERVED Gold (YES) Gold (NO) Row TotalSilver (YES) 12 10 22Silver (NO) 16 1 17Column Total 28 11 39

EXPECTED Gold (YES) Gold (NO)

Silver (YES) 15.8 6.2

Silver (NO) 12.2 4.8

841.3

1)12)(12(

42.7

00.333.218.191.0

8.4

)8.41(

2.6

)2.610(

2.12

)2.1216(

8.15

)8.1512(

8.439

171121.12

39

17282.6

39

22118.15

39

2228

2

2

2

22222

22211211

Critical

df

ffff

Since 7.42 > 3.841 reject Ho.

There is an inverse association between the occurrence of gold and silver among the mining towns (χ2

7.42, 0.01 > p > 0.005, φ = -0.436).

Page 24: Spearman’s Ranked Correlation - My Webspace fileswebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 11... · 2014-03-25 · Spearman’s Ranked Correlation. ... where r s is the correlation

Symmetric Measures

-.436 .006

.436 .006

39

Phi

Cramer's V

Nominal by

Nominal

N of Valid Cases

Value Approx. Sig.

Not assuming the null hy pothesis.a.

Using the asymptotic standard error assuming the null

hypothesis.

b.

SPSS Results

In SPSS Cramer’s φ is found under:

Analyze > Descriptive Statistics > Crosstabs > Statistics