20
Correlation Minium, Clarke & Coladarci, Chapter 7 QuickTime™ and a TIFF (Uncompressed) decompress are needed to see this pictu

Correlation Minium, Clarke & Coladarci, Chapter 7

Embed Size (px)

Citation preview

Page 1: Correlation Minium, Clarke & Coladarci, Chapter 7

CorrelationMinium, Clarke & Coladarci, Chapter 7

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Page 2: Correlation Minium, Clarke & Coladarci, Chapter 7

Association

Univariate vs. Bivariate

one variable vs. two variables

When we have two variables we can ask about the degree to which they co-vary

is there any relationship between an individual’s score on one variable and his or her score on a second variable

number of beers consumed and reaction time (RT)

Number of hours of studying and score on an exam

Years of education and salary

Parent’s anxiety (or depression) and child anxiety (or depression)*

The correlation coefficient

“a bivariate statistic that measures the degree of linear association between two quantitative variables.

The Pearson product-moment correlation coefficient

Page 3: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

Scatter diagram

Graph that shows the degree and pattern of the relationship between two variables

Horizontal axis

Usually the variable that does the predicting (this is somewhat arbitrary)

e.g., price, studying, income, caffeine intake

Vertical axis

Usually the variable that is predicted

e.g., quality, grades, happiness, alertness

Page 4: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

Steps for making a scatter diagram

Draw axes and assign variables to them

Determine the range of values for each variable and mark the axes

Mark a dot for each person’s pair of scores

Page 5: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

Linear correlation

Pattern on a scatter diagram is a straight line

Example above

Curvilinear correlation

More complex relationship between variables

Pattern in a scatter diagram is not a straight line

Example below

Page 6: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

Positive linear correlation

High scores on one variable matched by high scores on another

Line slants up to the right

Negative linear correlation

High scores on one variable matched by low scores on another

Line slants down to the right

Page 7: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

Zero correlation

No line, straight or otherwise, can be fit to the relationship between the two variables

Two variables are said to be “uncorrelated”

Page 8: Correlation Minium, Clarke & Coladarci, Chapter 7

Bivariate Distributions and Scatterplots

a. Negative linear correlation

b. Curvilinear correlation

c. Positive linear correlation

d. No correlation

Page 9: Correlation Minium, Clarke & Coladarci, Chapter 7

The Covariance

Covariance is a number that that reflects the degree and direction of association between two variables.

This is the definition

Note its similarity to the definition of variance (S2)

The logic of the Covariance

Cov(X X)(Y Y )

n

S2 (X X)(X X)

n

Page 10: Correlation Minium, Clarke & Coladarci, Chapter 7

0

3

6

9

12

15

0 3 6 9 12

X Values

Y V

alu

es

The Covariance

Example (Positive Correlation) (see p. 109)

Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)

A 9 13 4 4 16

B 7 9 2 0 0

C 5 7 0 -2 0

D 3 11 -2 2 -4

E 1 5 -4 -4 16

n=5 Xm=5 Ym=9 sum = 28

Cov =28/5=5.6

Cov(X X)(Y Y )

n

Page 11: Correlation Minium, Clarke & Coladarci, Chapter 7

0

3

6

9

12

15

0 3 6 9 12

X Values

Y V

alu

es

The Covariance

Example (Negative Correlation) (see p. 109)

Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)

A 9 5 4 -4 -16

B 7 11 2 2 4

C 5 7 0 -2 0

D 3 9 -2 0 0

E 1 13 -4 4 -16

n=5 Xm=5 Ym=9 sum = -28

Cov = -28/5=-5.6

Cov(X X)(Y Y )

n

Page 12: Correlation Minium, Clarke & Coladarci, Chapter 7

0

3

6

9

12

15

0 3 6 9 12

X Values

Y V

alu

es

The Covariance

Example (Zero Correlation) (see p. 109)

Person X Y X-Xm Y-Ym (X-Xm)(Y-Ym)

A 9 13 4 2.8 11.2

B 7 9 2 -1.2 -2.4

C 5 7 0 -3.2 0.0

D 3 9 -2 -1.2 2.4

E 1 13 -4 2.8 -11.2

n=5 Xm=5 Ym=10.2 sum = 0

Cov = 0/5 = 0

Cov(X X)(Y Y )

n

Page 13: Correlation Minium, Clarke & Coladarci, Chapter 7

The Pearson r: the Pearson product-moment coefficient of correlation

Correlation coefficient, r, indicates the precise degree of linear correlation between two variables

Can vary from

-1 (perfect negative correlation)

through 0 (no correlation)

to +1 (perfect positive correlation)

r is more useful than Cov because it is independent of the underlying scales of the two variables

if two variables produce an r of .5, for example, r will still equal .5 after any linear transformation of the two variables

linear transformation: adding, subtracting, dividing or multiplying by a constant

e.g., converting Celsius to Fahrenheit: F = 32 + 1.8C

e.g., converting Fahrenheit to Celsius: C = (F - 32) /1.8

r (X X)(Y Y ) / n

SXSYCovSXSY

Page 14: Correlation Minium, Clarke & Coladarci, Chapter 7

The Pearson r: the Pearson product-moment coefficient of correlation

r = .81

r = .46

r = .16

r = -.75

r = -.42

r = -.18

Page 15: Correlation Minium, Clarke & Coladarci, Chapter 7

Correlation and Causality

When two variables are correlated, three possible directions of causality

1st variable causes 2nd

2nd variable causes 1st

Some 3rd variable causes both the 1st and the 2nd

There is inherent ambiguity in correlations

Page 16: Correlation Minium, Clarke & Coladarci, Chapter 7

Correlation and Causality

When two variables are correlated, three possible directions of causality

1st variable causes 2nd

2nd variable causes 1st

Some 3rd variable causes both the 1st and the 2nd

Inherent ambiguity in correlations

Page 17: Correlation Minium, Clarke & Coladarci, Chapter 7

Factors influencing the Pearson r

Linearity

Outliers

“To the extent that a bivariate distribution departs from linearity, r will underestimate that relationship.” (p.121)

“Discrepant data points, or outliers, affect the magnitude of r and the direction of the effect depending on the outlier’s location in the scatterplot.” (p. 122).

Page 18: Correlation Minium, Clarke & Coladarci, Chapter 7

Factors influencing the Pearson r

Restriction of Range “Other things being equal, restricted variation in either X or Y will result in a lower Pearson r and would be obtained were variation greater.” (p. 122)

Page 19: Correlation Minium, Clarke & Coladarci, Chapter 7

Factors influencing the Pearson r

Context

“Because of the many factors that influence r, there is no such thing as the correlation between two variables. Rather, the obtained r must be interpreted in full view of the factors that affect it and the particular conditions under which it was obtained.” (p. 124)

Page 20: Correlation Minium, Clarke & Coladarci, Chapter 7

Judging the Strength of Association

r2: proportion of common variance

The coefficient of determination, r2, is the proportion of common variance shared by two variables.

We will talk more about this when we discuss Chapter 8.