35
Basic Statistics Correlation

Basic Statistics Correlation Var Relationships Associations

Embed Size (px)

Citation preview

Page 1: Basic Statistics Correlation Var Relationships Associations

Basic Statistics

Correlation

Page 2: Basic Statistics Correlation Var Relationships Associations

Var

Var

Var Var

Var

Relationships

Associations

Page 3: Basic Statistics Correlation Var Relationships Associations

Information

?COvary

In Research

Dependent variable

Independent variables

X1

X2

X3

Y

Page 4: Basic Statistics Correlation Var Relationships Associations

The Concept of Correlation

Association or relationship between two variables

X Y

Covary---Go together

Co-relate?relationr

Page 5: Basic Statistics Correlation Var Relationships Associations

Patterns of Covariation Y

Positive correlation

Negative correlation

CorrelationCovary

Go togetherX Y X Y

XZero or no correlation

Page 6: Basic Statistics Correlation Var Relationships Associations

Scatter plots allow us to visualize the relationships

Scatter Plots

The chief purpose of the scatter diagram is to study the nature of the relationship between two variables

Linear/curvilinear relationship

Direction of relationship

Magnitude (size) of relationship

Page 7: Basic Statistics Correlation Var Relationships Associations

Represents both the X and Y scores

Variable X

Variable Y

An illustration of a perfect positive correlation

high

high

low

low

Scatter Plot A

Exact value

Page 8: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of a positive correlation

high

high

low

low

Scatter Plot B

Estimated Y value

Page 9: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of a perfect negative correlation

high

high

low

low

Scatter Plot C

Exact value

Page 10: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of a negative correlation

high

high

low

low

Scatter Plot D

Estimated Y value

Page 11: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of a zero correlation

high

high

low

low

Scatter Plot E

Page 12: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of a curvilinear relationship

high

high

low

low

Scatter Plot F

Page 13: Basic Statistics Correlation Var Relationships Associations

The Measurement of Correlation

The degree of correlation between two variables can be described by such terms as “strong,” ”low,” ”positive,” or “moderate,” but these terms are not very precise.

If a correlation coefficient is computed between two sets of scores, the relationship can be described more accurately.

The Correlation Coefficient

A statistical summary of the degree and direction of relationship or association between two variables can be computed

Page 14: Basic Statistics Correlation Var Relationships Associations

Pearson’s Product-Moment Correlation Coefficient r

-1.00 -.50 0 + .50 1.00

Direction of relationship: Sign (+ or –)

Magnitude: 0 through +1 or 0 through -1

Negative correlation Positive correlation

No Relationship

nY)(

YnX)(

X

nY)X)((

XYr

22

22

Page 15: Basic Statistics Correlation Var Relationships Associations

The Pearson Product-MomentCorrelation Coefficient

1n

XXXXΣ

1n

XXΣS

2

2

Recall that the formula for a variance is:

If we replaced the second X that was squared with a second variable, Y, it would be:

1n

YYXXΣS yx

This is called a co-variance and is an index of the relationship between X and Y.

Page 16: Basic Statistics Correlation Var Relationships Associations

Conceptual Formula for Pearson r

n

1i

n

1i

2i

2i

n

1i

)Y(Y)X(X

)Y)(YX(Xr

ii

This formula may be rewritten to reflect the actual method of calculation

Page 17: Basic Statistics Correlation Var Relationships Associations

nY)(

YnX)(

X

nY)X)((

XYr

22

22

Calculation of Pearson r

You should notice that this formula is merely the sum of squares for covariance divided by the square root of the product of the sum of squares for X and Y

Page 18: Basic Statistics Correlation Var Relationships Associations

Formulae for Sums of Squares

n

YXXYSSxy

n

YYSSy

n

XXSSx

22

22

Therefore, the formula for calculating r may be rewritten as:

Page 19: Basic Statistics Correlation Var Relationships Associations

Calculation of r Using Sums of Squares

SSySSx

SSxyr

Page 20: Basic Statistics Correlation Var Relationships Associations

An Example

Suppose that a college statistics professor is interested in how the number of hours that a student spends studying is related to how many errors students make on the mid-term examination. To determine the relationship the professor collects the following data:

Page 21: Basic Statistics Correlation Var Relationships Associations

The Stats Professor’s Data

Student Hours Studied (X)

Errors (Y) X2 Y2 XY

1 4 15 16 225 60

2 4 12 16 144 48

3 5 9 25 81 45

4 6 10 36 100 60

5 7 8 49 64 56

6 7 4 49 16 28

7 7 6 49 36 42

8 9 2 81 4 18

9 9 4 81 16 36

10 12 3 100 9 36

Total X = 70 Y = 73 X2 =546 Y2=695 XY=429

Page 22: Basic Statistics Correlation Var Relationships Associations

The Data Needed to Calculate the Sum of Squares

X Y X2 Y2 XY

Total X = 70 Y = 73 X2 =546 Y2=695 XY=429

n

YYSSy

22

n

YXXYSSxy

n

XXSSx

22 = 546 - 702/10 = 546 - 490 = 56

= 695 - 732/10 = 695 - 523.9 = 162.1

= 429 – (70)(73)/10 = 429 – 511 = -82

Page 23: Basic Statistics Correlation Var Relationships Associations

Calculating the Correlation Coefficient

SSySSx

SSxyr = -82 / √(56)(162.1)

= - 0.86

Thus, the correlation between hours studied and errors made on the mid-term examination is -0.86; indicating that more time spend studying is related to fewer errors on the mid-term examination. Hopefully an obvious, but now a statistical conclusion!

Page 24: Basic Statistics Correlation Var Relationships Associations

Pearson Product-Moment Correlation Coefficient r

0-1 +1

Negative correlation

Positive correlation

perfect negative correlation

Perfect positive correlation

Zero correlation

nY)(

YnX)(

X

nY)X)((

XYr

22

22

Page 25: Basic Statistics Correlation Var Relationships Associations

Numerical values

Negative correlation Zero correlation Positive correlation

0- .35.73

nY)(

YnX)(

X

nY)X)((

XYr

22

22

Perfect Strong Moderate

Page 26: Basic Statistics Correlation Var Relationships Associations

The Pearson r and Marginal Distribution

The marginal distribution of X is simply the distribution of the X’s; the marginal distribution

of Y is the frequency distribution of the Y’s.

Y

X

Bivariate Normal Distribution

Bivariate relationship

Page 27: Basic Statistics Correlation Var Relationships Associations

Marginal distribution of X and Y are precisely the same shape.

X variable

Y variable

Page 28: Basic Statistics Correlation Var Relationships Associations

Interpreting r, the Correlation Coefficient

Recall that r includes two types of information:

The direction of the relationship (+ or -)The magnitude of the relationship (0 to 1)

However, there is a more precise way to use the correlation coefficient, r, to interpret the magnitude of a relationship. That is, the square of the correlation coefficient or r2.

The square of r tells us what proportion of the variance of Y can be explained by X or vice versa.

Page 29: Basic Statistics Correlation Var Relationships Associations

Variable X

Variable Y

An illustration of how the squared correlation accounts for variance in X, r = .7, r2 = .49

high

high

low

low

How does correlation explain variance?

Explained

Explained

Suppose you wish to estimate Y for a given value of X.

49% of variance is explained

Free to Vary

Page 30: Basic Statistics Correlation Var Relationships Associations

Now, let’s look at some correlation coefficients and their corresponding scatter plots.

Page 31: Basic Statistics Correlation Var Relationships Associations

Beginning Salary

700006000050000400003000020000100000

Cur

rent

Sal

ary

120000

100000

80000

60000

40000

20000

0

What is your estimate of r?

r = .87 r2 = .76 = 76%

Page 32: Basic Statistics Correlation Var Relationships Associations

Beginning Salary

700006000050000400003000020000100000

Cur

rent

Sal

ary

120000

100000

80000

60000

40000

20000

0

X

Y

What is your estimate of r?

r = -1.00 r2 = 1.00 = 100%

Page 33: Basic Statistics Correlation Var Relationships Associations

Beginning Salary

700006000050000400003000020000100000

Cur

rent

Sal

ary

120000

100000

80000

60000

40000

20000

0

X

Y

What is your estimate of r?

r = +1.00 r2 = 1.00 = 100%

Page 34: Basic Statistics Correlation Var Relationships Associations

What is your estimate of r?

r = .04

Months since Hire

10090807060

Beg

inni

ng S

ala

ry

70000

60000

50000

40000

30000

20000

10000

0

r2 = .002 = .2%

Page 35: Basic Statistics Correlation Var Relationships Associations

What is your estimate of r?

r = -.44

Time to Accelerate from 0 to 60 mph (sec)

3020100

Veh

icle

Wei

ght

(lbs.

)

6000

5000

4000

3000

2000

1000

r2 = .19 = 19%