Introduction to Correlation. Correlation The news is filled with examples of correlation Miles...

Preview:

Citation preview

Introduction to Correlation

Correlation

The news is filled with examples of correlation

Miles flown in an airplane vs … Driving faster than the speed limit vs … Women who smoke during pregnancy… If you eat only fast food for 30 days…

How Do You Calculate Correlation in Excel?

Make an XY scatterplot of the data, putting one variable on the x-axis and one variable on the y-axis.

Insert a linear trendline on the graph and include the R2 value

Interpret the results

Interpreting the Results

• The higher the R2 value, the more influence the variables might have on each other

• If you only have a few data points, then you need a higher R2 value in order to conclude there is a correlation

• Crude estimate: R2 > 0.5, most people say there is a correlation; R2 < 0.3, the correlation is essentially non-existent

• R2 between 0.3 and 0.5?? Gray area!

Examples

Look at: CigarettesBirthweight.xls SpeedLimits.xls HeightWeight.xls Grades.xls WineConsumption.xls BreastCancerTemperature.xls

How Do We Calculate Correlation in SPSS? In SPSS, click on Analyze ->

Correlate -> Bivariate (two variables) Select the two columns of data you

want to analyze (move them from the left box to the right box)

You can actually pick more than two columns, but stick with bivariate for now

How Do We Calculate Correlation in SPSS? Make sure the checkbox for Pearson

Correlation Coefficients is checked Click OK to run the correlation You should get an output window

something like the following slide

The correlation betweenheight and weight is 0.861

The Pearson Correlation value is not the sameas Excel’s R-squared value; Pearson’s valuecan be positive or negative

Positive and Negative Correlation (Pearson’s) Positive correlation: as the values of

one variable increase, the values of a second variable increase (values from 0 to 1.0)

Negative correlation: as the values of one variable increase, the values of a second variable decrease (values from 0 to -1.0)

Interpreting Pearson’s Correlation

Positive and Negative Correlation There is a negative correlation

between TV viewing and class grades—students who spend more time watching TV tend to have lower grades (or, students with higher grades tend to spend less time watching TV).

Positive and Negative Correlation

Positive correlation Negative correlation

Positive and Negative Correlation When looking for correlation, a

positive correlation is not necessarily greater than a negative correlation

Which correlation (Pearson) is the greatest?

- .34 .72 - .81 .40 - .12

Notice: R2 is same here

You can’t always tell relationships just by looking at the graph.

What Can We Conclude?

If two variables are correlated, then we can predict one based on the other

But correlation does NOT imply cause! It might be the case that having more

education causes a person to earn a higher income. It might be the case that having higher income allows a person to go to school more. There could also be a third variable. Or a fourth. Or a fifth…

What Can We Conclude?

Causality – one variable, say A, actually causes the change in B

Sheer coincidence – A and B really do not have anything to do with each other but happen to go up or down simultaneously

What Can We Conclude?

Common underlying cause or causes – most important one – A is correlated to B, but there is a third factor C (the common underlying cause) that causes the changes in both A and B.

Example: as ice cream sales go up, so do crime rates.

Recommended