21
LSP 121 Introduction to Correlation

LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

LSP 121

Introduction to Correlation

Page 2: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Correlation

• The news is filled with examples of correlation– If you eat so many helpings of tomatoes…– One alcoholic beverage a day…– Driving faster than the speed limit…– Women who smoke during pregnancy…

• Often, we can quantify correlation

Page 3: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

How Do You Calculate Correlation in Excel?

• Make an XY scatterplot of the data, putting one variable on the x-axis and one variable on the y-axis.1. Select the two columns you wish to graph2. Choose Insert Scatter

• Insert a linear trendline on the graph and include the R2 value3. Click one of the data points on the chart4. Right-click, choose Add Trendline,5. Check boxes/buttons for: Linear, Display Equation, Display R2

• Interpret the results• Try it with CigarettesBirthweight.xls

Page 4: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

# Smokes/day and Birth Weight

Page 5: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Interpreting the Results• The higher the R2 value, the greater the likelihood that there is correlation

• Crude estimate: R2 > 0.5– Most people say there is a correlation

• R2 < 0.3– Most say correlation is essentially non-existent

• R2 between 0.3 and 0.5?– Gray area – further analysis is needed

• If you only have a few data points, then you need a higher R2 value in order to make a decision whether there is or is not a correlation

Page 6: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Examples: Are they correlated?

• Look at:– CigarettesBirthweight.xls– SpeedLimits.xls (under Older Data)– HeightWeight.xls– Grades.xls (under Older Data)– WineConsumption.xls (under Older Data)– BreastCancerTemperature.xls

Page 7: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

How Do We Calculate Correlation in SPSS/PASW?

• In SPSS, click on Analyze -> Correlate -> Bivariate

• Select the two columns of data you want to analyze (move them from the left box to the right box)

• You can actually pick more than two columns, but we’ll keep it simple for now

Page 8: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

How Do We Calculate Correlation in SPSS/PASW?

• Make sure the checkbox for ‘Pearson Correlation Coefficients’ is checked

• Click OK to run the correlation• You should get an output window something

like the following slide

Page 9: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

The correlation betweenheight and weight is 0.861

The Pearson Correlation value is not the sameas Excel’s R-squared value; it can be positiveor negative

Page 10: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Positive and Negative Correlation

• Positive correlation: as the values of one variable increase, the values of a second variable increase (values from 0 to 1.0)

• Negative correlation: as the values of one variable increase, the values of a second variable decrease (values from 0 to -1.0)

Page 11: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Positive v.s. Negative Correlation

• There is a negative correlation between TV viewing and class grades—students who spend more time watching TV tend to have lower grades (or, students with higher grades tend to spend less time watching TV).

• There is a negative correlation between exercise and heart disease

• There is a positive correlation between exercise and self-esteem

Page 12: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Positive and Negative Correlation on a graph

Positive correlation Negative correlation

Page 13: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

How would you classify these correlations?

Negative correlation

Positive correlation

NO correlation

Page 14: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Positive and Negative Correlation

• When looking for correlation, positive correlation is not necessarily greater than negative correlation

• Which correlation is the greatest? -.34 .72 -.81 .40 -.12

Page 15: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

** Correlation vs Causation

• Correlation: Two concepts are related in some way.• Causation: Changing one of the factors also causes a

change in the other factor.– eg: Smoking and Cancer are correlated. They also have a

causal relationship.• If you do something to increase smoking, you increase the chance

of cancer– eg: Ice cream sales and crime rates also have a correlation.

However, they do NOT have a causal relationship. (Can you think why they are correlated?)

• If you do something to increase ice cream sales, you do not see an increase in crime

Page 16: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

What Can We Conclude?

• If two variables are correlated, then we can predict one based on the other

• But correlation does NOT imply causation!• It might be the case that having more

education causes a person to earn a higher income. It might be the case that having higher income allows a person to go to school more. There could also be a third variable. Or a fourth. Or a fifth…

Page 17: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Causation (aka ‘Causality’)• Causation: One variable A, actually causes a

change in B. • Here are some examples of correlations that also

have a causality:– Increase smoking Increased likelihood of lung cancer– Increase exercise Decreased likelihood of heart

disease• Key point: Many, many, many things in life have

correlations. But this does not mean that they have causation. – See next slide

Page 18: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Correlation does NOT imply causation!• OFTEN (very often!), two items that are correlated are falsely assumed to have a causal

relationship.• Usually, the reason for falsely assuming causation is the presence of a common

underlying factor. That is, A may be correlated with B, but this is due to some other factor, C.

• Example: None of these three correlations have a causal relationship. Can you identify the other factor?

– As ice cream sales go up, so do crime rates• Summer! Crime always goes up in the summer. Not surprisingly, more people buy ice cream in the

summer as well.– People who wear top-hats live longer (An actual study from the Victorian era)

• Income. Wealthier people wear top hats and can also afford better health care, medicines, doctors, etc.

– Hormone therapy for breast cancer decreases likelihood of heart disease• As with the previous example: socioeconomic status. Hormone therapy in of itself increases the

likelihood of heart disease! However, people who are wealthier are more likely to have better general medical care resulting in early detection of breast cancer, proper treatments, etc. For this reason, they are also more likely to be more educated about heart disease (eat better, exercise more, smoke less, etc). So even though hormone therapy causes heart disease, on the whole, the majority of people on this therapy tend to have less heart disease.

Page 19: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Causation or not?

• What do you think of this example? – Studies have demonstrated a clear correlation

between ease of faculty grading and faculty evaluations. That is, faculty who taught less challenging courses routinely receive better evaluations.

Page 20: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

Correlation v.s. Causation• Do not confuse correlation with causation.

– Just because two things are correlated (e.g. height and weight) does not mean that there is a causal relationship.

– In other words, making a change in A will predictably cause a change in B

– Giving somebody a top-hat will not make them live longer (see next slide).

• This is an example of where there is a correlation, but there is not causation.

• Very important point – expect 1-2 exam questions on this idea!

Page 21: LSP 121 Introduction to Correlation. Correlation The news is filled with examples of correlation – If you eat so many helpings of tomatoes… – One alcoholic

What Can We Conclude?

• Sheer coincidence – the two variables have nothing in common, but they create a strong R or R2 value

• Both variables are changing over time – divorce rates are going up and so are drug-offenses. Is an increase in divorce causing more people to use drugs (and get caught)?