12
Section 6.2: Regression, Prediction, and Causation

Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Embed Size (px)

Citation preview

Page 1: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Section 6.2:Regression, Prediction, and

Causation

Page 2: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does.

Both correlation and regression are strongly affected by outliers…

What do you think Hawaii is known for that is definitely an outlier compared to the other 49 states?

Correlation and Regression

Page 3: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Rainfall …

Correlation: If Hawaii is included, r = 0.195; if Hawaii is not included, r = 0.408.

Regression: If Hawaii is included, the LSRL is the solid line; if Hawaii is not included, the LSRL is the dotted line.

Page 4: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

The usefulness of the regression line for prediction depends on the strength of the correlation between the variables.

The square of the correlation is the right measure to use…

r squared will be a number between 0 and 1. The higher the number, higher the amount it accounts for all the variation along the line (you want a high number)…example 0.972 = 97.2% successful in explaining the regression line.

Correlation and Regression

Page 5: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

A strong relationship between 2 variables does not always mean that changes in one variable cause changes in the other.

The relationship between two variables is often influenced by other variables lurking in the background.

The best evidence for causation comes from randomized comparative experiments.

The observed relationship between 2 variables may be due to direct causation, common response, or confounding.

An observed relationship can be used for prediction without worrying about causation as long as the patterns found in the past data continue to hold true.

Causation

Page 6: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

There is a strong relationship between cigarette smoking and death rate from lung cancer. Does smoking cigarettes cause lung cancer?

There is a strong association between the availability of handguns in a nation and that nation’s homicide rate from guns. Does easy access to hand guns cause more murders?

Which one do you think is a better case for direct causation?

Causation

Page 7: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Does watching television extend your lifespan?

◦ Countries which are rich enough to have televisions are probably also fortunate enough to have better nutrition, clean water, better health care, etc. than poorer nations.

◦ This was called a “nonsense correlation”. The correlation is real, but the conclusion is nonsense.

Causation

Page 8: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Common Response: a lurking variable influences both x and y creates a high correlation even though there is no direct connection between x and y. Ex., obesity in children: a explanatory variable can be TV viewing time, but lurking variables may be inheritance from parents, overeating, or lack of physical activity,

Causation

Page 9: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Confounding: a child may be overweight not because of their poor eating habits but because their parents provide poor choices (their parents have bad eating habits themselves).

Causation

Page 10: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

If an experiment is not possible, you must meet the following criteria to prove causation:

1. The association between the variables is strong.

2. The association between the variables is consistent.

Evidence for Causation

Page 11: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

If an experiment is not possible, you must meet the following criteria to prove causation

3. Higher doses are associated with stronger responses.

4. The alleged cause precedes the effect in time.

5. The alleged cause is plausible.

Evidence for Causation (cont’d)

Page 12: Correlation and regression are closely connected; however correlation does not require you to choose an explanatory variable and regression does

Page 384..391 #6.34-6.37, 6.42

Homework