Upload
warren-howard
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Goals for Lecture 5
Recognize when correlation can be misleading
Realize reasons why two variables may be related, without cause-and-effect
Understand non-statistical considerations that can help establish a causal relationship
Thought Question 1
For each of these, is the correlation higher or lower than it would have been without the outlier?
Thought Question 2
There is a strong correlation in Lisbon between weekly sales of hot castanhas and weekly sales of tecidos para espirra. Does this mean that castanhas cause people to espirrar?
Thought Question 3
Research has found that countries with higher average fat intake tend to have higher breast cancer rates. Does this provide evidence that dietary fat is a contributing cause of breast cancer?
Problems with Correlations
Outliers can inflate or deflate correlations Groups combined inappropriately may mask
relationships
Hours Worked vs. Annual Earnings
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 20 40 60
hours
earn
ing
s
r = +.53
Hours Worked vs. Annual Earnings
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 20 40 60
hours
earn
ing
s
r = +.53
Hours Worked vs. Annual Earnings
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
0 20 40 60
hours
earn
ing
s
r = +.39
Combining GroupsCan Deceive
Class correlation of weight to height:
r = .69 Men’s correlation of weight to height:
r = .58 Women’s correlation of weight to height:
r = .21
Remember!
Correlation does not imply causation.
(Igrejas and liquor stores, shoe size and reading ability)
Correlation of variables
When considering relationships between measurement variables, there are two kinds: Explanatory (or independent) variable: The
variable that attempts to explain or is purported to cause (at least partially) differences in the…
Response (or dependent or outcome) variable Often, chronology is a guide to distinguishing
them (examples: baldness and heart attacks, poverty and test scores)
Some reasons why two variables could be related
The explanatory variable is the direct cause of the response variable
Some reasons why two variables could be related
The explanatory variable is the direct cause of the response variable
Example: pollen counts and percent of population suffering allergies, intercourse and babies
Some reasons two variables could be related
The response variable actually is causing a change in the explanatory variable
Some reasons two variables could be related
The response variable is causing a change in the explanatory variable
Example: hotel occupancy and advertising spending, divorce and alcohol abuse
Some reasons two variables could be related
The explanatory variable is a contributing -- but not sole -- cause
Some reasons two variables could be related
The explanatory variable is a contributing -- but not sole -- cause
Example: birth complications and violence, gun in home and homicide, hours studied and grade, diet and cancer
Some reasons two variables could be related
Confounding variables may exist
Example: happiness and heart disease, traffic deaths and speed limits
Some reasons two variables could be related
Both variables may result from a common cause
Example: SAT score and GPA, hot chocolate and tissues, storks and babies, fire losses and firefighters, WWII fighter opposition and bombing accuracy
Some reasons two variables could be related
Both variables are changing over time
Example: divorces and drug offenses, divorces and suicides
Some reasons two variables could be related
The association may be nothing more than coincidence
Example: clusters of disease, brain cancer from cell phones
So how can we confirm causation?
The only way to confirm is with a designed experiment. But non-statistical evidence of a possible connection may include:
A reasonable explanation of cause and effect.
A connection that happens under varying conditions.
Potential confounding variables ruled out.
Why?
Orchestra conductors tend to live long lives. Fewer accidents after speed limits were
lowered in 1973 due to the oil embargo. In the week before the 1994 Northridge
earthquake, 149 were admitted for heart attacks. In the week after there were 201.