Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Randomized Experiments
Political Analysis Year 2: Week 5Professor David Kirk
Outline
Lecture builds on Prof. Ruggeri’s lecture on confounding and the purpose of control variables
Potential outcomes framework
The logic of randomized experiments
Examples of experiments to study voting
Bicycle Thefts in Oxford (OX1)
0
10
20
30
40
50
60
70Bicycle Th
efts
Hypothetical PoliceCrackdown Begins
What is the impact of the police intervention?
1) Positive
2) Negative
3) Zero
4) Not enough information
Bicycle Thefts in Oxford (OX1)
0
10
20
30
40
50
60
70Bicycle Th
efts
Impact of Crackdown?
Hypothetical PoliceCrackdown Begins
Hypothetical Example
A gentleman is feeling ill but does not know whether going to the hospital will make him feel better
Assume there are two options (i.e., treatments): he either goes to the hospital (d = 1) or he does not (d = 0)
There are two potential outcomes: his level of health if he went to the hospital (y1) and his level of health if he does not go to the hospital (y0)
To determine the effect of going to the hospital on his health, we take the difference: y1 – y0
Fundamental Problem of Causal Inference
Causal inference is a missing data problem: We cannot simultaneously observe both y1 and y0 for the same person—one of the values is always missing
Group Y 1 Y 0
Treatment Group (D = 1) Observable UnobservableControl Group (D = 0) Unobservable Observable
Counterfactual
The counterfactual represents the state of the world that individuals would have experienced in the absence of the treatment or program (e.g., had they not gone to the hospital)
Problem: Counterfactual cannot be observed
Solution: We need to “mimic” or construct the counterfactual, and the treatment and control groups should be as similar as possible
Back to the Health Example (Naïve Estimate)
1
2
3
4
5
Hospital No Hospital
Self‐Reported Health Status of Individuals by Prior Hospitalization(2005 National Health Interview Study; Angrist and Pischke 2009)
Poor
Fair
Excellent
Very Good
Good
1
2
3
4
5
Hospital No Hospital
Self‐Reported Health Status of Individuals by Prior Hospitalization(2005 National Health Interview Study; Angrist and Pischke 2009)
Poor
Fair
Excellent
Very Good
Good
Control by Matching and Regression
Known sources of variation may be (potentially) eliminated by matching treatment and control groups on known characteristics Similarly with regression, we “control” for known sources of variation between
the treatment and control groups by adding variables to our model
Matching ensures that groups compared are alike on specific known and observable characteristics (in principle, everything we have thought of)
But matching and statistical control are limited: matching and control are only possible on observable characteristics
What we really need is a method of making groups alike not only on all known and observable characteristics, but everything we didn’t know about or were unable to observe
Randomized Experiments
What if we had randomly assigned―e.g., by flipping a coin―which ill individuals will receive medical treatment?
Random assignment facilitates causal inference by making groups (e.g., treatment and control groups) randomly similar to each other on all variables (known and unknown, observable or not) The only systematic difference between groups is that one group received the treatment (e.g., medical care)
Randomized Experiments cont.
If the groups are the same prior to medical treatment but have different levels of health after treatment, then the difference is because of the treatment
Vs.
Example: Voter Turnout
Some key questions in the study of voting:– What affects voter turnout?– Is voting a contagious behaviour?
oOr do individuals likely to vote simply seek out friends who are similar (birds of a feather flock together)?
– Can online mobilization increase turnout?
Two Treatments: Informational Message and Social Message
RM Bond et al. Nature 489, 295-298 (2012) doi:10.1038/nature11421
The Effect on Voting
RM Bond et al. Nature 489, 295-298 (2012) doi:10.1038/nature11421
Another voting experiment: Gerber, Green, and Larimer (2008)
To what extent do social norms cause voter turnout?
An individual’s vote is extremely unlikely to be the deciding vote in an election
So why do people vote?
Hypotheses: because of intrinsic rewards (e.g., feel good about following civic duty) and/or extrinsic rewards (social norms, civic duty, and the possibility of shame)
Gerber, Green, and Larimer: Research Design
Sent mailings to 80,000 to households in Michigan prior to 2006 election. Four different types of mailings:1) Reminded of civic duty2) Indicated that researchers would assess their turnout
through public records3) Displayed previous record of voter turnout of household
members4) Displayed previous record of voter turnout of household
members and neighbours
In addition to the 4 treatment groups, there was a control group (N=99,999) that did not receive any mailing
Treatment 1: Reminder of Civic Duty
Treatment 2: Vote Examined by Researchers
Treatment 3: Display Voting Record
Treatment 4: Display Voting Record (Household and Neighbours)
Results of Gerber et al.
Features of Random Experiments
1) Manipulation: to test relationships, the experimenter deliberately introduces change in an independent variable (i.e., the treatment), and then observes the effects of this change
2) Control: the experimenter controls which unit will receive treatment; because experimenters have the ability to control the conditions of observation, experiments can more effectively eliminate the possibility that confounding variables will influence the validity of results
3) Random Assignment: subjects are randomly assigned to treatment and control groups; besides the receipt of a treatment, these two groups are equivalent
Random Assignment
Random assignment: procedure of assignment to group, where each subject has an equal probability of being in either group (treatment or control)
Random does not mean haphazard or unknown
Random assignment allows us to control pre‐experimental differences across treatment and control groups
Through random assignment, attributes of the units (i.e., subjects) that might confound the experimental results will be evenly distributed between the treatment and control groups
Criteria for Establishing Causal Relationships
1) Association: T and Y are correlated
2) Time ordering: cause (T) precedes the effect (Y)
3) Nonspuriousness: spuriousness occurs because there is one or more confounding variables (W) which changes both the independent (T) and dependent (Y) variables, giving the illusion that the independent variable is causing the dependent variable
T Y
W
26
What else might explain the relationshipbetween chocolate consumption and winning
the Nobel Prize?
Chocolateconsumption
per capita
Nobel prizesawarded
per capita
?
Another example of spuriousness
Does shoe size have a causal effect on literacy?
Or is there a “confounding” variable that is correlated with both shoe size and literacy, thus explaining why there is a correlation?
ShoeSize Literacy
OmittedVariable
Summary
Causal inference is a missing data problem: We cannot simultaneously observe both potential outcomes (y1and y0) for the same person
Hence, the task of causal inference is to create a reasonable approximation of the counterfactual condition.
Randomization is generally regarded as to be the “gold standard” for constructing the counterfactual
We will examine other strategies for constructing the counterfactual in subsequent lectures
Final Thought
Last week Prof. Ruggeri encouraged you, in your final essays, to think about possible confounders (i.e., omitted variables) that may affect both your dependent variable and independent/treatment variable
Similarly, for your essays, think about an ideal experiment to answer a given research question. If an experiment was possible, what would it look like?