Upload
suneel-babu-chatla
View
36
Download
0
Embed Size (px)
Citation preview
Selection Bias with Linear Probability Models
(LPM)
Suneel ChatlaGalit Shmueli
Institute of Service Science,National Tsing Hua University, Taiwan
Outline
Ø Introduction to self selectionØ Popular methods for selection bias
correctiono Two step methods (2SLS)o Matching methods (PSM)
Ø Incorporating LPM into 2SLS and PSM Ø Simulation studyØ Conclusions
Quasi-experiments
Like randomized experimental designs that test causal hypotheses but lack random assignment (=self selection)
Pros
• When random assignment is impractical and/or unethical
• Easier to setup, greater external validity• Minimize threats to ecological validity
Cons
• Estimates are subject to contamination by confounding variables (Biased)
• Do not have total control over extraneous variables
Why we need Quasi experiments?
Two Methods for Addressing Selection Bias
Two Methods for Addressing Selection Bias
Two step methods: Heckman vs Olsen
Stage 1: Selection model (T)
AdjustmentStage 2: Outcomemodel (Y)
𝐸[𝑇|𝑋] = Φ(𝑋𝛾) 𝐼𝑀𝑅 =𝜙(𝑋𝛾)Φ(𝑋𝛾) 𝑌 = 𝑋𝜷 + 𝛿𝐼𝑀𝑅 + 𝜀Heckman
(1977)
𝐸[𝑇|𝑋] = 𝑋𝛾 𝜆 = 𝑋𝛾 − 1 𝑌 = 𝑋𝜷 + 𝛿𝜆 + 𝜀Olsen (1980)
Probit
LPM
Heckman’s
• Bivariate normality
• Inconsistent second stage standard errors
• Identification issues
• Expensive computation
• Convergence issues
Olsen’s
• Linear conditional expectation
• Inconsistent second stage standard errors
• Identification issues
• Cheaper computation
• No convergence issues
In Short: For Continuous Outcome
Open Research Questions
1. Selection model with unequal sample sizes (treat/control) - continuous outcome
2. Binary outcome model – coefficient consistency
3. Selection model with unequal sample sizes (treat/control) + binary outcome model with unequal sample sizes
Simulation Design
Selection model: 𝑆∗ = −0.5 + 0.5𝑥? − 0.5𝑥@ + 1.5𝑥A − 𝑥B + 𝜔
𝑇 = D 1𝑖𝑓𝑆∗ > 0
0𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Continuous Outcome model: 𝑌 = 0.5 − 1.5𝑥? + 0.5𝑥@ + 𝑥A + 𝜀
Binary Outcome:
𝑌O = D 1𝑖𝑓𝑌 > 00𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑁00, 0.5 −0.4−0.4 0.5
Q1: Continuous outcome: treat/control sample size ratio has no influence
Q2: Binary outcome - coefficients inconsistent
How about marginals?
Q3: Binary outcome - divergence of marginals with imbalance ratio
Outcome cut-off 50% Outcome cut-off 25% Outcome cut-off 5%
Sel
ectio
n cu
t-of
f 50
%Sel
ectio
n cu
t-of
f 25
%Sel
ectio
n cu
t-of
f 5%
Summary: Heckman Vs Olsen
Ø Continuous outcome: Heckman and Olsen corrections are similar, even when unbalanced
Ø Binary outcome: marginal effects from Heckman and Olsen corrections, diverge with imbalance
ØLPM in both stages provides consistent estimates (OLS)
ØBut how about Probit?
Two Methods for Addressing Selection Bias
Matching Methods
Stage 1: Selection model (T)
Covariate balance
Stage 2: Outcomemodel (Y)
𝑙𝑜𝑔𝑖𝑡(𝐸 𝑇 𝑋 ) = (𝑋𝛾)|𝑝 𝑇 = 1− 𝑝 𝑇 = 0 |< 𝜀
𝑌 = 𝑋𝜷 + 𝜀Rosenbaum and Rubin (1985)
𝐸 𝑇 𝑋 = (𝑋𝛾)|𝑝 𝑇 = 1− 𝑝 𝑇 = 0 |< 𝜀
𝑌 = 𝑋𝜷 + 𝜀LPM
Propensity Score Matching (PSM)
ü Only accounts for observable/observed covariates
ü Requires large samples and substantial overlap between treatment and control
ü What happens to ATE if we use LPM for matching?
Simulation Design
Selection model: 𝑇 = 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖( Z
Z[\](^_[`))
Outcome model :𝑌 = 𝑇 + 𝑋𝛽 + 𝜀
𝑁(0, {0.1,1,5})
𝑋~𝑁 0,1 and 𝛽 = 1
• Sample size1000
•Standard deviation0.1,1,5
• Bootstrap50
• 𝑚𝑒𝑎𝑛 𝑌hi? −𝑚𝑒𝑎𝑛 𝑌hij
ATE
Identical ATE from Logit and LPM matching
Summary & Future Researchü LPM similar to logit in terms of estimated Average
Treatment Effectü Ongoing work: what about binary outcome
models?
ü Logit faces problems if insufficient overlap between treat/control
ü Ongoing work: does LPM have overlap issues?
Thank you!