PUBL0050: Advanced Quantitative Methods
Lecture 5: Panel Data and Difference-in-Differences
Jack Blumenau
1 / 52
Thank you for the feedback!
1. Question: Can we have drop-in hours rather than office hours?โข Response: Speak to Constanza!
2. Question: Can we have more advice on merging/cleaning data in R?โข Response: I will add some material to the website on how toapproach this.
2 / 52
Thank you for the feedback!
3. Question: Can you provide a crib sheet of R functions?โข Response: Anything comprehensive will be difficult, but I will try toadd an overview of the most important functions that we use to thewebsite.
4. Question: Can we have more applied examples of methods?โข Response: I will make sure that there are more applications perlecture, even if these are just links to papers.
2 / 52
Thank you for the feedback!
5. Question: Can all lecture notes be put online now?โข Response: From reading week, I will upload all slides for the futureweeks, though these are subject to change.
6. Question: Can you provide more help with finding data?โข Response: This is a little difficult given the variety of data that youwill all use. I will add some resources of some commonly useddatabases to the website but please do not take this as acomprehensive list! Data sourcing/collection is part of theevaluation.
2 / 52
Thank you for the feedback!
7. Question: What is going to happen with the strike?โข Response: I wish I knew.โข Real response 1: If the strike goes ahead, I will post the videos of lastyearโs lectures online so that you can watch those.
โข Real response 2: We are still trying to work on contingency plans forthe two classes that would be affected by the strike.
2 / 52
Course outline
1. Potential Outcomes and Causal Inference2. Randomized Experiments3. Selection on Observables (I)4. Selection on Observables (II)5. Panel Data and Difference-in-Differences6. Synthetic Control7. Instrumental Variables (I)8. Instrumental Variables (II)9. Regression Discontinuity Designs10. Overview and Review
3 / 52
Was it โthe Sun wot won itโ?
Does the media influence vote choice?
โข Randomized experiment?
โข Hard to persuade newspapers to randomly endorse politicalcandidates
โข Hard to randomly allocate citizens to read certain newspapers
โข Selection on observables?
โข The types of individual who read certain newspapers (i.e. The Sun)are likely different in many ways from those who read othernewspapers
โข Difference-in-differences
โข Collect data on vote choice before & after change in endorsementโข Did people who read The Sun change their vote choice more thanpeople who did not read The Sun?
6 / 52
Lecture Outline
Difference-in-differences
Regression DD
Threats to Validity
Multiple periods
Conclusion
7 / 52
Difference-in-differences setup
DefinitionTwo groups:
โข ๐ท๐ = 1 Treated units
โข ๐ท๐ = 0 Control units
Two periods:
โข ๐๐ = 0 Pre-Treatment period
โข ๐๐ = 1 Post-Treatment period
Potential outcome ๐๐๐(๐ก)
โข ๐1๐(๐ก) outcome of unit ๐ in period ๐ก when treated (at ๐๐ = 1)โข ๐0๐(๐ก) outcome of unit ๐ in period ๐ก when control (at ๐๐ = 1)
8 / 52
Difference-in-differences setup
DefinitionCausal effect for unit ๐ at time ๐ก is
โข ๐๐(๐ก) = ๐1๐(๐ก) โ ๐0๐(๐ก)
For a given unit, in a given time period, the observed outcome ๐๐(๐ก) is:
โข ๐๐(๐ก) = ๐1๐(๐ก) โ ๐ท๐(๐ก) + ๐0๐(๐ก) โ (1 โ ๐ท๐(๐ก))โข If treatment occurs only after ๐ก = 0 we have:
๐๐(1) = ๐0๐(1) โ (1 โ ๐ท๐) + ๐1๐(1) โ ๐ท๐
โ Fundamental problem of causal inference.
Estimand (ATT)๐ATT = ๐ธ[๐1๐(1) โ ๐0๐(1)|๐ท๐ = 1]
ProblemMissing potential outcome: ๐ธ[๐0๐(1)|๐ท = 1], ie. what is the average post-periodoutcome for the treated in the absence of the treatment?
9 / 52
Illustration
Treated vs. control in post-treatment period
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
โ
โ
โ
โ
Problem: Missing potential outcome: ๐ธ[๐0๐(1)|๐ท๐ = 1]
10 / 52
Illustration
Strategy: Treated vs. control in post-treatment period
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(1)|Di=0]
E[Yi(1)|Di=1]
DIGM
โ
โ
โ
โ
Assumption: No selection bias.
10 / 52
Illustration
Strategy: Before vs. after for treatment units
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
Change over time in treatment group
โ
โ
โ
โ
Assumption: No effect of time independent of treatment.
10 / 52
Illustration
Treated vs. control in post-treatment period
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
Change over time in control group
โ
โ
โ
โ
Treated vs. control in post-treatment period
10 / 52
Illustration
Strategy: Difference-in-differences (1)
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(0)|Di=1]
E[Y0i(1)|Di=1]
โ
โ
โ
โ
โ
โ
Assumed change over time in treatment group
Assumption: Trend over time is the same for treatment and control.
10 / 52
Illustration
Strategy: Difference-in-differences (1)
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Y0i(1)|Di=1]
โ
โ
โ
โ
โ
โ
Assumed change over time in treatment group
Difference in differences
Assumption: Trend over time is the same for treatment and control.
10 / 52
Illustration
Strategy: Difference-in-differences (2)
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(0)|Di=0]
E[Yi(0)|Di=1]
Selection bias in preโtreatment period
โ
โ
โ
โ
Treated vs. control in post-treatment period
10 / 52
Illustration
Strategy: Difference-in-differences (2)
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(1)|Di=0]
E[Y0i(1)|Di=1]
Assumed selection bias in postโtreatment period
โ
โ
โ
โ
โ
โ
Assumption: Selection bias is stable over time.
10 / 52
Illustration
Strategy: Difference-in-differences (2)
t = 0 t=1
E[Yi(0)|Di=0]
E[Yi(1)|Di=0]
E[Yi(0)|Di=1]
E[Yi(1)|Di=1]
E[Yi(1)|Di=0]
E[Yi(1)|Di=1]
E[Y0i(1)|Di=1]
Assumed selection bias in postโtreatment period
โ
โ
โ
โ
โ
โ
Difference in differences
Assumption: Selection bias is stable over time.
10 / 52
Identification with Difference-in-Differences
Two ways of stating the identifying assumption:
โข Parellel trends
โข If treated units did not receive the treatment, they would havefollowed the same trend as the control units
โข No time-varying confounders
โข Omitted variables related both to treatment and outcome must befixed over time
11 / 52
Identification with Difference-in-Differences
Identification Assumption๐ธ[๐0๐(1) โ ๐0๐(0)|๐ท๐ = 1] = ๐ธ[๐0๐(1) โ ๐0๐(0)|๐ท๐ = 0] (parallel trends)
Identification ResultGiven parallel trends, ๐ATT is identified as:
๐ธ[๐1๐(1) โ ๐0๐(1)|๐ท = 1] = {๐ธ[๐๐(1)|๐ท๐ = 1] โ ๐ธ[๐๐(1)|๐ท๐ = 0]}
โ {๐ธ[๐๐(0)|๐ท๐ = 1] โ ๐ธ[๐๐(0)|๐ท = 0]}
In other words:
๐ATT = {Difference in means in post-treatment period}
โ {Difference in means in pre-treatment period}
12 / 52
Difference-in-Differences: Estimators
Estimand (ATET)๐ธ[๐1(1) โ ๐0(1)|๐ท = 1]
Estimator (Sample Means: Panel)
{ 1๐1
โ๐ท๐=1
๐๐(1) โ 1๐0
โ๐ท๐=0
๐๐(1)}โ{ 1๐1
โ๐ท๐=1
๐๐(0) โ 1๐0
โ๐ท๐=0
๐๐(0)}
= { 1๐1
โ๐ท๐=1
{๐๐(1) โ ๐๐(0)} โ 1๐0
โ๐ท๐=0
{๐๐(1) โ ๐๐(0)}} ,
where ๐1 is the number of treated individual and ๐0 is the number ofnon-treated individuals.
13 / 52
Example
Persuasive Power of the News MediaDid the change in support for the Labour Party by the Sun newspaperincrease the number of people voting Labour? Ladd and Lenz (2009)use the British Election Panel Survey, which includes information onwhether individuals voted for Labour in 1992, whether they votedLabour in 1997, and which newspapers they read.
โข Outcome (๐ , voted_lab): 1 if individual ๐ voted Labour at time ๐ก
โข Treatment (๐ท, reads_sun): 1 if individual ๐ read the Sun (in 1992)
โข Time (๐ , year): Election year (1992 or 1997)
โข ๐ = 1593, ๐1 = 211, ๐0 = 1382
โข Note that this is panel data (repeated observations on the sameindividuals over time)
14 / 52
Example
Typically useful to store data in โlongโ format (here, 2 rows per unit):str(sun)
## 'data.frame': 3186 obs. of 4 variables:## $ reads_sun: int 0 0 0 0 0 0 0 0 0 1 ...## $ voted_lab: int 1 1 0 1 1 1 1 1 1 1 ...## $ year : num 1992 1992 1992 1992 1992 ...## $ id : int 1 2 3 4 5 6 7 8 9 10 ...
table(sun$reads_sun, sun$year)
#### 1992 1997## 0 1382 1382## 1 211 211
Question: Which observations are โtreatedโ?
Answer: Sun readers in 1997.15 / 52
Example
# Untreated, pre-treatmenty_d0_t0 <- mean(sun$voted_lab[sun$reads_sun == 0 & sun$year == 1992])y_d0_t0
## [1] 0.3227207# Treated, pre-treatmenty_d1_t0 <- mean(sun$voted_lab[sun$reads_sun == 1 & sun$year == 1992])y_d1_t0
## [1] 0.3886256# Untreated, post-treatmenty_d0_t1 <- mean(sun$voted_lab[sun$reads_sun == 0 & sun$year == 1997])y_d0_t1
## [1] 0.4305355# Treated, post-treatmenty_d1_t1 <- mean(sun$voted_lab[sun$reads_sun == 1 & sun$year == 1997])y_d1_t1
## [1] 0.582938416 / 52
Example
0.30
0.35
0.40
0.45
0.50
0.55
0.60
% v
otin
g La
bour
1992 1997
Reads the SunDoesn't read the SunAssumed counterfactual trend
ATT
17 / 52
Example
# Parallel trend calculation(y_d1_t1 - y_d1_t0) - (y_d0_t1 - y_d0_t0)
## [1] 0.08649803
# Stable selection bias calculation(y_d1_t1 - y_d0_t1) - (y_d1_t0 - y_d0_t0)
## [1] 0.08649803
Implication: The change in endorsement caused Labour support toincrease by 8.6 percentage points amongst readers of The Sun.
18 / 52
Difference-in-Differences: Regression Estimator
Estimator (Regression 1)Alternatively, the same estimator can be obtained using regression techniques.
๐๐ = ๐ผ + ๐ฝ1 โ ๐ท๐ + ๐ฝ2 โ ๐๐ + ๐ฟ โ (๐ท๐ โ ๐๐) + ๐,where ๐ธ[๐|๐ท๐, ๐๐] = 0. Then, it is easy to show that
๐ธ[๐๐|๐ท๐, ๐๐] ๐๐ = 0 ๐๐ = 1 After - Before๐ท๐ = 0 ๐ผ ๐ผ + ๐ฝ2 ๐ฝ2๐ท๐ = 1 ๐ผ + ๐ฝ1 ๐ผ + ๐ฝ1 + ๐ฝ2 + ๐ฟ ๐ฝ2 + ๐ฟTreated - Control ๐ฝ1 ๐ฝ1 + ๐ฟ ๐ฟ
Thus, the difference-in-differences estimate is given by:
๐ATT = (๐ฝ2 + ๐ฟ) โ ๐ฝ2 = ๐ฟ
Equivalently:๐ATT = (๐ฝ1 + ๐ฟ) โ ๐ฝ1 = ๐ฟ
19 / 52
Example: Regression
dd_mod <- lm(voted_lab ~ reads_sun * as.factor(year),data = sun)
voted_lab
reads_sun 0.066โ
(0.036)as.factor(year)1997 0.108โโโ
(0.018)reads_sun:as.factor(year)1997 0.086โ
(0.050)Constant 0.323โโโ
(0.013)
Observations 3,186R2 0.022
โข ๐ผ = Labour support amongstnon-Sun readers, 1992
โข ๐ฝ1 = difference between Sunand non-Sun readers, 1992
โข ๐ฝ1 + ๐ฟ = difference betweenSun and non-Sun readers, 1997
โข ๐ฟ = ATT
20 / 52
Example: Regression
dd_mod <- lm(voted_lab ~ reads_sun * as.factor(year),data = sun)
voted_lab
reads_sun 0.066โ
(0.036)as.factor(year)1997 0.108โโโ
(0.018)reads_sun:as.factor(year)1997 0.086โ
(0.050)Constant 0.323โโโ
(0.013)
Observations 3,186R2 0.022
โข ๐ผ = Labour support amongstnon-Sun readers, 1992
โข ๐ฝ2 = 1992 to 1997 difference,amongst non-Sun readers
โข ๐ฝ2 + ๐ฟ = 1992 to 1997difference, amongst Sunreaders
โข ๐ฟ = ATT
20 / 52
Cross-sectional regression estimator
A nice feature diff-in-diff is it does not require panel data, i.e. repeatedobservations of the same units. Can also use repeated cross-sections:
โข ๐๐๐๐ก where unit ๐ is only measured at one ๐ก
โข Units fall into treatment based on groups ๐
โข Particularly useful as many โtreatmentsโ vary at some aggregatelevel:
โข e.g. Law changes at the state/region level
21 / 52
Cross-sectional regression estimator
Two options:
โข Individual-level data:
๐๐๐๐ก = ๐ผ + ๐ฝ1๐ท๐ + ๐ฝ2๐๐ + ๐ฝ3(๐ท๐ โ ๐๐) + ๐๐๐๐ก
โข Aggregated data:
๐๐๐ก = ๐ผ + ๐ฝ1๐ท๐ + ๐ฝ2๐๐ + ๐ฝ3(๐ท๐ โ ๐๐) + ๐๐๐ก
Both approaches will give the same result, as the treatment only varies atthe group level (so long as the aggregated version is weighted by cellsize).
22 / 52
Regression estimator advantages
1. Easy to calculate standard errors (though be careful aboutclustering)
2. We can control for other variables
โข Individual-level data, group-level treatment: controlling forindividual covariates may increase precision
โข Time-varying covariates at the group-level may strengthen theparallel trends assumption, but beware of post-treatment bias
3. Simple to extend to multiple groups/periods (more on this later)
4. Can use multi-valued (not just binary) treatments
23 / 52
Difference-in-Differences: First Differences Estimator
Estimator (Regression 2)With panel data we can use regression with first differences:
ฮ๐๐ = ๐ผ + ๐ฟ โ ๐ท๐ + ๐โฒ๐๐ฝ + ๐ข,
where ฮ๐๐ = ๐๐(1) โ ๐๐(0), and ๐ข = ฮ๐.
โข With two periods gives identical result as other regressions
24 / 52
Example: First-difference regression
1 row per unit:head(sun_diff)
## reads_sun voted_lab_92 voted_lab_97## 1 0 1 1## 2 0 1 0## 3 0 0 0## 4 0 1 1## 5 0 1 1## 6 0 1 1
sun_diff$diff <- sun_diff$voted_lab_97 - sun_diff$voted_lab_92head(sun_diff)
## reads_sun voted_lab_92 voted_lab_97 diff## 1 0 1 1 0## 2 0 1 0 -1## 3 0 0 0 0## 4 0 1 1 0## 5 0 1 1 0## 6 0 1 1 0
25 / 52
Example: First-difference regression
first_diff_mod <- lm(diff ~ reads_sun, data = sun_diff)
Table 1: First difference model
diff
reads_sun 0.086โโโ
(0.030)Constant 0.108โโโ
(0.011)
Observations 1,593R2 0.005
26 / 52
Non-parallel trends
Critical identification assumption: treatment units have similar trends tocontrol units in the absence of treatment.
Question: Why is this assumption untestable?
Answer: because of the FPOCI โ we cannot observe potential outcomeunder the control condition for treated units in the post-treatmentperiod.
28 / 52
Potential violations of parallel trends
โข โAshenfelterโs Dipโ
โข Participants in worker training programs may experience decreasedearnings before they enter the program (why are they participating?)
โข If wages revert to the mean, comparing wages of participants andnon-participants leads to an upwardly biased estimate
โข Targeting
โข Policymakers may target units who are most improving
29 / 52
Assessing (non-)parallel trends
What can we do?
โข One treatment/control group
โข Plot results and look at trends in periods before the treatmentโข Is the parallel trends assumption plausible?
โข Multiple treatment/control comparisons
โข Estimate treatment effects at different time points (i.e. placebo tests)โ All estimated treatment effects before the treatment should be 0.
โข Include unit-specific time trends โ โrelaxโ parallel trendsassumption
30 / 52
โGoodโ parallel trends example%
vot
ing
Labo
ur
1983 1987 1992 1997
Reads the SunDoesn't read the Sun
31 / 52
โBadโ parallel trends example%
vot
ing
Labo
ur
1983 1987 1992 1997
Reads the SunDoesn't read the Sun
32 / 52
Parallel and non-parallel trends
Parallel trends Treatment effect
Preโtreatment period Postโtreatment period
Nonโparallel trends No treatment effect
Preโtreatment period Postโtreatment period
Nonโparallel trends Treatment effect
Preโtreatment period Postโtreatment period
34 / 52
Difference-in-Differences: Fixed-effect estimator
Estimator (Fixed-effect regression)We can generalise to multiple groups/time periods using unit and periodfixed-effects (โtwo-wayโ fixed-effect model):
๐๐๐ก = ๐พ๐ + ๐ผ๐ก + ๐ฟ๐ท๐๐ก + ๐๐๐ก
โข ๐พ๐ is a fixed-effect for groups (dummy for each group)
โข ๐ผ๐ก if a fixed-effect for time periods (dummy for each time period)
โข ๐ฟ is the diff-in-diff estimate based on ๐ท๐๐ก, which is 1 for treatedunit-period observations, and 0 otherwise
Very flexible:
โข can replace ๐ท๐๐ก with almost any type of treatment (not only binary)โข can extend easily to multiple periods (i.e. more than 2)โข can have different units treated at different times
35 / 52
Example: Fixed-effect regression
sun$treat <- sun$reads_sun == 1 & sun$year == 1997
fe_model <- lm(voted_lab ~ treat + as.factor(id) + as.factor(year),data = sun)
Table 2: Fixed-effect model
voted_lab
treat 0.086โโโ
(0.030)
Observations 3,186R2 0.826
36 / 52
Why does fixed-effect regression estimate the diff-in-diff?
โข Unit/group FEs mean that we are only using within group variationin Y to calculate the effect of D
โข Removes any omitted variable bias that is constant over time
โข Time FEs means that we remove the effect of any changes to theoutcome variable that affect all units at the same time
โข ๐ฟ โ ๐ATTNote that unit dummies lead to smaller standard errors on our treatmenteffect. Why not always use unit dummies?
โข Fine in panel data, as we have same units at several points in timeโข Not possible with repeated cross-section when we do not have thesame units in each time period
37 / 52
Standard errors for the regression difference in differences
โข Many papers using a DD strategy use data from many periods
โข Treatments typically vary at the group level, while outcomesnormally measured at the individual level
โข E.g. Minimum wage increases (state-level) and employment data(firm-level) in Cark and Krueger
โข Will not bias treatment effect estimates, but will cause problems forvariance estimation when errors are serially correlated
โข Implication: traditional standard errors will tend to be too small.
38 / 52
Standard errors for the regression difference in differences
Solution (Bertrand et al (2004))Use cluster-robust standard errors where clusters are defined at thelevel of the treatment. If the number of groups is:
โข โฆlarge (โช 30), use vcovCL in sandwichโข โฆsmall (โช 30), use block-bootstrap
39 / 52
Example: Multiperiod diff-in-diff
Female role-models in politicsWhen women are promoted to high-office, do they motivate otherwomen to participate more in policymaking? Blumenau (2019) examinesthe participation of female MPs in parliamentary debates between 1997and 2018 as a proxy for policymaking involvement. If high-profilewomen act as role-models, when they are promoted they shouldencourage other women to participate more.
โข Outcome (๐ , prop_words): Proportion of words spoken by women in debate ๐ atin ministry ๐ at time ๐ก
โข Treatment (๐ท, female_minister): 1 if cabinet minister of ministry ๐ at time ๐ก isfemale
โข Unit of analysis: Debates (clustered within about 30 ministries)
โข ๐ โ 15,000 debates
โข Time measured at the month level (i.e. February 2012, etc)
40 / 52
Female role-models โ model specification
๐๐๐๐๐๐๐๐๐ ๐๐๐๐๐๐(๐๐ก) = ๐พ๐ +๐ผ๐ก +๐ฟ โ๐น๐๐๐๐๐๐๐๐๐๐ ๐ก๐๐๐๐ก +๐๐(๐๐ก)
โข ๐พ๐ โ ministry fixed effect
โข Controls for unobserved ministry characteristics that are timeinvariant
โข ๐ผ๐ก โ time (month-year) fixed effect
โข Controls for changes in participation over time that are common toall ministries
โข ๐ฟ โ average effect of switching from a male to a female minister based onwithin-ministry variation, amongst ministries that see a change in ministergender (i.e. ๐ATT)
43 / 52
R example
fe_mod <- lm(prop.women.words ~ minister.gender +as.factor(ministry) +
as.factor(yearmon),data= speeches)
screenreg(list(fe_mod),omit.coef = c(โministry|yearmonโ),digits = 3)
44 / 52
R example
#### ===============================## Model 1## -------------------------------## (Intercept) 0.039## (0.044)## minister.genderF 0.041 ***## (0.005)## -------------------------------## R^2 0.106## Adj. R^2 0.091## Num. obs. 14320## RMSE 0.218## ===============================## *** p < 0.001, ** p < 0.01, * p < 0.05
44 / 52
R example (clustered errors)
fe_mod_clustered <- coeftest(fe_mod,vcov = vcovCL, cluster = ~ ministry)
screenreg(list(fe_mod, fe_mod_clustered),omit.coef = c(โministry|yearmonโ),digits = 3)
45 / 52
R example (clustered errors)
#### ===========================================## Model 1 Model 2## -------------------------------------------## (Intercept) 0.039 0.039## (0.044) (0.048)## minister.genderF 0.041 *** 0.041 ***## (0.005) (0.011)## -------------------------------------------## R^2 0.106## Adj. R^2 0.091## Num. obs. 14320## RMSE 0.218## ===========================================## *** p < 0.001, ** p < 0.01, * p < 0.05
45 / 52
Female role-models โ full results
Interpretation: The appointment of a female minister increasesparticipation of other women byโ 20% relative to male minister baseline
46 / 52
Female role-models โ parallel trends
Hard to provide a visual inspection of the parallel trends assumption astreatment switches on and off over time in different ministries.
Nevertheless, we are still assuming that treated/control ministries wouldhave evolved identically over time in absense of treatment.
One way forward, test for ๐ โlagsโ and ๐ โleadsโ of the treatment:
๐ฆ๐(๐๐ก) = ๐ผ0 +๐
โ๐=โ๐
๐ฟ๐ โ๐น๐๐๐๐๐๐๐๐๐๐ ๐ก๐๐๐(๐ก+๐) +๐พ๐0 +๐ผ๐ก +๐๐(๐๐ก)
โข ๐ฟโ1, ๐ฟโ2, ..., ๐ฟโ๐ are the โleadโ effects which should all be 0
โข ๐ฟ1, ๐ฟ2, ..., ๐ฟ๐ are the โlaggedโ effects which may take differentvalues
Intuition: What is the DD estimate in the Sun example using only datafrom 1992 and 1996? 47 / 52
Data requirements for Diff-in-diff
Data structure:
โข Panel data or repeated cross-sectionโข Single or multiple treatmentsโข Continuous or binary treatmentsโข Works both at individual/aggregate level
Does this require more data?
โข Adding a time dimension can increase the amount of data you needโข No need to control for extensive covariates (so long as they are fixedwithin units over time) which might mean decreased data collection
49 / 52
Examples of diff-in-diff designs
1. Card & Krueger, 1994โข RQ: Do increases in the minimum wage reduce employment?โข Outcome: Employment growth in fast-food restaurantsโข Treatment: Increased minimum wage in New Jersey; no change inPennsylvania
โข Time: Before/after minimum wage changed
2. Dinas et al., 2019โข RQ: What is the effect of refugee arrivals on support for the far right?โข Outcome: Municipal support for far right partyโข Treatment: Refugee arrivals in Greek islandsโข Time: Elections before/after refugee crisis
50 / 52
Examples of diff-in-diff designs
3. Bechtel & Heinmueller, 2011โข RQ: What is the effect of good policy on government support?โข Outcome: Support for the German SPD in parliamentaryconstituencies
โข Treatment: Flooded German regions close to the River Elbeโข Time: Elections before/after 2002, when the Elbe flooded
4. Hainmueller & Hangartner, 2019โข RQ: What is the effect of direct democracy on immigrantassimilation?
โข Outcome: Naturalization rate of immigrants in Swiss municipalitiesโข Treatment: Whether municipality decides on naturalisation requestsvia expert or citizen councils
โข Time: Decisions before/after legal changes to decision making inmunicipalities
50 / 52
Conclusion
The DD design allows for a comparison over time in the treatment group,controlling for concurrent time trends using a control group.
DD requires data on multiple units in multiple periods, but can beapplied to panel data or repeated cross-sectional data.
DD is very widely used, as it is a powerful conditioning strategy thatdoesnโt require endless lists of covariates to strengthen the identifyingassumption.
The identification assumption โ that treatment and control units wouldfollow parallel trends in the absense of treatment โ should beinvestigated with every application!
51 / 52