15
Inference for Average Treatment Effects Kosuke Imai Harvard University STAT186/GOV2002 CAUSAL I NFERENCE Fall 2019 Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 1 / 15

Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Inference for Average Treatment Effects

Kosuke Imai

Harvard University

STAT186/GOV2002 CAUSAL INFERENCE

Fall 2019

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 1 / 15

Page 2: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Motivation

Two limitations of permutation inference:1 causal heterogeneity2 population inference

Fundamental problem of causal inference cannot identify individual causal effects

Neyman’s approach:1 Average treatment effects as causal quantities of interest: SATE

and PATE2 Design-based approach: randomization of treatment assignment,

random sampling3 Asymptotic approximation rather than exact inference

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 2 / 15

Page 3: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Social Pressure and Turnout (Gerber, et al. 2008. Am. Political Sci. Rev.)

August 2006 Primary Election in MichiganStatewide elections: Governor, US Senator180,000 householdsSend postcards with different messages

Randomly assign each household to a group (or treatment)1 no message (control group)2 civic duty message3 “you are being studied” message (Hawthorne effect)4 household social pressure message5 neighborhood social pressure message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 3 / 15

Page 4: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Neighborhood Social Pressure Message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 4 / 15

Page 5: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

“You are being studied” Message

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 5 / 15

Page 6: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Standard Empirical Analysis

Groups Control Civic duty Hawthorne Self NeighborTurnout rate 29.7% 31.5% 32.2% 34.5% 37.5%# of voters 191,243 38,218 38,204 38,218 38,201

Neighborhood social pressure vs. Control

τ̂ = 37.5− 29.7 = 7.8

s.e. =

√37.5× (100− 37.5)

38201+

29.7× (100− 29.7)191243

≈ 0.3

95%CI = [7.8− 1.96× 0.3, 7.8 + 1.96× 0.3] = [7.2, 8.4]

This calculation ignores the fact that some households havemultiple voters: we will discuss this issue later in the courseHow can we justify this standard difference-in-means analysisfrom the randomization perspective?

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 6 / 15

Page 7: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Estimation of the Sample Average Treatment Effect

Due to Neyman (1923) Neyman. 1990 (translated to English) Stat. Sci.

Difference-in-means estimator:

τ̂ ≡ 1n1

n∑i=1

TiYi −1n0

n∑i=1

(1− Ti)Yi

Unbiasedness (over repeated treatment assignments):

E(τ̂ | On) =1n1

n∑i=1

E(Ti | On)Yi(1)−1n0

n∑i=1

{1− E(Ti | On)}Yi(0)

=1n

n∑i=1

(Yi(1)− Yi(0)) = SATE

where On = {Yi(0),Yi(1)}ni=1

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 7 / 15

Page 8: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

The Variance of the Difference-in-Means Estimator

Variance of τ̂ :

V(τ̂ | On) =1n

(n0

n1S2

1 +n1

n0S2

0 + 2S01

),

where for t = 0,1,

S2t =

1n − 1

n∑i=1

(Yi(t)− Y (t))2 sample variance of Yi(t)

S01 =1

n − 1

n∑i=1

(Yi(0)− Y (0))(Yi(1)− Y (1)) sample covariance

The variance is NOT identifiable

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 8 / 15

Page 9: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Details of the Variance Derivation

1 Let Xi = Yi(1) + n1Yi(0)/n0 and Di = nTi/n1 − 1, and write

V(τ̂ | On) =1n2 E

(

n∑i=1

DiXi

)2 ∣∣∣∣ On

2 Show

E(Di | On) = 0, E(D2i | On) =

n0

n1,

E(DiDj | On) = − n0

n1(n − 1)

3 Use Ê and Ë to show,

V(τ̂ | On) =n0

n(n − 1)n1

n∑i=1

(Xi − X )2

4 Substitute the potential outcome expressions for Xi

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 9 / 15

Page 10: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Conservative Variance Estimator

The usual variance estimator is conservative on average:

V(τ̂ | On) ≤S2

1n1

+S2

0n0

= E

(σ̂2

1n1

+σ̂2

0n0

∣∣∣ On

)where

σ̂t =1

nt − 1

n∑i=1

1{Ti = t}(Yi − Y t)2 for t = 0,1

Under the constant additive unit causal effect assumption, i.e.,Yi(1)− Yi(0) = c for all i ,

S01 =12(S2

1 + S20) and V(τ̂ | On) =

S21

n1+

S20

n0

The optimal treatment assignment rule:

nopt1 =

n1 + S0/S1

, nopt0 =

n1 + S1/S0

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 10 / 15

Page 11: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Bounds on the Variance

Use of the Cauchy-Schwartz inequality:1 Upper bound: sample correlation between Yi(1) and Yi(0) is 12 Lower bound: sample correlation between Yi(1) and Yi(0) is −1

n0n1

n

(S1

n1− S0

n0

)2

≤ V(τ̂ | On) ≤n0n1

n

(S1

n1+

S0

n0

)2

Constant additive unit causal effect sample correlation is 1

n0n1

n

(S1

n1+

S0

n0

)2

=S2

1n1

+S2

0n0

Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.)

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 11 / 15

Page 12: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Inference for Population Average Treatment Effect

Assumption: simple random sampling from an infinite populationUnbiasedness (over repeated sampling):

E{E(τ̂ | On)} = E(SATE) = PATE

Variance:

V(τ̂) = V{E(τ̂ | On)}+ E{V(τ̂ | On)}

=σ2

1n1

+σ2

0n0

where σ2t is the population variance of Yi(t) for t = 0,1

Unbiased variance estimator:

V̂(τ̂) =σ̂2

1n1

+σ̂2

0n0

where E{V̂(τ̂)} = V(τ̂)

for t = 0,1Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 12 / 15

Page 13: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Asymptotic Inference for PATE

Hold k = n1/n constant:

τ̂ =1n

n∑i=1

(TiYi(1)

k− (1− Ti)Yi(0)

1− k

)︸ ︷︷ ︸

i.i.d. with mean PATE & variance nV(τ̂)

Consistency via Law of large numbers:

τ̂p−→ PATE

Asymptotic normality via the Central Limit Theorem:

√n(τ̂ − PATE) d−→ N

(0,σ2

1k

+σ2

01− k

)(1− α)× 100% Confidence intervals:

[τ̂ − s.e.× zα/2, τ̂ + s.e.× zα/2]

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 13 / 15

Page 14: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Exchange at the Royal Statistiacal Society(Neyman et al. (1935) Suppl. of J. Royal Stat. Soc.)

Neyman: So long as the average yields of any treatments areidentical, the question as to whether these treatments affect separateyields on single plots seems to be uninteresting

Fisher: It may be foolish, but that is what the z test was designed for,and the only purpose for which it has been used.

Neyman: I am considering problems which are important from thepoint of view of agriculture.

Fisher: It may be that the question which Dr. Neyman thinks should beanswered is more important than the one I have proposed andattempted to answer. I suggest that before criticizing previous work it isalways wise to give enough study to the subject to understand itspurpose.

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 14 / 15

Page 15: Inference for Average Treatment Effects · Sharp bounds based on the entire marginal distributions application of Hoeffding’s lemma (Aronow et al. 2015. Ann. Stat.) Kosuke Imai

Summary: Fisher vs. Neyman

Like Fisher, Neyman proposed randomization-based inferenceUnlike Fisher,

1 estimands are average treatment effects2 heterogenous treatment effects are allowed3 population as well as sample inference is possible4 asymptotic approximation is required for inference

Reading: IMBENS AND RUBIN, CHAPTER 6

Kosuke Imai (Harvard) Average Treatment Effects Stat186/Gov2002 Fall 2019 15 / 15