© Michael Lechner, 2006, p. 1 (Non-bayesian) Discussion (translation) of Principal Stratification for Causal Inference with Extended Partial Complience

© Michael Lechner, 2006, p. 1

(Non-bayesian) Discussion (translation) ofPrincipal Stratification for Causal Inference with Extended

Partial Complience

by

Hui Jin and Don Rubin Mannheim, ZEW, October 2006

Michael Lechner SIAW, ZEW , CEPR, IZA


A statistics paper in an econometric perspective

Perspective 1: An exercise in IV estimation (à la IA ’94 and AIR ’96)

Complication: There is only a binary instrument, but we are interested in

the effects of multiple treatments in the form of dose responses.

A binary instrument is not powerful enough for such comparisons.

Perspective 2: We ‘must’ condition on an endogenous variable (an

intermediate outcome) to estimate the effect of interest Paper shows

how to recover the causal effect of interest (!) in such a framework

These problems occur although there is underlying experiment that

assigns people to different treatment states

However: Paper uses different language than econometricians do ...


An artifical example from the training literature... as translation device

Unemployed want to attend a training programme

UE is randomized in one of 2 programmes, called T and C (Z)

T is tough programme – not much fun, a lot of work, add. human capital

C is a leasurly social experience, no human capital

Each programme has duration of 4 weeks, participants may leave

programmes any time (even before they start)

UE have a taste for leasure

Programmes have heterogenous effects


An artifical example from the training literature... as translation device (2)

We want to understand the effect of the programmes on the employment

rate 2 years after the start of the programme.

Even more: We may want to understand the effects of a completed

programme compared to the other completed programme.

Problem: If we base the analysis on the subsamples of those who

complete the programme, we may contaminate the causal inference,

because those who realised that they have a low return may have

dropped out already.

This type of selection problem is more likely to occur with the tough

programme.


Solutions of the identification problems

The paper provides two solutions to this problem (and is VERY clear

about the underlying assumptions)

1) Instead of conditioning on the endogenous observable intermediate

outcome (programme duration), condition on the potential intermediate

outcomes. For example: Compare the person that completed to the

tough programme to somebody who participated in the easy

programme but would have completed the tough programme and

average (unobservable find other restrictions!).

Here, require also same propensity to complete the easy programme

2) Device a hypothetical random experiment that would identify the effects


Z as something like an instrument of D and d Assumptions used in paper

Standard assumptions: SUTVA, Z randomized

Exclusion of direct effect of Z on Y: If a change in Z does not affect

(potential) terminating behavior in both programmes, than potential

outcomes Y(Z) are the same (plausible in example)

Strong access monotonicity (D(C)=0, d(T)=0): If randomised into the

tough programme, there is no way of participating in parts of the easy

programme, and conversly ... (plausible in strict experiment)

removes 2 of the 4 (partially) unobservables from the playing field ...

Negative side effect monotonity ( ): If UE would have left nice

programme, UE would have left tough programme as well (???)

[behavioral assumption]

restricts the remaining unobservable in terms of the observable

0 0

[ ( ) ( ) | ( ), ( ) , ( ) , ( )] [ ( ) ( ) | ( ), ( )]E Y T Y C D T D C d T d C E Y T Y C D T d C

( ) ( )D T d C


Identification and estimationHow does identification work without a Bayesian perspective? The missing equation ...

Paper shows a Bayesian estimation strategy

For a Non-Bayesian, there remain a couple of open points that center around the

equation that is missing in the paper:

Issues:

- What is the role of the different assumption in the identification step ?

- More specific: For example, what happens if we assume weak (insted of strong)

access monotonicity? In this case, do we identify an interval or a point?

- Frequentist estimation ... which moments of the data are required?

[ ( ) ( ) | ( ), ( )] ???????E Y T Y C D T d C function of data


Next target: Dose response

Additional assumption

- Dose depends (only) on single index which is observable for control

group, but unobservable for controls

- Example: There is some underlying variable which influences length of

participation. However, for the nice programme the UE follow this

‚desire‘, but for the nasty programme they deviate towards. This

deviation is influenced by the ‚desire‘ only and is otherwise random

(hard to justify in this example)

Same questions as before ...

??????[ ( ) ( ( )) | ] [ ( ( )) ( ) | ] ?DE Y T Y d C d E Y D T Y C d function of data


Conclusion

Principal stratification could be a very helpful concept in econometrics

It is clearly related to IV estimation – relation could be made even more

explicit

Taking account of the non-Bayesian perspective would greatly enhance

its value for econometricians

Documents

© Michael Lechner, 2006, p. 1 (Non-bayesian) Discussion (translation) of Principal Stratification for Causal Inference with Extended Partial Complience