Reanalysis of Perry Preschool Program

Overview Background Methods Evidence Beyond Conclusion

Reanalysis of Perry Preschool Program

James Heckman, Seong Moon, Peter Savelyev,Rodrigo Pinto, and Adam Yavitz

University of Chicago

EPGE, Rio, December 2009

(This draft December 17, 2009)

1 / 65


Social Experiments on Early Childhood Intervention

Many Social Experiments try to measure the impact of earlychildhood investment on life-time outcomes.

The Perry Preschool Program is considered a cornerstone forjustifying early interventions.

In a series of papers, we analyze:

1. Average Treatment Effects2. Channels through which treatment operates3. External validity4. Cost-Benefit Analysis5. Impact of the local economy6. Inference

2 / 65


Outline of This Talk

1. Background on the High/Scope Perry Preschool Program

2. Statistical Challenges

3. Life Course Impacts of Early Childhood Intervention

4. Beyond Treatment Effects (Channels that Program Operates)

5. Conclusion

3 / 65


1. Background: The Perry Preschool Program Study

What: Small randomized experiment(123 children: 58 treatment, 65 control)

Where: Conducted in Ypsilanti, Michigan

When: The early 1960s

Who: Low-IQ, Low-SES African-American children

Measurements: Covering many varieties of activity over thelife cycle (ages 3–15, 19, 27, 40)

4 / 65


Eligibility

Eligibility (loosely) based on Stanford-Binet IQ score and ameasure of socioeconomic status:

Those with IQ scores outside of 70-85 were excluded.Socioeconomic status was measured as a weighted linearcombination of paternal skill level, paternal educationalattainment, and number of rooms per person in the home.Those with socioeconomic status scores above a certainthreshold were excluded.

5 / 65


Data

Broadly, interviews included questions about:

Family structureMarital statusCognitive and non-cognitive testsEducationCriminal behaviorEmploymentIncome

6 / 65


Intervention: High/Scope Perry Program

Early Age: Children were age 3–5 during treatment.

Duration: Program lasted 30 weeks / year,mid-October through May.

Curriculum: A daily 2-1/2 hour classroom session on weekdaymornings.

Home Visits: Weekly 1-1/2 hour home visits by the teacher.

Quality: Teachers certified in elementary (early) and specialeducation.

Program cost: $9,785 per participant per year(year-2006 dollars).

7 / 65


Features of Perry Preschool Curriculum

Children are treated as active learners.(Piagetian theory)

Teachers encourage children to engage in play activities thatinvolve making choices and problem solving.

Tasks:1 Plan, execute and evaluate activities;2 Foster cooperation among peers;3 Encourage communication on solving personal conflicts;4 Foment Social Skills;

Source: Schweinhart et al. (1993)

8 / 65


Table 1: Perry Sample Characteristics, by Wave

Birth YearSample Size

AgesTreat. Control

1958 13 15 41959 8 9 3,41960 12 14 3,41961 13 14 3,41962 12 13 3,4

123 subjects overall;

Initial wave had one year of treatment; remaining waves had twoyears.

9 / 65


2. Perry Program Evaluation

10 / 65


2. Why Randomized Experiment?

Randomization facilitates program evaluation:

Randomization traditionally sidesteps problems of selection bias.

Difference-in-means statistics provides measures of interest,such as the average treatment effect.

11 / 65


Why Randomization is Considered Ideal?

1. Why Randomization is Considered Ideal?

Traditional data is defined by:

Evaluation Eq.: Y = Y1 · D + Y0 · (1− D), D = 1[UV > 0]

Non-experimental data: Y1 �⊥⊥ Y0 �⊥⊥ UV

Simple Regression: Y = ∆ · D + β · X + ε

OLS estimator: E [∆] = E [Y1|D = 1,X ]− E [Y0|D = 0,X ]

Randomized trial: (Y0,Y1) ⊥⊥ D

Randomized Estimator: E [∆] = E [Y1 − Y0|X ] ∴ ATE (X )

12 / 65


2. Why Reanalyze?

Previous literature fails to account for three statistical features ofthe Perry study:

(A) Small sample size.

(B) Irregularities in the randomization protocol (main criticism).

(C) An overwhelmingly large set of follow-up measures.(More outcome measures than observations.)

13 / 65


2. Evaluating the Perry Study: Statistical Challenges

(A) Small Sample Size

Resampling (permutation-based) testing

(B) Compromised Randomization: Two approaches1 Based on the Implemented Perry randomization protocol.

Relies on valid conditional exchangeability properties of Perry.Non-parametric block-permutation test that accounts forCompromises.

2 Conservative approach based on a partially identified model;Weaker assumptions that account for latent heterogeneity;

(C) “Cherry Picking” (selective reporting of results)

Multiple-hypothesis testing: the stepdown procedure uses analgorithm for multiple-hypothesis inference.

14 / 65

2.A Small Sample Size

Classical statistical tests (t, F , and χ2 tests) are onlyasymptotically correct.

Unreliable when sample size is small and data has non-normaldistribution.

Lack degrees of freedom when testing too many jointhypotheses.

The conditions producing failure characterize the Perry data.

Table 2: Perry Sample Size by Gender

Treatment Control Total

Female 25 26 51Male 33 39 72

Total 58 65 123


Problem of Small Sample Size

2.A Solution to Small Sample: Permutation-Based Inference

1 Permutation-based inference (often termed data-dependent)computes p-values conditional on the observed data.

2 Distribution-free: does not require parametric assumptionsabout data sampling distribution (e.g, normality).

3 More reliable when the sampling distribution is skewed, small ornon-normal.

4 Does not rely on asymptotic theory, valid for small sample.

16 / 65

2.A. How Does Permutation Test Work?

Relies on the statistical exchangeability of a random vector underthe null hypothesis (randomization hypothesis).

Randomization Hypothesis: The Randomization Hypothesiscorresponding to H0 assumes that the distribution of Y is invariantto transformations of a permutation group G . Formally, under H0,

∀g ∈ G ; gYd= Y

where g is generally defined as:

gY = (Yi ; i ∈ I|Yi = Yπg (i) and πg : I → I is a bijection).

The set of permuted observations in Y has a uniformdistribution conditional on data.

The compromised randomization protocol dictates how toconstruct the permutation group G .


Compromised Randomization

2.B. Why Studying the Compromises of Randomization?

Exchangeability properties come as a consequence of theRandomization protocol;

Compromises occur by reassignment of treatment status basedon pre-program variables X .

If X impacts outcome Y , such a compromise generates a fewproblems:

1 Induces correlation between treatment assignments D andpre-program variables X that operate on outcomes Y .

2 Treatment assignments D impact outcome Y by channelsother than the treatment effect itself.

3 Biased inference will occur if treatment assignment distributionis wrongly assumed.

18 / 65



The Perry Randomization Protocol

How was Perry Randomization Protocol Conducted?

For each one of the five waves:

Stage 1: Eligibility;

Stage 2: Construction of two groups initially paired by IQ;

Stage 3: Swaps to balance background variables;

Stage 4: Coin flip to assign treatment;

Stage 5: Post-random reassignment based on mother’s workstatus (not random);

Younger siblings: Only the eldest sibling participates inrandomization. All younger siblings receive the assignment of theeldest.

19 / 65

Figure 1: Perry Randomization Diagram

T C

Step 4. Swap some treatment subjects with working mothers into the control group.

CT

Step 3. Randomly asssign treatment status to each group.

CT

Step 2. Some swaps to balance gender and SES.

Step 1. Unlabeled groups formed by pariring subjects by successive IQ.

IQ Score

Step 0. Subjects with older siblings given the same treatment assignment as the older sibling.

UnrandomizedEntry Cohort

CT

PreviousWave A

CT

PreviousWave B

Compromises due to Swaps

Figure 2: IQ at Entry, by Wave and Treatment GroupTable 12: Entry IQ vs. Treatment Group, by Wave

Control Treat. Control Treat. Control Treat. Control Treat. Control Treat.88 2 1 87 2 1 87 3 1 86 2 88 186 1 86 2 86 1 2 85 2 85 2 185 1 85 1 84 1 84 2 84 184 2 84 2 83 1 1 83 3 2 83 383 1 83 1 82 1 1 82 2 1 82 282 2 79 1 81 1 2 81 1 81 180 1 1 73 1 80 2 80 1 80 1 279 1 72 2 79 1 1 79 1 1 79 277 1 2 71 1 75 1 1 78 2 1 78 1 176 1 70 1 73 1 1 77 1 76 2 173 1 69 1 71 1 76 2 75 1 171 1 64 1 69 1 75 1 71 170 1 9 8 68 1 73 1 61 169 3 14 12 66 1 13 1268 1 14 1367 166 163 2

15 13

Counts CountsIQIQIQIQIQ

Counts Counts Counts

Class 5

Perry: Stanford-Binet Entry IQ by Cohort and Group Assigment

Class 1 Class 2 Class 3 Class 4

61

Stanford-Binet IQ at entry (age 3) was used to measure baseline IQ.

Siblings Restriction

Table 3: Siblings in Perry Sample

Overall Male Female

All Ctl. Trt. All Ctl. Trt. All Ctl. Trt.

Singleton 82 41 41 51 26 25 31 15 16Eldest 19 12 7 13 9 4 6 3 3

Younger 22 12 10 8 4 4 14 8 6

N 123 65 58 72 39 33 51 26 25

Note: The sample includes 17 pairs, 1 triple, and 1 quadruple of siblings.

Imbalance of pre-program variables

Table 4: Maternal Employment and Paternal Presence (at Study Entry)

Maternal Empl. Paternal PresenceYes No Prob.a Yes No Prob.a

MalesTrt. 2 31 .06 15 18 .45Ctl. 11 28 .28 22 17 .56

FemalesTrt. 3 22 .12 17 8 .68Ctl. 9 17 .35 11 15 .42

Notes: Measured at study entry (age 4 for wave 1; age 3 for other waves).(a) Conditional probability estimate (# yes / total).



2.B. Compromised Randomization Protocol

Implementation of the Inference Procedure for CompromisedRandomization

We show that some Conditional Exchangeability propertiesremain valid under compromised randomization.

Our permutation test adapts the Freedman and Lane (1983)procedure to the characteristics of Perry RandomizationProtocol.

24 / 65



Byproducts of Our Conditional Permutation Procedure

Respects the impossibility of certain assignments under theoriginal (implemented) randomization protocol (e.g., familyclustering);

Prevents imbalance of background variables (SES index, IQ,and gender) across permutations;

Accounts for the dependence between treatment assignmentand background variables.

Valid under Perry compromised randomization.

25 / 65


Multiple-Hypothesis Testing

2.C. Multiple-Hypothesis Testing

26 / 65



2.C. Evidence of Significant Treatment Effects

Table 5: Percentage of Test Statistics Greater then IndicatedSignificance LevelTable 3: Percentage of Test Statistics Greater than Indicated Significance Level

Used Data∗: All Data Male Data Only Female Data Only

Percentage of p-values smaller than 1% 7% 3% 7%



∗ Based on 715 outcomes in overall Perry Study. (See Schweinhart et al. (2005) for a description of the data.) 269 outcomes

from the period before age 19 interview. 269 from age 19 interview. 95 outcomes from age 27 interview. 55 outcomes from age

40 interview.

inference approaches above, is described in Section 3.3.

3.1 Permutation Testing Procedure

The permutation-based inference in this paper is designed principally to addresses the small sample size, but

to do so in a way that permits us to simultaneously account for compromised randomization.

Advantages of Permutation-Based Inference Permutation test revolves around restating a null hy-

pothesis in terms of a set of permutations of the data, such that if the null hypothesis is true, then the

distribution of the data will be invariant to that set of permutations. In other words, it relies on the as-

sumption of exchangeability of observations under the null hypothesis. Permutation-based inference are

often termed data-dependent tests, because the computed p-values are conditioned on the observed data.

They test are also termed distribution-free because they do not rely on assumptions about the parametric

distribution from which the data have been sampled. In this sense they do not rely on normality assumptions

mentioned previously. Because permutation tests give accurate p-values even when the sampling distribution

is skewed, they are often used when sample data is small and unlikely to fulfill normality conditions. Hayes

(1996) among others show the advantage of permutation tests over the classical approaches for the case of

small sample and non-normality of data.

If the null hypothesis of no treatment effect is true and there are no compromise in the randomization

protocol, then distribution of outcomes for treatments and controls should be the same. Notationally, the

random variable D that describes treatment assignment should be independent of outcome Y , or Y ⊥⊥ D.

Statistics based on assignments D and outcomes Y are thus distribution-invariant or exchangeable under

reassignment of treatment labels according to the randomization procedure. For example, under the null

hypothesis, the distribution of a statistic such as the difference in means between treatments and controls

should not change if treatment status is permuted across observations. The compromised aspects of ran-

domization requires more attention for the use of permutation approach, as the permutations should comply

10

27 / 65



2.C. Type-I Error in Multiple-Hypothesis Testing

The goal of multiple-hypothesis testing is to control familywiseerror rate (FWER) at a given level α:

FWER = P { reject any true null hypothesis } ≤ α

Strong FWER control: controls FWER regardless of whichsubset of the null hypotheses are actually true.

Weak FWER control: controls FWER only when all nullhypotheses are true.

28 / 65

The Stepdown Algorithm

We perform multiple-hypothesis testing using the stepdownalgorithm:

The Stepdown Algorithm:

1 Starts by targeting the most significant test statistic.

2 Decides if it is statistically significant or not.

3 “Steps down” to smaller, less significant test statistics

Less conservative than traditional Bonferoni and Holmprocedure by incorporating the dependence structure of the teststatistics.

Corrects for pre-testing, “Cherry Picking” or “Fishing”.

Romano and Wolf (2005) show that stepdown provides strongFWER control under a certain monotonicity of the teststatistics.

Table Example

Table 6: Earnings & Employment Outcomes (cont.), Females: Effects

Ctl.Mean

EffectOutcome Age

Uncond.a Cond.b

No Job in Past Year 19 0.58 -0.34 -0.37Jobless Months in Past 2 Yrs. 19 10.42 -5.20 -5.47

Current Employment 19 0.15 0.29 0.23Monthly Earn., Current Job 19 2.08 -0.61 -0.47

No Job in Past Year 27 0.54 -0.29 -0.25Current Employment 27 0.55 0.25 0.18

Monthly Earn., Current Job 27 1.13 0.69 0.48Jobless Months in Past 2 Yrs. 27 10.45 -4.21 -2.14

Yearly Earn., Current Job 27 15.45 4.60 2.18

No Job in Past Year 40 0.41 -0.25 -0.22Yearly Earn., Current Job 40 19.85 4.35 4.46

Monthly Earn., Current Job 40 1.85 0.21 0.27Jobless Months in Past 2 Yrs. 40 5.05 -1.05 1.05

Current Employment 40 0.82 0.02 -0.08



Table Notes

Monetary values adjusted to thousands of year-2006 dollarsusing annual national CPI.

Age-19 measures are conditional on at least some earningsduring the period specified — observations with zero earningsare omitted in computing means and regressions.

(a) Unconditional difference in means between the treatmentgroups;

(b) Conditional treatment effect with linear covariatesStanford-Binet IQ, socio-economic score (SES), maternalemployment, and father’s presence at study entry — this is alsothe effect for Freedman and Lane under the full linearityassumption (p-value is in column “Full Lin.”).

31 / 65

Table Example

Table 7: Earnings & Employment Outcomes (cont.), Females: p-Values

p-valuesOutcome Age

Naıvea FullLin.b

PartialLin.c

Part. Lin.(adj.)d

No Job in Past Year 19 .006 .007 .003 .010Jobless Months in Past 2 Yrs. 19 .054 .099 .020 .056

Current Employment 19 .023 .045 .032 .064Monthly Earn., Current Job 19 .750 .701 .725 .725

No Job in Past Year 27 .017 .058 .037 .094Current Employment 27 .036 .096 .042 .094

Monthly Earn., Current Job 27 .050 .144 .109 .188Jobless Months in Past 2 Yrs. 27 .077 .285 .165 .241

Yearly Earn., Current Job 27 .169 .339 .277 .277

No Job in Past Year 40 .032 .092 .056 .156Yearly Earn., Current Job 40 .251 .272 .224 .423

Monthly Earn., Current Job 40 .328 .316 .261 .440Jobless Months in Past 2 Yrs. 40 .343 .654 .528 .627

Current Employment 40 .419 .727 .615 .615



Table Notes

All p-values are one-sided and correspond to the null hypothesis of notreatment effect.

(a) Based on conditional permutation inference, without orbit restrictions orlinear covariates — estimated effect size in the “unconditional effect”column;

(b) Based on the Freedman-Lane procedure, without restricting permutationorbits and assuming linearity in all covariates (maternal employment,paternal presence, socio-economic status index (SES), and Stanford-BinetIQ) — estimated effect size in the “conditional effect” column;

(c) Based on the Freedman-Lane procedure, using the linear covariatesmaternal employment, paternal presence, and Stanford-Binet IQ, andrestricting permutation orbits within strata formed by socio-economicstatus index (SES) being above or below the sample median andpermuting siblings as a block;

(d) p-values from the previous column, adjusted for multiple inference usingstepdown procedure.

33 / 65


3. Consequences of Early Childhood Intervention

34 / 65


Empirical Findings: Cognitive Tests

Treatment effect on full-scale IQ fades out

Treatment effects persist in achievement tests

Treatment effects are stronger for females

35 / 65

Figure 3: Stanford-Binet IQ, Males

79.2 94.9 95.4 91.5 91.1 88.3 88.4 83.7

77.8 83.1 84.8 85.8 87.7 89.1 89.0 86.0

75

80

85

90

95

100

105

Treatment

Control

IQ

4 5 6 7 8 9 10EntryAge

Treatment

Control

Treatment effect on IQ fades out

Figure 4: Stanford-Binet IQ, Females

80.0 96.4 94.3 90.9 92.5 87.8 86.7 86.8

79.6 83.7 81.7 87.2 86.0 83.6 83.0 81.8

75

80

85

90

95

100

105

Treatment

Control

IQ

4 5 6 7 8 9 10EntryAge

Treatment

Control

Treatment effect in IQ fades out

Figure 5: Total CAT Scores by Treatment Group, Age 14

010

2030

40

010

2030

400 20 40 60 80 100 0 20 40 60 80 100

Control TreatmentN

umbe

r of S

ubjec

ts

Percentile CAT = California Achievement Test.Treatment: N=49 (9 missing). Control: N = 46 (19 missing).

Perry Age 14 Total CAT Scores, by Treatment Group

CAT = California Achievement Test

Treatment: N = 49 (9 missing); Control: N = 46 (19 missing)

Statistically significant treatment effect for both males and females (p-values 0.009 and0.021, respectively)

Figure 6: Test of Cognitive Skills, Age 28

5.5

6.1

5.3

5.2

Writing (28)

Problem Solving (28)

Male APL Mean Scores

4.9

5.1

0.0 2.0 4.0 6.0 8.0

Reading (28)

Ctr. Mean. Tr. Mean.

6.1

6.0

5.7

5.1

Writing (28)

Problem Solving (28)

Female APL Mean Scores

5.5

4.3

0.0 2.0 4.0 6.0 8.0

Reading (28)


Rodrigo

Text Box

**

Rodrigo

Text Box

*

Rodrigo

Text Box

*

Rodrigo

Text Box

**

Cognitive Tests

Table 8: P-values for APL Scores, Ages 19 & 27

Age 19 Age 27Measure Male Female Male Female

Community Resources .566 .040 .814 .318Occupational Knowledge .004 .004 .194 .034

Consumer Economics .174 .217 .141 .650Health .058 .123 .088 .005

Government & Law .056 .494 .227 .213ID of Facts and Terms .175 .115 .690 .135

Reading .053 .019 .665 .013Writing .067 .093 .158 .098

Computation .068 .439 .244 .664Problem Solving .292 .091 .019 .053

Joint Test .032 .027 .101 .034

p-values are from a one-sided test of treatment effect using Mann-Whitney U-statistics,conditioning on maternal employment and paternal presence, and restricting permutations onSES and IQ percentiles, maternal employment, and permuting siblings as a block.


Empirical Findings: Education

Treated females excel at schooling

No significant schooling results for males

41 / 65

Figure 7: Schooling Achievement

0.39

0.48

0.33

0.51

Vocational Training (40)

HS Graduation (19)

Male Schooling

0.39

0.39

0.38

0.00 0.20 0.40 0.60 0.80 1.00

Some College (40)



0.24

0.84

0.08

0.23


HS Graduation (19)

Female Schooling

0.56

0.24

0.38

0.00 0.20 0.40 0.60 0.80 1.00

Some College (40)



Rodrigo

Text Box

***

Rodrigo

Text Box

***

Rodrigo

Text Box

***


Empirical Findings: Education

Three varieties of educational outcome are tested:

High-school graduation, GED attainment, and collegeattendance;

primary and secondary school outcomes; and

primary school problem indicators.

All of these groups of outcomes show a pattern of significantresults for females at multiple ages and no or few significantresults for males.

43 / 65


Empirical Findings: Employment and Income

Treated adults earn more.

Treated adults experience more employment.

Females show employment effects earlier than males.

44 / 65

Figure 8: Employment Status

0.55

0 56

0.41Current Employment (19)

Male Employment

0.7

0.6

0.5

0.56

0 0.2 0.4 0.6 0.8 1

Current Employment (40)



0.44

0 55

0.15Current Employment (19)

Female Employment

0.84

0.8

0.82

0.55

0 0.2 0.4 0.6 0.8 1




Rodrigo

Text Box

**

Rodrigo

Text Box

**

Rodrigo

Text Box

**


Economic Outcomes: Property, Financial services andWelfare

Treated Male outperform control males.

Weak evidence for females.

46 / 65


Empirical Findings: Crime

Treated adults commit less crime

Greater reduction in crime for males

47 / 65

Figure 9: Arrest Counts

0.21

0.76

0.08

0.72

Never Arested (20-27)

Never Arrested (up to 19)

Male Crime Measures

0.39

0.21

0.15

0.00 0.20 0.40 0.60 0.80 1.00

Never Arrested (28-40)


0.48

0.96

0.46

0.79

Never Arrested (20‐27)

Never Arrested (up to 19)

Female Crime Measures

0.520.50

0.00 0.20 0.40 0.60 0.80 1.00

Never Arrested (28‐40)


Rodrigo

Text Box

**

Rodrigo

Text Box

**

Rodrigo

Text Box

**


4. Beyond Treatment Effects

1 How do cognitive and noncognitive skills change in response toexperimental investment?

2 How do these changes in skills translate into life outcomes?

49 / 65


Mechanism through which Early Childhood Interventionswork

1 Treatment enhance Skills.

2 Some Skills are Unobserved abilities.

3 Skills can be proxied by cog. and noncog. Tests (Measures).

4 Skills affect all Outcomes, but in different ways.

5 Treatment effects operates on outcomes through theseenhanced skills.

50 / 65


Estimating the Model

1 We estimate the effects of the experimental intervention oncognitive and noncognitive capabilities (Measurement System).

2 Then, we estimate the mediating effects of these capabilities onlater life outcomes.

51 / 65


Our measures of Cognitive and Non-cognitive Traits

Human abilities cannot be summarized by a single measure.

52 / 65


The Big Five and Digman’s Personality Traits

The Big Five taxonomy includes openness (O),conscientiousness (C), extraversion (E), agreeableness (A) andneuroticism (N).

The Big Five traits are linked to two robust higher-order traitsnamed “α” (CAN) and “β” (EO) in Digman’s (1997)hierarchical model of personality.

We use Perry non-cog. measures to map Digman’s “α”(Personal Behavior) and “β” (Social Development).

We use IQ scores as measures of cognition.

53 / 65


Table 9: Description of Measures

Factors # of Measures Description of Measures

Stanford‐Binet IQ;

Cognitive 3 Peabody Picture Vocabulary Test (PPVT);

Leiter International Performance Scale.

PBI absenses or truancies;

Personal Behavior 4 PBI lying or cheating;

PBI steals;

PBI swears or uses obscene words.

YRS social relationship with classmates;

Social Development 3 YRS social relationship with teachers;

YRS extraversion.

Description of Measures

54 / 65


Figure 10: General Intelligence Factor Scores, Kernel Density

pm = .360; pv = .718 pm = .011; pv = .068

0.1

.2.3

.4

−3 −2 −1 0 1 2 3

control treatment

P=0.360

®

(a) Males

0.1

.2.3

.4.5

−3 −2 −1 0 1 2 3

control treatment

P=.011

®

(b) Females

55 / 65


Figure 11: Personal Behavior Factor Scores, Kernel Density

pm = .034; pv = .864 pm = .064; pv = .176

0.1

.2.3

.4

−3 −2 −1 0 1 2 3

control treatment

P=0.034

®

(a) Males

0.1

.2.3

.4.5

−3 −2 −1 0 1 2 3

control treatment

P=.064

®

(b) Females

56 / 65


Figure 12: Social Development Factor Scores, Kernel Density

pm = .128; pv = .366 pm = .003; pv = .068

0.1

.2.3

.4

−3 −2 −1 0 1 2 3

control treatment

P=.128

®

(a) Males

0.2

.4.6

−3 −2 −1 0 1 2 3

control treatment

P=.003

®

(b) Females

57 / 65


Outcome Equation

The vector θ = [θc , θb, θs ]′ represents a participant’s levels oflatent abilities.

D denotes the treatment status, that is, D = 1 if a person is inthe treatment group, and D = 0 otherwise.

Let θD be the capabilities associated with treatment D.

Let K = {1, · · · ,K} be a set of indices of all K outcomes,Yk ; k ∈ K.

Yk,D = τk,D + αk · θD + βk · XD + εk,D , (1)

58 / 65


Decomposing Treatment Effects Into Mediating Variables

The mean of the conditional treatment effect for outcome kadjusting for X :

∆k | X ≡ E (Yk,1 − Yk,0)− βk · E (X1 − X0) (2)

Contributions due to various factors:

∆k |X =αckE (θc

1 − θc0)︸︷︷︸

cognition

+αbkE (θb

1 − θb0 )︸︷︷︸

personal behavior

+ αskE (θs

1 − θs0)︸︷︷︸

social development

+ (τk,1 − τk,0)︸︷︷︸other factors

. (3)

59 / 65


Figure 13: Decomposition of Treatment Effects, Males, Part I

Total charges of all crimes age 40 (-)

Ever on welfare, age 40 (-)

# of adult arrests, age 27 (-)

# of felony arrests, age 27 (-)

No tobacco use, age 27 (+)

Monthly earnings, age 27 (+)

CAT total*, age 14 (+)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Total # of lifetime arrests, age 40 (-)

Total charges of all crimes, age 40 (-)



# of felony arrests, age 27 (-)

No tobacco use, age 27 (+)


CAT total*, age 14 (+)

Cognitive Factor Personal Behavior Social Development Other Factors

60 / 65


Figure 14: Decomposition of Treatment Effects, Males, Part II

Total charges of property crimes with victim cost, age 40 (-)

Total charges of all crimes with victim cost, age 40 (-)

Total charges of violent crimes with victim cost, age 40 (-)

Total # of felony arrests, age 40 (-)

Total # of misdemeanor arrests, age 40 (-)

Total # of adult arrests, age 40 (-)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Any charge of a crime with victim cost, age 40 (-)

Total charges of property crimes with victim cost, age 40 (-)

Total charges of all crimes with victim cost, age 40 (-)

Total charges of violent crimes with victim cost, age 40 (-)

Total # of felony arrests, age 40 (-)




61 / 65


Figure 15: Decomposition of Treatment Effects, Females, Part I

Jobless for more than 1 year, age 27 (-)

Employed, age 27 (+)

Highest grade completed, age 19 (+)

High school graduate, age 19 (+)

Ever mentally impaired, age19 (-)

Any special education, age14 (-)

CAT total, age 14 (+)

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Had savings account, age 27 (+)


Jobless for more than 1 year, age 27 (-)

Employed, age 27 (+)

Highest grade completed, age 19 (+)

High school graduate, age 19 (+)

Ever mentally impaired, age19 (-)

Any special education, age14 (-)

CAT total, age 14 (+)


62 / 65


Figure 16: Decomposition of Treatment Effects, Females, Part II

Total # of lifetime arrest, age 40 (-)

Total charges of crimes, age 40 (-)

Months in marriages, age 40 (+)



0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%



Total # of lifetime arrest, age 40 (-)

Total charges of crimes, age 40 (-)

Months in marriages, age 40 (+)




63 / 65


5. Conclusions: Inference on Treatment Effects

5. Conclusions

1 Strong evidence that early childhood intervention matters.

2 Compromised Randomization and small sample problemsdemand a novel statistical approach.

3 These statistical challenges do not preclude the possibility ofdeveloping valid formal inferences.

4 Results survive the correction for the statistical features ofPerry data.

64 / 65


5. Conclusions: Beyond Treatment Effects

5. Conclusions

1 Our analysis decomposes Perry treatment effects on lifeoutcomes into components associated with increments in skills.

2 We contribute to an emerging literature in economicsdocumenting the impact of noncognitive skills in lifetimeoutcomes.

3 Intervention operates mainly through enhancing Noncognitiveskills.

65 / 65

Documents

Reanalysis of Perry Preschool Program