Upload
vodat
View
233
Download
1
Embed Size (px)
Citation preview
The Effect of Private Tutoring Expenditures on Academic
Performance of Middle School Students: Evidence from the
Korea Education Longitudinal Study
Deockhyun Ryu Changhui Kang†
Department of Economics Department of Economics
Chung-Ang University Chung-Ang University
Seoul, South Korea Seoul, South Korea
† Corresponding address: Department of Economics, Chung-Ang University, 221 Heukseok-Dong Dongjak-Gu,
Seoul 156-756, South Korea; E-mail: [email protected], Phone: +82-2-820-5862, Fax: +82-2-812-9718.
Abstract
In order to shed light on the effectiveness of educational inputs for student outcomes,this paper examines the effect of private tutoring expenditures on academic performanceof middle school students in South Korea. In the face of difficulties in causal estimation,the paper employs IV, first-difference (FD), propensity-score matching and nonparametricbounding methods. We apply these methods to a panel data set from South Korea, theKorea Education Longitudinal Study (KELS). The results show that the true effect of privatetutoring remains at most modest. IV (FD) estimates suggest that a 10 percent increase inexpenditure raises a test score by 0.029 SD or 1.06 percent (a 0.003 SD or 0.122 percent).Matching estimates imply that the same amount of increase in expenditure leads to a 0.21to 0.60 percent higher average test score. The tightest bounds of the effect of tutoring revealthat the same increase in expenditure improves the test score by a low of 0 to a high of 2.03percent, while statistical tests fail to rule out zero effects. Such modest effects of privatetutoring seem fairly comparable to the effects of public school expenditures on test scoresand earnings estimated by previous studies.
JEL Classification : I20, C30Keywords : Private Tutoring, Test Scores, IV, First-difference, Matching, Nonparametric
Bounds, South Korea
2
1 Introduction
The effectiveness of monetary educational expenditures for student academic performance is
one of the most controversial issues in educational research. Papers that summarize the debate
often support conflicting views on the gains from educational investments.
On the one hand, there are a long list of research on the effectiveness of public school ex-
penditures, which yields different conclusions (Betts, 1996; Card and Krueger, 1996; Hanushek,
1986, 1997, 2003; Krueger, 2003). Recent studies based on natural experiments or randomiza-
tion in developing countries continue to reveal conflicting evidence on the effectiveness of public
school inputs (Banerjee et al., 2007; Glewwe et al., 2004, 2007; Jacob and Lefgren, 2004; Lavy
and Schlosser, 2005; Leuven et al., 2007).
On the other hand, studies focusing on private schools (e.g., Catholic schools) do not agree
on the impacts of educational inputs as well (Altonji et al., 2005a, 2005b; Evans and Schwab,
1995; Figlio and Stone, 1999; Goldhaber, 1996; Neal, 1997). Research drawing on private school
voucher experiments in the U.S. and some Latin American countries reveals recent evidence on
impacts of private school attendance (Anand et al., 2009; Angrist et al., 2002; Krueger and Zhu,
2004; McEwan, 2004).
As a third line of research in which the current study also engages, there is a small but
growing literature investigating impacts of private tutoring on students. The studies also have
yet to show unambiguous evidence on causal impacts of private tutoring: Dang (2007), Dang
and Rogers (2008) and Ono (2007) support strong effects of private tutoring; in contrast, Briggs
(2001), Gurun and Millimet (2008) and Kang (2007) document negligible impacts of tutoring
and coaching on education outcomes.
Although informative for furthering our understanding about the effectiveness of educational
investments, recent studies on private tutoring face at least two limitations. First, studies are
based on questionable empirical methods for drawing causal effects of private tutoring. Either
they are based on IVs that are potentially correlated with the outcome, or they fail to explicitly
control for endogeneity of private tutoring.1 Second, the measures of educational outcomes1For example, Dang (2007) relies on a joint Tobit-ordered probit model that involves a fairly complicated
likelihood function whereby an identification of the effect of private tutoring comes from an instrumental variable(IV) of tutoring fees charged by the schools in the commune. As the author admits, such a variable is likelyto proxy for local living standards, which can be directly correlated with a student’s educational outcome. Ono(2007) examines the effect of ronin—spending additional years upon graduation of high school to enter prestigiouscolleges in Japan—on college quality, employing the average quality of colleges within the respondent’s prefecture
1
used by the studies are usually indirect and rough for deep understanding for causal impacts of
private tutoring.2
Given a room for improvement in research and the continuing interests in the effectiveness of
educational inputs for outputs, this paper contributes to the literature by examining the causal
effect of private tutoring expenditures on academic performance of middle school students in
South Korea.3 Sharing the idea of examining the effectiveness of educational inputs by looking
into private tutoring with previous studies, the current study adds to the literature in at least
two dimensions. First, the present study employs a more direct measure of the educational
outcome and more detailed information on private tutoring expenditures than earlier studies
on private tutoring have available. For empirical analysis we rely on a student’s longitudinal
information on test scores of each of three primary academic subjects (Korean, English and
math) and tutoring expenditures for each subject. Using data from the Korean Education and
Employment Panel (KEEP) that has information on overall tutoring expenditures and average
test scores of Korean, English and math subjects, Kang (2007) shows that the true effect of
private tutoring on total scores of college entrance test for high school students (grade 12)
remains at best modest in South Korea. Given his results on the impact of tutoring on twelfth-
graders, we also extend his study by focusing on middle school students in grades 7 to 9, since
impacts of tutoring can vary by the grade level.
Second, in order to overcome endogeneity of tutoring expenditures in estimations, the cur-
rent paper employs four different empirical methods that are frequently used in the recent
treatment effects literature to draw causal estimates: an instrumental variable (IV) method, a
first-difference (FD) method, a propensity-score matching method and a nonparametric bound-
ing method.
In an IV method we rely on whether a student is first-born in the family as an IV for the
of origin as an IV for the ronin status. As in Dang (2007), such a variable can also be related with a respondent’scollege quality as a proxy for unobservable student and family characteristics. Briggs (2001) addresses endogeneityof coaching for standardized admissions tests (SAT or ACT) in the U.S. by means of Heckman’s selection correctionmethods but it is unclear where an identification comes from besides functional form assumptions.
2Dang (2007) employs self-reported academic ranking (poor, average, good and excellent) in school; Ono (2007)the attending college’s mean score of entrance examinations; Gurun and Millimet (2008) whether to attend auniversity. Briggs (2001) and Kang (2007) alone rely on test scores to measure educational outputs— SAT andACT scores in the U.S. and College Scholastic Ability Test (CSAT) in South Korea, respectively.
3In response to a rigid public education system and a lack of independent private schools and in order tosupplement public school education, parents in South Korea spend a great deal of money on private tutoring fortheir children. For an overview of the education system and private tutoring markets in South Korea, see Kang(2007) and Kim and Lee (2001). For an alternative explanation about widespread private tutoring that focuseson the ability-mixing school system of Korea, see Hur and Kang (2007).
2
amount of tutoring expenditures for that student. A rationale behind this idea is that a student’s
first-born status is determined by the nature, while parents usually invest more for the first-
born’s education than for the later-born’s (Black et al. 2005). In an FD method we remove a
student’s unobservable heterogeneity by first-differencing test scores and tutoring expenditures
for a student. A panel structure of the data enables such a longitudinal method.
A propensity-score matching method attempts to construct a counterfactual observation(s)
of the treated observation based on the propensity score (Rosenbaum and Rubin 1983). By
averaging the difference between the treated outcome and counterfactual outcomes for all the
treated observations, the matching method yields causal estimates for the average treatment
effects on the treated (ATT). In the current paper, the treatment is an increasing level of private
tutoring expenditure.
Our fourth method, a nonparametric bounding method, is relatively new and recently gaining
popularity in empirical analysis (Blundell et al., 2007; Gerfin and Schellhorn, 2006; Gonzalez,
2005; Kreider and Pepper, 2007; Lechner, 1999; Manski, 1990, 1997; Manski and Pepper, 2000).
Instead of obtaining point estimates that often rely on questionable assumptions, the bounding
method attempts to calculate the lower and upper bounds of the average treatment effect given
a few weaker assumptions.
We apply such four empirical methods to a panel data set from South Korea, the Korea
Education Longitudinal Study (KELS), that has longitudinal information on private tutoring
expenditures and test scores of three primary subjects (Korean, English and math) for students
of grades 7 to 9. A common finding from the four empirical methods is that the true effect of
private tutoring remains at most modest. IV (FD) estimates suggest that a 10 percent increase
in expenditure raises the average overall score by 0.029 SD or 1.06 percent (0.003 SD or 0.122
percent); matching estimates imply that the same amount of increase in expenditure leads to
a 0.21 to 0.60 percent higher average test score; the tightest bounds of the effect of tutoring
reveal that a 10 percent increase in expenditure improves the average test score by a minimum
of 0 to a maximum of 2.03 percent, while statistical tests fail to rule out zero effects of tutoring.
Our current findings for the effect of private tutoring for middle school students generally agree
with the results of Kang (2007) for high school students in grade 12. There is no compelling
evidence that causal impacts of private tutoring are strong and differ by the grade level of the
student.
3
The rest of the paper is organized as follows. Section 2 outlines the empirical strategy of
the paper. Data are discussed in section 3; empirical results are shown in section 4. Section 5
concludes.
2 Empirical Framework
For empirical analysis we consider a value-added model of educational production function
expressed by:
yit = β0 + yit−1β1 + sitβ2 +Xitβ3 + αi + uit (1)
where yit is the Z-score of student i (i = 1, · · · , N) at year t (t = 2006, 2007) that is normalized
from the raw test score (Yit) to have mean zero and variance one in each grade sample; yit−1 is
a measure of i’s pre-determined academic capability that attempts to control for i’s unobserved
and unmeasured characteristics (e.g., cognitive abilities, motivation, perseverance, etc.); sit is
a logarized value of the average monthly expenditures on i’s tutoring at year t (Sit)4; Xit is
a vector of i’s personal and family backgrounds as well as school characteristics at t; αi is i’s
unobserved and unmeasured characteristics that are left uncontrolled by yit−1; and uit is the
random error term.
Provided that OLS estimates for β2 can be biased due to endogeneity of sit, we employ four
methods that can address the endogeneity problem. First, an IV method follows an idea of
Black et al. (2005, p.695), using an indicator of whether a student is first-born in the family
(Fi) as an IV for sit. A rationale behind this idea is that a student’s birth order is no doubt
determined by the nature, while parents usually invest more for the first-born’s education than
for the later-born’s. Given that Cov(Fi, sit) is usually positive, as also empirically found in the
current study, a key assumption for causal estimation in this method is that Cov(Fi, αi + uit)
is equal to zero. Although Fi per se is given exogenously and there are studies showing little
impact of birth order on a child’s educational outcomes (Retherford and Sewell 1991, Rodgers et
al. 2000), such an assumption may be too strong due to potentially non-zero Cov(Fi, αi), while
we may suppose Cov(Fi, uit) = 0. Even if it is difficult to assume Cov(Fi, αi) = 0, however, our4To deal with a zero spending in the log transformation, a value of 10 is added to every student’s raw value
of tutoring expenditure. The value of 10 is used since it is the smallest accounting unit reported in the survey(W10,000). Whether a smaller value (e.g., 1) is added to every expenditure or the level of raw values is employedrather than the log, the results are qualitatively similar.
4
reading of the literature suggests that there can be a smaller risk in supposing that Cov(Fi, αi)
is positive rather than negative for the following two reasons.
First, papers that report strong, if any, birth order effects usually show negative rather than
positive effects of birth order on intelligence (Bjerkedal et al. 2007, Black et al. 2007, Zajonc 1976,
Zajonc and Mullally 1997). Namely, intelligence of older siblings is either as high or higher than
that of younger siblings on average. It is a well-established empirical regularity that a child’s
high intelligence leads to high academic performance in school (Herrnstein and Murray 1994).
Second, previous empirical studies show that parents favor the first-born over the later-born with
respect to educational investments in general (Behrman and Taubman 1986, Black et al. 2005).
To the extent that parents favor the first-born in monetary educational investments, they will
tend to support the same child more over other educational dimensions as well, say, by providing
better emotional and non-financial supports for the first-born. Provided that Cov(Fi, αi + uit)
is likely to be positive, we can infer that our 2SLS estimates for β2 will overstate causal effects
of private tutoring on test scores rather than understate them. If we find that 2SLS estimates
for β2 fail to be substantially different from zero, for instance, we conclude that private tutoring
does not lead to a substantial improvement of a student’s academic performance.
As a second method of our estimation, we employ an FD method, removing a student’s
time-invariant characteristics by differencing within i. As long as non-zero Cov(Fi, αi) primarily
yields biased estimates for β2, first-differencing is a method that can address an endogeneity
problem in a longitudinal context. To the extent that Cov(sit − sit−1, uit − uit−1) = 0, which is
more justifiable than Cov(sit, αi + uit) = 0 or Cov(Fi, αi + uit) = 0, FD estimates for β2 can
deliver causal estimates for the effect of private tutoring.
The third method of estimation is a propensity-score matching method, which is popular in
recent microeconometric evaluation literature (Heckman et al., 1999; Smith and Todd, 2005).5
For subsequent use, let us discretize the level of tutoring expenditures and define Ti as a treat-
ment indicator that is equal to zero if the average monthly expenditure on tutoring (Si) is equal
to zero; one if it is greater than zero but less than or equal to H1; and two if it is greater than
H1. In empirical analysis below, we set H1 equal to KRW 200,000 (US$195.3) for average over-
all tutoring expenditures for three subjects, KRW 30,000 (US$29.3) for average expenditures5In the matching method as well as the IV method and a bounding method presented shortly, we treat the
data as pooled cross-sectional data.
5
for Korean alone, and KRW 90,000 (US$87.9) for average expenditures for each of English and
math alone.6 Each student receives treatment t ∈ T = {0, 1, 2}. Since there are three discrete
levels of tutoring expenditures in the current estimation, we rely on Lechner (2001) who has
developed general propensity-score matching methods for more than two mutually exclusive
treatments. What follows heavily draws on Larsson (2003) and Lechner (2001, 2002).
Given three different levels of expenditures or treatments ({0, 1, 2}), we denote potential
outcomes by {y0, y1, y2}. For each student, only one outcome can be observable in the data
and the others are counterfactuals. Our evaluation problem is to estimate the average impact
of treatment m compared to treatment l for combinations of m, l ∈ {0, 1, 2} (m > l). More
formally, the outcome of interest is:
θm,l0 = E(ym − yl|T = m) = E(ym|T = m)− E(yl|T = m).
Here, θm,l0 is the multiple-treatment version of the average treatment effect on the treated
(ATT), which denotes the average treatment effect of treatment m relative to treatment l for
participants in treatment m. To the extent that E(ym|T = m) is easily constructed from the
data, matching methods attempt to construct the unobservable counterfactual E(yl|T = m)
under some assumptions. A key assumption employed in the matching literature is the Condi-
tional Independence Assumption (CIA), which states that treatment assignment and potential
outcomes are independent conditional on a set of the individual’s observable characteristics
W ≡ (X, yt−1). With q meaning independence, it is formally expressed as:
y0, y1, y2 q T | W
Extending Rosenbaum and Rubin (1983) into the multiple treatments framework, Lechner
(2001) shows that it is not necessary for matching to condition on multidimensional W but only
to condition on the participation probability conditional on W (the propensity score). Hence,
E(yl|T = m) = EP
m|ml(W )
[E(yl|Pm|ml(W ), T = l) | T = m]
6Such thresholds are arbitrary. As robustness checks, we construct two alternative Ti’s by employing differentvalues as a new threshold between 1 and 2 of Ti. The results based on such thresholds are qualitatively similarto those reported in the current paper. The alternative results are available upon request.
6
where Pm|ml(w) = Pm|ml(T = m|T = l or T = m, W = w).
We employ a matching protocol for the estimation of θm,l0 suggested in Lechner (2001, Table
1) as follows:
Step 1 Estimate the conditional probabilities on the subsample of participants in m and l to
obtain propensity score Pm|ml(w).
Step 2 For a given value of m and l, the following steps are performed:
1. Choose one observation in the subsample defined by participation in m and delete it
from that subsample.
2. Find an observation in the subsample of participants in l that is as close to the one
chosen in step 2.(1) in terms of Pm|ml(w). Do not remove that observation to be
used again.
3. Repeat (1) and (2) until no participant is left in subsample m.
4. Using the matched comparison group formed in (3), compute the sample mean
E(yl|T = m) as an estimate for E(yl|T = m).
Step 3 Repeat step 2 for all combinations of m and l.
Step 4 Compute the estimate of θm,l0 by θm,l
N = E(ym|T = m) − E(yl|T = m). Obtain the
standard error of θm,lN by generating 500 bootstrap samples.7
Along with estimates for ATT, we report the average treatment effect of treatment m relative
to treatment l for a participant drawn randomly from the population (ATE). According to
Lechner (2001), the ATE is calculated as follows:
γm,lN =
2∑j=0
[(E(ym|T = j)− E(yl|T = j)
)· P (T = j)
]
Such an ATE is comparable to the average treatment effects drawn from a bounding method
presented below.
The fourth method of estimation is a nonparametric bounding method that recently receives
attention in empirical analysis. The goal is to calculate the lower and upper bounds of the7A STATA software called PSMATCH2 that was developed by Leuven and Sianesi (2003) is employed for
matching estimations.
7
average treatment effect given a few assumptions. In the following presentation of the method,
we heavily draw on Gonzalez (2005), Kang (2007), Manski (1990) and Manski and Pepper (2000)
among others.
Let us define the response function yi(·) : T → Y , which maps treatments into outcomes.
The realized outcome y ≡ y(z) is the level of y for a student who actually receives treatment
z. The latent outcome y(t) (t 6= z) describes what level of performance the student would have
achieved had he or she received treatment t.
In order to set up bounds for the treatment effects, we first decompose E[y(t)] by
E[y(t)] = E[y|z = t]Pr(z = t) + E[y(t)|z 6= t]Pr(z 6= t) (2)
To make bounds analysis feasible, let us suppose that y is bounded by [K0, K1].8 Since the
unobservable counterfactual E[y(t)|z 6= t] is bounded by [K0, K1], we have the worst-case (WC)
bounds of E[y(t)] given by
E[y|z = t]Pr(z = t) +K0Pr(z 6= t)
≤ E[y(t)] ≤
E[y|z = t]Pr(z = t) +K1Pr(z 6= t)
(3)
In order to further tighten the bounds of E[y(t)], assumptions are introduced below individ-
ually as well as jointly. The first assumption is monotone treatment response (MTR), which is
specified as follows9:
tl < tm −→ y(tl) ≤ y(tm) (4)
Another assumption to be employed is monotone treatment selection (MTS), which is spec-8In fact, however, specific values of K0 and K1 make no difference in our reported results, because the current
paper examines the MTR+MTS and MIV+MTR+MTS bounds alone, which are not a function of K0 and K1.Estimated bounds based on other assumptions are available upon request.
9This assumption is drawn from a theory that there will be non-negative impacts of increased educationalspending on a student’s academic performance. A majority of empirical studies support the validity of such anassumption. Although the exact magnitude of a positive effect of educational spending on a student’s performanceis hardly agreed, it is rare that studies find strong negative impacts of monetary educational investments (seeHanushek (1997, 2003); an exception is a study of Leuven et al. (2007)).
8
ified by10:
tl < tm −→ E[y(t)|z = tl] ≤ E[y(t)|z = tm] (5)
While it specifies a source of endogeneity in conventional OLS methods of examining the
impacts of educational investments, the MTS assumption can make an important contribution
to tightening the bounds of the true effect in combination with MTR. Joining MTS with MTR,
we can obtain the MTR+MTS bounds of E[y(t)] given by
∑h<tE(y|z = h)Pr(z = h) + E(y|z = t)Pr(z ≥ t)
≤ E[y(t)] ≤∑h>tE(y|z = h)Pr(z = h) + E(y|z = t)Pr(z ≤ t)
(6)
One can further tighten the preceding MTR+MTS bounds, if she find an IV υ that satisfies
mean independence—E[y(t)|υ = u1] = E[y(t)|υ = u2] where u1 6= u2, Under mean indepen-
dence, the expected test score of students with υ = u1 is equal to that of students with υ = u2
for any given level of private spending. In practice, however, finding such an IV is extremely
difficult. As an alternative, Manski and Pepper (2000) proposes a monotone IV that satisfies
mean monotonicity—E[y(t)|υ = u1] ≤ E[y(t)|υ = u2] if u1 < u2. Under mean monotonicity, it
is sufficient that the expected test score of students with υ = u1 is less than or equal to that
of students with υ = u2. We use a first-born indicator Fi as such a monotone IV. Namely, we
suppose that for a given level of tutoring expenditure first-born students (Fi = 1) on average
performs as well or better than students who are later-born in the family (Fi = 0).
Combining MIV with MTR+MTS, the MIV+MTR+MTS bounds of E[y(t)] are given by
∑u∈F Pr(F = u)·{supu1≤u[
∑h<tE(y|F = u1, z = h)Pr(z = h|F = u1) + E(y|F = u1, z = t)Pr(z ≥ t|F = u1)]
}≤ E[y(t)] ≤∑
u∈F Pr(F = u)·{infu2≥u[
∑h>tE(y|F = u2, z = h)Pr(z = h|F = u2) + E(y|F = u2, z = t)Pr(z ≤ t|F = u2)]
}(7)
We calculate conditional expectations, E[y(t)|·], nonparametrically by relying on local lin-10This assumption supposes that sorting into treatment is not exogenous but monotone in the sense that the
average latent outcome y(t) is greater for those students whose parents spend a large amount of money on privatetutoring (z = tm) than for those whose parents spend a small amount (z = tl, tl < tm). For instance, high incomeparents are more likely to spend a large amount of money on private tutoring for their child than low-incomeparents, while children of high income parents tend to be more academically able and smarter than those oflow-income parents (see, e.g., Haveman and Wolfe (1995)).
9
ear regression (Fan 1992) in which the control variable is a logarized value of i’s family income
and E[y(t)|·] is evaluated at its mean value. In section 4 we report estimated bounds under
MTR+MTS and MIV+MTR+MTS assumptions alone, suppressing those under other assump-
tions for a terse presentation. Given the bounds of E[y(t)] under varying assumptions, the lower
bound (LB) of an average treatment effect (ATE), E[y(tm) − y(tl)] (tm > tl), is calculated by
the difference between the lower bound of E[y(tm)] and the upper bound of E[y(tl)]; the upper
bound (UB) of ATE is obtained by the difference between the upper bound of E[y(tm)] and the
lower bound of E[y(tl)]. Along with the bounds of ATEs are calculated bootstrap 5th and 95th
percentiles of the lower and upper bound, respectively. The interval between these percentiles
shows a conservative 90% confidence interval for the true effect of private tutoring. The number
of the bootstrap samples is 50.
3 The Data - The Korea Education Longitudinal Study (KELS)
3.1 Description of the Main Sample
For empirical analysis the current study employs the Korea Education Longitudinal Study
(KELS). KELS is an annual longitudinal survey that is conducted from year 2005 by the Ko-
rea Educational Development Institute (KEDI)—a government-funded research institute. The
basic structure of KELS follows the National Educational Longitudinal Studies (NELS 88 and
ELS:2002) of the U.S.
The beginning cohort of KELS consists of 6,908 students in grade 7—the first year of middle
school in Korea—in 2005. The sample of the students and schools is drawn by a stratification
method to reflect the national population of 703,914 seventh graders in 2,929 middle schools.
More specifically, at first 150 schools are selected nationwide in consideration of the regional
distribution of schools and students. In each school 50 students are drawn at random, while all
students are drawn if the school is attended by less than 50 students. Each of the sampled stu-
dents is administered a series of personal, family and school-related questionnaires. In addition,
students’ homeroom teachers, school principals and parents are separately surveyed to collect a
range of background information on the sampled student.
In each wave of KELS student academic performance is measured by achievement tests for
three subjects: Korean, English and mathematics. The test score of each subject is scaled
10
from 0 (lowest) to 100 (highest). In the subsequent analysis the raw score of each subject is
normalized to have mean zero and variance one. In addition, we calculate a simple average of
the three subject scores—or two subject scores if only two are available. This average score is
also normalized to have mean zero and variance one.
Another important feature of the KELS data is the availability of detailed information on
a student’s private tutoring experience and tutoring expenditures by parents, and the sibling
composition from the parent questionnaire. It enables us to construct main explanatory variables
and the (monotone) instrumental variable of this study. As regards private tutoring for a
student, the parents are asked to report monthly average expenditures on private tutoring
for each subject of Korean, English and math during the survey year. Such expenditures are
reported not only for each subject, but for each type of tutoring methods, such as tutoring in
hakwons (private tutoring institutions), tutoring by individual tutors, tutoring via the Internet
and the broadcasting media, etc. Our measure of private tutoring expenditures is an overall
sum of expenditures for all these types. In the analysis we use either total expenditures on three
subjects as a whole, or expenditures on each subject.
For the current study we employ the first three waves (years 2005 to 2007) of KELS. Since
the test scores of the first wave are employed as a measure of a student’s pre-determined quality
(yt−1), we use test scores and information of the students collected in waves 2 and 3 more
extensively. In the raw samples, there are 6,538 and 6,310 valid test scores of the students
in waves 2 and 3, respectively. After removing observations that have missing values for the
variables used in regressions, we secure a total of 9,461 valid observations from 5,425 individual
students for further analysis. 50.4 percent of the observations are drawn from wave 2; the
remainder is from wave 3. Descriptive statistics of the analysis sample are documented in Table
1.
INSERT TABLE 1 HERE.
3.2 Descriptive Statistics
The mean raw score of Korean, English and math is 59.7, 56.5 and 52.5, respectively. The
mean of the average raw score of the three subjects is 56.2. Here, differences in the number of
observations across subjects are due to slightly different degrees of score availability. Since the
11
average score is calculated on the basis of two or three subject scores, the number of observations
is the largest for the average score, while it is smaller for the individual subject scores. The
same is true for tutoring expenditures and prior scores (yt−1) in the table.
If we compare test scores by birth order, the mean scores of Korean, English and math
among the first-born significantly exceed those of the later-born students. And the mean of the
average raw score is also significantly greater for the first-born (58.5) than for the later-born
(53.9). Yet, it is not clear whether these differences between the two groups are causally created
by variations in tutoring expenditures.
As for private tutoring expenditures, parents spend more on tutoring for the first-born than
for the later-born. While the overall average monthly spending on private tutoring is about
W178,900—approximately $174.6, the average spending for the first-born (W200,400) is 27
percent greater than that for the later-born (W157,400). This amount of gap is significantly
different from zero. The proportion of those who have received private tutoring—those with
positive average monthly spending—is also far higher among the first-born (72.3 percent) than
among the later-born (62.7 percent).
When tutoring experiences are subdivided into each of the three individual subjects, English
and math are primary subjects of private tutoring. While parents spend a monthly average of
W32,700 for Korean tutoring on average, they expend more than twice as much for each of En-
glish (W79,460) and math (W80,340) tutoring. For each subject, parents expend a significantly
greater amount on tutoring for the first-born than for the later-born.
Besides in terms of current academic performance and tutoring expenditures, the first-born
students have a pre-determined quality significantly greater than the later-born. If we proxy a
student’s pre-determined quality by the test score of the year before, the mean of the average
score as well as that of individual subject scores are significantly higher for the first-born than
for the later-born. The mean of the raw average score of the three subject is 61.25 (0.186 in
normalized Z-score) for the first-born and 56.83 (−0.040 in Z-score) for the later-born. The
pattern that the first-born have a better pre-determined quality than the later-born remains
same if the mean of each subject score is employed. In addition, weekly hours of self-study
excluding private tutoring hours are also greater for the first-born than for the later-born.
If we examine other variables, the first-born tend to have background characteristics more
favorable for academic performance than the later-born. For example, the first-born enjoy a
12
smaller sibling size in the family and greater average education level and income of the parents
than the later-born. The preceding comparisons of variables between the first-born and the
later-born cast doubt on the validity of the exogenous IV assumption; rather it supports the
validity of the mean monotonicity assumption. To the extent that mean monotonicity holds,
we can draw some useful information about the causal effect of private tutoring expenditures
from 2SLS estimations as well as from the bounding estimations.
4 Estimation Results
4.1 OLS, 2SLS and FD Results
Table 2 presents OLS, 2SLS and FD estimation results when we employ normalized average
scores of the three subjects for yit and total expenditures on three subjects for sit. Corresponding
estimates based on test scores and expenditures of each subject are reported in Table 3. In both
tables, the square bracket under each of the estimates for β2 shows a percent change in test
score due to a 10 percent increase in tutoring expenditure, which is evaluated at the mean of
the raw test score.
INSERT TABLE 2 and TABLE 3 HERE.
From Table 2, the OLS estimate for β2 in column (2) suggests that the association between
the tutoring expenditure and normalized average test score is positive but quite small in mag-
nitude, although it is statistically significantly different from zero. A 10 percent greater overall
expenditure on private tutoring is related to no more than a 0.006 SD higher test score. Such a
magnitude implies that a 10 percent greater expenditure is associated with only a 0.211 percent
higher test score. Such an association, however, may not be consistent and causal due to endo-
geneity of sit. Depending on the value of Cov(sit, αi + uit), the OLS estimate may be biased
either upward or downward.
Other estimates in column (2) show expected signs. For example, hours of self-study, average
test scores of the previous year, intact family (as opposed to single or divorced parents) and
parents’ average education are positively related to a student’s performance; female students
have higher scores than male students; parents’ average age and no-religion are positively related
13
to test scores. The number of children, being handicapped and family income, however, fail to
show strong associations with academic performance.
The first-stage results of the 2SLS regression of tutoring expenditures on the IV and explana-
tory variables are presented in column (1) of Table 2. As expected, being first-born significantly
increases private tutoring expenditures for a student. First-born students receive on average
a 17.6 percent greater expenditure on tutoring than later-born students. Such an amount is
significantly different from zero. According to Stock et al. (2002), Fi is a strong IV for tutoring
expenditures for a student, since the F-statistic for the IV (32.97) greatly exceeds proposed
thresholds of weak IVs (e.g., 16.38).
The 2SLS estimates for equation (1) are shown in column (3) of Table 2. The estimate for
β2 suggests that a 10 percent increase in expenditure enhances a student’s performance by 0.029
SD. Evaluated at the mean value of the test score, such an estimate implies that a 10 percent
increase in expenditure raises the average test score by 1.06 percent. Although the estimate is
significantly different from zero, the magnitude of the effect of private tutoring does not seem
to be large. Such a magnitude is much smaller than the amount of improvement in test score
(2.8 to 3.6 percent) due to a 10 percent increase in per-pupil expenditure suggested by Krueger
(2003). It is more analogous to the effect sizes suggested by Guryan (2003) in terms of test
scores (0.77 to 1.15 percent), and by Card and Krueger (1996) and Grogger (1996) in terms
of labor market earnings (0.7 to 1.1 percent). Furthermore, to the extent that there exists a
potentially positive rather than negative correlation between Fi and αi as is implied by the mean
monotonicity assumption, our 2SLS estimate is more likely to even overstate the true effect of
private tutoring rather than understate it.
A weak effect of private tutoring is also found in the FD estimation. The estimate in column
(4) suggests that a 10 percent increase in expenditure leads to nothing but a 0.003 SD (or 0.122
percent) higher average test score. In sum, the 2SLS and FD estimates suggest small (or modest
at best) effects of private tutoring on academic performance. Our current findings for the effect
of private tutoring for middle school students are in line with the results that Kang (2007) finds
for high school students in grade 12.
If we disaggregate tutoring expenditures and test scores by the subject, the patterns of weak
effects of private tutoring changes little. In Table 3, 2SLS estimates suggest that a 10 percent
increase in expenditure raises each test score of Korean, English and math by a 3.14, 1.25, and
14
1.28 percent, respectively. As explained earlier, such estimated effects are likely to be overes-
timates rather than underestimates of the effect of private tutoring, however. While the 2SLS
estimate of β2 for Korean implies a non-negligible effect of tutoring, the FD estimate suggests
a negligible effect of it. Similar patterns hold for English and math subjects. The FD estimates
suggest that a 10 percent rise in expenditure increases each test score of Korean, English and
math by mere 0.14, 0.12, and 0.22 percent, respectively. Moreover, such FD estimates are fairly
precise as shown by small standard errors of the estimates.
4.2 Results of the Matching Method
Table 4 shows the estimation results of the matching method for the average score of the
three subjects as well as for each subject score. The estimates of average treatment effects on
the treated (θm,lN , ATT) are reported in columns (1) to (3); average treatment effects (γm,l
N ,
ATE) in columns (4) to (6). The elasticities of ATT and ATE that show a percent change in
test score due to a 10 percent increase in expenditure are presented below the ATT and ATE
estimates, respectively. The elasticity for ATT (and that for ATE analogously) is calculated by
the following formula:
10×θm,lN × SD × E(Si|t = m)
Sml × E(Yi|t = m)(8)
where SD is a standard deviation of the raw test score before normalization (Y )—20.7 for
overall average, 19.7 for Korean, 25.6 for English, and 25.4 for math; θm,lN is an estimated ATT
of E[ym − yl|T = m]; and Sml ≡ E(Si|t = m) − E(Si|t = l) (m > l; m, l = 0, 1, 2). Since
different treatment levels represent different expenditure levels, elasticities rather than ATT or
ATE estimates are a better measure of comparison. We will focus on the elasticities of ATE
estimates rather than those of ATT estimates, since the former are comparable to those obtained
in the bounding analysis.
INSERT TABLE 4 HERE.
Using the matching method, we also fail to find compelling evidence that an increase in
tutoring expenditure yields strong positive causal impacts on the test scores. As regards the
average score of the three subjects, the elasticities imply that a 10 percent increase in tutor-
ing expenditure raises the average test score by a 0.21 to 0.60 percent, depending on where
15
the effect is evaluated. When tutoring expenditures and test scores are disaggregated by the
subject, similar patterns emerge. While the estimates are less precise, a 10 percent increase in
expenditure for Korean tutoring raises the Korean test score by a 0.20 to 0.25 percent. Tutoring
for English and math seems to be a bit more effective but the effect sizes are analogous to the
overall effect. A 10 percent increase in expenditure for English (math) tutoring enhances the
test score by a 0.29 to 0.82 (0.69 to 1.02) percent. These estimates are quite precise as shown
by small standard errors of the estimates. Overall, the matching estimates also suggest that
causal impacts of private tutoring on the test scores seem to be modest.
4.3 Results of the Bounds Analysis
The estimated bounds of average treatment effects (ATE) of tutoring expenditures are presented
in Table 5. In order to gain perspective, in the right-most two columns of the table, we convert
the upper bound estimate and its 95th percentile into elasticities representing a percent change
in test score due to a 10 percent increase in expenditure. The elasticity is calculated by equation
(8).
INSERT TABLE 5 HERE.
As for the bounding results for the total expenditures and average score of the three subjects
shown in Panel A, it is difficult to conclude that private tutoring strongly improves a student’s
academic performance, whichever ATEs are employed for interpretations. MTR+MTS upper
bounds suggest that a 10 percent increase in expenditure raises a student’s test score at most
by a 1.59 to 2.13, depending on where the effect is evaluated. Since the lower bounds fail to
be significantly greater than zero, however, the estimated bounds do not rule out zero causal
effects, suggesting that the true effect of private tutoring is unlikely to be substantial.
Manski and Pepper (2000, p.1004) suggest an informal method to check the validity of the
joint MTR+MTS hypothesis. Under MTR+MTS, it should be satisfied that E[y|z = u] must
be a weakly increasing function of u, namely,
u′ ≤ u ⇒ E[y|z = u′] = E[y(u′)|z = u′]MTR≤ E[y(u)|z = u′]
MTS≤ E[y(u)|z = u] = E[y|z = u]
(9)
16
To examine the validity of the joint MTR+MTS hypothesis, in Table 6 we present sample
means of E[y(0)], E[y(1)] and E[y(2)] that are calculated nonparametrically as explained in
section 2. In each of the whole and individual subject samples, E[y|z = u] is increasing with u;
hence it seems unlikely that either MTS or MTR is violated.
The tightest MIV+MTR+MTS bounds in Panel A draw a largely similar picture that private
tutoring does not raise a student’s performance substantially. The upper bounds suggest that a
10 percent increase in expenditure improves average test score at most by a 1.52 to 2.03 percent.
The lower bounds, however, fail to rule out zero effects of private tutoring significantly.11
If we examine the bounding results for the expenditure and test score of each individual
subject in Panels B to D of Table 5, the MIV+MTR+MTS bounds also fail to reveal compelling
evidence that private tutoring is strongly effective for raising a student’s performance. While
the upper bounds of the effect of private tutoring drawn for Korean show negligible impacts
of private tutoring, those upper bounds obtained for each of English and math present non-
negligible effects of it. At the largest, a 10 percent increase in expenditure raises an English
test score by a 1.9 to 2.4 percent, and a math test score by a 2.2 to 2.8 percent. Nonetheless,
the lower MIV+MTR+MTS bounds fail to significantly exclude zero effects of tutoring. Hence,
a conservative interpretation would be that there is a lack of evidence that private tutoring
strongly enhances a student’s academic performance of Korean, English and math alike.
To summarize the preceding estimation results, causal impacts of private tutoring on student
academic performance seem to be modest at best. If we take 2SLS, FD and matching estimates
as point estimates of the causal impacts of private tutoring, a 10 percent increase in expenditure
raises the overall average score by 0.12 to 1.06 percent, a Korean subject score by 0.14 to 3.14
percent, a English subject score by 0.12 to 1.25 percent, and a math subject score by 0.22 to
1.3 percent. Moreover, such ranges of the effect of private tutoring in general remain within the
bounds of the ATEs estimated by the nonparametric bounding method.
In order to gain perspective of the effect sizes, Kang (2007) compares the empirical results
of the effects of private tutoring with estimates of the effect of public school expenditures on
student outcomes. Given that our current findings for middle school students are similar to the11Employing a different data set of South Korea (the Korean Education and Employment Panel, KEEP), Kang
(2007) implements a similar bounding method to examine the causal effect of private tutoring expenditures ontest scores. While lower limits of the MIV+MTR+MTS bounds fail to exceed zero, their upper limits suggestthat a 10 percent increase in spending raises a student’s test score by 0.53 to 0.76 percent (Table 6).
17
results of Kang (2007) for high school students in grade 12, his conclusion in general applies
here: the causal effects of private tutoring expenditures on academic test scores are fairly
comparable to the effects of public school expenditures (Betts, 1995; Card and Krueger, 1992,
1996; Grogger, 1996; Guryan 2003), while some estimates of the previous studies are slightly
above our estimated effects of private tutoring expenditures (e.g., Banerjee et al., 2007; Krueger,
1999).
5 Concluding Remarks
In order to shed light on the effectiveness of educational inputs for student outcomes, this paper
examines a relatively unexplored dimension of educational inputs—private tutoring expendi-
tures. In the face of difficulties in causal estimation, the paper employs IV, FD, propensity-score
matching and nonparametric bounding methods. With these methods we show that the true
effect of private tutoring remains modest at best. IV (FD) estimates suggest that a 10 percent
increase in expenditure raises average overall score by 0.029 SD or 1.06 percent (0.003 SD or
0.122 percent); matching estimates imply that the same amount of increase in expenditure leads
to a 0.21 to 0.60 percent higher average test score; the tightest bounds of the true effect of tu-
toring reveal that a 10 percent increase in expenditure improves the average test score by a low
of 0 to a high of 2.03 percent, while statistical tests fail to rule out zero effects of tutoring. Such
modest impacts of private tutoring, however, are fairly comparable to the effects of public school
expenditures on test scores and earnings estimated by previous studies. In addition, our current
findings for the effect of private tutoring for middle school students agree with the results that
Kang (2007) finds for high school students in grade 12. There is no compelling evidence that
causal impacts of private tutoring are strong and differ by the grade level of the student.
Kang (2007) proposes two potential explanations for modest impacts of private tutoring in
South Korea. We believe they also apply to the current context. First, overall quality of teachers
in the private tutoring sector may be responsible for small effects of private expenditures. In
Korea, full-time public school teachers are tenured up to 62 years of age and enjoy the same
employment benefits as government officials. In contrast, contracts of instructors in private
tutoring institutions (hakwons) are usually short-term in nature and fairly unstable as in other
private small firms. This will cause teachers’ quality in the private sector to be worse than
18
that in the public sector. Poor teacher quality does not lead to an improvement in student
performance.
Second, peer pressure among parents may explain the lack of the effect. When private
tutoring is a norm in parents’ peer groups, the decision to invest in children’s tutoring may be
based on a subjective/cultural belief about the effectiveness of private tutoring, or the concern
about their being viewed by the peers as neglectful of children’s education. If the decision about
tutoring is based on peer pressure, small effects of private tutoring will hardly be a big surprise.
Although we suggest a couple of potential explanations about weak impacts of private tutor-
ing in Korea, searching for empirical foundations for such suggestions as well as for alternative
explanations will become a useful undertaking for future research. In addition, while we find
weak effects of private tutoring in Korea, whether similar monetary educational investments
raise student educational outcomes in different countries and contexts also remains to be fur-
ther examined.
References
Altonji, J.G., Elder, T.E., Taber, C.R., 2005a. Selection on observed and unobserved variables:
Assessing the effectiveness of Catholic schools, Journal of Political Economy 113 (1), 151-
184.
Altonji, J.G., Elder, T.E., Taber, C.R., 2005b. An evaluation of instrumental variable strategies
for estimating the effects of Catholic schooling, Journal of Human Resources 40 (4), 791-
821.
Anand, P., Mizala, A., Repetto, A., 2009. Using school scholarships to estimate the effect of
private education on the academic achievement of low-income students in Chile, Economics
of Education Review, forthcoming.
Angrist, J.D., Bettinger, E., Bloom, E., King, E., Kremer, M., 2002. Vouchers for private
schooling in Colombia: Evidence from a randomized natural experiment, American Eco-
nomic Review 92 (5), 1535-1558.
19
Behrman, J.R., Taubman, P., 1986. Birth order, schooling, and earnings, Journal of Labor
Economics 4 (3), S121-S145.
Banerjee, A.V., Cole, S., Duflo, E., Linden, L., 2007. Remedying Education: Evidence from Two
Randomized Experiments in India, Quarterly Journal of Economics, 122 (3), 1235-1264.
Betts, J.R., 1995. Does school quality matter? Evidence from the national longitudinal survey
of youth, Review of Economics and Statistics 77 (2), 231-250.
Betts, J.R., 1996. Is there a link between school inputs and earnings? Fresh scrutiny of an old
literature, In: Burtless, G. (Eds.), Does Money Matter? The Effect of School Resources on
Student Achievement and Adult Success, Brookings, Washington, DC, pp. 141-191.
Bjerkedal, T., Kristensen, P., Skjeret, G.A., Brevik, J.I., 2007. Intelligence test scores and birth
order among young Norwegian men (conscripts) analyzed within and between families
Intelligence 35 (5), 503-514.
Black, S.E., Devereux, P.J., Salvanes, K.G., 2005. The more the merrier? The effect of family
size and birth order on children’s education, Quarterly Journal of Economics 120 (2), 669-
700.
Black, S.E., Devereux, P.J., Salvanes, K.G., 2007. Older and Wiser? Birth Order and IQ of
Young Men, NBER Working Paper 13237.
Blundell, R., Gosling, A., Ichimura, H., Meghir, C., 2007. Changes in the Distribution of Male
and Female Wages Accounting for Employment Composition Using Bounds, Econometrica
75 (2), 323-363.
Briggs, D.C., 2001. The Effect of Admissions Test Preparation: Evidence from NELS:88, Chance
14 (1), 10-18.
Card, D., Krueger, A.B., 1992. Does school quality matter? returns to education and the
characteristics of public schools in the United States, Journal of Political Economy 100 (1),
1-40.
20
Card, D., Krueger, A. B., 1996. School resources and student outcomes: An overview of the liter-
ature and new evidence from North and South Carolina, Journal of Economic Perspectives
10 (4), 31-50.
Dang, H., 2007. The determinants and impact of private tutoring classes in Vietnam, Economics
of Education Review, 26 (6), 683-698.
Dang, H., Rogers, F.H., 2008. How to interpret the growing phenomenon of private tutoring:
human capital deepening, inequality increasing, or waste of resources?, Policy Research
Working Paper Series of The World Bank, No. 4530
Evans, W.N., Schwab, R.M., 1995. Finishing high school and starting college: Do Catholic
schools make a difference?, Quarterly Journal of Economics 110 (4), 941-974.
Fan, J., 1992. Design Adaptive Nonparametric Regression, Journal of of the American Statistical
Association 87, 998-1004.
Figlio, D.N., Stone, J.A., 1999. Are private schools really better?, In: Polachek, S. (Eds.),
Research in Labor Economics, Vol.18, JAI Press, Stamford, Connecticut, pp. 115-140.
Gerfin, M., Schellhorn, M., 2006. Nonparametric bounds on the effect of deductibles in health
care insurance on doctor visits - Swiss evidence, Health Economics 15 (9), 1011-1020.
Glewwe, P., Kremer, M., Moulin, S., Zitzewitz, E., 2004. Retrospective vs. prospective analyses
of school inputs: the case of flip charts in Kenya, Journal of Development Economics 74,
251-268.
Glewwe, P., Kremer, M., Moulin, S., 2007. Many children left behind? Textbooks and test scores
in Kenya, NBER Working Papers No.13300, National Bureau of Economic Research, Inc..
Goldhaber, D.D., 1996. Public and private high schools: Is school choice an answer to the
productivity problem?, Economics of Education Review 15 (2), 93-109.
Gonzalez, L., 2005. Nonparametric Bounds on the Returns to Language Skills, Journal of Ap-
plied Econometrics 20 (6), 771-795.
Grogger, J., 1996. School expenditures and post-schooling earnings: Evidence from High School
and Beyond, Review of Economics and Statistics 78 (4), 628-637.
21
Gurun, A., Millimet, D.L., 2008. Does Private Tutoring Payoff, IZA Discussion Paper No. 3637,
The Institute for the Study of Labor (IZA).
Guryan, J., 2003. Does Money Matter? Estimates from Education Fianance Reform in Mas-
sachusetts, Mimeo.
Hanushek, E.A., 1986. The economics of schooling: Production and efficiency in public schools,
Journal of Economic Literature 24 (3), 1141-1177.
Hanushek, E.A., 1997. Assessing the effects of school resources on student performance: An
update, Educational Evaluation and Policy Analysis 19 (2), 141-164.
Hanushek, E.A., 2003. The failure of input-based schooling policies, Economic Journal 113 (485),
F64-F98.
Haveman, R., Wolfe, B., 1995. The Determinants of Children’s Attainments: A Review of
Methods and Findings, Journal of Economic Literature 33 (4), 1829-1878.
Heckman, J., LaLonde, R., Smith, J., 1999. The Economics and Econometrics of Active Labour
Market Programme, in Ashenfelter, O. and Card, D. (eds.), The Handbook of Labor Eco-
nomics, Volume III.
Herrnstein, R., Murray, C., 1994. The Bell Curve, The Free Press, New York.
Hur, J., Kang, C., 2007. Educational Implications of School Systems at Different Stages of
Schooling, Working Paper, available at
“http://prof.cau.ac.kr/∼ckang/papers/School%20Systems%20at%20different%20levels.pdf”.
Jacob, B.A., Lefgren, L., 2004. Remedial Education and Student Achievement: A Regression-
Discontinuity Analysis, Review of Economics and Statistics, 86 (1), 226-244.
Kang, C., 2007. The Effect of Private Tutoring Expenditures on Academic Performance: Evi-
dence from a Nonparametric Bounding Method, Working Paper, Department of Economics,
National University of Singapore.
Kim, S., Lee, J.-H., 2001. Demand for education and developmental state: Private tu-
toring in South Korea, Social Science Research Network Electronic Paper Collection:
http://ssrn.com/abstract=268284.
22
Kreider, B., Pepper, J.V., 2007. Disability and Employment: Reevaluating the Evidence in Light
of Reporting Errors, Journal of the American Statistical Association 102 (478), 432-441.
Krueger, A.B., 1999. Experimental estimates of education production functions, Quarterly Jour-
nal of Economics 114 (2), 497-532.
Krueger, A.B., 2003. Economic considerations and class size, Economic Journal 113 (485), F34-
F62.
Krueger, A., Zhu, P., 2004. Another look at the New York City school voucher experiment,
American Behavioral Scientist 47, 699-717.
Larsson, L., 2003. Evaluation of Swedish youth labour market programmes, Journal of Human
Resources 38 (4), 891-927.
Lavy, V., Schlosser, A., 2005. Targeted Remedial Education for Underperforming Teenagers:
Costs and Benefits, Journal of Labor Economics, 23 (4), 839-874.
Lechner, J., 1999. Nonparametric bounds on employment and income effects of continuous
vocational training in East Germany, Econometrics Journal 2 (1), 1-28.
Lechner, M., 2001. Identification and Estimation of Causal Effects of Multiple Treatments under
the Conditional Independence Assumption, in Lechner, M., Pfeiffer, F. (eds), Econometric
Evaluation of Labour Market Policies, Heidelberg: Physica/Springer, pp. 43-58.
Lechner, M., 2002. Program Heterogeneity and Propensity Score Matching: An Application to
the Evaluation of Active Labor Market Policies, Review of Economics and Statistics 84 (2),
205-220.
Leuven, E., Lindahl, M., Oosterbeek, H., Webbink, D., 2007. The Effect of Extra Funding for
Disadvantaged Pupils on Achievement, Review of Economics and Statistics 89 (4), 721-736.
Leuven, E., Sianesi, 2003. PSMATCH2: Stata module to perform full Mahalanobis and propen-
sity score matching, common support graphing, and covariate imbalance testing, available
at “http://ideas.repec.org/c/boc/bocode/s432001.html”.
Manski, C.F., 1990. Nonparametric Bounds on Treatment Effects, American Economic Review
80 (2), 319-323.
23
Manski, C.F., 1997. Monotone Treatment Response, Econometrica 65(6), 1311-1334.
Manski, C.F., Pepper, J.V., 2000. Monotone Instrumental Variables, with an Application to the
Returns to Schooling, Econometrica 68 (4), 997-1012.
McEwan, P.J., 2004. The Potential Impact of Vouchers, Peabody Journal of Education 79 (3),
57-80.
Neal, D., 1997. The effects of Catholic secondary schooling on educational achievement, Journal
of Labor Economics 15 (1), 98-123.
Ono, H., 2007. Does examination hell pay off ? A cost-benefit analysis of ”ronin” and college
education in Japan, Economics of Education Review, 26 (3), 271-284.
Retherford, R.D., Sewell, W.H., 1991. Birth order and intelligence: Further tests of the conflu-
ence model, American Sociological Review 56 (2), 141-158.
Rodgers, J.L., Cleveland, H.H., Oord, E. V.D., Rowe, D.C., 2000. Resolving the debate over
birth order, family size, and intelligence, American Psychologist 55 (6), 599-612.
Rosenbaum, P.R., Rubin, D.B., 1983. The Central Role of the Propensity Score in Observational
Studies for Causal Effects, Biometrika 70 (1), 41-55.
Smith, A.S., Todd, P.E., 2005. Does matching overcome LaLonde’s critique of nonexperimental
estimators?, Journal of Econometrics 125 (1-2), 305-353.
Stock, J.H., Wright, J.H., Yogo, M., 2002. A survey of weak instruments and weak identification
in Generalized Method of Moments, Journal of Business and Economic Statistics 20 (4),
518-529.
Zajonc, R.B., 1976. Family configuration and intelligence, Science 192 (4236), 227-236.
Zajonc, R.B., Mullally, P.R., 1997. Birth order reconciling conflicting effects, American Psy-
chologist 52 (7), 685-699.
24
Tab
le1:
Des
crip
tive
Stat
isti
csof
the
Sam
ple
(1)
(2)
Tota
lsa
mple
Fir
st-b
orn
Late
r-b
orn
Diff
eren
ce[(
1)-
(2)]
Vari
able
sN
Mea
nS.D
.M
ean
S.D
.M
ean
S.D
.M
ean
S.E
.T
-valu
e
Aver
age
score
of
thre
ete
sts
9461
56.2
20.7
58.5
20.8
53.9
20.4
4.5
43
0.4
23
10.7
3T
est
score
of
Kore
an
9420
59.7
19.7
61.7
19.5
57.6
19.7
4.0
51
0.4
03
10.0
4T
est
score
of
English
9436
56.5
25.6
59.1
25.6
53.8
25.3
5.2
69
0.5
24
10.0
6T
est
score
of
math
9433
52.5
25.4
54.6
25.5
50.3
25.1
4.3
48
0.5
21
8.3
4T
ota
ltu
tori
ng
exp
endit
ure
s(W
1,0
00)
9461
178.9
226.5
200.4
231.5
157.4
219.4
42.9
74
4.6
37
9.2
7A
ny
tuto
ring
(Yes
=1)
9461
0.6
75
0.4
69
0.7
23
0.4
48
0.6
27
0.4
84
0.0
96
0.0
10
10.0
3T
uto
ring
exp
endit
ure
sfo
rK
ore
an
8594
32.6
860.7
035.8
60.8
29.6
60.5
6.2
23
1.3
08
4.7
6T
uto
ring
for
Kore
an
(Yes
=1)
8594
0.4
25
0.4
94
0.4
68
0.4
99
0.3
82
0.4
86
0.0
86
0.0
11
8.1
1T
uto
ring
exp
endit
ure
sfo
rE
nglish
8799
79.4
6106.7
089.9
112.3
69.1
99.7
20.7
71
2.2
64
9.1
7T
uto
ring
for
English
(Yes
=1)
8799
0.6
58
0.4
75
0.7
11
0.4
53
0.6
04
0.4
89
0.1
07
0.0
10
10.6
5T
uto
ring
exp
endit
ure
sfo
rm
ath
8865
80.3
4110.0
989.6
110.4
71.0
109.0
18.6
71
2.3
30
8.0
1T
uto
ring
for
math
(Yes
=1)
8865
0.6
59
0.4
74
0.7
13
0.4
52
0.6
05
0.4
89
0.1
08
0.0
10
10.7
8P
rior
aver
age
score
9461
0.0
73
0.9
92
0.1
86
0.9
91
-0.0
40
0.9
80
0.2
26
0.0
20
11.1
4P
rior
score
of
Kore
an
9420
0.0
73
0.9
80
0.1
64
0.9
66
-0.0
17
0.9
86
0.1
81
0.0
20
9.0
1P
rior
score
of
English
9452
0.0
64
0.9
97
0.1
74
1.0
04
-0.0
45
0.9
78
0.2
19
0.0
20
10.7
4P
rior
score
of
math
9433
0.0
57
0.9
97
0.1
52
0.9
89
-0.0
37
0.9
96
0.1
89
0.0
20
9.2
3H
ours
of
self
-stu
dy
9461
5.5
87
5.1
61
6.0
07
5.2
85
5.1
67
4.9
99
0.8
41
0.1
06
7.9
5M
ale
(Yes
=1)
9461
0.5
04
0.5
00
0.4
88
0.5
00
0.5
21
0.5
00
-0.0
33
0.0
10
-3.1
8N
um
ber
of
childre
n9461
2.2
14
0.7
09
2.0
03
0.6
51
2.4
26
0.7
00
-0.4
23
0.0
14
-30.4
0H
andic
ap
(Yes
=1)
9461
0.0
35
0.1
84
0.0
38
0.1
91
0.0
32
0.1
76
0.0
06
0.0
04
1.5
7F
irst
-born
(Yes
=1)
9461
0.5
00
0.5
00
1.0
00
0.0
00
0.0
00
0.0
00
1.0
00
Inta
ctfa
mily
(Yes
=1)
9461
0.9
13
0.2
82
0.9
12
0.2
83
0.9
13
0.2
82
0.0
00
0.0
06
-0.0
8P
are
nts
’av
erage
age
9461
42.3
14.0
640.9
13.7
443.7
13.8
9-2
.799
0.0
78
-35.6
7P
are
nts
’av
erage
educa
tion
9461
12.9
02.2
313.1
72.1
212.6
42.3
10.5
36
0.0
46
11.7
8H
ave
religio
n(Y
es=
1)
9461
0.6
85
0.4
64
0.6
74
0.4
69
0.6
96
0.4
60
-0.0
22
0.0
10
-2.3
1F
am
ily
inco
me
(W1,0
00)
9461
3718.9
3095.9
3792.0
3097.6
3645.7
3092.8
146.2
63.6
42.3
0Surv
eyyea
r2007
(Yes
=1)
9461
0.4
96
0.5
00
0.4
95
0.5
00
0.4
98
0.5
00
-0.0
03
0.0
10
-0.3
0Sch
ool
chara
cter
isti
csM
etro
polita
nci
ty(Y
es=
1)
9461
0.4
57
0.4
98
0.4
63
0.4
99
0.4
50
0.4
98
0.0
13
0.0
10
1.3
1M
ediu
mci
ty(Y
es=
1)
9461
0.4
54
0.4
98
0.4
65
0.4
99
0.4
43
0.4
97
0.0
22
0.0
10
2.1
2R
ura
lare
a(Y
es=
1)
9461
0.0
89
0.2
85
0.0
71
0.2
58
0.1
07
0.3
09
-0.0
35
0.0
06
-6.0
0P
riva
tesc
hool
(Yes
=1)
9461
0.2
00
0.4
00
0.1
92
0.3
94
0.2
09
0.4
07
-0.0
17
0.0
08
-2.0
8C
oed
school
(Yes
=1)
9461
0.6
44
0.4
79
0.6
43
0.4
79
0.6
44
0.4
79
-0.0
01
0.0
10
-0.1
1B
oys-
only
school
(Yes
=1)
9461
0.1
82
0.3
86
0.1
68
0.3
74
0.1
97
0.3
97
-0.0
29
0.0
08
-3.6
5G
irls
-only
school
(Yes
=1)
9461
0.1
74
0.3
79
0.1
89
0.3
92
0.1
59
0.3
66
0.0
30
0.0
08
3.8
6L
n(g
rade
size
)9461
5.4
91
0.8
03
5.5
45
0.7
53
5.4
37
0.8
48
0.1
07
0.0
16
6.5
2
25
Table 2: OLS, 2SLS and First-difference Estimates of the Effect of Tutoring Expenditures onTest Scores: Average Scores of the Three Subjects
Dependent variable: Ln(Tutoring Normalized average scoreExpenditure)
Estimation method: OLS 2SLS First-difference
(1) (2) (3) (4)
Ln(Tutoring 0.057 (0.005)** 0.288 (0.089)** 0.033 (0.007)**expenditures) [0.211] [1.062] [0.122]
First-born child 0.176 (0.031)**Hours of self-study 0.030 (0.003)** 0.010 (0.001)** 0.003 (0.003) 0.004 (0.002)*Prior avg score 0.333 (0.016)** 0.717 (0.007)** 0.638 (0.031)** -0.405 (0.015)**Male 0.210 (0.035)** -0.115 (0.016)** -0.163 (0.026)**No. of children -0.080 (0.021)** 0.008 (0.009) 0.034 (0.014)*Handicapped 0.108 (0.075) -0.036 (0.034) -0.063 (0.040)Intact family 0.253 (0.053)** 0.048 (0.024)* -0.008 (0.035)Parents’ avg age 0.000 (0.004) 0.004 (0.002)** 0.006 (0.002)**Parents’ avg edu. 0.034 (0.007)** 0.018 (0.003)** 0.010 (0.005)*Have religion 0.153 (0.030)** -0.031 (0.014)* -0.066 (0.020)**Ln(Family income) 0.642 (0.027)** 0.004 (0.013) -0.143 (0.058)*Survey year 2007 0.079 (0.027)** 0.000 (0.012) -0.018 (0.016) -0.004 (0.010)Intercept -2.856 (0.280)** -0.682 (0.122)** -0.136 (0.251)School Yes Yes Yes
characteristicsF (IV excluded 32.97from the 2nd stage)R-square 0.271 0.628Number of sample 9,461 9,461 9,461 4,036
Notes: Standard errors are reported in parentheses. * and ** indicate that the estimate is significantat the 0.05 and 0.01 levels, respectively. The numbers in square brackets are percent changes in testscore due to a 10 percent change in expenditure that are evaluated at mean values.
26
Table 3: OLS, 2SLS and First-difference Estimates of the Effect of Tutoring Expenditures:Individual Subjects
Dependent variable: Ln(Tutoring Normalized test scoreExpenditure)
Estimation method: OLS 2SLS First-difference
(1) (2) (3) (4)
Panel A: Korean
Ln(Tutoring 0.025 (0.008)** 0.851 (0.245)** 0.038 (0.012)**expenditures) [0.092] [3.138] [0.141]
First-born child 0.110 (0.024)**
F (IV) 20.72R-square 0.096 0.415Number of sample 8,520 8,520 8,520 3,296
Panel B: English
Ln(Tutoring 0.084 (0.007)** 0.339 (0.112)** 0.032 (0.010)**expenditures) [0.310] [1.249] [0.117]
First-born child 0.151 (0.025)**
F (IV) 36.57R-square 0.291 0.564Number of sample 8,764 8,764 8,764 3,506
Panel C: Mathematics
Ln(Tutoring 0.111 (0.007)** 0.348 (0.117)** 0.061 (0.011)**expenditures) [0.408] [1.281] [0.223]
First-born child 0.156 (0.025)**
F (IV) 39.10R-square 0.293 0.475Number of sample 8,807 8,807 8,807 3,535
Notes: Other explanantory variables, whose estimates are suppressed here, are same as in Table 2.Standard errors are reported in parentheses. * and ** indicate that the estimate is significant at the0.05 and 0.01 levels, respectively. The numbers in square brackets are percent changes in test scoredue to a 10 percent change in expenditure that are evaluated at mean values.
27
Table 4: Matching Estimates of the Effect of Tutoring Expenditures
Average Treatment Effects Average Treatment Effectson the Treated (ATT) (ATE)
Outcome variables θ1,0N θ2,1
N θ2,0N γ1,0
N γ2,1N γ2,0
N
(1) (2) (3) (4) (5) (6)
A. Average score of 0.128** 0.062 0.170** 0.166** 0.046 0.164**the three subjects (0.030) (0.033) (0.039) (0.024) (0.026) (0.031)
Elasticity: 0.465 0.276 0.546 0.604 0.205 0.528
B. Test score of -0.033 0.060 0.048 0.061 0.061 0.070**Korean (0.052) (0.048) (0.034) (0.041) (0.043) (0.026)
Elasticity: -0.108 0.241 0.152 0.197 0.246 0.222
C. Test score of 0.111** 0.106** 0.226** 0.160** 0.056* 0.217**English (0.034) (0.031) (0.040) (0.026) (0.026) (0.034)
Elasticity: 0.496 0.544 0.858 0.715 0.288 0.823
D. Test score of 0.165** 0.136** 0.230** 0.181** 0.128** 0.253**Mathematics (0.033) (0.038) (0.044) (0.025) (0.030) (0.039)
Elasticity: 0.787 0.736 0.929 0.861 0.693 1.020
Notes: Bootstrap standard errors are reported in parentheses. * and * indicate thatthe estimate is significant at the 0.05 and 0.01 levels, respectively. The elasticitiesare percent changes in test score due to a 10 percent change in expenditure that areevaluated at mean values.
28
Table 5: Estimated Bounds of the Effect of Tutoring Expenditures
LB UB LB UB UB UB5 pctile 95 pctile 95 pctile
Panel A: Average score of the three subjects
MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.503 0.000 0.572 1.827 2.077E[y(2)− y(1)] 0.000 0.360 0.000 0.437 1.592 1.936E[y(2)− y(0)] 0.000 0.663 0.000 0.751 2.129 2.411
MIV+MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.488 0.000 0.544 1.772 1.977E[y(2)− y(1)] 0.000 0.344 0.000 0.407 1.523 1.802E[y(2)− y(0)] 0.000 0.632 0.000 0.702 2.029 2.255
Panel B: Test score of Korean
MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.067 0.000 0.134 0.216 0.432E[y(2)− y(1)] 0.000 0.060 0.000 0.139 0.241 0.559E[y(2)− y(0)] 0.000 0.086 0.000 0.169 0.272 0.536
MIV+MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.059 0.000 0.131 0.192 0.422E[y(2)− y(1)] 0.000 0.056 0.000 0.135 0.226 0.544E[y(2)− y(0)] 0.000 0.079 0.000 0.152 0.251 0.483
Panel C: Test score of English
MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.486 0.000 0.552 2.173 2.466E[y(2)− y(1)] 0.000 0.381 0.000 0.456 1.950 2.333E[y(2)− y(0)] 0.000 0.662 0.000 0.732 2.511 2.778
MIV+MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.474 0.000 0.533 2.117 2.383E[y(2)− y(1)] 0.000 0.371 0.000 0.442 1.899 2.261E[y(2)− y(0)] 0.000 0.639 0.000 0.710 2.426 2.694
Panel D: Test score of Mathematics
MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.518 0.000 0.582 2.471 2.777E[y(2)− y(1)] 0.000 0.418 0.000 0.482 2.256 2.603E[y(2)− y(0)] 0.000 0.712 0.000 0.788 2.873 3.178
MIV+MTR+MTS Bounds ElasticityE[y(1)− y(0)] 0.000 0.507 0.000 0.575 2.418 2.745E[y(2)− y(1)] 0.000 0.411 0.000 0.489 2.220 2.644E[y(2)− y(0)] 0.000 0.694 0.000 0.772 2.800 3.114
29