86
1 The impact of health resources on education outcomes in rural India CANDICE WEI LING TAN Honours Thesis Bachelor of Commerce (Financial Economics / Business Statistics) Bachelor of Arts (History) Supervisor: Dr. Gautam Bose 27 th October 2008

The impact of health resources on education …...1 The impact of health resources on education outcomes in rural India CANDICE WEI LING TAN Honours Thesis Bachelor of Commerce (Financial

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    The impact of health resources on education outcomes in rural India

    CANDICE WEI LING TAN

    Honours Thesis Bachelor of Commerce (Financial Economics / Business Statistics)

    Bachelor of Arts (History)

    Supervisor: Dr. Gautam Bose

    27th October 2008

  • 2

    Declaration

    I hereby declare that this submission is my own work and any contributions or materials

    by other authors used in this thesis have been appropriately acknowledged. This thesis has

    not been previously submitted to any other university or institution as part of the

    requirements for another degree or award.

    CANDICE WEI LING TAN

    27th October 2008

  • 3

    Acknowledgements

    I would like to thank my supervisor Dr. Gautam Bose for his support and assistance

    throughout my honours year. An enormous debt of gratitude is owed to Meliyanni Johar

    for her willingness to help, her unwavering Stata wisdom and encouragement. I would

    also like to thank Professor Denzil Fiebig for his help and insight. Thanks also to Mr.

    Hong il Yoo for graciously answering my random econometrics questions and Dr.

    Valentyn Pachenko and Dr. Shiko Maruyama for their comments.

    Finally, I would like to acknowledge my family: mum, dad and brothers Paul and David

    (thank you for your help on this thesis!) who have helped me in their own important way

    throughout my studies.

  • 4

    Table of Contents

    ABSTRACT……………………………………………………………………………..8

    1.INTRODUCTION……………………………………………………………………..9

    2. BACKGROUND ........................................................................................................ 13

    2.1 THE GLOBAL HEALTH AND EDUCATION CRISIS ................................. 13

    2.2 CASE STUDY OF INDIA................................................................................ 15

    3. LITERATURE REVIEW ........................................................................................... 17

    3.1 THE BENEFITS AND DETERMINANTS OF EDUCATION....................... 17

    3.2 EDUCATION AND HEALTH......................................................................... 18

    3.3 GENDER BIAS IN EDUCATION AND HEALTH........................................ 21

    4. CONCEPTUAL FRAMEWORK ............................................................................... 23

    5. DATA ......................................................................................................................... 27

    5.1.1 DESCRIPTION.............................................................................................. 27

    5.1.2 HANDLING OF DATA ................................................................................ 28

    5.1.3 LIMITATIONS OF DATA............................................................................ 29

    5.2 DEPENDENT VARIABLE.............................................................................. 29

    5.3 EXPLANATORY VARIABLES...................................................................... 33

    6. ECONOMETRIC APPROACH ................................................................................. 42

    6.1 BINARY DEPENDENT VARIABLE ............................................................. 42

    6.2 BIVARIATE PROBIT MODEL WITH SAMPLE SELECTION.................... 43

    6.3 EXTENSION: RANDOM EFFECTS PROBIT MODELS.............................. 47

    7. EMPIRICAL RESULTS............................................................................................. 49

    7.1 BIVARIATE PROBIT WITH SAMPLE SELECTION RESULTS ................ 49

    7.2 UNIVARIATE PROBIT MODEL RESULTS ................................................. 51

    7.3 GENDER-DISAGGREGATED PROBIT MODEL RESULTS ...................... 58

    7.4 PREDICTED PROBABILITIES...................................................................... 64

    7.5 DIAGNOSTICS................................................................................................ 67

    7.6 ROBUSTNESS CHECKS ................................................................................ 70 7.6.1 RANDOM EFFECTS PROBIT RESULTS........................................................ 70

    7.6.2 ALTERNATIVE VARIABLE DEFINITIONS AND SUB SAMPLE ESTIMATION ............................................................................................................. 71

    8. CONCLUSION........................................................................................................... 73

  • 5

    9. APPENDIX................................................................................................................. 77

    APPENDIX 1: YEARS OF EDUCATION AND AGE DIFFERENCES………..77 APPENDIX 2: QUALITY OF EDUCATION INDEX VARIABLES…………...78 APPENDIX 3: DESCRIPTIVE STATISTICS FOR ALL (PARTICIPATION AND PARTICIPATE)…………………………………………………………………..78 APPENDIX 4: DESCRIPTIVE STATISTICS FOR ALL CHILDREN (CONSISTENT AND INCONSISTENT)………………………………………...79 APPENDIX 5: CALCULATING PARTIAL EFFECTS OF CONTINUOUS VARIABLE ON RESPONSE PROBABILITY…………………………………. 79 APPENDIX 6: GENDER-DISAGGREGATED RANDOM EFFECTS PROBIT RESULTS FOR PARTICIPATION………………………………………………80 APPENDIX 7: GENDER-DISAGGREGATED RANDOM EFFECTS PROBIT RESULTS FOR CONSISTENCY……………………………………………… 81

    10. REFERENCES ......................................................................................................... 82

  • 6

    List of Tables

    1. DEFINITION OF EXPLANATORY VARIABLES …………………………………………34

    2. DESCRIPTIVE STATISTICS FOR ALL SAMPLE CHILDREN …………………………...39

    3. BIVARIATE PROBIT WITH SAMPLE SELECTION RESULTS …………………………..50

    4. PROBIT MODEL RESULTS FOR SCHOOLING PARTICIPATION ………………………52

    5. PROBIT MODEL RESULTS FOR SCHOOLING CONSISTENCY ………………………...54

    6. GENDER-DISAGGREGATED PROBIT MODEL RESULTS FOR

    SCHOOLING PARTICIPATION ………………………………………………………………..59

    7. GENDER-DISAGGREGATED PROBIT MODEL RESULTS FOR

    SCHOOLING CONSISTENCY …………………………………………………………………62

    8A. PREDICTED PROBABILITIES FOR SCHOOLING PARTICIPATION (MALES) ……....65

    8B. PREDICTED PROBABILITIES FOR SCHOOLING PARTICIPATION (FEMALES) ……66

    9A. PREDICTED PROBABILITIES FOR SCHOOLING

    CONSISTENCY (MALES)………………………………………………………………………66

    9B. PREDICTED PROBABILITIES FOR SCHOOLING

    CONSISTENCY (FEMALES)…………………………………………………………………... 67

    10A. PREDICTION SUCCESSES FROM TABLE 6 (MALES)………………………………...69

    10B. PREDICTION SUCCESSES FROM TABLE 7 (MALES) ………………………………..69

    11A. PREDICTION SUCCESSES FROM TABLE 6 (FEMALES) ……………………….…….69

    11B. PREDICTION SUCCESSES FROM TABLE 7 (FEMALES)……………………………...69

  • 7

    List of Figures

    1. DISTRIBUTION OF SAGE SCORES…………………………………………………………32

    2. GENERAL HEALTH SESSION FREQUENCY AND SCHOOLING OUTCOMES………...41

    3. SOURCE OF WATER AND SCHOOLING OUTCOMES……………………………………41

    4. OBSERVATIONS IN A BIVARIATE PROBIT MODEL WITH SAMPLE SELECTION…..45

  • 8

    Abstract The benefits of education on productivity, growth and development have engendered the

    interests of many researchers in examining the determinants of schooling outcomes.

    Various factors have been considered, including parental education, gender and school

    factors. This thesis will contribute to the literature and considers another pertinent factor

    in education outcomes: the health status of the child.

    Many studies have linked the health of a child to their schooling performance because of

    the effect that health has on a child’s immune system, cognitive ability and level of

    concentration. However these studies focused on individual and anthropometric indicators

    of health. In this thesis, the focus will be on village-level health resources.

    Using detailed household survey data from rural India, this thesis investigates the impact

    that village infrastructure and resources that promote health, such as clean supplies of

    water and proximity to a hospital, has on schooling outcomes. Specifically, whether health

    resources impact on the propensity of a child to attend school (participation), as well as

    his or her propensity to keep up with their schooling for their age (consistency).

    The econometric models used in this thesis take into account the possible relationship

    between these two schooling outcomes, as well as unobserved household effects that may

    impact on a child’s education. The findings of this study indicate that health resources

    have a statistically and practically large impact on schooling outcomes after controlling

    for a range of individual, household and other village level characteristics. The

    importance of preventive care health measures, such as the frequency of general health

    lessons, is particularly robust. Furthermore, the effect of health resources is found to be

    greater for female education outcomes and indicates that improvement in basic health

    provision would also help reduce the pronounced gender bias that exists in education

    attainment.

  • 9

    1. Introduction Human capital investment through education is widely recognized as an important source

    of economic growth and productivity for a nation. Indeed governments in developing

    nations often divert substantial amounts of their national expenditure to the education

    sector because of the positive flow-on effects a more educated population would have on

    other sectors of the economy. It has also been considered an effective means to ease

    inequality and to improve the opportunities for socio-economic improvement for people in

    developing countries (World Bank, 2001).

    Despite this, the education levels in developing countries remain low, and improvements

    in education outcomes continue to be needed in order to promote and sustain development

    in the world’s poorer nations. This issue is even more urgent as educational attainment is

    typically the only means for a family to break the poverty cycle. Thus the implications of

    schooling are not only considered for present individuals and families, but for future

    generations as well.

    In India, the importance of education for children has been demonstrated by the many

    program and policies that have been pursued over the last few decades. However,

    although improvements have been made, like many other developing countries this

    second-most populous nation in the world still faces many challenges and obstacles to

    favourable schooling outcomes that need to be addressed.

    Because of the known benefits of education attainment, myriad studies have sought to

    determine the factors that contribute to improved schooling outcomes. This would then aid

    and direct public policy initiatives and investments to strengthen this vital area of

    development. Individual and household characteristics such as gender, household income

    and parental schooling levels are among the many factors that education papers have

    focused on. Following this intention, my thesis will also analyse the determinants of

    education outcomes for children using detailed survey data from India. However, my

  • 10

    focus is on another potential determinant of schooling that deserves deeper analysis – the

    health status of a child.

    The topics of health and education have been addressed extensively in the economic

    literature and from many perspectives. This study will take the position that health is an

    important determinant of a student’s ability to acquiring education: a more healthy child

    will be more productive and capable in class, which will positively impact on their

    schooling outcome. Indeed the importance of health and nutrition on the overall welfare of

    a growing child has been widely acknowledged. From its impact on cognitive and learning

    ability, to food absorption and other deficiencies, the potential impact of health on

    schooling outcomes has gained considerable attention.

    Many studies have linked the nutritional status of children to educational achievements

    using data from various developing countries. They find that the health status of a child

    does have significant impact on their education outcomes, which is either measured by

    participation or test score variations. However, these papers have focused predominantly

    on individual-level and anthropometric indicators for a child’s health, such as height-to-

    age or weight-to-age measures. In contrast, this study will focus on village-level health

    resources and infrastructure as a measure of a child’s health status.

    It is hypothesized that a child living in a village that is well endowed with quality health

    resources is more likely to achieve favourable education outcomes compared to a child

    from a village that lacks proper health services. Although the World Health Organization

    (WHO), United Nations (UN) among others have frequently cited unsafe water supply and

    sanitation as having severe implications for development, analysis of these and other

    health resources on child education outcomes has not been seriously undertaken.

    Using nationally representative household and village survey data from rural India, this

    thesis will investigate the extent to which health services and infrastructure in a village

    affect education outcomes. Two measures of education outcomes will be examined:

    schooling participation as well as schooling consistency (the extent that, once in school, a

  • 11

    child keeps up with their education for their age). Although schooling participation in

    India has experienced vast improvement over the decades, improvement in the quality of

    the education received by children remains critical (WHO, 2000). Thus, determining

    schooling consistency is considered another important education outcome and one that has

    not been commonly addressed in the related literature.

    Limited dependent variable models will be employed to estimate this relationship.

    Although univariate probit models were used for the main results of this thesis, other

    econometric strategies were employed in order to account for two possible sample

    problems in analysing schooling outcomes. First, almost a quarter of the children in the

    sample had not participated in the school system. If a child’s propensity to participate in

    school and a child’s propensity to have consistent schooling were correlated, simply

    estimating schooling consistency on children who have participated in school would

    amount to endogenous sample selection. Studies that have examined these two schooling

    tendencies have often assumed the independence between the two, yet from a

    methodological standpoint this possible bias should be addressed in the initial stage of the

    econometric study. Thus, a bivariate probit model with sample selection is estimated.

    Second, because many children in the sample are from the same family, common factors

    may affect siblings in the same household. I employ random effects probit models in

    order take account of this potential unobserved household heterogeneity and clustering

    effect.

    In addition to the investigation of the impact of health resources on education outcomes,

    the determination of a gender bias in not only education, but also health and other factors

    is considered as well. The gender bias in education has been well documented and studies

    have typically included a single gender dummy in order to capture this effect. My analysis

    will split the sample into gender-disaggregated sub samples and hope to capture other

    biases that may arise. In particular, studies have shown that there is a gender bias in health

    outcomes in developing countries. Where health resources are scarce or poor, females

    tend to be more disadvantaged than males. This implies that an improvement in health

  • 12

    resources would have greater impact on females that could, in turn, help reduce the gender

    gap in education outcomes.

    The benefits of this rich cross sectional dataset are that several health resources variables

    are available to use as a measure of health status, as well as a range of individual,

    household and village level data to control for other factors that may affect schooling

    outcomes. Accordingly, this thesis will contribute to the literature in two main aspects.

    First, it will add to the studies that examine the determinants of education outcomes by

    focusing on the health status of the child, as well as providing fresh analysis into the effect

    of other commonly considered schooling determinants.

    Second, building from the literature that links individual nutritional and health status of a

    child to their schooling outcomes, this thesis will investigate the impact of village-level

    health resources on education outcomes. Compared to the individual health perspective

    that have analysed variation in calorie-intake or height-for-age scores, the results from this

    thesis on public infrastructure and resources may allow a more direct policy implication

    and suggestion for public expenditure and investment.

    The remainder of this thesis is structured as follows. Section 2 presents some background

    on education and health in developing countries and considers the case study of India.

    Section 3 reviews the literature on education and health with emphasis placed on the

    studies that link health to education outcomes. Section 4 presents the conceptual

    framework of the investigation. Section 5 develops the econometric strategies employed.

    Section 6 describes the cross-sectional household and village data from India used for the

    empirical analysis. Section 7 presents and discusses the empirical results that seem to

    affirm the hypothesis; in particular, preventive care health measures have a statistically

    and practically significant impact on schooling outcomes, highlighting in particular the

    positive effect of general health sessions and information. Section 8 concludes with

    suggestions for future research.

  • 13

    2. Background

    2.1 The Global Health and Education Crisis There are many paths to development but the focal areas of this thesis are health and

    education. The importance of health and education were stated in the historic 1948 United

    Nations Universal Declaration of Human Rights1.

    Everyone has the right to a standard of living adequate for the health and well

    being of himself and of his family.

    (Article 25)

    Everyone has the right to an education…directed to the full development of the

    human personality and to the strengthening of respect for human rights and

    fundamental freedoms.

    (Article 26)

    These fundamental rights for human beings are basic for socio-economic improvement.

    Indeed improving the lives of poor and disadvantaged people has always been a worthy

    objective in both developing and developed nations. However this task takes on greater

    urgency in developing countries where poverty and underdevelopment are widespread,

    with potential to endure and persist through generations.

    More recently in 2000, the United Nations set out eight international development goals to

    be achieved by the global community. The United Nations and its 189 member states

    agreed to make a concerted effort in meeting these United Nations Millennium

    Development Goals (MDG) by the year 2015.2 The goals include:

    1 http://www.un.org/Overview/rights.html 2 http://www.un.org/millenniumgoals/

  • 14

    The United Nations Millennium Development Goals

    1. Eradicate extreme hunger and poverty

    2. Achieve universal primary education

    3. Promote gender equality and empower women

    4. Reduce child mortality

    5. Improve maternal health

    6. Combat HIV/AIDS, malaria and other diseases

    7. Ensure environmental sustainability

    8. Develop a global partnership for development

    Providing basic health resources and achieving good education outcomes would greatly

    assist in achieving a majority of these goals. Moreover, the poor state of education and

    health resources in the world today, largely in developing countries, further highlights the

    need for improvement in these vital areas. Progress reports from developing nations

    indicate that, contrary to the ideals espoused in 1948, the provision of proper health care

    and education are human rights that remain far from being universal.

    The United Nations Educational, Scientific and Cultural Organization (UNESCO)

    reported that 113 million school-age children around the world are not in school

    (UNESCO, 2002). This deficiency is particularly pronounced in developing nations.

    Although education is widely acknowledged as an important source of development, a key

    to poverty reduction and thus a priority for developing nations (World Bank, 2001), the

    acquisition of education in these areas is not guaranteed. In fact it is estimated that 46 per

    cent of people in developing countries are illiterate, 25 percent of children aged 6-12 years

    do not receive primary education and 80 per cent of children aged 13-18 years do not

    receive secondary education (Todaro, 2000).

    The situation of health care and provision lies in a similarly dismal state in developing

    countries. Even in the 21st century, 11 million children die each year from preventable

  • 15

    illnesses (UNICEF, 2002). A lack of clean water supply, poor sanitation practices and

    poor health care contributes to the spread of infections and diseases that, though

    inconsequential in developed countries, prove devastating in developing areas. In fact, 1.3

    million people die every year of malaria, 1.8 million people die of diarrheal diseases and

    90% of these deaths are children under the age of five years old. (WHO, 2004). The

    relative ease in the prevention and cure of these low-level ailments further highlights the

    severity of the health crisis that faces poor countries in the world.

    2.2 Case Study of India Achieving favourable education and health outcomes is of great importance in a nation

    like India, the second most populous nation in the world. The United Nations Millennium

    Development Goals are goals that the Indian government has pledged to achieve in order

    to improve the livelihoods of its people. Finance Minister P. Chidambaram expressed this

    commitment during his budget speech in 2004:

    “The countries of the world, India included, have set for themselves the Millennium

    Development Goals. Our date with destiny is not at the end of the millennium, but in the

    year 2015. Will we achieve those goals? In the eleven years that remain, it is in our hands

    to shape our destiny.”3

    On one hand, great improvements in education and health have been achieved in India

    over the last few decades. Policies and initiatives such as the Integrated Child

    Development Services (ICDS) in 1975 to the recently launched National Rural Health

    Mission as well as high economic growth in recent years has helped to foster improved

    outcomes in these two vital areas of development. Yet, serious challenges still remain and

    need to be addressed in order to achieve sustained growth and development.

    3 Sachs, J. (2005). “The End of Poverty: How we can make it happen in our lifetime”, Penguin Books, Great Britain. p. 185.

  • 16

    According to the World Bank, the number of Indian children not in school has been

    reduced from 25 million in 2003 to 9.6 million in 2005-06. More equity in schooling has

    also been achieved and the gap between gender and social status (through castes) has also

    been reduced.4 However, challenges in education still persist. Although there have been

    improvements in schooling participation, the quality of education outcomes has not

    experienced the same progress, with incomplete schooling and drop-outs a considerable

    problem.

    In terms of health, India is also lagging behind in both the provision and quality of health

    resources. In fact, India has one of the highest percentages of undernourished children in

    the world with approximately 60 million children classified as being undernourished

    (Gragnolati et al, 2005). According to UNICEF, in 2006 some 2.1 million children under

    the age of five died in India and this figure has been attributed to India’s poor state of

    health care and delivery (UNICEF, 2006).

    Moreover, in their report on undernourished children in India, Gragnolati et al (2005)

    argue that the country’s child malnutrition problem persists in part because the focus on

    improving nutrition has primarily been on food intake. However, they highlight the role

    that infections and ill-informed health practices have had on India’s malnutrition epidemic.

    Although great improvements in the provision of improved drinking-water sources have

    been achieved (86% of the Indian population in 2004), unimproved sanitation and lack of

    general health knowledge in the population remain significant public health threats for

    much of the population (UNICEF, 2002). The extent that such bleak health conditions

    affect poor education outcomes forms the basis of this investigation.

    4World Bank (2008) India Country Overview 2008. See Reference.

  • 17

    3. Literature Review

    3.1 The Benefits and Determinants of Education

    Human capital formation through education is fundamental for development and progress.

    This is particularly an important issue in developing countries where the poor may

    experience persistent inequality and poverty because of the credit constraints and lack of

    opportunity for socio-economic improvement. Through higher post-school earnings, a

    good education provides the opportunity for intergenerational income mobility and a

    breaking of the poverty cycle (Restuccia & Urrutia, 2004). Alternatively, Behrman (1990)

    and Bedi and Gaston (1997) argue that children who are poorly educated may have low

    productivity in adult life and end up in poverty.

    However, free provision of education does not necessarily mean free consumption and a

    vast amount a literature has been devoted to the study of schooling determinants in

    developing countries (Dreze and Kingdon, 2000; Duraisamy, 1992; Sipahimalani, 1997

    among others). Dreze and Kingdon (2000) examined schooling participation and grade

    attainment in rural north India. Using household survey data, they employed a logit model

    for estimating participation and an ordered logit model for estimating the determinants of

    grade attainment (three outcomes of not enrolled in, enrolled but not completed and

    completed primary schooling) for separate as well as pooled sample of female and male

    children of primary age. Their results found that a range of individual and household

    variables affects participation and grade attainment in school. They particularly

    highlighted the role that parental education plays in schooling outcomes as well as

    schooling characteristics. Mid-day meals are found to be particularly effective in

    improving participation in school for girls.

    Other studies on the determinants of education outcomes have also considered a range of

    variables but focus on some key characteristics. For example, Blau and Grossberg (1992)

    find the role of mother’s education an important determinant while Brown and Park

    (2002) focus on the role that wealth and credit constraints have on schooling investments.

  • 18

    Birth order and family size are also commonly found to be significant determinants for

    schooling outcomes. It is from these studies on schooling determinants that the choice of

    explanatory variables in the model will be considered.

    Another factor that has been considered in relation to education attainment is the role of

    health. The literature on education and health is also quite extensive. Some studies have

    examined them as separate inputs for growth and development. This study aims to

    investigate a relationship between the two.

    3.2 Education and Health

    Studies have compared the benefits of investing in education with the benefits of investing

    in health in the hopes of directing public policy in the area that would most benefit

    economic growth. Knowles and Owen (1995) found that health has a greater impact on

    economic growth than education. Using life expectancy as a proxy for health capital, it

    was found to have a statistically and practically significant impact on income per capita

    compared to education, and highlighted the importance of including health capital in

    models of growth.

    Webber (2002) is even more emphatic about the apparent trade off between education and

    health in his paper subtitled, “should we invest in health or education?” His question is

    answered using cross sectional data from 46 countries. Webber finds that his proxy for

    health, as measured by the intake of calories per head, has a statistically insignificant

    effect on economic growth contrary to education. He concluded that the results support

    the notion that investing in health has lower returns for a nation than investing in

    education. His suggestions for future research, however, are to investigate other proxies

    for health, in particular health infrastructure such as the supply of clean water and quality

    of health care.

    Therefore economic growth studies have viewed education and health as two separate,

    exogenous and almost opposing inputs for growth and development. This study, however,

  • 19

    will instead consider the intricate relationship between these two important sources of

    social capital whilst focusing on individual education and health outcomes.

    Both channels of the health and education relationship have been examined. Although this

    study is investigating the impact of health status on education outcomes, the opposite

    direction of education’s effect on the health status of an individual has also been of

    interest – that is, the health benefits of acquiring education. Mushkin (1962) argues that

    ignorance delays medical treatment and subsequently increases the strength of infection

    and disease. The inability to read and understand medicinal information and innovations

    could also be averse to a person’s health. More educated people could also be more likely

    to be employed in “safer” white-collar occupations with less health risks and generally

    pursue activities that do not endanger their health (Case 2002; Caldwell, 1986).

    However, the direction of this education and health relationship views education as human

    capital already acquired. Since education attainment (and the ensuing positive flow-on

    effects) is not guaranteed in developing countries, the reverse relationship is of interest in

    this study. The channel that this thesis will investigate – namely, the extent that health

    status impacts on education outcomes - is analogous to the impact of worker productivity

    on output. In other words, we investigate the impact of the health of a child on his or her

    capacity for participating and exerting effort in school.

    The link between health and education attainment has been well established. Using height-

    to-age, weight-to-age and other anthropometric measures for child health, research across

    a range of developing countries has shown that variations in these indicators have a

    significant impact on schooling outcomes.

    The health and nutritional status of a child has been shown to determine the propensity of

    a child to participate in school. In their study of Nepalese children, Moock and Leslie

    (1986) examined the effect of nutrition status – as measured by height-for-age, weight-for-

    age and weight-for-height – on both schooling participation and grade attainment. They

    estimated a probit model for schooling participation and found that children with better

  • 20

    nutritional status had a significantly higher probability of attending school compared to

    those with stunted growth. In terms of grade attainment, their ordinary least squares (OLS)

    results also come to the same conclusion on the benefits of nutrition on schooling

    outcomes. However, only 15 per cent of the 350 primary school aged children actually

    participated in school. Therefore their analysis of grade attainment for this 15 per cent of

    the sample does not take into account the possible sample selection of participation. This

    will be addressed in my study, as the proportion of non-participants is also non-trivial.

    Addressing schooling participation, Glewwe and Jacoby (1995) found that children in

    Ghana delayed enrolment in school and also completed fewer years in schooling because

    of malnutrition and poor health (measured by height-for-age). Because of the negative

    impact delayed schooling would have on post-school labour earnings, the authors

    emphasise the importance of child health and nutrition. However, after they control for

    unobserved family variables using random and fixed effects estimation, the effect of

    health is substantially reduced. This paper highlights the importance of accounting for

    unobserved factors that may affect the analysis.

    Studies also link poor health status to poor achievements by children in school as

    measured by variation in test scores. Gorman and Pollitt (1993) found that children with

    better nutrition in Guatemala performed better in cognitive and other school tests.

    Similarly, a study in the Philippines also found that a one standard deviation increase in

    early-age child health increased test scores by almost a third of a standard deviation

    (Glewwe & King, 2001). Thus these studies have indicated that the health status of a child

    can impact on their ability to acquire education through its consequence on concentration,

    cognitive and physical ability. Subsequently this will impact on the quantity and quality of

    their education and potential for socio-economic improvement.

    However, these studies on health and education have largely focused on the impact of

    individual health status on education outcomes. Because of this, the empirical methods of

    these papers have primarily used two stage least squares in order to account for the

    possible endogeneity of a child’s health to their education outcomes. This is because, as

  • 21

    mentioned previously, the relationship between health and education can be viewed from

    both directions and thus this issue would arise. Less focus, however, has been placed on

    the role of village health infrastructure and resources as a determinant of health status, and

    its subsequent impact on education outcomes. By using village level health resources as

    an explanatory variable, the threat of endogeneity is minimized. That is, a child’s

    education outcome may impact on their individual health status, but it is unlikely that a

    child’s education outcome would impact on the village’s level of health infrastructure.

    It has been established that there exists a relationship between the quality of sanitation and

    water and other village-level health resources, and the health status of households,

    particularly children. For example, Esrey (1996) found that improved water seemed to

    decrease the prevalence of diarrhoea in children by 6 percentage points when analysed

    across different countries. However, the extent to which these village-level health

    resources impact on education outcomes has not been vigorously addressed and it is this

    gap in the literature that my thesis hopes to fill. While studies have included physical

    infrastructure variables in their models that could proxy for health resources in the village,

    it is usually a single indicator that is used to measure the overall level of development in

    the village. The presence of piped water, for example, is a common proxy that has been

    used in studies of education outcomes (for example, Psacharopoulos and Arriagada, 1989;

    Holmes, 1999). The innovation in this paper is that besides controlling for village level

    development, various other health resource variables will be included in order to isolate, if

    any, a causal relationship between health resources and education outcomes.

    3.3 Gender Bias in Education and Health Gender bias in education outcomes has been a keen and important area of interest in the

    education literature (Lavy et al 1996; King and Lillard, 1987 among others), which

    indicates there is a sharp disparity in female and male schooling outcomes. South Asia is a

    region well known for its strong male preferences and discrimination against females and

    this inequality has also been viewed from a health perspective. For example, Rosenzweig

    and Schultz (1982) and Dasgupta (1987) found that there were significant gender

  • 22

    differences in household health care and resource expenditures in India as a result of

    perceived differences in future earning abilities. Studies of Pakistan found that boys

    received preferential treatment over girls with respect to treatment for illnesses such as

    diarrhoea and fevers as well as acute respiratory infections (Mahmood and Mahmood,

    1995; Filmer et al, 1998). Such bias has implications for childbirth and thus the health of

    future generations – further highlighting the importance of reducing such inequality.

    Females may also be given the task of taking care of ill family members if their

    opportunity cost of being outside the home is considered lower than a male counterpart.

    Therefore this issue of gender bias will also be addressed in this study, not only from a

    schooling outcomes perspective but also in terms of health outcomes.

    This thesis will contribute to the literature on the determinants of schooling by analysing a

    range of individual, household and village characteristics. This area of research is

    particularly important for developing countries in which education attainment is

    considered a key to development and poverty reduction. Furthermore, this study will

    extend the literature by focusing on a range of village-level health resources as a measure

    of health status and nutrition that has not been seriously addressed. The empirical methods

    employed in this paper will also take into account the possible sample selection of

    schooling participants that tends to be ignored in the literature, as well as accounting for

    unobserved household effects that could impact on the schooling outcomes for children in

    the same family. Finally, gender bias in education and health outcomes will also be

    examined.

  • 23

    4. Conceptual Framework

    The primary hypothesis of this thesis is that village-level infrastructure and resources that

    promote health leads, through improved health conditions, to more favourable education

    outcomes. ‘Education outcomes’ here reflect schooling participation as well as schooling

    consistency (the extent that a student keeps up with schooling according to their age). In

    terms of health resources, these can be categorized into three broad areas (De Ferranti,

    1985):

    Preventive care (patient-related): this includes services that are performed

    on well patients in order to reduce the incidence of adverse health events like

    gastro-intestinal infections, diarrhoea, and malaria etcetera. They would

    include measures such as food supplements, malaria shots and other

    vaccinations.

    Preventive care (non-patient related): this includes services that are

    provided in a community in order to control the spread of disease and

    infections. These include resources such as clean water, proper sanitation and

    the promotion of good health habits and hygiene.

    Curative care: these include resources such as hospitals and health

    facilities, medical practitioners or traditional healers that act to contain and

    ease illness after they occur.

    The ailments that plague school aged children, particularly in developing countries, are

    typically common “low level” diseases that tend to be easily preventable (the occurrence

    of diarrhoea, for example). The prevention and treatment of more complex ailments, such

    as malaria, are also well known. This indicates the importance of investing in basic health

    care and resources.

    I posit that communities that are less endowed with health resources and measures would

    lead to adverse effects on the education outcomes of the children living in that area. In a

    poor health-resourced environment, children are more likely to be afflicted periodically

  • 24

    with low-level ailments that would cause temporarily debilitation. The period of time that

    they remain indisposed depends on the severity of the illness, capacities of family care as

    well as the curative care services available in the area. Therefore the hypothesis of this

    thesis is that the health status of a child and their incidence of sicknesses will then impact

    on their education outcomes by affecting their ability to perform and succeed in school. It

    will also impact on the probability of school entry.

    It is hypothesized that the frequency of illnesses in the years preceding schooling age

    reduces the probability of school participation. Due to the opportunity cost of spending

    time in school, as opposed to utilizing that time in the home or engaging in paid work, the

    payoff from schooling needs to outweigh the cost of the invested time (besides other

    pecuniary schooling costs). This opportunity cost is particular high in developing

    countries where the resources in households can be heavily constrained.

    However, time spent in school is more effective over consistent time periods rather than

    short bouts of learning. Consistent attendance in school increases the productivity and

    learning ability of the child and also gives them the opportunity to understand ideas and

    concepts that would be important for higher level learning. Therefore because schooling

    requires a long term and consistent time investment in order to provide “profitable”

    returns (in terms of potential earning ability in the future), a child who is prone to illness

    and will consequently have a transient presence in the classroom may be more productive

    in non-schooling activities or work (where returns requires a less consistent time

    investment). Alternatively, the poor health status of a child may simply demand time and

    caring in the home rather than in the classroom. As such, a hypothesis of this study is that

    there exists a positive relationship between the probability of a child participating in

    school and the availability and quality of public health infrastructure in the village.

    Similarly, I hypothesize that health status impacts on children who are already enrolled in

    school by affecting their probability of keeping up with their studies. The frequency with

    which the child falls ill, and the extent that they remain ill without timely treatment,

    determines the probability that he or she successfully completes their schooling. Again, by

  • 25

    missing classes due to illnesses, a child misses out on learning concepts that would be

    used to understand more difficult material. Poor health and nutrition would also affect a

    child’s cognitive ability and capacity for learning and concentrating in the classroom. This

    would increase the probability that they fall behind in their schooling.

    Another consequence of a village being endowed with poor health resources is that the

    health status of other family members, besides the school-aged children, would be poor.

    This could place more responsibility on the children to take care of their ill siblings or

    elders and give less priority to attending or keeping up with their schooling.

    Therefore this relationship between health resources and education outcomes can be

    expressed as a reduced form achievement function:

    Ai = β1Zi + β2Hi + vk [1]

    where Ai is the education outcomes of child i. Zi is a vector of individual, household and

    other village-level characteristics that affect education outcomes, Hi includes the village-

    level health resources available to child i, and vk is a random disturbance term that

    includes unobservable characteristics that would affect schooling outcomes, for example,

    a child’s innate ability.

    Therefore my theory is that the education outcome of a child is, among other things, a

    factor of the health resources in the village, through its effects on the health status of the

    child. Implicit in this theory is that all families in a “good health” community will utilize

    the health resources available and alternatively, “bad” health resources will adversely

    impact on all families and the health status of the children in that community. This may be

    considered a strong assumption if some health measures are not available to all families in

    the village. However, the nature of public infrastructure and services is that when they are

    in place, they are available to all residents regardless of the economics means of the

    individual family. For health improvements in particular, the benefits to a whole

    community or village has well been affirmed.

  • 26

    In a study by Bundy et al (1990), they identified a transmission effect from treating

    diseases for school age children to the rest of the adult community. Hughes et al (2000)

    and Alderman et al (2001) also found that the impact of village level health resources had

    spill over effects on the entire village. Therefore I assume that a village with good health

    resources would similarly have positive externalities for the wider community. Likewise,

    a poor health-resourced village would be more conducive to infectious disease and illness

    that could permeate throughout the village because of interactions and poor practices

    across families.

  • 27

    5. Data 5.1.1 Description The data used in this study was collected by the National Council of Applied Economic

    Research (India) in 1999. This ARIS-REDS data comes from a nationally representative

    sample of rural Indian villages and households. This rich cross-sectional dataset is

    appropriate for this study as it provides detailed individual, household and village-level

    data across a range of socio-economic characteristics including health and education

    information for every family member.

    The ARIS-REDS surveys were taken in several rounds. It was first collected in 1969,

    then 1970, 1971, 1982 and 1999. Due to the timing between the survey rounds,

    longitudinal analysis of the households and villages would not be possible without

    significant changes in the composition of households and villages. The 1999 data was

    chosen for this analysis as it is the most recent, the data is in Stata format and the

    directories and identifications are presented in a clear layout (relative to the earlier

    datasets in which much of the information on scanned photocopies lacked clarity). This

    dataset covers 9298 families, consisting of 44,999 individuals across 253 Indian villages

    and within 16 rural states.

    Although the majority of the 1999 data is available online5, merging the village and

    household data needed village identifiers that were suppressed for privacy concerns.

    Professor Andrew Foster of the Department of Economics and Community Health at

    Brown University manages the full ARIS-REDS data. After obtaining approval from my

    faculty’s Human Research Ethics Advisory Panel and Professor Foster, the secure data

    allowed full merging of the relevant data decks.

    5 http://adfdell.pstc.brown.edu/arisreds_data/

  • 28

    5.1.2 Handling of Data The sample consists of several decks of information. If a family provided answers for

    the deck 2 questionnaire (referring to household composition), then because of full

    enumeration, more detailed information for all members of the family were available in

    subsequent decks. Since the data for this study needed to be merged across household

    and individual levels, deck 2 provided an overview of the family and was considered the

    “master” deck. A household identification (ID) number and specific member ID

    matched more detailed information about the child such as years of education. For

    example, the master deck contains information about a family with ID 7072 with three

    children of schooling age. These children had specific IDs that were matched with more

    detailed information in deck 6 (sons) and deck 7 (daughters).

    However, there were incidences of ID inconsistency across individuals within the family.

    For example, a son with an ID of “4” in the master deck may not match with their

    specific ID in deck 6. These inconsistencies can be attributed to the sheer nature of

    survey data and the inclination for human type error. In such cases, manual re-

    identifications were necessary. Although a tedious process, this ensured that the data

    remained consistent and informative for this analysis. Finally, the village data needed to

    be merged with the master deck. Secure village IDs were matched with coded identifiers

    in the master deck. Thus a fully merged dataset that consisted of individual information

    within families and across different villages was constructed for analysis.

    My study is based on a sub sample of children of the schooling age 6-18 years.

    Moreover, we are interested in children who were alive at the time of the survey as well

    as children who were living in the family. Therefore the sample did not include children

    listed in the more detailed decks that were not indicated in the master deck. These

    children were generally older, married and lived away from the family village. I was

    interested only in children living in the family village as the village-level health

    resources may only then have relevant impact on their education outcomes. Finally,

    children of families for which education and other household information were missing

    were excluded from the analysis. Therefore families with children, as indicated in the

  • 29

    master deck, that did not have more detailed information in the other decks were

    excluded. This was done under the assumption that this information was missing at

    random and thus their exclusion would not adversely affect the results of our analysis.

    Therefore our analysis was reduced to a sub sample of 8,668 children of the schooling

    age 6-18 years.

    5.1.3 Limitations of Data There are limitations to the information contained in this dataset. First, for village data,

    information on hospitals and schools are only available for the main institution in the

    village. That is, although there may be more than one school in the village, detailed data

    about the number of qualified teachers or availability of textbooks is available for the

    representative school in the village only. This means that it has to be assumed that the

    main school in the village is representative in terms of quality of the other schools. This

    assumption has to also be placed on the village information available for hospitals and

    health centres.

    Second, the cross sectional nature of the dataset implies that the children within the

    schooling range of 6-18 years had access to the schooling and health resources as they

    were maturing. That is, an 18 year old that is shown in the data to reside from a village

    with good health resources is assumed to have had this quality of health resources as he

    or she were growing up. Because of India’s emphasis on improving health resources as

    early as the 1950s but more so in early 1980s, this assumption is considered acceptable

    and necessary given the scope and nature of the data.

    5.2 Dependent Variable There are different ways to measure education outcomes. Some studies have utilized and

    examined variations in standardized test scores as a measure of education attainment

    (for example, Glewwe & King, 2001; Jamison and Lockheed, 1987). Arguably, this may

    not be an adequate reflection of education achievement as schooling provides a child

  • 30

    with opportunities of learning social skills through interaction and other positive

    externalities. More so, test scores reflect students who are already enrolled in school –

    that is, human capital already acquired. However as aforementioned, in many

    developing countries including India, participating in the education system is not

    guaranteed or universal but is affected by various household and socio-economic factors.

    Therefore participation in school and consistency in school are considered as measures

    of endogenous education outcomes for this analysis.

    Measuring schooling participation is uncomplicated. A child who reports at least one

    year of education is considered a school participant. For measuring schooling

    consistency, however, simply comparing the number of schooling years attained across

    different ages would be erroneous. That is, a 5 year old would have less years of

    schooling than an 18 year old simply because of their age difference. Although

    controlling for age could correct for this, a crux of this study is measuring the

    completeness or consistency of schooling outcomes. Therefore whether a child keeps up

    with their schooling, given their age, is of particular interest. In order to analyse these

    variations in education for children across different ages, a standardised measure that

    captures both participation and completion is used.

    A standardised measure for education outcomes, called the SAGE score (schooling for

    age), has been used in other studies and is a useful measure because it controls for

    different aged children and encompasses both schooling participation and years of

    schooling completed (Patrinos and Psacharopoulos, 1997; Gitter & Barham, 1999). The

    SAGE score is calculated as follows:

    SAGE = [S/(A – E)] * 100 [2]

    S is the total number of years completed, A is the age of the child, and E is the age that

    children officially begin schooling. The age in which children start school in India is 5

    years old. Thus a SAGE score of 100 would mean that a child’s education is consistent

    for their age, whilst a score less than 100 would mean that they have missed some years

  • 31

    of schooling or not participated at all (a SAGE score of zero). Although the official

    schooling age is 5 years, children may begin some sort of pre schooling and accordingly

    SAGE scores greater than 100 are possible.

    Again, survey data are not free from error. The dataset used contained discrepancies in

    the child’s age and years of education that needed to be accounted for. Although these

    differences are the interest of this study, including observations in which the difference

    is obviously a case of human type error could be adverse to our analysis and lead to bias

    results. Therefore, children with SAGE scores that were confidently regarded as a result

    of human error were altered according. For example, a child aged 10 years is purported

    to have 40 years of education – a difference of 30 years. Changing the education years

    to ‘4’ seemed reasonable. Two other observations were changed in this way as their

    differences in age and years of education were unrealistically great in magnitude (and at

    the same time easy to infer the correct value). Note that the results did not change with

    the exclusion of these three observations. However, there were cases of ambiguity in

    which the difference in age and years of education were very small.

    The minimum cut-off for differences in this analysis is three years of age. This means a

    6-year-old child having three years of education is considered reasonable since children

    may have had pre schooling before the official schooling age. Moreover, the

    observations with differences of less than three years constituted only 0.8 per cent or 68

    observations of the entire sample and thus culling these observations was believed to

    have little impact on our analysis. See Appendix 1 for a table of the age-education year

    differences and the abovementioned changes.

    Therefore, the final sample for analysis will consist of 8,600 observations of 6-18 year

    old children. The distribution of SAGE scores over this sample is shown in Figure 1.

  • 32

    Figure 1: Distribution of SAGE scores

    05

    1015

    2025

    Per

    cent

    0 100 200 300SAGEX

    Figure 1 shows that a significant proportion of the observations in the sample had SAGE

    scores of zero. Specifically, 2028 or 23.57% of the entire sample of 8,600 children have

    had no years of education. Of the 6572 children who have had at least one year of

    education, 66.84% of them have not kept up with their education according to their age

    (with SAGE scores less than 100). Approximately 6% or 569 children had a SAGE

    score greater than 100, indicating some form of pre schooling. The variation in SAGE

    scores for the sample will inform the estimation methods to be used, which will be

    detailed in the next section.

    Finally, it should be noted that variation in SAGE scores between 0 and 100 do not

    reflect the extent of schooling “completeness”. That is, a higher SAGE score does not

    necessarily mean that the child is keeping up with their schooling “better” than a child

    with a lower score. Given the way the SAGE score is constructed, such variations can be

    attributed to difference in ages rather than differences in years completed. For example,

  • 33

    a 17 year old with 10 years of education and a 12 year old with 5 years of education are

    both behind in their schooling-for-age by two years. Yet, the 17 year old has a higher

    SAGE score of 0.83 compared to 0.71 for the 12 year old. However, because an aim of

    this thesis is to analyse the incidence, rather than the extent of, children falling behind in

    their schooling, this particular feature of SAGE scores between 0 and 100 is not of key

    interest in this study.

    Therefore, equation [1] will be estimated with two measures of education attainment:

    schooling participation and schooling consistency. A binary outcome of school

    participation (SAGE score > 0) or not and school consistency (SAGE score ≥ 100) will

    describe the education outcomes of the children. Details on the econometric strategy and

    the modelling of these dependent variables will be provided in the next section.

    5.3 Explanatory Variables The variables of interests in this study are the village-level health resources. In addition

    to these, our rich dataset also allows for several individual, household and village level

    control variables. Table 2 presents the definitions of the list of variables used for this

    analysis. The inclusion of the control variables was considered given the practice and

    findings of previous studies.

  • 34

    Table 1 Definition of explanatory variables Variable Definition

    Individual characteristics

    AGE6_12 dummy, takes value 1 if child is aged 6-12 years AGE13_15 dummy, takes value 1 if child is aged 13-15 years AGE16_18 dummy, takes value 1 if child is aged 16-18 years MALE dummy, takes value 1 if child is male BTHORDER value 1 for first born, 2 for second born, 3 for third born…etcetera ACTIVITY dummy, takes value 1 if child performs activity non-school related

    Household characteristics LAND dummy, takes value 1 if family owns land FAMILY_SIZE number of family members HH_EXP family expenditure on food and non-food items as reported in

    1999 (Rupees) per capita HEAD_EDUC education level of head of the family and spouse

    Village characteristics ELEC dummy, takes value 1 if village is electrified SCHOOL_DIST distance (km) of the main school from the village EDUC_QUAL education quality indicator of value 1-10 HEALTHDIST dummy, takes value 1 if health facility is not situated in the

    village HOSPDIST dummy, takes value 1 if rural hospital is not situated in the village CHLORF dummy, takes value 1 if frequency of well chlorination is at least

    every 3 months MALAEF dummy, takes value 1 if frequency of malaria spraying is at least

    every months GHEALTH dummy, takes value 1 if frequency of general health sessions

    given in the village is at least every 3 months WATERSOURCE dummy, takes value 1 if village has improved source of water TOILETQ dummy, takes value 1 if village has improved sanitation HTHGUID dummy, takes value 1 if village has a health guide

    a) Village-level health resources There are eight health variables of interest that are considered to capture the level of

    health resources in a village.

  • 35

    The proximity of a health facility from the village (health_dist and hosp_dist) was

    deemed adequate proxies for the extent of curative care available in the village. A proxy

    for the quality of curative care could only be captured by whether the health centre had

    beds on its premises. However this had little variation over the sample and thus was not

    included.

    The variables watersource and toiletq are the indicators of safe drinking water and

    proper sanitation in a village. Following the definitions by the World Health

    organization and UNICEF, an “improved” source of water in a village is the presence of

    a public tap, hand pump or tube well. An “unimproved” source includes canals, rivers

    and ponds. “Improved sanitation” refers to toilet facilities with a flush or semi flush, and

    “unimproved” sanitation is defined as a service latrine or open fields in the data. (WHO

    and UNICEF, 2004)

    Other preventive care health resources in the analysis include three frequency measures.

    The frequency of well chlorinating, chlorf, is considered an important health measure.

    An improved source of water may still contain harmful bacteria and disease-causing

    organisms and chlorinating water sources will help kill such bacteria and reduce the

    transmission of water-borne diseases (WHO, 2004). Thus the frequency of well

    chlorinating is considered an important water quality measure. The frequency of malaria

    spraying, malaef, obviously helps reduce the incidence of malaria but is also an

    indicator of the other vaccinations and immunizations that the village may perform. The

    frequency of general health sessions, ghealthf, reflects the level of health consciousness

    in the village. Indeed general health knowledge has been cited as being fundamental to

    improving the health of people in developing countries. Basic practices such as hand

    washing, for example, has particularly been acknowledged as effective means to reduce

    the spread of disease and infection and is increasingly being promoted in developing

    countries (Reuters, 2008). Finally the presence of a health guide in a village, hthguid, is

    another measure of health resources that is thought to improve schooling outcomes by

    promoting good health practices.

  • 36

    b) Individual and household characteristics Age dummies that reflect different levels of schooling: primary, middle and senior

    school, are included in order to capture any possible age cohort effect on schooling

    outcomes. Gender may play an important role in determining education outcomes as

    well. From traditional or cultural norms, males may be considered more important as a

    future income-earner for the family (contrary to females who will marry and “leave” the

    family). Therefore parents may favour schooling resources towards sons rather than

    daughters.

    Although the empirical evidence has been mixed, the birth order of the child may also

    have an effect on education outcomes for children. Children that are born earlier (and

    have a low birth order) have fewer siblings to compete with and thus can enjoy a greater

    proportion of household resources (Lindbert, 1977). Older children may also be

    expected to provide for the family and thus are given greater access to schooling

    resources by parents. On the other hand, later stages of the life cycle for a family may

    have greater resources available for schooling. This means older children may not be

    given the chance for schooling because of limited household resources compared to later

    born children (Parish and Willis, 1993). Therefore the expected impact of a child’s birth

    order on their education outcome is not certain.

    The financial resources of the family would impact on the educational outcomes of

    children by influencing their ability to put their children through schooling. Indeed

    although schooling costs may be freely provided and subsidized by the government,

    there remain other costs to schooling, such as textbook, transport and other

    miscellaneous expenditures. Moreover, and another important issue in developing

    countries, the opportunity cost of the child’s schooling is their labour in the fields and

    supplemental income. Thus the activity status of a child – whether he or she is engaging

    in non-school activities be it paid or non-paid – is included as an explanatory variable.

  • 37

    Psacharopoulos and Arriagada (1989) found the demand of child labour by the family a

    strong indictor of schooling participation.

    Moreover, if a family has a higher income, the willingness to invest in a child’s

    education (which is a long term investment) rather than the shorter-term child labour

    earnings is expected to be greater. However as suggested and used by Maitra (2003), log

    of household expenditure should be used as a proxy for permanent income. This is due

    to possible mis-measurement of household income information in the data, as well as the

    transitory nature of household income. It is considered that households tend to smooth

    consumption over time and thus household expenditure per head is a more appropriate

    proxy for permanent income. The data on household’s expenditure includes food and

    non-food items.

    Access to credit is also another factor that may influence human capital investment. If

    poor families are credit constrained and lack the collateral to borrow against their

    income, investing in human capital would be difficult. Whether a family owns land or

    not is used as a proxy for family assets and their ability to invest in education. It is

    expected that a family with more resources are more likely to access credit and invest in

    education. This is considered important in deciding schooling participation for a child

    and thus land is used as an exclusion restriction in the econometric strategy, which will

    be detailed later.

    The education level of the head of the house, head_educ, is also included in order to

    capture its expected positive correlation with the child’s education. This may be due to a

    greater appreciation of schooling or ability to assist their children in their studies The

    intergeneration effects of parental education on the schooling outcome of children has

    been studied previously (for example, see Glick and Sahn, 2000; Brown and Park, 2002).

    Head education could also be considered a proxy for the child’s innate ability or IQ.

  • 38

    c) Other village-level determinants of child education outcomes

    Previous studies have indicated that community-level factors in education, such as

    distance of village to schools, teacher-to-pupil numbers, class size and other quality

    indicators impacts on the education outcome of children (Hanushek, 1995, Glewwe,

    2002). Supply side schooling factors are including in this analysis. The distance of a

    village from a school entails time and transportations costs that could reduce the demand

    for school as well as the ability to keep up with schooling. An index, educ_qual, was

    also formed, which includes 10 possible indicators of school quality including the

    availability of mid-day meals, furniture for students and computers in the school. See

    Appendix 2 for the list of variables in the index.

    Other community-level infrastructure factors were also considered as determinants of

    schooling outcomes. The level of development or income in the village could impact on

    both education and health outcomes. Thus appropriate proxy variables needed to be

    included. Unfortunately, the data allowed for limited variable choices in this respect.

    Whether a village is electrified or not, elec, is used to indicate the level of development

    in the village. Other proxies that were considered included the number of televisions and

    telephones in the village. These were poor proxies, however, because these continuous

    variables were not useful unless the proportion of the village people owning these items

    were known. As such, only one indicator of village development is included in the

    analyses (which pose one of the limitations of this study).

    Table 2 contains descriptive statistics for the variables used in the empirical model.

    Descriptive statistics of the variables with comparisons between children who have

    participated in school against those who have not participated, as well as comparison

    between consistently and inconsistently schooled children can be seen in Appendixes 3

    and 4 respectively.

  • 39

    Table 2: Descriptive Statistics for all children (N = 8600)

    Variable Mean Standard

    Dev. Min Max age6_12 0.539 0.498 0 1

    age13_15 0.230 0.421 0 1 age16_18 0.230 0.421 0 1

    male 0.547 0.498 0 1 bthorder 1.993 1.128 1 13

    activity 0.097 0.296 0 1 familysize 6.423 2.490 2 30 lnHH_exp 10.080 0.520 7.711 13.851 headeduc 6.921 5.594 0 44

    land 0.740 0.438 0 1 elec 0.917 0.276 0 1

    scdist 1.411 2.511 0 9 educ_qual 3.788 2.180 0 10 healthdist 0.414 0.493 0 1

    hospdist 0.268 0.443 0 1 chlorf 0.551 0.497 0 1

    malaef 0.431 0.495 0 1 ghealthf 0.538 0.499 0 1

    watersource 0.641 0.480 0 1 toiletq 0.143 0.350 0 1

    hthguid 0.418 0.493 0 1

    Before econometric analysis was performed using the data, simple comparisons were

    made in order to identify any possible correlation to support the hypothesis. A priori it is

    argued that poor health resources lead to poor education outcomes. Therefore it is

    expected that children who have not participated in school or who have been

    inconsistent in their schooling would live in villages that lacked preventive and curative

    care health measures.

  • 40

    Some simple correlations of the data, focusing on two health resources – general health

    sessions and source of water - is shown in Figure 2 and Figure 3. Both indicated that this

    correlation could exist. Note that participation and consistency has been modelled as a

    binary outcome

    It can be seen from Figure 2 that a higher percentage of children who have participated

    in school live in villages with frequent – at least every two months - general health

    sessions (57%) compared to children who have not participated in school (43%). In

    terms of schooling consistency, though there is less pronounced difference, 60% of

    children who have kept up with their schooling live in villages with frequent general

    health sessions compared to 56% of children who have fallen behind in their studies.

    Source of water and schooling outcomes is compared in Figure 3. Similarly, a greater

    percentage of children who have good education outcomes tend to live in villages with

    improved sources of water.

    Certainly these “naïve” correlations do not take into account any other factors that may

    impact on schooling outcomes. However, this preliminary exercise gave some indication

    that a relationship may exist, further justified our investigation and necessitated the use

    of econometrics to advance our analysis.

  • 41

    Figure 2: General Health Session Frequency and Schooling Outcomes

    57%

    43%

    60%56%

    0

    10

    20

    30

    40

    50

    60

    70

    Participation Non- Participation Consistency Inconsistent

    % o

    f chi

    ldre

    n fr

    om g

    ood

    heal

    th re

    sour

    ce v

    illag

    e

    Figure 3: Source of Water and Schooling Outcomes

    64%69%

    58%66%

    0

    10

    20

    30

    40

    50

    60

    70

    80

    Participation Non- Participation Consistency Inconsistent

    % o

    f chi

    ldre

    n fr

    om g

    ood

    heal

    th

    reso

    urce

    vill

    age

  • 42

    6. Econometric Approach 6.1 Binary Dependent Variable Our main hypothesis is that a low level of health resources in a village will increase the

    probability that a child falls behind in their schooling. Moreover, it may reduce the

    propensity that a child attends school. As aforementioned, the construction of the SAGE

    score means that variations in scores between zero and 100 do not reflect the extent of

    schooling “completeness”. Therefore, Ordinary Least Squares (OLS) estimation of

    variation in SAGE scores against explanatory variables would not produce meaningful

    results. Instead, the nature of the SAGE scores indicates that a binary dependent variable

    approach would be appropriate.

    A SAGE score equal or greater than 100 would indicate that a child has kept up with their

    schooling and a SAGE score less than 100 would indicate inconsistent schooling for age.

    Similarly a positive SAGE score would indicate schooling participation and a SAGE score

    of zero would indicate non-participation. Thus two probit models will estimate equation

    [1] with a binary dependent variable for schooling participation and schooling consistency.

    However, the nature of the data indicates that there may be a sample selection problem

    when modelling schooling consistency.

    From Figure 1, it was shown that a sizable number of the children in the sample have had

    no schooling years at all. Because inconsistency in schooling necessitates participation in

    schooling, ignoring these observations and only analysing the sub sample of schooled

    children could produce inconsistent estimates. This is because the analysis of the schooled

    children may not be randomly selected. Since schooling consistency is only observed if

    the child participated in school, sample selectivity bias may arise if the probability of not

    participating in school is not differentiated and distinct from that of being inconsistent in

    schooling.

  • 43

    Therefore to address this suspicion of sample selection and interdependency between

    schooling participation and schooling consistency, a bivariate probit model with sample

    selection as well as univariate probit models was employed.

    6.2 Bivariate Probit Model with Sample Selection A probit model of the standard form models each schooling outcome of participation and

    consistency. Let the superscript * indicate an unobserved or latent variable:

    Y*i = βXi + ei P(Yi = 1 | Xi) = P(Y*i > 0| Xi) = P[ei > -(βXi) | Xi] = Φ( βXi ) [3]

    where i is an individual subscript, X represents a vector of characteristics and includes 1,

    ei is a standard-normally distributed error term and Φ(.) is the standard-normal cumulative

    density. For more details, see Wooldridge (2002). Equation [2] is estimated by maximum

    likelihood estimation using the econometric program, State 9SE (as are all estimations in

    this study).

    However, because of the possible sample selection bias from estimating the schooling

    consistency model, it is necessary to begin the analysis using a two-equation approach in

    order to determine if a sample selection problem existed. Thus, a bivariate probit model

    with sample selection was employed.

    This model consists of two simultaneous equations – one for the selection equation of

    attending school, Yi1, and another for the outcome equation of keeping up with schooling,

    Yi2. In other words, the econometric model will consider two latent variables representing

    the propensities of a child to be educated as well as the propensity for a child to keep up

    with their education given their age. Let the superscript * indicate the latent variables with

    the model specification following:

  • 44

    Y*i1 = β1X1i1 + ei1

    Y*i2 = β2X2i2 + ei2 [4]

    where i is the individual subscript and Xij are the vectors of individual, household and

    village-level characteristics that affect child education outcomes for j =1,2 and includes 1

    (village-level health resources Hi from equation [1] is now included in Xij). The

    disturbance terms (ei1, ei2) are assumed to be zero-mean, bivariate normally distributed

    with a unit variance and a correlation coefficient between ei1 and ei2 equal to ρ: (0,0,1,1, ρ).

    The modelling strategy is such that the binary choice variable Yi1 takes a value 1 if the

    child has had at least one year of education (SAGE score > 0) and 0 if the child has had no

    schooling (SAGE score = 0). The second binary variable, Yi2, takes the value 1 if the child

    has had consistent schooling (SAGE score ≥100) and 0 if he/she has fallen behind for their

    age (0 0); = 0 otherwise [5a]

    Yi2 = 1 if consistent schooling (Y*i2 >0); = 0 otherwise [5b]

    This model is a variant of the standard bivariate probit model with four observations (see

    Meng & Schmidt, 1985). Also known as a ‘bivariate probit model with partial

    observability’, there are three types of observations under this particular model structure: a

    child with no schooling, a child with consistent schooling or a child with inconsistent

    schooling. The observations of this two-equation probit model can be represented

    graphically, where n is the number of observations observed for each equation:

  • 45

    Figure 4. Three observations in the bivariate probit model with sample selection

    n = 8600 n = 6572

    The likelihood function is therefore given by:

    ℓ = Π pr(no schooling) · Π pr(consistent schooling) · Π pr(inconsistent schooling)

    This implies the log-likelihood function is:

    lnℓ = ∑ni=1(1 – Yi1) ln [1 – Φ (Xi1β1)]

    + ∑ni=1Yi1·(1- Yi2) ln{Φ(Xi1β1) – Φ2(Xi1β1, Xi2β2; ρ)}

    +∑ni=1Yi1·Yi2 ln Φ2(Xi1β1, Xi2β2; ρ) [6]

    where Φ(·) and Φ2(·,·,ρ) denotes the univariate and bivariate standard normal cumulative

    distribution functions. Equation [6] is jointly estimated by maximum likelihood using

    Stata 9 SE.

    No Participation; Yi1 = 0

    Participation; Yi1 = 1

    Inconsistent; Yi2 = 0

    Consistent; Yi2 = 1

    All Children

  • 46

    This econometric strategy was chosen to deal with the potential sample selection problem.

    The need for this model is indicated by the statistical significance of ρ – the correlation

    coefficient of the dual equation errors. If ρ is statistically significant, this indicates that

    there is a relationship between the two schooling propensities.

    Although schooling participation is fully observable, estimating this first probit equation

    would produce inefficient results under this condition. Moreover, the second probit model

    would produce selectivity bias if only the schooled sub sample was analysed. However, if

    ρ is not statistically different from zero, two univariate probit models to estimate the

    probability of schooling participation and consistency following [3] would be appropriate.

    Identification in a multiple equation probit model has been a source of some debate.

    Maddala (1983) stated that in order to identify the second equation, at least one variable

    needed to be included in the selection equation that is not included in the outcome

    equation. However, Wilde (2000) argues that exclusion restrictions are not needed (and

    that Maddala was considering a specific example) if there is sufficient variation in at least

    one exogenous regressor in each equation. Essentially, the non-linearities in the probit

    models are considered sufficient for identification.

    For prudence’s sake, an exclusion restriction was included in this analysis. The variable,

    land, was considered a good proxy for a family’s assets and potential to invest in

    schooling. This asset proxy is not considered relevant for schooling consistency (with

    household income more pertinent). Using Stata, the bivariate probit with sample selection

    model did not converge without this identifier and thus its inclusion seemed necessary as

    well, despite Wilde’s argument.

  • 47

    6.3 Extension: Random Effects Probit Models Probit analysis of schooling participation and schooling consistency assumes that each

    observation or child is independent. However, the sample consists of 3820 families with

    an average size of 2.3 members. Therefore there may be some characteristics that are not

    specifically in the model but are common to children in the same family and household.

    Examples of this clustering effect could be parental competence in assisting their children

    in their schoolwork or parental preference for education or health resources. Because

    intra-cluster correlation would create bias parameter estimates, an extension of our

    econometric models to take into account unobserved household heterogeneity is necessary.

    A random effects probit model is used to account for children’s education data being

    clustered at the family level. For detailed discussion on the model, see Maddala (1987).

    This model considers the household effect to be random. The latent variable (for general

    schooling outcomes) is thus of the form:

    Y*ih = α + Xhβ+ Zihγ + vih [7]

    where i and h are individual and household subscripts respectively. 1 x K vector Xh contains the explanatory variables that vary only at the household level. 1 x L vector Zih contains the explanatory variables that vary within the households or clusters.

    The random effects probit model assumes that the error term vih is composite in nature:

    vih = ch + uih [8]

    where i and h are individual and household subscripts respectively, ch is the unobserved

    household effect and uih is the idiosyncratic error. It is assumed that uih ~ i.i.d. N(0,1) and

    ch ~ N(0, σ2c). Thus,

    Var (vih) = 1 + σ2c [9]

  • 48

    ρ = σ2c / 1 + σ2c [10]

    where rho ρ is considered the proportion of the error variance that is due to the

    unobserved household effect.

    The random effects probit model holds some strong assumptions. Notably, that there is no

    relationship between the explanatory variables and the unobserved household effect, ch. This assumption is needed in order to produce consistent estimates. An alternative model

    to control for unobserved heterogeneity is the fixed effects model. However, because our

    variables of interests are at the non-individual level, using fixed effects would effectively

    drop