Assessing the impact of in utero exposure to famine on fecundity: Evidence from the 1959–61 famine in China

This article was downloaded by: [University of Colorado at Boulder Libraries]On: 19 December 2014, At: 12:16Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registered office: MortimerHouse, 37-41 Mortimer Street, London W1T 3JH, UK

Population Studies: A Journal of DemographyPublication details, including instructions for authors and subscription information:http://www.tandfonline.com/loi/rpst20

Assessing the impact of in utero exposure to famineon fecundity: Evidence from the 1959–61 famine inChinaShige Songa

a Queens College of The City University of New YorkPublished online: 15 Mar 2013.

To cite this article: Shige Song (2013) Assessing the impact of in utero exposure to famine on fecundity:Evidence from the 1959–61 famine in China, Population Studies: A Journal of Demography, 67:3, 293-308, DOI:10.1080/00324728.2013.774045

To link to this article: http://dx.doi.org/10.1080/00324728.2013.774045

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) containedin the publications on our platform. However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose ofthe Content. Any opinions and views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be reliedupon and should be independently verified with primary sources of information. Taylor and Francis shallnot be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and otherliabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematicreproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in anyform to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/rpst20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/00324728.2013.774045

http://dx.doi.org/10.1080/00324728.2013.774045

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Assessing the impact of in utero exposure to famine onfecundity: Evidence from the 1959�61 famine in China

Shige SongQueens College of The City University of New York

This study identifies a significant increase in sterility among rural, but not urban, Chinese women who were

conceived and born during the 1959�61 famine that resulted from the Great Leap Forward. Applied to data

from two large-scale, nationally representative, sample surveys of Chinese women of childbearing age

conducted in 1997 and 2001 by the State Family Planning Commission, difference-in-differences analysis

revealed that exposure to the famine while in the womb caused an increase in the risk of sterility amongst

the adult women surveyed of 1.1 per cent. This is a substantial increase given that the overall prevalence of

primary and permanent sterility is only slightly over 1 per cent in China. These findings support the

hypothesis that a woman exposed to acute malnutrition while in the womb may experience a long-term

negative impact on her reproductive system, which could result in permanently impaired fecundity.

Keywords: developmental origins of health and disease; fecundity; female sterility; mixture model; long-

term survivor; China; famine; difference-in-differences analysis

[Submitted April 2011; Final version accepted July 2012]

Introduction

When food consumption falls below a critical mini-

mum level, women stop ovulating and thus cannot

conceive; when food levels rise once again ovulation

is restored, explaining why acute malnutrition may

temporarily decrease fecundity. Recent progress in

the search for the foetal, or developmental, origins

of health and disease suggests that famine may also

have a long-term impact on fecundity by affecting

the in utero development of the organs responsible

for the production and regulation of reproductive

hormones (Lumey and Stein 1997; Gluckman et al.

2005; Gardner et al. 2009). If this is indeed the case,

then the demographic consequences of famine are

likely to be substantially greater than has previously

been believed.

Evidence for the proposed long-term impact of

famine exposure on human fecundity is limited and

inconsistent. There are only three relevant studies,

all of which were of the effects of the major famine

resulting from the German occupation of the Neth-

erlands during the Second World War: the ‘Hunger

Winter’ of 1944�45 (Stein 1975). Using hospital

records and interview data on 700 Dutch women

born between 1944 and 1946, Lumey and Stein

(1997) reported that girls exposed in utero to the

Dutch famine were not significantly different from

girls not exposed in this way in age at menarche, age

at the birth of first child, number of births, or the

proportion of who remained childless. In contrast, a

second study based on the Dutch famine, albeit with

a smaller sample of 473 women, reported that

women exposed to the famine while in the womb

went on to begin childbearing at a younger age, to

have more children, to have more twin births, and to

be less likely to remain childless than women who

were not exposed (Painter et al. 2008). A third study

based on the Dutch famine reached conclusions

different from each of the other two. Examining data

from 7,941 women born in 1932�41, Elias et al.

(2005) showed that women exposed to severe famine

during childhood, rather than in utero, grew up to

experience a significantly decreased chance of bear-

ing a first or second child, and, for medical reasons,

an increased chance of having no children or fewer

than desired.

This paper reports results from a study that uses

high-quality survey data in conjunction with the

difference-in-differences method to identify the

long-term effect on the fecundity of women exposed,

while in the womb, to China’s 1959�61 famine. The

Population Studies, 2013

Vol. 67, No. 3, 293�308, http://dx.doi.org/10.1080/00324728.2013.774045

# 2013 Population Investigation Committee

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

http://dx.doi.org/10.1080/00324728.2013.774045

famine was the result of a political campaign (the

‘Great Leap Forward’) to modernize the country’s

agrarian economy (Dikotter 2010). The plan of the

paper is as follows. I first review the demographic

literature on human fecundity, paying special atten-

tion to issues of measurement and estimation. In

light of this literature, I then discuss some of the

shared weaknesses of the Dutch famine studies and

assess how these weaknesses may have influenced

their findings. Next, I describe the empirical context

of the current study and the data, variables, and

identification and estimation strategy used. Finally, I

present the statistical results and discuss their policy

relevance.

Measuring fecundity: a demographic approach

Fecundity and fertility

Fecundity and fertility are closely related to each

other. Fecundity refers to the biological capacity to

reproduce, whereas fertility refers to the number of

offspring actually produced. Because fecundity is not

directly observable, demographers have to infer

fecundity from attained fertility. The total number

of children borne by a woman can be treated as a

proxy measure of her fecundity only among popula-

tions with natural fertility, where there is no contra-

ceptive use. There are almost no contemporary

populations that practise natural fertility, and there-

fore the number of children each woman bears

seldom reflects her biological capacity to reproduce.

In some special cases, however, even in contempor-

ary populations, it is possible to infer the two key

aspects of fecundity, fecundability and sterility, from

the time between marriage and first birth, or from

the lack of any child.

Fecundity, fecundability, and sterility

Fecundability refers to the monthly probability of

conceiving amongst women who are in a susceptible

state and who do not practise any form of contra-

ception. In populations with natural fertility, it is

possible to estimate fecundability using information

on birth intervals. In most contemporary popula-

tions, however, the widespread use of highly effec-

tive contraceptive methods severely distorts the

biological pace of childbearing, making estimates

of fecundability that are based on birth intervals a

less reliable indicator of fecundity. In addition,

estimates of fecundability may be influenced by

heterogeneity in fertility-related behaviours, such

as the frequency of sexual intercourse (Dunson

and Zhou 2000).

Sterility, also known as ‘permanent’, ‘complete’,

and ‘primary’ sterility (Wood 1994, pp. 443�4), refers

to an irreversible biological state that lasts through-

out a woman’s entire reproductive period. Child-

lessness is a reasonable proxy measure of it. When

data on contraceptive use or frequency of sexual

intercourse are lacking, the prevalence of sterility,

which is primarily influenced by biological factors,

may be considered a more reliable measure of

fecundity than fecundability (Weinstein and Stark

1994; Wood 1994). The only non-biological factor

that must be taken into account when estimating the

prevalence of sterility is the extent of a preference

for voluntary childlessness within a population, since

childlessness by choice cannot be taken to indicate

sterility.

In the demographic literature the estimation of

fecundability and sterility have been largely treated

as two separate issues (Trussell and Wilson 1985;

Kallan and Udry 1986; Larsen and Menken 1989;

Weinstein et al. 1990; Larsen and Vaupel 1993).

However, during the past 20 years demographers

have begun to realize that these two closely related

aspects of fecundity should be analysed jointly

because they each introduce additional information

about the other (Wood 1994). A number of statis-

tical procedures, each with different underlying

behavioural assumptions and varying degrees of

sophistication, have been proposed to estimate

sterility and fecundability simultaneously using in-

formation on birth intervals (Heckman and Walker

1990; Larsen and Vaupel 1993; Wood 1994; Dunson

and Zhou 2000).

Lessons from the Dutch famine studies

The cohort of women born in the Netherlands

during the 1944�45 Dutch famine can hardly be

described as a natural-fertility population. When

these women reached childbearing age in the 1960s,

the vast majority of them would use highly effective

contraceptive measures (Moors 1978; Leridon 1981).

As a result, the number of children borne by each of

these women depended heavily on choice and could

not be used as an indicator of fecundity (although

see Lumey and Stein 1997; Painter et al. 2008). Since

it is impossible to obtain reliable estimates of

fecundability from birth-interval information with-

out knowing the exact timing of contraceptive use,

the high level of contraceptive use amongst the

294 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

Dutch population casts doubt on the relevance of

any estimated cohort differences in fecundability

based on birth-interval information.

In addition, none of the studies based on the

1944�45 Dutch famine attempted to exclude from

their analyses those women who were voluntarily

childless, although they comprised up to a quarter of

the cohort born during the famine (Den Bandt

1980). Such a high percentage of voluntarily child-

less individuals in the study populations could easily

have biased the results on cohort differences in

sterility in unexpected ways.

Finally, none of the studies of the Dutch famine

controlled for age at marriage which, although the

relationship is not yet well understood, has been

shown to be an important correlate of childlessness

in historical populations (Trussell and Wilson 1985;

Menken et al. 1986).

In summary, none of the studies of the Dutch

famine were able to achieve an adequate measure of

women’s fecundity. It is therefore necessary to treat

with caution the conclusions drawn from these

studies about the long-term effect of famine expo-

sure on women’s fecundity.

The current study

The Great Leap Forward, the 1959�61 famine,and urban�rural differences in famine severity

A number of studies have detailed the causes and

magnitude of the 1959�61 famine in China (Ashton

et al. 1984; Peng 1987; Kung and Lin 2003), and

these indicate that the famine affected most parts of

the country, causing approximately 30 million excess

deaths and an additional 33 million foetal losses,

making it one of the most disastrous events in

human history. The famine was much more severe

in rural areas than in urban ones. Although reliable

data on changes in daily calorie intake during the

famine are not available, indirect estimates have

suggested that grain availability declined much more

dramatically in rural areas, exacerbating a pre-

existing urban�rural difference in grain availability

per head and pushing the daily calorie intake of

many rural residents below subsistence level (Peng

1987; Lin and Yang 2000). In contrast, the effects on

the urban population were far less severe; starvation

rarely occurred in the cities.

Figure 1 shows the annual crude mortality rates

for the populations of urban and rural China from

1956 to1964 (State Statistical Bureau 1991; Lin and

Yang 2000). Among the urban population there is no

sign of a major increase in mortality during the three

famine years 1959, 1960, and 1961. Indeed, in 1959

and 1961 the rate of mortality in urban areas was not

much higher than the rate in rural areas during non-

famine years. Furthermore, amongst the urban

population the rate of mortality in the peak famine

year, 1961, was lower than the rate of mortality in

rural areas in either 1959 or 1961, and much lower

than the peak rate of rural famine mortality in 1960.

The Chinese population recovered quickly however.

In 1962, immediately after the famine ended, the

national mortality rate returned to its pre-famine

level and the country’s fertility rate, which had fallen

markedly during the famine, rose well beyond its

pre-famine level and subsequently remained high for

several years (Peng 1987).

An important reason for the urban�rural differ-

ence in famine severity was the different treatment

meted out to urban and rural residents by China’s

socialist system. Whereas urban residents were

granted legal rights to food security through a

system of food rationing, rural residents, who

produced and supplied the food, were legally bound,

under the coercive quotas imposed by the national

food-procurement system, to surrender their pro-

duce (Lin and Yang 2000). In years of poor harvest,

such as 1959, 1960, and 1961, peasants and their

families had little food remaining after they had

fulfilled quotas. The extreme measures used by the

state to extract ‘surplus’ food from the peasants left

the latter with no choice but to comply, although

resistance and conflicts were not uncommon (Walker

1984; Dikotter 2010). The different impact which the

famine had on urban and rural areas, as depicted in

Figure 1, contributes crucially to the information

30

25

20

15

10

51956 1958 1960 1962 1964

Year

Urban population Rural population

Mor

talit

y ra

te (

per

thou

sand

)

Figure 1 Annual crude mortality rates per 1,000

population for the urban and rural populations of

China; 1956�64Source: Statistical Yearbook of China 1991 (State Statis-tical Bureau 1991, pp. 79�80).

Long-term effect of famine on fecundity 295

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

needed to identify and estimate the long-term effect

of famine exposure on women’s fecundity.

The famine resulting from the Great Leap For-

ward has been used to study both short-term and

long-term effects of exposure in utero to acute

malnutrition. The outcomes include the following:

involuntary foetal loss suffered by the mothers (Cai

and Wang 2005); a reduced propensity to have male

offspring (Song 2012); and changes in mortality

(Song 2010) and disability among children born

during the famine (Mu and Zhang 2010). As a

source of data on the effects of famine, China’s

famine differs strikingly from the Dutch example

owing to the unusually long duration of the former,

the extent to which it affected the population, and

the significant spatial variations in the severity of its

impact. Previous studies have shown that the Chi-

nese famine provides an opportunity to investigate

outcomes on which the studies based on the Dutch

famine failed to provide consistent results (Song

2012).

The one-child policy and changing fertilitypatterns in China

When the cohorts of Chinese women born during

the 1959�61 famine entered their childbearing years

in the early 1980s, China was in the process of

changing from a ‘later, longer, fewer’ family plan-

ning policy to a more rigid policy of ‘one child per

couple’. Whereas the earlier policy had encouraged

couples to marry late, to have no more than two

children, and to have long birth intervals, the new

policy prohibited couples from having more than

one birth, while at the same time relaxing the

regulations about late marriage and long birth

intervals. In 1980 a new marriage law made the

legal minimum age at marriage 20 for women and 22

for men; both considerably lower than the de facto

local standards of the 1970s (Song 2004).

These policy changes, combined with other con-

temporaneous social and economic changes, such as

a shift away from arranged marriage and increased

intimacy and sexual activity between spouses, as well

as a rapid increase in formal education for both men

and women, led to a secular decline in both the age

at marriage and the age at first birth (Wang and

Yang 1996; Hong 2006). Clearly, these demographic

changes were policy-induced and had nothing to do

with the 1959�61 famine. Although the ‘later, longer,

fewer’ policy in the 1970s and the ‘one-child policy’

that was adopted in the early 1980s both aimed to

prevent people from having ‘too many’ children,

under neither policy was having one child consid-

ered to be having ‘too many’. According to Scharp-

ing (2003, pp. 215�6), a long-standing feature of life

in rural China is that virtually all women desire to

have children. In addition, although the use of

contraceptives has been widespread in China since

the 1980s, few Chinese women practise contracep-

tion before the birth of their first child, particularly

in rural areas (Choe and Tsuya 1991; Short et al.

2000). Women who do not proceed to have at least

one child are therefore highly likely to be involun-

tarily childless and suffering from primary sterility.

Thus, in the unique historical context of Chinese

society over the past half century, neither the

‘number children borne’ nor birth spacing can

reliably be used to describe cohort trends in wo-

men’s fecundity. Being much less sensitive to

changes in policy, society, or the economy, women’s

sterility, as measured by involuntary childlessness,

can, in contrast, be used to capture famine-induced

changes in fecundity.

Research hypothesis

Table 1 summarizes the in utero famine exposure

status of six selected birth cohorts of Chinese women

and the hypothesized effect on their fecundity in

adulthood. Of the six birth cohorts considered, the

Table 1 Windows of in utero exposure to the 1959�61 famine in China by birth cohort and the expected effects onfecundity by rural and urban residence

Expected effect on female fecundity

Birth cohort Prenatal famine exposure Rural women Urban women

1957�58 No exposure No effect No effect1959 Partial exposure, late Some damaging effect Weak to no effect1960�61 Full exposure Full damaging effect Weak to no effect1962 Partial exposure, early Some damaging effect Weak to no effect1963�64 No exposure No effect No effect1965�66 No exposure No effect No effect

296 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

1957�58 cohort were born before the famine and

experienced the famine as young children. Both the

1963�64 and 1965�66 cohorts were born after the

famine and thus were not exposed to its pernicious

effects. The 1959 and the 1962 cohorts were only

partially exposed to the famine in utero, although at

different gestational ages. Finally, the 1960�61 co-

hort were exposed to the famine from the moment

of conception to birth

To make the cohort comparison as sharp and

informative as possible, the main statistical analysis

of the current study focused primarily on the

contrast between the following three cohorts: the

pre-famine cohort (1957�58), the famine cohort

(1960�61), and the post-famine cohort (1963�64).

According to hypotheses about the foetal origins of

health and disease, only the period spent in the

womb is vital for the development of the woman’s

reproductive system and it was therefore hypothe-

sized that individuals from the 1960�61 rural cohort

would have significantly reduced fecundity. This was

because they had been exposed to the famine

throughout their gestation and would have been

most affected by the acute malnutrition suffered by

their mothers during the famine. It was further

hypothesized that members of the 1963�64 rural

cohort would exhibit ‘normal’ levels of fecundity

during their own childbearing years because they

were not exposed to the famine in utero. Finally, it

was assumed that women of the 1957�58 rural

cohort would also have more or less normal levels

of fecundity because they had been exposed to the

famine during early childhood rather than in the

womb. Cohort variations in fecundity were expected

to be much weaker among the urban population

because annual differences in exposure to the

famine were much less marked in urban areas.

Research design

Data and sample

The analyses undertaken in the current study were

based on data from the two most recent, nationally

representative, sample surveys conducted by China’s

State Family Planning Commission: the 1997

National Population and Reproductive Health

Survey and the 2001 National Family Planning and

Reproductive Health Survey. The sampling plans

and questionnaires used in both surveys were very

similar. The questionnaires collected pregnancy

and birth-history information from women of child-

bearing age residing in family households across all

31 of China’s provinces. The 1997 survey interviewed

15,000 women and the 2001 survey 39,586. The

surveys provided good population coverage and

the data collected are considered to be of high

quality (Zhang and Zhao 2006; Chen et al. 2007;

Song and Burgard 2011).

Two outcome variables were extracted from the

retrospective birth histories and these were then

used to estimate fecundability and sterility jointly.

The variables were ‘ever had a child’, a binary

variable indicating whether or not a respondent

had ever given birth and ‘time to first birth’, a

continuous variable measuring, for each woman who

had ever given birth, the interval, in years, between

her first marriage and first birth.

Most existing studies define a woman as sterile if

she has not given birth after 2 years of marriage

(Collins et al. 1983; Fang et al. 1993). Some studies

have adopted a more rigid definition, classifying a

woman as sterile only if she has not given birth after

7 years of marriage (Larsen 2000; Liu et al. 2004).

For the current study, women aged 33�40 in the 1997

survey and women aged 37�44 in the 2001 survey

were selected for analysis. Because rural women had

married, on average, in their early 20s and urban

women in their mid-20s (see Table 3), a woman who

had not given birth by the time of the surveys had

therefore been ‘at risk’ of having a child for

considerably more than 7 years, so the definition of

sterility used in the current analysis was particularly

rigorous.

Table 2 compares the estimated prevalence of

sterility amongst the women sampled from the 1997

and 2001 surveys, using each of the three definitions

of sterility given above: no child born after 2 years of

marriage, no child born after 7 years of marriage,

and no child born by 1997 or 2001. The estimates

calculated using the last of these definitions were

assumed to indicate the true prevalence of primary

sterility. When the 2-year cut-off was applied to the

1997 and 2001 data, the true prevalence of women’s

sterility was vastly overestimated. When the 7-year

cut-off was applied, the true prevalence was again

overestimated but by a much smaller margin. The

estimates for the three cohorts derived from the

1997 survey data were largely consistent with those

derived for the same cohorts from the 2001 survey

data. When the 7-year cut-off was applied to the data

in the two surveys, approximately 1.8 per cent of

women were estimated to be sterile. This was a

similar magnitude to the 1.3 per cent rate estimated

for China by earlier studies using the same definition

of sterility (Liu et al. 2004). However, both these

rates are lower than comparable estimates for other


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

societies where 2.1�4 per cent of women were

considered to be sterile (Bongaarts 1980; Wrigley

1997; Larsen 2000). The reasons for the lower

prevalence of sterility in China is not yet well

understood, but may be related to Chinese culture,

which is strongly pro-natalist, and actively en-

courages childless women to seek fertility treatment.

However, the aim of the current research was not to

estimate the prevalence of women’s sterility in

China, but to identify differences in the levels of

sterility between birth cohorts, and between urban

and rural populations. Thus unless the*so far

unexplained*‘Chinese factor’ had a differential

impact in urban and rural areas or across birth

cohorts, which is considered most unlikely, it will

have little influence on either the analysis or the

central findings.

The key independent variables of interest in the

analysis were women’s birth cohort and urban vs.

rural residence. As described above, Chinese women

born between 1957 and 1966, and surveyed in either

1997 or 2001, were classified into six birth cohorts: a

1957�58 cohort (pre-famine, no in utero exposure to

the 1959�61 famine); a 1959 cohort (partial exposure

in utero at late gestational age); a 1960�61 cohort

(full exposure; whole time in utero spent exposed to

famine conditions); a 1962 cohort (partial exposure

in utero at early gestational age); and a 1963�64 and

a 1965�66 cohort (both post-famine, no exposure).

All six cohorts were included in an exploratory

analysis (see Table 2) and a sensitivity test (see

Table 5) but the main analysis was conducted using

only the 1957�58, 1969�61, and 1963�64 cohorts (see

Tables 3 and 4). ‘Urban�rural residence’ was taken

to be a binary variable indicating where the respon-

dent was living at the time they were surveyed; if

they lived in a rural area they were given the code

‘1’, if in an urban area they were given the code ‘0’.

Although it is possible that some women had

changed their place of residence between the time

of their birth and the time they were interviewed,

the impact of such changes on key findings about the

effect of in utero famine exposure on fecundity is

largely predictable and likely to be small in scale. In

China the flow of migration was nearly always from

rural to urban areas. Because women born in rural

areas were most severely affected by the 1959�61

famine, the presence of a fraction of rural-born

women in the urban sample drawn from the 1997

and 2001 surveys may mean that the effects of the

famine were somewhat overestimated for the urban

sample but not for the rural sample. If so, the

difference-in-differences estimate of the famine

effect was more conservative than was truly

the case.

A number of control variables were included in

the analysis. ‘Age at marriage’, which is a known

correlate of both fecundability and sterility (Trussell

and Wilson 1985; Menken et al. 1986), was included

as a continuous variable. ‘Education’ was taken to be

a four-category ordinal variable, derived from the six

categories used in the surveys. The four categories

were: no schooling, primary school, junior high

school, and senior high school education and above.

Women’s education is an important socio-economic

indicator, widely used in population and health

studies in less developed countries where data on

other socio-economic indicators, such as household

income or occupation, are unreliable or unavailable

(Desai and Alva 1998). In addition, women’s educa-

tion can be used to measure self-efficacy, as women

with higher education are assumed to be more

capable of utilizing modern health services and

pursuing health-enhancing activities (Caldwell

Table 2 Estimated prevalence of sterility (per 100 women) among Chinese women when different definitions of sterilityare used; 19971 and 20012 surveys

1997 Survey1 2001 Survey2

BirthNo child after

2 yearsNo child after

7 yearsNo childby time

No child after2 years

No child after 7years

No childby time

cohort of marriage of marriage of survey N of marriage of marriage of survey N

1957�58 16.44 1.74 0.93 864 17.20 1.72 0.97 2,2621959 15.44 1.40 1.05 285 15.97 1.60 1.35 8141960�61 17.13 1.96 1.31 613 17.24 2.63 1.59 1,6361962 15.91 1.52 0.57 528 16.92 1.21 0.78 1,4071963�64 17.11 1.56 1.23 1,216 15.21 2.01 1.03 3,2871965�66 16.12 2.33 2.05 1,073 14.24 1.67 1.00 3,110Overall 16.51 1.81 1.29 4,579 15.84 1.84 1.08 12,516

Source: 11997 National Population and Reproductive Health Survey and 22001 National Family Planning and ReproductiveHealth Survey.

298 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

1994; Ross and Wu 1995; Desai and Alva 1998; Song

and Burgard 2011). Ethnicity was measured as a

binary variable, members of the Han ethnic majority

being coded ‘1’ and the non-Han ethnic minority

being coded ‘0’.

In Table 3, data from the 1997 and 2001 surveys

have been combined to produce descriptive statistics

for women drawn from the three birth cohorts of the

main study sample, cross-classified by place of

residence.

Statistical models of fecundity: joint estimationof sterility and fecundability

When estimating women’s sterility from birth-his-

tory information, the problem of right censoring was

encountered. Women in the cohorts being studied

were aged 33�40 at the time of the 1997 survey and

37�44 when the 2001 survey was taken. At such ages,

most Chinese women would have stopped giving

birth, but would not yet have lost their biological

capacity to reproduce (Lavely 1986). Data derived

from the two surveys indicate whether each woman

had given birth to a child by the time she was

surveyed, but not whether she went on to have a

birth thereafter. Older cohorts in the sample would

therefore almost certainly show a lower level of

childlessness than their younger counterparts, not

because they were less likely to be sterile, but

because they have had more time to bear a child.

It is possible to truncate the period in which the

older cohorts are being observed in order to achieve

better comparability, but this strategy is not optimal

because valuable information that can be used to

achieve more precise effect estimates is discarded.

Hazard models can be used to handle right-

censored data, but conventional hazard models do

not make a clear distinction between the probability

of an event occurring and the timing of its occur-

rence. If a subset of the population under observa-

tion has never been ‘at risk’ of experiencing a

particular event, conventional hazard models are

no longer sufficient. For this study it was hypothe-

sized that within the sample population there was a

subgroup of women who were not ‘at risk’ of giving

birth because they were biologically sterile, and

in utero exposure to famine potentially increased a

woman’s risk of being in the ‘sterile’ subgroup. The

analytical task was to identify the effect of in utero

famine exposure on adult sterility by comparing the

relative prevalence of sterility across cohorts and

between urban and rural populations. To obtain

unbiased estimates of the prevalence of sterility, it

is crucial to control adequately for differential

exposure to the risk of having a child by imposing

certain parametric assumptions on the baseline

hazard function of ‘time from marriage to first

birth’. In the literature a parallel argument is often

encountered: that because sterile women are not at

risk of giving birth, their presence must be ade-

quately controlled for in order to obtain unbiased

estimates of fecundability (Heckman and Walker

1990; Wood 1994; Dunson and Zhou 2000).

Table 3 Combined descriptive statistics of selected Chinese women interviewed in the 19971 and 20012 surveys

Rural Urban

1957�58 1960�61 1963�64 1957�58 1960�61 1963�64(N�2,264) (N�1,545) (N�3,353) (N�862) (N�704) (N�1,150)

For everybody in the sample

% who had never given birth 0.75 1.55 0.84 1.51 1.42 1.83% belonging to the non-Han ethnic minority 8.66 10.36 9.81 5.68 5.97 7.04

Education% with no schooling 39.84 26.21 19.09 5.22 2.56 1.91% who attended primary school only 31.67 33.98 34.66 10.56 8.66 7.74% who attended junior high 21.69 27.44 39.70 34.80 26.85 37.74% who attended senior high or above 6.80 12.36 6.56 49.42 61.93 52.61

Age at first marriage 22.10 21.81 21.45 24.07 23.71 23.40(2.41) (2.39) (2.41) (2.69) (2.77) (2.73)

For those who had given birth

Average age at first birth (years) 23.65 23.32 22.90 25.66 25.38 24.98(2.53) (2.50) (2.48) (2.85) (2.98) (2.86)

Average length of first birth interval (years) 1.57 1.52 1.47 1.63 1.71 1.63(1.14) (1.12) (1.01) (1.25) (1.42) (1.47)

Note: Standard deviation given in parentheses.Sources: 1 and 2 as for Table 2; the figures from the two surveys have been combined.


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

A mixture cure fraction model, also known as a

‘cure model’, a ‘long-term survivor model’, a ‘split-

population model’, or a ‘mover-stayer’ model (Fare-

well 1982; Schmidt and Witte 1989; Maller and Zhou

1996) was used to jointly estimate both a probability

model that predicted sterility status and a parametric

hazard model that predicted the monthly probability

of conception for those who were not sterile.

Following Sposto (2002) and Lambert et al. (2010),

the survival function of a cure model can be defined

as

SðtÞ ¼ pþ ð1� pÞð1� FðtÞÞ (1)

where p represents the fraction of sterile women

within a population and F(t) denotes a statistical

distribution function for the time from marriage to

first birth. The hazard function for the model can be

written as

hðtÞ ¼ ð1� pÞf ðtÞSðtÞ

(2)

where f(t) is the density function of F(t). Following

Wood (1994), a log-normal distribution for F(t) was

chosen because ‘time from marriage to first birth’ is

known to have an inverse J-shaped hazard function:

FðtÞ ¼ Uðlog½kt�cÞ (3)

where F(�) denotes the standard normal distribution

function.

Covariates can be introduced into both the

sterility probability model p and the log-normal

hazard model h(t). For a binary dependent variable

such as sterility, a logit link function has commonly

been used to introduce covariates, and the coeffi-

cients can be interpreted as an odds ratio:

logpðxÞ

1� pðxÞ¼aþbX: (4)

Probit, complementary log�log, and even linear

link functions can also be used. Using a linear link

function with a binary dependent variable*com-

monly known as a ‘linear probability model’*has

the important advantage that the interpretation of

its coefficients, including those for the interaction

terms, is similar to that of a linear regression model.

This makes it much easier to construct the differ-

ence-in-differences effect estimates from the statis-

tical results. While linear probability models have

some weaknesses, such as the assumption that errors

are normally distributed, these can be handled by

bootstrapping the standard errors or confidence

intervals (Mooney et al. 1993).

When using a log-normal hazard model, covari-

ates can be introduced to predict either the scale

parameter l, the shape parameter g or both.

Following Sposto (2002) and De Angelis et al.

(1999), in the main analysis the shape parameter

was held constant and only the scale parameter was

allowed to vary with covariates:

logðkÞ¼aþbXþl (5)

in which m follows a standard normal distribution

with mean value of zero and a fixed standard

deviation. Because the log-normal model is not a

member of the proportional hazard model family, its

coefficients are interpreted as a time ratio, rather

than a hazard ratio. Thus, one unit increase in the

covariate X produces exp(b) units of change in the

time it takes for an event to occur, the ‘time-to-

event’. If the time ratio of a covariate is lower than

one, then that covariate accelerates the process

under observation; if higher than one then the

covariate is acting to slow down the occurrence of

the event of interest.

More technical details of the mixture cure model,

including the derivation of the model likelihood

function and numerical maximization strategy, can

be found in Sposto (2002) and Lambert et al. (2010).

The current analysis employed user-contributed

mixture analysis modules in Stata to implement the

above procedures (Lambert 2007).

Obtaining difference-in-differences estimates ofthe effect of in utero famine exposure onwomen’s sterility in adulthood

The most commonly used strategy for identifying the

effect of famine exposure is to compare the outcome

of interest between an observed famine cohort and

non-famine cohorts, but, as Chen and Zhou (2007)

and Huang et al. (2010a) have indicated, such an

estimated difference between the two types of

cohort may include both a famine effect and a

residual cohort effect that is unrelated to famine.

These authors suggest that a difference-in-differ-

ences strategy should be used to remove the residual

cohort effect in order to obtain a more reliable

estimate of the famine effect.

Let Ci denote the birth cohort of the ith woman in

the sample; Ci�1 if she was exposed to the famine

while in the womb, otherwise Ci�0. Let Ri denote

the place of residence of the ith woman; Ri�1 if she

was living in a rural area, otherwise Ri�0. A

difference-in-differences estimate of the effect of in

utero famine exposure on adult sterility can be

300 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

constructed by taking the second difference between

the cohort difference and the urban�rural difference

in sterility risk:

d ¼fP½Ci ¼ 1;Ri ¼ 1� � P½Ci ¼ 1;Ri ¼ 0�g� fP½Ci ¼ 0;Ri ¼ 1� � P½Ci ¼ 0;Ri ¼ 0�g:

(6)

This is equivalent to including Ci, Ri and their

interaction term in the logistic regression for sterility

in equation (4):

logpðxÞ

1� pðxÞ¼ aþ b1Ci þ b2Ri þ b3Ci � Ri: (7)

As Puhani (2008) has demonstrated, owing to the

non-linear nature of the logit link used above, the

difference-in-differences estimate of the famine

effect is a monotonic nonlinear transformation of

b3, the statistical significance of which can be

assessed using numerical methods. The interpreta-

tion can be greatly simplified by estimating an

equivalent linear probability model:

pðxÞ ¼ aþ b1Ci þ b2Ri þ b3Ci � Ri (8)

in which b3 represents the difference-in-differences

estimate of the famine effect. The statistical signifi-

cance and confidence intervals of which can be

assessed in the same manner as those of a linear

regression model.

Analysis

The analysis was conducted in three steps. First,

descriptive statistics of the study sample were

calculated. Then a series of cure fraction models,

which jointly estimated sterility and fecundability,

were estimated and compared. Finally, the differ-

ence-in-differences estimates of the famine effect on

fecundity were constructed based on the best-fit

model.

Descriptive analysis

Figure 2 shows the observed pattern of women’s

sterility, defined as having no child between first

marriage and the survey interview, for single-year

birth cohorts in both rural and urban areas of China.

For rural women, the impact of in utero famine

exposure on sterility is clear. The proportion of

sterile rural women rose between the 1958 and the

1959 cohort before reaching its peak level in the

1960 cohort. The proportion of sterile women

declined in the 1961 birth cohort and fell to its

lowest level in the 1962 cohort. Although both the

1959 and the 1962 rural cohorts were exposed to

famine conditions for part of their time in the womb,

they show dramatically different patterns of sterility

in their adult years. These patterns suggest that

either exposure to famine during early gestation had

a different effect from exposure later in gestation,

something which has not been previously reported in

the literature, or that random sampling errors were

present. In contrast, it is more difficult to identify a

cohort pattern in sterility amongst urban women

owing to the considerable year-to-year fluctuations

observed.

Did in utero famine exposure influence thefecundity of adult women differently in urbanand rural areas?

Table 4 reports results from four mixture cure

models that estimated both women’s sterility and

fecundability. For each model, the output has been

divided into two panels: Panel A shows the results of

a logistic regression predicting sterility status, while

controlling for right censoring; Panel B shows a log-

normal accelerated-failure-time hazard model that

estimated the fecundability of those who were not

sterile. The model selection was based on AIC and

BIC (Raftery 1995; Aitkin 1996), the two widely

used measures of relative goodness of fit of statis-

tical models. For both measures, a smaller value

indicates a better fit to the data.

Model 1 included birth cohort, urban�rural resi-

dence, and year of survey as variables in both the

2.5

2.0

1.5

1.0

0.5

01950 1953 1959 1965 1968

Birth cohort19621956

Urban women Rural women

Lev

el o

f st

erili

ty (

per

hund

red)

Figure 2 Trends in sterility amongst rural and

urban Chinese women, by year of birth; 1951�67,

as reported in the 1997 and 2001 surveysNote: Sterility is defined as no childbirth between firstmarriage and the time of the survey.Source: As for Table 2.


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

Table 4 Results from four joint sterility and fecundability models (mixture logit�log-normal cure fraction model) forwomen from the 1957�58, 1960�61, and 1963�64 birth cohorts interviewed in the 19971 and 20012 surveys

Model 1 Model 2 Model 3 Model 4

Panel A: sterility model (odds ratios)Year of survey

1997 � � � �2001 1.01 1.01 0.87 0.87

[0.65, 1.57] [0.65, 1.57] [0.57, 1.34] [0.57, 1.34]Birth cohort

1957�58 0.88 0.89 0.68 0.68[0.42, 1.84] [0.43, 1.84] [0.33, 1.39] [0.33, 1.40]

1960�61 0.77 0.77 0.65 0.65[0.34, 1.74] [0.34, 1.74] [0.29, 1.46] [0.29, 1.45]

1963�64 � � � �Rural residence 0.45* 0.45* 0.78 0.81

[0.24, 0.83] [0.24, 0.83] [0.40, 1.53] [0.45, 1.49]Interaction effect

1957�58�rural residence 1.14 1.14 1.17 1.20[0.44, 2.99] [0.44, 2.98] [0.45, 3.03] [0.47, 3.09]

1960�61�rural residence 2.72* 2.72* 2.78* 2.81*[1.02, 7.31] [1.02, 7.31] [1.04, 7.40] [1.06, 7.46]

Age at first marriage 1.34*** 1.33***[1.26, 1.42] [1.26, 1.41]

EducationNo schooling 1.24

[0.67, 2.29]Primary school � �Junior high school 1.09

[0.63, 1.88]Senior high school and above 0.98

[0.51, 1.88]Han ethnic majority 1.17

[0.56, 2.44]Panel B: fecundability model (time ratios)Year of survey

1997 � � � �2001 0.76*** 0.76*** 0.77*** 0.77***

[0.75, 0.78] [0.75, 0.78] [0.75, 0.79] [0.75, 0.79]Birth cohort

1957�58 1.04*** 1.02 1.05*** 1.05***[1.02, 1.06] [0.97, 1.06] [1.02, 1.07] [1.02, 1.07]

1960�61 1.03* 1.05* 1.03* 1.03*[1.01, 1.06] [1.01, 1.10] [1.01, 1.06] [1.01, 1.06]

1963�64 � � � �Rural residence 0.95*** 0.95*** 0.93*** 0.93***

[0.93, 0.97] [0.92, 0.98] [0.91, 0.96] [0.91, 0.96]Interaction effect

1957�58�rural residence 1.03[0.98, 1.08]

1960�61�rural residence 0.97[0.92, 1.03]

Age at first marriage 0.98*** 0.98***[0.98, 0.98] [0.98, 0.98]

EducationNo schooling 1.04** 1.04**

[1.01, 1.07] [1.01, 1.07]Primary school � �

Junior high school 0.99 0.99[0.96, 1.01] [0.96, 1.01]

Senior high school and above 1.07*** 1.07***[1.04, 1.11] [1.04, 1.11]

302 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

sterility and fecundability equations and the inter-

action between urban�rural residence and birth

cohort in the sterility equation. Model 2 added

interaction terms between birth cohort and urban�rural residence to the fecundability equation.

Although AIC does not show a strong preference

between the two models, BIC clearly suggests that

Model 1 fits the data better than Model 2, indicating

that in utero famine exposure status affected sterility

amongst women in rural areas differently from those

in urban areas, but that there was no differential

effect on fecundability. The inclusion of additional

control variables such as age at marriage, level of

education, and ethnicity in Model 3 resulted in an

even better fit to the data, as suggested by both AIC

and BIC. Careful inspection of Model 3 reveals that

the effects of education and ethnicity are significant

in the fecundability equation (Panel B) but not in

the sterility equation (Panel A). By excluding

education and ethnicity from the sterility equation

in Model 4, the fit of the model was further

improved, according to both the AIC and BIC,

making Model 4 the best-fit model.

The foregoing model comparison exercise sug-

gests that exposure to the 1959�61 famine while in

the womb had a differential influence on the sterility,

but not the fecundability, of adult women in urban

and rural areas of China.

The biological nature of the relationship betweenin utero famine exposure and sterility inadulthood

In the best-fit model, Model 4, neither education nor

ethnicity has a statistically significant effect on

sterility, and statistically significant urban�rural

differences in sterility are only present in the famine

cohorts but not in the non-famine cohorts because

women in rural areas were exposed to much more

severe famine-induced acute malnutrition than wo-

men in urban areas.

A key assumption underlying this research was

that the childlessness observed among the Chinese

women in the two survey populations was caused by

biology, not by choice. Such an assumption typically

does not hold for late twentieth or twenty-first-

century populations because of their widespread use

of highly effective contraceptive measures. How-

ever, it is possible to identify particular populations,

such as the Hutterites (Larsen and Vaupel 1993) or

the Amish (Wood et al. 1994), who have very low

levels of contraceptive use and voluntary childless-

ness, and it is possible, as previously discussed, that

because of their unique cultural heritage the Chinese

population, even in recent years, may resemble the

Amish and the Hutterite populations in that all

couples desire to have at least one child. It is crucial,

when using observed childlessness as an indicator of

sterility, that such a universal desire can be assumed

amongst the study population. Were this assumption

not valid, and some of the childlessness observed

was the result of choice, the prevalence of child-

lessness would be influenced by socio-economic

factors, with a clear relationship being discernible

between the relevant socio-economic factors and the

level of observed childlessness (Bloom and Pebley

1982). The fact that no such relationship is present in

the four models reported in Table 4 suggests that the

assumption that the observed levels of childlessness

were the result of biology, rather than of choice, was

a valid one.

The effect of ‘age at marriage’ on sterility war-

rants further discussion. Inclusion of age at first

marriage in Models 3 and 4 shows a strong and

positive effect on sterility (Panel A). However, one

should not jump to the conclusion, based on the

statistically significant coefficients of ‘age at mar-

riage’, that late marriage caused an increase in the

risk of sterility. As previously discussed, the model

relationship could have been produced by several

different mechanisms and additional research is

needed to reach a more solid understanding of the

factors at play. The inclusion of age at marriage in

Table 4 (Continued)

Model 1 Model 2 Model 3 Model 4

Han ethnic majority 0.91*** 0.01***[0.88, 0.94] [0.99, 0.94]

AIC 68,733.8 68,733.6 68,502.9 68,495.7BIC 68,827.3 68,841.5 68,668.5 68,632.4N 9,878

*pB0.05, **pB0.01, ***pB0.001.Note: 95 per cent confidence intervals in square brackets.Source: 1 and 2 as for Table 2.


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

Models 3 and 4 changed the influence of the ‘rural

residence’ variable from statistically significant in

Model 2 to non-significant in Models 3 and 4,

suggesting that the observed urban�rural difference

in the level of sterility was largely attributable to the

difference in the timing of marriage between urban

and rural areas (see Table 2). For the purposes of the

present study, it should be noted that controlling for

‘age at marriage’ has little impact on the key

coefficients.

Fecundability was influenced by both social andbiological factors

In contrast to the sterility results, variables repre-

senting educational attainment, ethnicity, place of

residence, birth cohort, and age at marriage all

demonstrate significant effects on fecundability

(Panel B), supporting the claim that fecundability

has both biological and behavioural components.

The frequency of sexual intercourse is a particularly

important behavioural factor (Weinstein and Stark

1994) but, because it is virtually impossible to

control for the frequency of intercourse in social

surveys, some of its effect on fecundability is

attributed to more general socio-economic factors.

The estimated cohort difference in fecundability

should not, therefore, be interpreted as a result of

famine exposure.

Model 4 shows a secular trend toward a shorter

interval between marriage and first birth developed

in both the urban and rural populations. This pattern

is consistent with other studies of China (Wang and

Yang 1996; Hong 2006) and is likely to have been

driven by the changes in marriage patterns, family

structure, and state family planning policy taking

place over the second half of the twentieth century.

Constructing the difference-in-differencesestimate of the in utero famine effect on women’ssterility

Although it is possible to extract the difference-in-

differences estimate of the in utero famine effect on

women’s sterility in adulthood directly from the

nonlinear probability models reported in Table 4 (Ai

and Norton 2003; Zelner 2009), linear probability

models provide much more straightforward alter-

natives. Table 5 reports such difference-in-differ-

ences estimates obtained from the best-fit linear

probability model, as well as their 95 per cent

confidence intervals.

To offer a more complete picture of the relation-

ship between in utero famine exposure and sterility

in adulthood, the new model included the 1959,

1962, and 1965�66 birth cohorts in addition to the

three birth cohorts represented in Table 4. Both the

1959 and 1962 birth cohorts experienced partial

exposure to the famine while in the womb, but the

1965�66 cohort, like the 1963�64 cohort, were

conceived after the famine was over. As Table 5

shows, five of the six difference-in-differences coef-

ficients are positive and only one of them (1962�rural residence) is negative. Among the five positive

coefficients, four have a 95 per cent confidence

interval including zero and only one (1960�61�rural residence) has a 95 per cent confidence interval

excluding zero. The only negative coefficient also

has a 95 per cent confidence interval including zero.

Based on these results it was concluded that,

compared to their counterparts who were not

exposed to the famine, female foetuses that spent

the whole of their period of gestation under famine

conditions faced an increased risk of experiencing

sterility as an adult of approximately 0.011 (i.e., an

increase of 1.1 sterile cases per 100 women), an

effect which is statistically significant. As previously

discussed, the presence of some rural-born women in

the urban sample means that the estimated differ-

ence-in-differences estimate reported in Table 5 is

likely to be biased downward. The true effect of in

utero famine exposure on sterility may, therefore, be

greater than 0.011. It was also concluded that being

exposed to famine in early childhood or for only part

of the 9 months of gestation, did not have a

statistically significant effect on sterility in adult-

hood.

The fact that the interaction term between the

1965�66 cohort and rural residence is not significant

increases confidence in the difference-in-differences

results. It provides additional evidence that in the

absence of famine conditions there was no real

urban�rural difference in women’s sterility.

Discussion

Using data from two large, nationally representative

surveys conducted in China in 1997 and 2001, this

study has shown that in utero exposure to the 1959�61 famine had a permanently damaging effect on

women’s fecundity. More specifically, exposure to

the famine while in the womb increased the risk of

women being sterile by 1.1 per cent. Given that

primary and permanent sterility is a rare phenom-

enon with an overall population prevalence only

304 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

slightly higher than 1 per cent in China (Liu et al.

2004), this is a substantial and important effect.

The unique strength of the present study comes

from the use of data from two large surveys of a

population in which the desire for at least one child

is universal (Scharping 2003), in combination with

the use of a difference-in-differences identification

strategy and mixture cure models, which can simul-

taneously model sterility and fecundability. The

significance of the findings is two-fold. First, by

showing the presence of a permanently damaging

effect of in utero exposure to famine on fecundity, an

important gap in our understanding of the develop-

mental origins of health and disease has been

bridged. Secondly, the results demonstrate the re-

levance and usefulness of the developmental origins

framework when seeking a better understanding of

population phenomena.

It may be pertinent to ask in the context of the

1959�61 famine whether, and to what extent, the

results of the analyses may have been influenced by

differential selection through mortality, fertility, or

both, since the members of the birth cohorts from

the famine years who survived to be interviewed in

1997 or 2001 constituted only a fraction of the

original cohorts. Selection effects in the context of

famine have been shown to be important in shaping

the subsequently observed cohort patterns of child

and adolescent mortality and of schizophrenia (Song

et al. 2009; Huang et al. 2010b; Song 2010). The

results shown above indicate that the famine cohort

had a higher sterility risk than both the pre-famine

and the post-famine cohorts. This suggests two

possibilities. The first is that the estimated differ-

ence-in-differences effect of in utero famine expo-

sure represents the lower bound of the true effect

because the famine survivors were more likely to

have been genetically well endowed and therefore to

have had higher-than-average reproductive perfor-

mance over their adult years. The second possibility

is that the estimated difference-in-differences coeffi-

cient represents the true effect of in utero exposure

to famine, in which case the factors that influenced a

woman’s reproductive function later in life were

unrelated to those determining her chance of

surviving the famine. In either case, the key finding

that in utero famine exposure increases the like-

lihood of sterility in adulthood remains valid.

Sterility, as defined in the present study, is a rare

phenomenon within a population. Despite the sta-

tistically significant coefficients reported in Tables 4

and 5, some reservations remain concerning the

substantive importance of such findings. After all,

a famine-induced increase in sterility of 1.1 per cent

is unlikely to change the population dynamics in

China significantly, so why are the findings impor-

tant?

Menken and Larsen (1994) provided the best

answer to this question. They argued that under-

standing the potential impact on population dy-

namics is the fourth most important reason to

study sterility. Their three most important reasons

were: the need to understand the effects of sterility

on the lives of affected individuals; the need to

estimate the prevalence of sterility within popula-

tions so that appropriate public health intervention

can be implemented; and the need to understand the

risk factors and causes with which sterility is

associated. The findings of the present study shed

light on all four areas of enquiry.

In many societies sterility and the resulting

involuntary childlessness cause women to be stigma-

tized, resulting in a great deal of pain and harm

(Miall 1985; Whiteford and Gonzalez 1995). Because

sterility does not have a clearly defined aetiology,

apart from those cases resulting from sexually

transmitted diseases, it is often viewed as the woman’s

fault, a punishment for her sins and wrongdoings.

The stigma is particularly great in societies with

strong pro-natalist cultures, such as China. The

Table 5 Difference-in-differences estimates of the effect of in utero exposure to the 1959�61 famine on the risk of sterilityin adulthood: results from the joint linear probability�log-normal cure fraction model using data combined from the 19971

and 20012 surveys

Cohort�place of residence Effect on the probability of sterility 95 per cent confidence intervals

1957�58�rural residence 0.0013 [�0.0088, 0.0114]1959�rural residence 0.0072 [�0.0072, 0.0247]1960�61�rural residence 0.0109 [0.0006, 0.0248]1962�rural residence �0.0055 [�0.0198, 0.0073]1963�64�rural residence � �1965�66�rural residence 0.0040 [�0.0065, 0.0175]Observations 17,095

Note: The 95 per cent confidence intervals were calculated using the bootstrap method.Source: 1 and 2 as for Table 2.


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

present research has demonstrated for the first time

that of the 15 million women born in the rural areas

of China in 1960 and 1961, 110,000 were rendered

sterile by exposure to the 1959�61 famine while in

the womb. These 110,000 rural women have lived

their lives in the shadow of the famine.

The effects will follow them into old age because

the one-child policy, in combination with the pre-

dominantly family-based support system in rural

China, means that it will become increasingly

difficult to obtain adequate old-age support even

for those who have children (Wang 2011) and

virtually impossible for those who do not.

It should be acknowledged that this study has a

number of limitations and weaknesses. One is that

the lack of individual-level measures of famine-

induced malnutrition makes it difficult to attribute

the estimated effects directly to the causal influence

of in utero malnutrition. After all, acute malnutrition

is not the only adverse event that occurs during

famine; infectious disease and heightened stress may

have similar effects. While infectious disease is not

thought to have been widespread during or after

China’s 1959�61 famine because the state closely

monitored their incidence even during the famine

(Dikotter 2010), the lack of food would certainly

have caused heightened anxiety and stress, but there

are no direct measures of these factors. It can

therefore be said only that in utero famine exposure

increased the risk of sterility in women; the extent to

which this was a result of malnutrition of the foetus

in the womb, or of heightened maternal stress could

not be determined. Another piece of useful informa-

tion that is missing from the data is information on

women’s birth weight and length. Such measures can

be used as individual-level proxies for the famine-

induced malnutrition (as opposed to birth cohort

and urban�rural residence used in this study) to

obtain more fine-grained results. To the best of my

knowledge, none of the existing data sources have

the above-mentioned information and new data

collection efforts are necessary.

Notes

1 Shige Song is at Queens College and CUNY Institute

for Demographic Research, The City University of New

York, 65�30 Kissena Blvd., Queen’s, NY 11367, USA.

E-mail: [email protected]

2 An earlier version of this paper was presented at the

2010 annual meeting of the Population Association of

America.

References

Ai, C. and E. C. Norton. 2003. Interaction terms in logit

and probit models, Economics Letters 80(1): 123�129.

Aitkin, M. 1996. A general maximum likelihood analysis

of overdispersion in generalized linear models, Statistics

and Computing 6(3): 251�262.

Ashton, B., K. Hill, A. Piazza, and R. Zeitz. 1984. Famine

in China, 1958�61, Population and Development

Review 10(4): 613�645.

Bloom, D. E. and A. R. Pebley. 1982. Voluntary child-

lessness: a review of the evidence and implications,

Population Research and Policy Review 1(3): 203�224.

Bongaarts, J. 1980. Malnutrition and fecundity, Studies in

Family Planning 11(12): 401�406.

Cai, Y. and F. Wang. 2005. Famine, social disruption, and

involuntary fetal loss: evidence from Chinese survey

data, Demography 42(2): 301�322.

Caldwell, J. C. 1994. How is greater maternal education

translated into lower child mortality, Health Transition

Review 4(2): 224�229.

Chen, J., Z. Xie, and H. Liu. 2007. Son preference, use of

maternal health care, and infant mortality in rural

China, 1989�2000, Population Studies 61(2): 161�183.

Chen, Y. and L. A. Zhou. 2007. The long-term health and

economic consequences of the 1959�1961 famine in

China, Journal of Health Economics 26(4): 659�681.

Choe, M. K. and N. O. Tsuya. 1991. Why do Chinese

women practice contraception? The case of rural

Jilin Province, Studies in Family Planning 22(1):

39�51.

Collins, J. A., W. Wrixon, L. B. Janes, and E. H. Wilson.

1983. Treatment-independent pregnancy among infer-

tile couples, New England Journal of Medicine 309(20):

1201�1206.

De Angelis, R., R. Capocaccia, T. Hakulinen, B. Soderman,

and A. Verdecchia. 1999. Mixture models for cancer

survival analysis: application to population-based data

with covariates, Statistics in Medicine 18(4): 441�454.

Den Bandt, M. L. 1980. Voluntary childlessness in the

Netherlands, Journal of Family and Economic Issues

3(3): 329�349.

Desai, S. and S. Alva. 1998. Maternal education and child

health: is there a strong causal relationship?,

Demography 35(1): 71�81.

Dikotter, F. 2010. Mao’s Great Famine: The History of

China’s Most Devastating Catastrophe, 1958�1962. New

York: Walker.

Dunson, D. B. and H. Zhou. 2000. A Bayesian model for

fecundability and sterility, Journal of the American

Statistical Association 95(452): 1054�1062.

Elias, S. G., P. A. H. van Noord, P. H. M. Peeters, I. den

Tonkelaar, and D. E. Grobbee. 2005. Childhood ex-

posure to the 1944�1945 Dutch Famine and subsequent

306 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

female reproductive function, Human Reproduction

20(9): 2483�2488.

Fang, K., Q. Deng, and E. Gao. 1993. Analysis of infertility

rate among firstly married Chinese women during

1976�1985, Reproduction and Contraception 4(2):

148�155.

Farewell, V. T. 1982. The use of mixture models for the

analysis of survival data with long-term survivors,

Biometrics 38(4): 1041�1046.

Gardner, D. S., S. E. Ozanne, and K. D. Sinclair. 2009.

Effect of the early-life nutritional environment on

fecundity and fertility of mammals, Philosophical

Transactions of the Royal Society B: Biological Sciences

364(1534): 3419�3427.

Gluckman, P. D., M. A. Hanson, H. G. Spencer, and P.

Bateson. 2005. Environmental influences during devel-

opment and their later consequences for health and

disease: implications for the interpretation of empirical

studies, Proceedings of the Royal Society B: Biological

Sciences 272(1564): 671�677.

Heckman, J. J. and J. R. Walker. 1990. Estimating

fecundability from data on waiting times to first

conception, Journal of the American Statistical

Association 85(410): 283�294.

Hong, Y. 2006. Marital decision-making and the timing of

first birth in rural China before the 1990s, Population

Studies 60(3): 329�341.

Huang, C., Z. Li, K. M. Venkat Narayan, D. F. Williamson,

and R. Martorell. 2010a. Bigger babies born to women

survivors of the 1959�1961 Chinese famine: a puzzle

due to survival selection?, Journal of Developmental

Origins of Health and Disease 1(6): 412�418.

Huang, C., Z. Li, M. Wang, and R. Martorell. 2010b. Early

life exposure to the 1959�1961 Chinese famine has

long-term health consequences, Journal of Nutrition

140(10): 1874�1878.

Kallan, J. and J. R. Udry. 1986. The determinants of

effective fecundability based on the first birth interval,

Demography 23(1): 53�66.

Kung, J. K. and J. Y. Lin. 2003. The causes of China’s Great

Leap Famine, 1959�1961, Economic Development and

Cultural Change 52(1): 51�73.

Lambert, P. C. 2007. Modeling of the cure fraction in

survival studies, Stata Journal 7(3): 351�375.

Lambert, P. C., P. W. Dickman, C. L. Weston, and J. R.

Thompson. 2010. Estimating the cure fraction in

population-based cancer studies by using finite mixture

models, Journal of the Royal Statistical Society: Series C

(Applied Statistics) 59(1): 35�55.

Larsen, U. and J. Menken. 1989. Measuring sterility from

incomplete birth histories, Demography 26(2): 185�201.

Larsen, U. and J. W. Vaupel. 1993. Hutterite fecundability

by age and parity: strategies for frailty modeling of

event histories, Demography 30(1): 81�102.

Larsen, U. 2000. Primary and secondary infertility in sub-

Saharan Africa, International Journal of Epidemiology

29(2): 285�291.

Lavely, W. R. 1986. Age patterns of Chinese marital

fertility, 1950�1981, Demography 23(3): 419�434.

Leridon, H. 1981. Fertility and contraception in 12

developed countries, International Family Planning

Perspectives 7(2): 70�78.

Lin, J. Y. and D. T. Yang. 2000. Food availability,

entitlements and the Chinese famine of 1959�61, The

Economic Journal 110(460): 136�158.

Liu, J., U. Larsen, and G. Wyshak. 2004. Prevalence of

primary infertility in China: in-depth analysis of

infertility differentials in three minority province/

autonomous regions, Journal of Biosocial Science

37(1): 55�74.

Lumey, L. H. and A. D. Stein. 1997. In utero exposure to

famine and subsequent fertility: the Dutch Famine

Birth Cohort Study, American Journal of Public Health

87(12): 1962�1966.

Maller, R. A. and X. Zhou. 1996. Survival Analysis with

Long-Term Survivors. New York: Wiley.

Menken, J., J. Trussell, and U. Larsen. 1986. Age and

infertility, Science 233(4771): 1389�1394.

Menken, J. and U. Larsen. 1994. Estimating the incidence

and prevalence and analyzing the correlates of inferti-

lity and sterility, Annals of the New York Academy of

Sciences 709(1): 249�265.

Miall, C. E. 1985. The stigma of involuntary childlessness,

Social Problems 33(4): 268�282.

Mooney, C. Z., R. D. Duval, and R. Duval. 1993.

Bootstrapping: A Nonparametric Approach to Statistical

Inference. Newbury Park, CA: Sage.

Moors, H. G. 1978. The Netherlands Survey on Fertility

and Parenthood Motivation, 1975: a summary of

findings, in International Statistical Institute/Word Ferti-

lity Survey. Voorburg, the Netherlands.

Mu, R. and X. Zhang. 2010. Why does the Great Chinese

Famine affect the male and female survivors differ-

ently? Mortality selection versus son preference,

Economics & Human Biology 9(1): 92�105.

Painter, R. C., R. G. J. Westendorp, S. R. de Rooij, C.

Osmond, D. J. P. Barker, and T. J. Roseboom. 2008.

Increased reproductive success of women after prenatal

undernutrition, Human Reproduction 23(11): 2591�2595.

Peng, X. 1987. Demographic consequences of the Great

Leap Forward in China’s provinces, Population and

Development Review 13(4): 639�670.

Puhani, P. A. 2008. The treatment effect, the cross

difference, and the interaction term in nonlinear ‘differ-

ence-in-differences’ models, IZA Discussion Papers.

Raftery, A. E. 1995. Bayesian model selection in social

research, Sociological Methodology 25: 111�164.


Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

Ross, C. E. and C. Wu. 1995. The links between education

and health, American Sociological Review 60(5): 719�745.

Scharping, T. 2003. Birth Control in China 1949�2000:

Population Policy and Demographic Development. New

York: Routledge.

Schmidt, P. and A. D. Witte. 1989. Predicting criminal

recidivism using ‘split population’ survival time models,

Journal of Econometrics 40(1): 141�159.

Short, S. E., L. Ma, and W. Yu. 2000. Birth planning and

sterilization in China, Population Studies 54(3): 279�291.

Song, S. 2004. Marriage Formation in Contemporary China.

Los Angeles: University of California, Los Angeles.

Song, S., W. Wang, and P. Hu. 2009. Famine, death, and

madness: schizophrenia in early adulthood after pre-

natal exposure to the Chinese Great Leap Forward

Famine, Social Science & Medicine 68(7): 1315�1321.

Song, S. 2010. Mortality consequences of the 1959�1961

Great Leap Forward Famine in China: debilitation,

selection, and mortality crossovers, Social Science &

Medicine 71(3): 551�558.

Song, S. and S. A. Burgard. 2011. Dynamics of inequality:

mother’s education and infant mortality in China,

1970�2001, Journal of Health and Social Behavior

52(3): 349�364.

Song, S. 2012. Does famine influence sex ratio at birth?

Evidence from the 1959�1961 Great Leap Forward

Famine in China, Proceedings of the Royal Society B:

Biological Sciences 279(1739): 2883�2890.

Sposto, R. 2002. Cure model analysis in cancer: an

application to data from the Children’s Cancer Group,

Statistics in Medicine 21(2): 293�312.

State Statistical Bureau. 1991. Statistical Yearbook of

China 1991. Beijing: State Statistical Press.

Stein, Z. 1975. Famine and Human Development: The

Dutch Hunger Winter of 1944�1945. New York: Oxford

University Press.

Trussell, J. and C. Wilson. 1985. Sterility in a population

with natural fertility, Population Studies 39(2): 269�286.

Walker, K. R. 1984. Food Grain Procurement and Con-

sumption in China. New York: Cambridge University

Press.

Wang, F. and Q. Yang. 1996. Age at marriage and the first

birth interval: the emerging change in sexual behavior

among young couples in China, Population and Devel-

opment Review 22(2): 299�320.

Wang, F. 2011. The future of a demographic overachiever:

long-term implications of the demographic transition in

China, Population and Development Review 37(s1):

173�190.

Weinstein, M., J. W. Wood, M. A. Stoto, and D. D.

Greenfield. 1990. Components of age-specific fecund-

ability, Population Studies 44(3): 447�467.

Weinstein, M. and M. Stark. 1994. Behavioral and

biological determinants of fecundability, Annals of the

New York Academy of Sciences*Paper Edition 709:

128�144.

Whiteford, L. M. and L. Gonzalez. 1995. Stigma: the

hidden burden of infertility, Social Science & Medicine

40(1): 27�36.

Wood, J. W. 1994. Dynamics of Human Reproduction:

Biology, Biometry, Demography. New York: Aldine.

Wood, J. W., D. J. Holman, A. I. Yashin, R. J. Peterson, M.

Weinstein, and M. C. Chang. 1994. A multistate model

of fecundability and sterility, Demography 31(3): 403�

426.

Wrigley, E. A. 1997. English Population History from

Family Reconstitution, 1580�1837. New York: Cam-

bridge University Press.

Zelner, B. A. 2009. Using simulation to interpret results

from logit, probit, and other nonlinear models, Strategic

Management Journal 30(12): 1335�1348.

Zhang, G. and Z. Zhao. 2006. Reexamining China’s

fertility puzzle: data collection and quality over the

last two decades, Population and Development Review

32(2): 293�321.

308 Shige Song

Dow

nloa

ded

by [

Uni

vers

ity o

f C

olor

ado

at B

ould

er L

ibra

ries

] at

12:

16 1

9 D

ecem

ber

2014

Documents

Assessing the impact of in utero exposure to famine on fecundity: Evidence from the 1959–61 famine in China