21
June 14, 2017 Dr. Roeggla and the BMJ Manuscript Committee: We would like to thank you, the committee, and reviewers for the thoughtful and detailed review and consideration of our manuscript, Small-for-Gestational-Age: Estimates of Births and Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed a major revision of the manuscript and also re-conducted the complete analysis. We have now performed the analysis excluding the older datasets of concern and also calculated our attributable mortality risk using the pooled relative risks for neonatal mortality using the Intergrowth standard. We have responded point- by-point to the feedback in the below letter, as well as in the revised manuscript in track changes. Due to the removal of the earlier datasets from the analysis, we did not have an adequate number of studies to generate estimates for post-neonatal infant mortality risk and have thus removed that section from the manuscript. Please let us know if there are any concerns that we have not adequately addressed, and we look forward to hearing from you again regarding this revised analysis and manuscript. We also would like to thank the reviewers, whose feedback has helped to substantially improve the manuscript. Best regards, Anne CC Lee Naoko Kozuki Joanne Katz For the CHERG Preterm-SGA Working Group

Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

June 14, 2017 Dr. Roeggla and the BMJ Manuscript Committee: We would like to thank you, the committee, and reviewers for the thoughtful and detailed review and consideration of our manuscript, Small-for-Gestational-Age: Estimates of Births and Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed a major revision of the manuscript and also re-conducted the complete analysis. We have now performed the analysis excluding the older datasets of concern and also calculated our attributable mortality risk using the pooled relative risks for neonatal mortality using the Intergrowth standard. We have responded point-by-point to the feedback in the below letter, as well as in the revised manuscript in track changes. Due to the removal of the earlier datasets from the analysis, we did not have an adequate number of studies to generate estimates for post-neonatal infant mortality risk and have thus removed that section from the manuscript. Please let us know if there are any concerns that we have not adequately addressed, and we look forward to hearing from you again regarding this revised analysis and manuscript. We also would like to thank the reviewers, whose feedback has helped to substantially improve the manuscript. Best regards, Anne CC Lee Naoko Kozuki Joanne Katz For the CHERG Preterm-SGA Working Group

Page 2: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

The committee thought this was a potentially useful paper. We were therefore interested in the topic of your research. The following concerns were mentioned: • Isn’t this in essence a very similar analysis to the first paper, now using the Intergrowth-21st ‘normal’ values to define SGA? Please explain novelty and importance in more detail. Response:

This analysis provides the first global estimates of the burden of SGA births and attributable neonatal mortality using the new Intergrowth-21th standard. In 2009, at the time the working group was formed and our earlier analyses were conducted, the 1991 United States birth population (Alexander et al, Obstet Gynecol 1996 Feb; 87: 163) was selected by our group as the best available reference, given that it represented a national, multi-ethnic population, had large sample sizes in lower gestational ages, and was the most commonly cited reference in the literature. However, there are multiple limitations to using the U.S. 1991 birth population as a universal reference for global burden estimates, among them the fact that it serves as a reference for how fetuses actually grow in a U.S. population with the presence of pregnancy morbidities and does not serve as a prescriptive standard for optimal fetal growth. Global estimates of the burden of SGA using Intergrowth, the first international, multi-ethnic prescriptive fetal growth standard is critical in order to more accurately quantify disease burden, track epidemiologic trends, and understand the impact of public health interventions targeting SGA. These estimates are the first global burden estimates of SGA using this new standard, and also are the first national estimates for the year 2012, instead of 2010, as was our prior work. Furthermore, we provide estimates of the contribution of SGA to neonatal mortality for the year 2012. Understanding the contribution of SGA as a risk factor for mortality is critical to targeting public health interventions for primary and secondary prevention and tracking our progress. • Please explain the provenance of the data in more detail. Response:

The original datasets contributing to this analysis were identified as part of a larger scope of work for the CHERG SGA-Preterm Working Group. The full description of the selection of these datasets is detailed in Katz et al, Lancet. 2013 Aug 3;382(9890):417-25. We have added more detail regarding the provenance of the data in Methods paragraph 1 which is below. More detail can be added upon request.

“This paper builds on prior analyses of the CHERG on the burden and risk of SGA and preterm

birth in LMICs.14 24 25

An investigator group (CHERG SGA-Preterm Birth Working Group) was

established in 2009 to conduct a set of analyses related to SGA-preterm birth. To identify

datasets to include in the analyses, Medline, WHO regional databases (African Index Medicus,

LILAS, EMRO), bibliographies of pulled articles, and grey literature were reviewed to identify

LMIC cohorts that collected gestational age and birth weight information, and systematically

recorded vital status until 28 days of life. We aimed to include datasets that were population-

based, representing most deliveries from certain geographic or catchment areas, whether home-

or facility-based. Datasets were excluded that: 1) had greater than 25% missing birth weight or

gestational age, or loss to follow up; 2) measured weight only after the first week of life; 3) did not have

systematic follow-up of vital status in the first month of life; 4) had inaccurate gestational age

(determined in whole months, or deemed ‘inaccurate’ by study investigators). Full details on the

Page 3: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

literature review process and selection of datasets have been previously published by our

research group.14

Principal investigators were requested to either individually conduct

analyses, or to share their data with the working group. Fourteen LMIC birth cohorts were

included in this new analysis. The original studies were from prospective studies, including

longitudinal birth cohorts (n=4) and pregnancy/neonatal intervention trials (n=10), with

systematically collected data on gestational age, birth weight and neonatal outcomes. The

original datasets and study descriptions are detailed in eTable 1. The majority of datasets were

from 2000 onwards, with three studies from the 1990s.” • Accuracy is a problem - the need for ethnic specific foetal growth charts has been recognised for a long time. Response: We acknowledge and understand the viewpoint and ongoing debate regarding the need for ethnic specific growth charts. The main objective of our work was to describe the global burden of SGA, the contribution of pregnancy risks, and the potential impact of preventive interventions addressing risk factors in order to optimize fetal growth globally. While genetic potential for growth may differ across populations (Buck et al, Am J Obstet Gynecol. 2015 October ; 213(4): 449; Kiserud et al, PLoS Med 2017;14(1):e1002220), we believe that in low-middle income country (LMIC) settings, this contribution is relatively minor when accounting for the vast differences in rates of maternal undernutrition and pregnancy morbidity.1 The WHO expert committee on the use and interpretation of anthropometry recommended the use of a single international standard to assess child anthropometrics1, and subsequently the WHO Child Growth Standards were generated based on the premise and prescriptive approach that all children should grow similarly in optimal environments. In a similar vein, our aim was to choose a single prescriptive, international standard to define the global burden of sub-optimal fetal growth.

The Intergrowth-21st standard was chosen for this analysis given that it is the first international, multi-ethnic, multi-country fetal/neonatal growth standard, including 8 geographically defined populations across continents. The Intergrowth standard is intended to capture optimal growth, which previous references are simply descriptive statistics of the selected population. The women who contributed births to the Intergrowth standard was comprised of pregnant mothers, with gestational age estimated by ultrasound with well-documented health, including environmental and nutrition status. The Intergrowth group found that among populations “where maternal health and nutritional needs are met”, differences in fetal growth, maternal weight gain, neonatal and postnatal growth among preterm infants were very small using the 3 different comparison strategies used by WHO standards.2 Furthermore, the birth weight standards from Intergrowth at term match the initial components of the WHO child growth standards, demonstrating the biological and temporal robustness of both.3 Thus, for a single global reference from which to base SGA burden estimates, this was chosen by the CHERG working group to be the most appropriate choice for a common standard. The choice of an unselected, national, ethnic population reference would describe fetal growth with current rates of exposure to undernutrition and pregnancy morbidities in LMIC, and would not allow us to target the optimization of fetal growth globally from a population health perspective. Using local references, rather than an international standard, for SGA would result in a predefined prevalence close to 10% for all countries independent of their health and

Page 4: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

nutritional condition, which fails to capture the deviation of populations from optimal fetal growth. The intention of our larger scope of work is to estimate the global burden of SGA and the potential role of interventions to prevent SGA, and thus the use of local ethnic references would not allow this estimation. • The committee was worried about the datasets being quite old, considering there's been great changes in terms of nutrition and provision of prenatal healthcare over the last decades. We understand the committee’s concern regarding the timing of the cohorts. To address this concern, we have removed the cohorts that were from the 1980’s, thus only retaining cohorts from after 1990. The vast majority (80% of the cohorts) were from after the year 2000. We have now re-analyzed the pooled data and SGA estimates without these older datasets.

Page 5: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Comments from Reviewers

Reviewer: 1 Comments: This article is potentially a very interesting and relevant study for health care workers worldwide. However, in the current form and with the methods used, I have strong reservations. It would require at least a major revision I. As SGA is used as a proxy of IUGR, the true IUGR cannot be determined. A fetal growth standard created in one obstetric population may be applied to another only when the two populations are similar to each other (Cheng, Y., et al. (2016). "Impact of replacing Chinese ethnicity-specific fetal biometry charts with the INTERGROWTH-21(st) standard." BJOG 123 Suppl 3: 48-55). In that case SGA will be able to more accurately detect the true number of IUGR infants. For that reason a universal fetal growth reference will never be able to estimate the true IUGR rates, and the number of misclassifications will increase as the studied population differs more from the population that the reference was based on. Several studies have already shown misclassifications based on the use of universal fetal growth references compared with ethnic specific references and customised fetal growth standards misclassifications (Anderson, N. H., et al. (2016). "INTERGROWTH-21st vs customized birthweight standards for identification of perinatal mortality and morbidity." Am J Obstet Gynecol 214(4): 509 e501-507). The competing research group of WHO (Kiserud, T., et al. (2017). "The World Health Organization Fetal Growth Charts: A Multinational Longitudinal Study of Ultrasound Biometric Measurements and Estimated Fetal Weight." PLoS Med 14(1): e1002220) showed a large variation between countries in fetal growth and presented their universal fetal growth charts not as a standard, and also with strong reservations. Another study investigating ethnic differences also concluded that ethnic specific references should be used (Buck Louis, G. M., et al. (2015). "Racial/ethnic standards for fetal growth: the NICHD Fetal Growth Studies." Am J Obstet Gynecol 213(4): 449 e441-449 e441.) Response: We acknowledge the reviewer/committees’ view on using ethnic-specific growth references. Please see our detailed response above to the committee’s concern regarding ethnic specific foetal growth charts. We have better addressed this debate in the Introduction (Introduction, paragraphs 2-3) as well as in the Discussion of the manuscript (Discussion, paragraph 3). We believe we have strengthened our reasoning for our choice of the single Intergrowth standard. We also acknowledge the limitations to using a single universal standard, the potential reasons for using ethnic/customized references in the discussion, and addressed our limitations in a full paragraph in the Discussion section (paragraph 3). We have also included citations to several of the references provided by Reviewer 1 in the corresponding paragraphs in the Introduction and Discussion. II. The results are presented as if still relevant now. The data used were however quite outdated (1982-2008). As much has changed since then the conclusions cannot be drawn. Response We understand the reviewer’s and committee’s concern regarding the timing of the cohorts. To address this concern we have now excluded the older datasets (prior to 1990) from the analysis. The vast majority (80%) of datasets included in the analysis were from after the year 2000.

Page 6: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

III. The PAF was not calculated correctly. The PAF is now calculated as a proportion (all SGA neonatal deaths/all neonatal deaths). The PAF-equation = (Incidence neonatal deaths (tot) – Incidence neonatal deaths among non-SGA)/Incidence (tot). It is now unclear how many neonates were born SGA at all, and this influences the PAF considerably. ‘ Response:

In this analysis, we have used methods of the WHO Comparative Risk Assessment (CRA) Methodology to estimate population attributable fraction (PAF) for multiple category exposures (http://www.who.int/healthinfo/global_burden_disease/metrics_paf/en/.) In these methods, the Potential Impact Fraction (PIFs) is calculated and used to estimate the population attributable fractions (PAFs) for multiple category exposures.

From the WHO CRA methods “the PIF equation can be used to estimate the population attributable fraction (PAF), defined as the proportional reduction in death that would occur if exposure to the risk factor were reduced to the counterfactual exposure distribution.” Thus, calculation of the PIF is one method of performing the calculation for PAF, using the below equation.

Where Pi = proportion of population at exposure level i, current exposure (ie. current prevalence of SGA) P'i = proportion of population at exposure level i, counterfactual or ideal level of exposure (theoretical/ideal prevalence of SGA) RR = the risk ratio at exposure level i (ie. risk of neonatal mortality) n = the number of exposure levels In this equation, the current and “ideal” prevalence of SGA is used in the equation and is a large determinant of the PIF. The numbers of neonates born SGA is available in Table 1 of the paper. These methods are published in the book Comparative Quantification of Health Risks, Global Regional Burden of Disease Attributable to Selected Major Risk Factors (WHO 2004) Chapter 25, “Estimating Attributable Burden of Disease from Exposure and Hazard Data” and elsewhere

(Zapata-Diomedi, Br J Sports Med. 2016 Mar 8). These methods are the basis for the Global Burden of Disease estimates in Lim et al. (Lancet. 2012 Dec 15;380(9859):2224-60).

We had used the terminology PAF as opposed to PIF given that is the terminology more commonly used by WHO in CRA analysis: http://www.who.int/healthinfo/global_burden_disease/metrics_paf/en/.

1

Page 7: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

We also agree with the reviewer’s definition of PAF and have included this definition as well in the methods.

IV. The current conclusions drawn are interesting as a brain exercise but may be harmful when the figures especially those from South Asia are taken factually. Especially in populations that deviate from the Intergrowth-21st standard, ethnic specific standards will more accurately detect the true growth restricted. Response: We feel that the use of Intergrowth is useful to describe the current burden of SGA, with an aspirational goal of reaching optimal pregnancy health and fetal growth for all babies worldwide. We recognize that there are different perspectives on this. As detailed above, we agree that there are genetic and ethnic differences in fetal growth, however, we feel that these play a smaller role in LMICs with the much larger variation and burden of maternal and pregnancy morbidities in LMICs – such as maternal undernutrition, hypertensive disorders, and infections. Our remit was to understand the burden of SGA and the potential of preventive interventions to reduce SGA. We agree there maybe some misclassification of babies using a universal standard, particularly for South Asia. However, we feel that the Intergrowth standard performs more accurately than the widely-used U.S. 1991 birth population. There was likely substantial overestimation of SGA in South Asia in our prior estimates, and the estimates were much lower in the current Intergrowth standard. Thus we feel that it is critical to conduct and share these new estimates. In the discussion we have added the statement:

“Finally, the use of the Intergrowth standard may be questioned by those who prefer national or

ethnic standards. If the baseline population differs substantially from the Intergrowth selected

population, misclassification of SGA may occur.” ABSTRACT 1. Page 5, Line 32: “the mortalityQ.”. This seemed to refer to the previous sentence, but it referred to LMIC in general. Please start with “In LMIC,.. ” Response: We have revised the wording of this phrase as recommended by the reviewer. 2. Page 5, line 40: “born too small in LMICs”. Although SGA is a risk factor it remains a descriptive ‘condition’, because of the heterogeneity in growth, depending on the reference used. “Born too small” is normative and holds a value judgment. I would rather use the descriptive term SGA. The use of the universal Intergrowth-21st standard, has likely led to many misclassifications, even though the new standard already classifies less children as SGA than the previously used reference. Response: We have revised the wording of this phrase as recommended by the reviewer. INTRODUCTION 3. Page 7, line 34: “The INTERGROWTHQ.”. Although the ‘standard’ offers a common reference, the universal use for all ethnic groups is debated. The precise definition of SGA is extremely important. Response: We have acknowledged this view and debate in the 2nd paragraph of the

Page 8: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

introduction, and addressed the reason for our choice of the Intergrowth standards 4. In fact, the use of a universal reference such as Intergrowth-21st in clinical practice may even be harmful in generally smaller populations such as in South Asia. Many studies have shown that mortality and morbidity in South Asian neonates is disconcordant with birth size and concordant with ethnic specific references. (Kierans, W. J., et al. (2008). "Does one size fit all? The case for ethnic-specific standards of fetal growth." BMC. Pregnancy. Childbirth 8(1): 1.). Response: We understand the reviewers concern regarding applying a universal reference, particularly in South Asia. We have provided our justification for using a universal prescriptive standard above. We feel that this reference is more appropriate than our earlier estimates using the U.S. reference which misclassified relatively more infants as SGA in South Asia. Using the U.S. 1991 reference, in South Asia 16.3 million infants were classified as SGA compared to 12.6 million using the Intergrowth standard. We have calculated the mortality risks for SGA babies with the Intergrowth standard, thus the mortality risks applied for the PIF/PAF calculations are using the same reference standard. 5. The normative approach by using a universal reference (‘standard’) may lead to many babies with normal fat mass to be classified as SGA, and consequently to supplementary feeding practices with increases in fat as a result. Response: We acknowledge the reviewer’s concern. Please see response above. 6. Page 8, line 10: “intrauterine programmingQ.”. That is considering that SGA is properly determined. Classifying babies as SGA that actually have an appropriate weight may lead to overfeeding and consequently to overweight and associated diseases. Response: We acknowledge the reviewer’s concern. Please see response above. While misclassification is probable, we feel that use of the Intergrowth standard reduces the misclassification of SGA compared to the U.S. 1991 birth population reference. 7. Page 8, line 36 “we alsoQ”. I would recommend at least using ethnic specific references (at least in the case of (South) Asians as a comparison. The use of universal references is highly arbitrary. Using inappropriate references renders all your results useless and will be highly misleading. Response: We have performed an estimate comparing the prevalence and number of SGA infants using a local South Asian reference (Bhatia, Indian Ped 1981; 19:647). We have included the following paragraph in the discussion to address this issue. “National or ethnic references are also often challenged by the fact that they are developed from the

local population and are unselected, including all the pregnancies and pathology of the catchment area

population. These describe local patterns of growth, however included are pregnancies which are

affected by undernutrition and morbidities such as infections or hypertension. Thus SGA simply defines

the lowest <10% tail of the curve, however does not identify the many more newborns affected by poor

growth. Use of local curves results in a predefined SGA prevalence close to 10%, irrespective of a

population’s health and nutritional status. For example, using the Bhatia reference that was developed

for single liveborn infants in India53

, the number of SGA infants estimated in South Asia for 2012 would

Page 9: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

be reduced to 3.77 million infants (10.5% of live births) in the year 2012, compared to 12.6 million (34%)

using the Intergrowth standard.”

8. Compare aims of : Kozuki, N., et al. (2015). "Comparison of US Birth Weight References and the International Fetal and Newborn Growth Consortium for the 21st Response:

The aims of the Kozuki et al (2015) paper lay the ground work for the current analysis. In the Kozuki et al (2015) paper, we examined the attenuation in prevalence and mortality risk of SGA comparing the U.S. 1991 reference and Intergrowth standard. In the current paper, our aim is to provide the first global burden of disease estimates related to SGA, defined by the Intergrowth standard. To our knowledge, these are the first estimates of the numbers of SGA births in LMIC and the numbers of neonatal deaths attributable to SGA using this new standard. METHODS 9. Page 9, line 5 “Q1982 to 2008Q.”. As mentioned in my introduction, I have large reservations by the datasets used and the presentation of the results. These data do not represent the current populations in these countries adequately. Therefore, it is not justified to draw the conclusions as they are now presented. Response: We appreciate the reviewer’s concerns. To address this, we excluded the older datasets (before 1990) in the revised analysis. We have conducted all analysis and burden of disease estimates excluding the older studies Approximately 80% of the datasets included in the current analysis were from after the year 2000. 10. Page 9, line 45 ” The Intergrowth study reported that among healthy pregnant women with no known risk factors, fetal growth was comparable across different populations around the globe”. This has received criticism (see my introduction paragraph). Although it is the first ‘standard’ or reference that is intended for universal use, the universal applicability has been debated (Anderson, Am J Obstet Gynecol 214). Customised fetal growth standards have repeatedly been shown to better predict perinatal problems. Also, the P10 of Intergrowth is considerably higher than in babies of South Asian origin in the Netherlands (Visser, G. H., et al. (2009). "New Dutch reference curves for birthweight by gestational age." Early Hum. Dev 85(12): 737-744.). Although I believe Intergrowth will lead to less misclassifications than the US reference, it’s still far from perfect. Response: We appreciate and acknowledge this debate, about use of ethnic or even customized standards. As described above, our study aim was to determine global disease burden, which required a single prescriptive standard for growth and Intergrowth was most well suited for this purpose. We have attempted to better acknowledge the limitations of Intergrowth and the potential benefits of ethnic or customized charts in the Discussion paragraphs 2-3. 11. Page 10, line 10: “Low birth weightQ”. LBW is a rather crude measure, not taking any other factors into account, such as gestational age, sex, ethnicity, parity, mother’s weight/height etc). As SGA takes several factors into account it is potentially more accurate. 2500g in one population does not mean the same in another. I believe the more inaccurate measures are being used the less precise the estimations are, rendering these completely arbitrary. Is the addition of a P2 or P3 not better for estimating severe growth restriction? So why was LBW

Page 10: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

used? What is the added value of the imprecise measure LBW? Response: We acknowledge the reviewer’s concern, however low birth weight was not used in this analysis as a single exposure but rather to further stratify SGA infants by categorizing those larger (>2500 gm) vs.smaller (<2500g) babies. This additional categorization was applied in attempt to separate out the higher risk SGA infants (low birth weight) from the lower-risk, higher-weight SGA infants (who may more likely be misclassified). In the revised analysis, the non-low birth weight SGA infants did not carry increased neonatal mortality risk in Asia or Latin America. 12. Page 10, line 51: “We conductedQ.”. I do not understand why the RR were not based on Intergrowth 21-st standard. I believe this should have been done. I also don’t understand why mortality risk estimates do not differ between the used standard/reference, while the IG21 standard and US reference differ that much. Using RR based on US references may introduce considerable bias. Response:

We agree with the reviewer on this very important point. In the revised manuscript, we have now redone entire the analysis and calculated neonatal mortality RR’s for individual datasets using the Intergrowth 21 standard, and pooled RR estimates at the regional level using the Intergrowth RR’s. The Intergrowth pooled RR’s for neonatal mortality are used in the current PIF/PAF calculations. RESULTS 13. The PAF was not calculated correctly. The PAF is now calculated as a proportion (all SGA neonatal deaths/all neonatal deaths). The PAF-equation = (Incidence neonatal deaths (tot) – Incidence neonatal deaths among non-SGA)/Incidence (tot). It is now unclear how many neonates were born SGA at all, and this influences the PAF considerably. Response:

As per the response above:

In this analysis, we have used methods of the WHO Comparative Risk Assessment (CRA) Methodology to estimate population attributable fraction (PAF) for multiple category exposures (http://www.who.int/healthinfo/global_burden_disease/metrics_paf/en/.) In these methods, the Potential Impact Fraction (PIFs) is calculated and used to estimate the population attributable fractions (PAFs) for multiple category exposures.

From the WHO CRA methods “the PIF equation can be used to estimate the population attributable fraction (PAF), defined as the proportional reduction in death that would occur if exposure to the risk factor were reduced to the counterfactual exposure distribution.” Thus, calculation of the potential impact fraction (PIF) is one way of performing the calculation for PAF, using the below equation.

Where

Page 11: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Pi = proportion of population at exposure level i, current exposure

P'i = proportion of population at exposure level i, counterfactual or ideal level of exposure

RR = the risk ratio at exposure level i

n = the number of exposure levels

Thus in this equation, the current and “ideal” prevalence of SGA is used in the equation and a large determinant of the PIF. The numbers of neonates born SGA is available in Table 1 of the paper. These methods are published in the book Comparative Quantification of Health Risks, Global Regional Burden of Disease Attributable to Selected Major Risk Factors (WHO 2004) Chapter 25, “Estimating Attributable Burden of Disease from Exposure and Hazard Data” and elsewhere

(Zapata-Diomedi, Br J Sports Med. 2016 Mar 8). These methods are the basis for the Global Burden of Disease estimates in Lim et al. (Lancet. 2012 Dec 15;380(9859):2224-60).

We had used the terminology PAF as opposed to PIF given that is the terminology more commonly used by the WHO in CRA analysis: http://www.who.int/healthinfo/global_burden_disease/metrics_paf/en/.

We also agree with the reviewer’s definition of PAF and have included this definition as well in the methods.

14. Figure 2 and figure 3 show absolute numbers. I would recommend to standardise these numbers or use proportions of the number of births. This would be less suggestive and simplify the interpretation. Response: We appreciate this edit and have revised both figures accordingly. We have revised Figure 2 to reflect the prevalence of SGA as percentage of total live births, and revised Figure 3 to reflect the proportion of total neonatal deaths attributable SGA (as percentages). 15. eTable 2/3: To estimate the effect of a reduction in SGA on neonatal death the best way to do this is by calculating Potential Impact Fraction (PIF). The PIF estimates the proportion of neonatal deaths in the population that can be avoided by a reduction in the rate of the risk factor. In my opinion, the current calculations are invalid. Response: The current calculations are in fact calculating PIFs, which is the same as a categorical PAF equation as above. We had used the equation above which is equivalent to PIF. DISCUSSION 16. Page 14, line 51: Good that studies with different conclusions than Intergrowth were mentioned. Response: Thank you, we have incorporated discussion and citation of several of manuscript that you have mentioned as well.

1

Page 12: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

17. Page 15, line 12 “Intergrowth had very strictQ”. The strict eligibility criteria are highly arbitrary and its use implicates that this population was highly selected and therefore biased. This results in fetal growth criteria that are “too perfect”, not representing the studied populations well enough, likely not even the more affluent part. Response: We have acknowledged that the Intergrowth was more selective in the Discussion, paragraph 2 18. Page 15, line 16 “The NICHD andQ” I don’t support the criticism”. SGA still is a proxy for nutritional status and does not measure it directly. The thin-fat Indian baby is well recognised. Is such a baby growth restricted? And it is also evident that shorter mothers will have shorter and consequently lighter babies. Excluding such mothers from references narrows the eventual derived distribution (SD). Response: We have removed this reference. 19. Page 15, line 25 “Intergrowth’s 10th percentile birth weight cut-off was generally 150-200g lower across gestationQ.”. But the cut—off is still 150 gram higher than found in affluent Asian Indian (Hindustani) babies in the Netherlands (Visser, G. H., et al. (2009). "New Dutch reference curves for birthweight by gestational age." Early Hum. Dev 85(12): 737-744. Response: We have acknowledged this point in the Discussion, paragraphs 3 and 4, and address this specific study (Visser et al 2009) in the second sentence of Discussion paragraph 3 (below). We also have calculated the prevalence of SGA in one of our datasets with a local South Asian growth curve (Bhatia et al 1981), for comparison with the SGA estimate with the Intergrowth standard (Discussion paragraph 4), to further address the effects of curve choice on SGA prevalence estimates.

Discussion, paragraph 3: In the Netherlands, population birthweight reference curves were published by Visser et al, and

showed that Dutch Hindustani babies were systematically of lower birthweight than other ethnic

groups, up to 300 grams at certain weeks gestation.51

20. Page 16, line 29 “Some of the birth cohortsQ1982”. I believe the use of “old databases” is a major limitation as much has changed since the United Nations Millennium Development Goals were generally adopted in 2000. The way the results are presented implies that the results are contemporary while they are not. The current results are therefore irrelevant. Response: Based on these recommendations we have removed all cohorts before 1990 from the analysis and estimates. Now 80% of our cohorts are after the year 2000. 21. Page 17, line 19 “The primary prevention of IUGR in LMICsQ”. I would reduce (summarise) this last section considerably as prevention was not this study’s objective. Also, the practical implications in this section cannot be drawn based on this study, as historical datasets were used. Also because of the use of universal fetal growth references that has many limitations and drawbacks that do not justify drawing conclusions that may impact the health of certain populations (i.e. South Asian) negatively. The named supplementation programmes may even lead to overfeeding in South Asia in the typical “thin-fat” Indian baby, eventually leading to a larger fat mass and early development of cardiometabolic disease.

Page 13: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Response:

Thank you, we appreciate this suggestion. We have substantially reduced this paragraph. Good luck! J.A. de Wilde Additional Questions: Please enter your name: J.A. de Wilde _________________________________________________________________________ Reviewer: 2 Recommendation: Comments: Overall, this is an important study that raises awareness of SGA and contributes to the overall understanding of the contribution of SGA to global neonatal mortality. The authors do a very nice job explaining the strengths and limitations of the analyses without overstating their results. A few considerations to clarify the study include the following: Introduction - the authors make the case of Intergrowth as the optimal reference, yet the tables (i.e., Tables 1-3) also include the 1991 US reference standard. If Intergrowth is the standard, not sure adding the 1991 estimates are needed (and not sure what this adds, while potentially taking away from the main message?)

Response: Thank you for this suggestion. We have removed the US 1991 data from the main tables. That data is now available in the Web Appendix, eTables 3 and 5.

Methods - the authors do not provide a justification for the date range for the studies that were selected for inclusion. It seems like the older studies (i.e., the Nestle funded study from the 1980's) could have substantial differences in methodology that would affect their data quality.

Response: In the original analysis, two cohort were from the 1980’s. Given the reviewers’ concerns, we have now removed those studies from the current pooled analysis and only included studies after 1990. The majority (80%) of the studies were done post-2000.

Results - a large quantity of data are presented and comparisons made with various categories. Consider reducing what is the main findings for inclusion in the primary results (but the web-tables are nice for reference)

Response: We have removed the 1991 U.S. population references to the Web Appendix (eTables 3 and 5) to focus the main results better on the estimates by Intergrowth.

Page 14: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Discussion - one of the challenges with SGA in LMIC seems to be its detection. While the authors recommend a number of prevention strategies, it is unclear whether they are suggesting that these be provided across all women or pregnant women at risk for SGA? This is especially true given the limited accurate GA dating across LMICs (as the authors note in the discussion).

Response: Thank you, this is a great point. Given your comments and reviewer 1, we decided to place less emphasis on primary prevention. The strategies are likely to differ substantially based on the context and it is challenging to generalize recommendations. Given the challenges of gestational age in most LMIC, this makes it even more challenge to tackle. Thus we have focused a bit more on the management of SGA and/or preterm infants.

Generally, it could be clarified further what knowing SGA adds to the preterm status.

Response; A baby born SGA in addition to preterm substantially and potentially synergistically increases the risk of neonatal mortality. We have added a sentence to this effect in the discussion, paragraph 6.

Minor comments - Methods - unclear why Johns Hopkins was the IRB that deemed this to be exempt. No statement of women's consent is given.

Response: We have clarified this in the methods, paragraph 2. In the original studies, maternal consent was obtained with local research ethical approval. However, as this was a secondary data analysis of de-identified datasets, Johns Hopkins deemed the study exempt.

Using the large number of unconventional abbreviations (i.e., PAGA, PSGA, etc.) makes it a bit more tedious to follow the results. Response:

We have attempted to reduce the number of abbreviations or add full text as possible. We have used the terms “Term-SGA” and “Preterm-SGA” instead of TSGA and PSGA, respectively, throughout the text. Additional Questions: Please enter your name: Elizabeth McClure Job Title: Senior Research Epidemiologist Institution: RTI International

Page 15: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

__________________________________________________________________________ Reviewer: 3 Recommendation: Comments: * Originality - does the work add enough to what is already in the published literature? If so, what does it add? If not, please cite relevant references. This work presents the first estimate of the burden of SGA in LMICs using the international INTERGROWTH-21st standards. It presents original information on the PAF based on an optimal maternal population. * Importance of work to general readers - does this work matter to clinicians, patients, teachers, or policymakers? Is a general journal the right place for it? This work is important to policy makers in global health. Similar to how stunting is used as an indicator of the nutrition and health status of children around the world. Similarly, an evidence-based definition of SGA can compare all babies around the world on an equal basis. This highlights regions where attention to maternal nutrition and health are most needed (i.e. South and South East Asia). With time, it is anticipated that this standardised indicator can be used to compare progress within and between countries and regions. * Scientific reliability Research Question - clearly defined and appropriately answered? Yes the research question is clearly defined and answered Overall design of study - adequate ? Given the limitations in routinely collected data for indicators such as birthweight and gestational age at delivery, the authors have used an appropriate strategy. However it is important that they make it clear that these limitations of routine data and the efforts needed to improve the quality of the data. Response: We have emphasized the limitations of the current datasets, and the data that are generally available in LMICs. In the limitations section of the discussion, we have emphasized the need to increase the quantity and quality of birth weight data, particularly in South Asia and Sub-Saharan Africa, where <50% of babies are weighed at birth. We have also emphasized the need to improve gestational age data in LMICs. Participants studied - adequately described and their conditions defined? As this study is attempting to examine large-scale population level trends, this information is not applicable. Methods - adequately described? Complies with relevant reporting

Page 16: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

standard - Eg CONSORT for randomised trials ? Ethical ? Methods are adequately described. Given the lack of source data, modeling is used to obtain the estimates with confidence intervals, which is appropriate. Results - answer the research question? Credible? Well presented? The results are presented clearly and address the stated research question. The figures illustrate the differing burden or SGA across regions clearly. Both absolute numbers and percentages for comparison are given. Interpretation and conclusions - warranted by and sufficiently derived from/focused on the data? Message clear? The message is clear and derived from the data. The authors highlight the limitations to this sort of analysis with modeled data. More emphasis is needed in improving the data quality for hospital deliveries that do not require much of equipment and simple calibration. This can be reach at almost al levels of care. Did they consider a sensitivity analysis to see whether there is any evidence of temporal change, considering that some of the data is now over 30 years old that they included? Response:

To address these concerns and similar concerns from the committee, we excluded the 2 older before 1990 from the revised analysis and estimates. Approximately 80% of datasets included in the current analysis were after the year 2000. In the new analysis, the estimates did not change meaningfully compared to the prior estimates including the older datasets. However, we elected to present the revised analysis excluding the pre-1990 data in the current version of the manuscript. In the discussion, paragraph 2 for example, there is a comparison of INTERGROWTH-21st Project’s STANDARDS to the NICHD and WHO fetal growth reference charts. There are two problems here: a) standards are clearly difference to references and therefore results will be conceptually different; b this study is using the INTERGROWTH 21st newborn size at birth standards and no fetal growth data from ultrasound is presented. Hence this comparison is not relevant here as the NIH is fetal growth by ultrasound ethnic specific and WHO is only fetal growth by ultrasound. It seems to be some confusion in the use of these tools. Response: The inclusion of the NICHD and WHO fetal growth references were to address the potential for ethnic-specific differences in fetal growth, which have been raised by many. We have moved these data to a separate paragraph, and emphasized that these are ultrasound-dated data, and corresponding birth weight data are not available or published for either of these studies. Finally, the wording when referring to the INTERGROWTH 21ST PROJECT implies that birth-weight was used to determinate whether population data could be pooled to construct international growth standards. This is incorrect as this project explicitly used linear growth i.e. birth length avoiding the use of BW. To be accurate, the authors should change this in the places where is implied. Response:

Page 17: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Thank you for this comment. We have attempted to correct any inadvertent references to birth weight in this context (for pooling purposes). Please let us know if there is any further inaccuracy in the representation of the project. References - up to date and relevant? Any glaring omissions? no Additional Questions: Please enter your name: Jose Villar MD, MPH Job Title: Professor of Perinatal Medicine Institution: John Radcliffe Hospital Reviewer: 4 Recommendation: Comments: Small-for-gestational-age (SGA) infants are at increased risk of morbidity. SGA is defined as weighing below the 10th percentile of birth weight for a specific completed gestational age by sex of a given reference population. Clearly a key issue is which reference population is chosen. Debates are inevitable about the relative merits of global vs national reference standards. It all depends on what the question is. The INTERGROWTH centile charts were derived from a very health population in which rather little difference was seen between the size of fetuses in 8 countries, including both developed (including UK, USA) and LMIC (including Kenya, India, China). These charts may be seen as prescriptive or aspirational, or depicting optimal fetal groups (p7/40). Most other charts of fetal size were developed from an unselected, or minimally selected, group and may be seen more as descriptive; these would be expected to vary between countries. Sometimes these two types of chart are referred to as standards and references, respectively, but the words seem to be used interchangeably by many/most people. The authors note that “the population attributable fraction (PAF) reflects the disease burden attributable to a causal risk factor if it were reduced from the current exposure level to a theoretical minimum counterfactual distribution” (8/21). They thus chose to use the INTERGROWTH centiles as the basis for the analysis. Given the definition of PAF, the use of INTERGROWTH standards is appropriate. That choice will inevitably lead to a lower prevalence of SGA than use of other standards with much less selective inclusion criteria such as the US 1991 birthweight reference. This paper presents valuable and interesting information. Overall I think that the paper is clear with respect to the methods used and the findings. Some parts could be made easier for readers not especially familiar with this literature. For example, on p10-11 I didn’t really understand the following text: “Following the publication of the

Page 18: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

Intergrowth standard which occurred after the completion of our original analysis, we explored the magnitude of change in the effect size of the mortality risk estimates when comparing SGA defined by the U.S. population reference and the Intergrowth standard 53 (eMethods 1). After finding no significant difference between the reference standard used, the previously published regional RRs were used for PAF analyses in this analysis.” Please clarify what was compared between the 2 standards. Response: Thanks for requesting this clarification. We have now done additional analysis to analyze all the original datasets using the Intergrowth standard across all gestational ages (>33 weeks with Villar et al; Lancet. 2014 Sep 6;384(9946):857-68., and <33 weeks using the more recently published reference charts by Villar et al. Lancet. 2016 Feb 27;387(10021):844-5 ). The individual datasets were reanalyzed to calculate RR’s for neonatal mortality for SGA defined using the Intergrowth standard along all gestational ages and then pooled at the regional level in this revised analysis. These revised Intergrowth RR’s were used for PAF calculations. Additional Questions: Please enter your name: Doug Altman Job Title: Professor of Statistics in Medicine Institution: Univ of Oxford **Information for submitting a revision** Deadline: Your revised manuscript should be returned within one How to submit your revised article: Log into http://mc.manuscriptcentral.com/bmj and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision. You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript using a word processing program and save it on your computer. Once the revised manuscript is prepared, you can upload it and submit it through your Author Center. When submitting your revised manuscript, you will be able to respond to the comments made by the reviewer(s) and Committee in the space provided. You can use this space to document any changes you make to the original manuscript and to explain your responses. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the reviewer(s). As well as submitting your revised manuscript, we also require a copy of the manuscript with changes highlighted. Please upload this as a supplemental file with file designation ‘Revised Manuscript Marked copy’. Your original files are available to you when you upload your revised manuscript. Please delete any redundant files before completing the submission. When you revise and return your manuscript, please take note of all the following points about revising your article. Even if an item, such as a competing interests statement, was present and correct in the original draft of your paper, please check that it has not slipped out during revision. Please include these items in the revised manuscript to comply with BMJ style (see:

Page 19: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

http://www.bmj.com/about-bmj/resources-authors/article-submission/article-requirements and http://www.bmj.com/about-bmj/resources-authors/forms-policies-and-checklists). Items to include with your revision (see http://www.bmj.com/about-bmj/resources-authors/article-types/research): 1. What this paper adds/what is already known box (as described at http://resources.bmj.com/bmj/authors/types-of-article/research) 2. Name of the ethics committee or IRB, ID# of the approval, and a statement that participants gave informed consent before taking part. If ethics committee approval was not required, please state so clearly and explain the reasons why (see http://resources.bmj.com/bmj/authors/editorial-policies/guidelines.) 3. Patient confidentiality forms when appropriate (see http://resources.bmj.com/bmj/authors/editorial-policies/copy_of_patient-confidentiality). 4. Competing interests statement (see http://resources.bmj.com/bmj/authors/editorial-policies/competing-interests) 5. Contributorship statement+ guarantor (see http://resources.bmj.com/bmj/authors/article-submission/authorship-contributorship) 6. Transparency statement: (see http://www.bmj.com/about-bmj/resources-authors/forms-policies-and-checklists/transparency-policy) 7. Copyright statement/licence for publication (see http://www.bmj.com/about-bmj/resources-authors/forms-policies-and-checklists/copyright-open-access-and-permission-reuse) 8. Data sharing statement (see http://www.bmj.com/about-bmj/resources-authors/article-types/research) 9. Funding statement and statement of the independence of researchers from funders (see http://resources.bmj.com/bmj/authors/article-submission/article-requirements). 10. Patient involvement statement (see http://www.bmj.com/about-bmj/resources-authors/article-types/research). 11. Please ensure the paper complies with The BMJ’s style, as detailed below: a. Title: this should include the study design eg "systematic review and meta-analysis.” b. Abstract: Please include a structured abstract with key summary statistics, as explained below (also see http://resources.bmj.com/bmj/authors/types-of-article/research). For every clinical trial - and for any other registered study- the last line of the abstract must list the study registration number and the name of the register. c. Introduction: This should cover no more than three paragraphs, focusing on the research question and your reasons for asking it now.

Page 20: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

d. Methods: For an intervention study the manuscript should include enough information about the intervention(s) and comparator(s) (even if this was usual care) for reviewers and readers to understand fully what happened in the study. To enable readers to replicate your work or implement the interventions in their own practice please also provide (uploaded as one or more supplemental files, including video and audio files where appropriate) any relevant detailed descriptions and materials. Alternatively, please provide in the manuscript urls to openly accessible websites where these materials can be found. e. Results: Please report statistical aspects of the study in line with the Statistical Analyses and Methods in the Published Literature (SAMPL) guidelines http://www.equator-network.org/reporting-guidelines/sampl/. Please include in the results section of your structured abstract (and, of course, in the article's results section) the following terms, as appropriate: i. For a clinical trial: Absolute event rates among experimental and control groups; RRR (relative risk reduction); NNT or NNH (number needed to treat or harm) and its 95% confidence interval (or, if the trial is of a public health intervention, number helped per 1000 or 100,000.) ii. For a cohort study: Absolute event rates over time (eg 10 years) among exposed and non-exposed groups; RRR (relative risk reduction.) iii. For a case control study:OR (odds ratio) for strength of association between exposure and outcome. iv. For a study of a diagnostic test: Sensitivity and specificity; PPV and NPV (positive and negative predictive values.) v. For a systematic review and/or meta-analysis: Point estimates and confidence intervals for the main results; one or more references for the statistical package(s) used to analyse the data, eg RevMan for a systematic review. There is no need to provide a formal reference for a very widely used package that will be very familiar to general readers eg STATA, but please say in the text which version you used. For articles that include explicit statements of the quality of evidence and strength of recommendations, we prefer reporting using the GRADE system. f. Discussion: To minimise the risk of careful explanation giving way to polemic, please write the discussion section of your paper in a structured way. Please follow this structure: i) statement of principal findings of the study; ii) strengths and weaknesses of the study; iii) strengths and weaknesses in relation to other studies, discussing important differences in results; iv) what your study adds (whenever possible please discuss your study in the light of relevant systematic reviews and meta-analyses); v) meaning of the study, including possible explanations and implications for clinicians and policymakers and other researchers; vi) how your study could promote better decisions; vi) unanswered questions and future research g. Footnotes and statements Online and print publication: All original research in The BMJ is published with open access. Our open access policy is detailed here: http://www.bmj.com/about-bmj/resources-authors/forms-policies-and-checklists/copyright-open-access-and-permission-reuse. The full text online version of your article, if accepted after revision, will be the indexed citable version (full details are at http://resources.bmj.com/bmj/about-bmj/the-bmjs-publishing-model). The print and iPad BMJ will carry an abridged version of your article. This abridged version of the article is essentially an evidence abstract called BMJ pico, which we would like you to write using the template downloadable at http://resources.bmj.com/bmj/authors/bmj-pico. Publication of research on bmj.com is definitive and is not simply interim "epublication ahead of print", so if you do not wish to abridge your article using BMJ pico, you will be able to opt for online only publication. Please

Page 21: Small-for-Gestational-Age: Estimates of Births and ...€¦ · Attributable Neonatal Deaths in Low-Middle Income Countries. In response to the Committee concerns, we have performed

let us know if you would prefer this option. If your article is accepted we will invite you to submit a video abstract, lasting no longer than 4 minutes, and based on the information in your paper’s BMJ pico evidence abstract. The content and focus of the video must relate directly to the study that has been accepted for publication by The BMJ, and should not stray beyond the data. END