Upload
lamkhanh
View
226
Download
3
Embed Size (px)
Citation preview
FREE-KNOT SPLINES AND BOOTSTRAPPING FOR NONLINEAR MODELING IN COMPLEX SAMPLES
by
SCOTT W. KEITH
DAVID B. ALLISON, CHAIR CHARLES R. KATHOLI CHARLES D. COWAN
NENGJUN YI OLIVIA THOMAS
EDWARD W. GREGG
A DISSERTATION
Submitted to the graduate faculty of The University of Alabama at Birmingham, in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
BIRMINGHAM, ALABAMA
2008
Copyright by Scott W. Keith
2008
FREE-KNOT SPLINES AND BOOTSTRAPPING FOR NONLINEAR MODELING IN COMPLEX SAMPLES
Scott W. Keith
BIOSTATISTICS
ABSTRACT
Studies on body mass index (BMI) as it relates to headache or mortality have
noted considerable nonlinearity. Approaches to rigorously analyzing these relationships
have been limited to generalized linear models and survival models using either
categorizations or polynomials of BMI. I have designed, evaluated, and implemented a
piecewise linear logistic regression framework for modeling nonlinearity between a
binary outcome and a continuous predictor, such as BMI, adjusted for covariates in
complex samples. Least squares and maximum likelihood estimation methods were used
to numerically optimize free-knot splines. Inference methods utilized both parametric and
nonparametric bootstrapping. Parameter estimates were structured for interpretability by
investigators familiar with logistic regression. Unlike other nonlinear software, this
framework accounts for multistage cross-sectional survey sample designs.
I applied this framework to complex datasets to examine the US population for
headache among women and mortality as they respectively relate to BMI. For headache,
datasets included the National Health Interview Survey (NHIS) and the first National
Health and Nutrition Examination Survey (NHANES I). A common nadir in the BMI-
headache relationship was detected around a BMI of 20, relative to which mild obesity
(BMI of 30) and severe obesity (BMI of 40) were respectively associated with roughly
35% and 80% increased odds of headache. Mortality analyses focused on NHANES III.
BMI showed a checkmark-shaped relationship with odds of mortality, but elevated BMI
iii
did not show significantly increased odds. This was unexpected and the product of a
nascent analysis plan. Thus, this finding should be viewed as preliminary. Waist-to-hip
ratio (WHR) has been used as an anthropometric predictor of mortality risk, but the shape
of the relationship has not been carefully examined. For comparison with BMI, I
investigated WHR in NHANES III. Linear logistic regression methods were sufficient for
the WHR-mortality relationship, but WHR was a significant predictor for women only.
The results of these studies relate broadly to the US population and the methods
provide a flexible logistic regression framework for detecting and characterizing
nonlinear relationships. The estimates may provide impetus for more focused obesity,
headache, and mortality research which might realistically affect long-term public health
policy and risk awareness.
iv
DEDICATION
I owe a great debt of gratitude to my best friend and wife, Aimee A. Dugas. Her
love and tireless support of me in developing my career has been a priceless gift. It is to
her that I dedicate this work.
v
ACKNOWLEDGMENTS
I would like to thank the members of this dissertation committee for their
guidance, criticism, and support. In particular, I wish to acknowledge my friend and
mentor David B. Allison whose trust and encouragement has profoundly impacted my
thinking, skill set, and potential for success. Much appreciation also goes to Tapan Mehta
for his assistance in running simulation programs and porting code for high-performance
parallel computing. Thanks to the International Journal of Obesity in which was
published, “Putative contributors to the secular increases in obesity: Exploring the roads
less traveled.” Thanks to Obesity in which was published, “BMI and headache among
women: Results from 11 epidemiologic datasets.” This research was supported in part by
the following NIH Grants: T32HL079888, P30DK056336, and R01DK076771; and
Ortho-McNeil Pharmaceutical, Inc.
vi
TABLE OF CONTENTS
Page
ABSTRACT....................................................................................................................... iii DEDICATION.....................................................................................................................v ACKNOWLEDGMENTS ................................................................................................. vi LIST OF TABLES............................................................................................................. ix LIST OF FIGURES .............................................................................................................x LIST OF ABBREVIATIONS............................................................................................ xi
INTRODUCTION ...............................................................................................................1
Goals and Objectives ...............................................................................................1 How These Ideas Developed ...................................................................................2 An Overview of the Components of This Research ................................................3
Piecewise Linear Free-Knot Splines............................................................3 Knot Selection Via Parametric Bootstrapping ............................................3 Complex Multistage Probability Samples....................................................3 Adjustments to Estimates and Confidence Intervals ....................................4 Simulations...................................................................................................4 Applications to Public Health Outcomes .....................................................4
Specific Aims...........................................................................................................5
Specific Aim 1 ..............................................................................................5 Specific Aim 2 ..............................................................................................6
The Papers................................................................................................................7
PUTATIVE CONTRIBUTORS TO THE SECULAR INCREASE IN OBESITY: EXPLORING THE ROADS LESS TRAVELED ...............................................................8
vii
A FREE-KNOT SPLINE MODELING FRAMEWORK FOR PIECEWISE LINEAR LOGISTIC REGRESSION IN COMPLEX SAMPLES ..................................................44 BMI AND HEADACHE AMONG WOMEN: RESULTS FROM 11 EPIDEMIOLOGIC DATASETS .......................................................................................................................90 BODY MASS INDEX AND WAIST-TO-HIP RATIO AS THEY RELATE TO MORTALITY IN NHANES III ......................................................................................116 CONCLUSION................................................................................................................145 GENERAL LIST OF REFERENCES .............................................................................150 APPENDIX A Future Directions: Nonlinear Cox Proportional Hazards Regression...........154 B On the effects of ignoring sample weights or those with extremely
high BMI .....................................................................................................158
viii
LIST OF TABLES
Tables Page BMI AND HEADACHE AMONG WOMEN: RESULTS FROM 11 EPIDEMIOLOGIC
DATASETS 1 Description of epidemiologic datasets used.........................................................111
2 The coding of headache among the 11 datasets...................................................112
3 Piecewise logistic regression primary model results ...........................................113
4 Piecewise logistic regression extended model results .........................................114
5 Odds ratios and 95% confidence intervals across selected BMI values for the primary and extended models ..............................................................................115
BODY MASS INDEX AND WAIST-TO-HIP RATIO AS THEY RELATE TO
MORTALITY IN NHANES III
1 NHANES III data description..............................................................................139
2 Piecewise linear logistic regression model results for relating log-odds of mortality during follow-up to BMI and WHR by gender ....................................140
3 Odds ratios and bootstrap 95% CI across selected BMI and WHR values by
gender...................................................................................................................141
ix
LIST OF FIGURES
Figures Page
PUTATIVE CONTRIBUTORS TO THE SECULAR INCREASE IN OBESITY: EXPLORING THE ROADS LESS TRAVELED
1 Secular changes in a number of key indicators of factors that may be
related to the increase in obesity............................................................................43 A FREE-KNOT SPLINE MODELING FRAMEWORK FOR PIECEWISE LINEAR
LOGISTIC REGRESSION IN COMPLEX SAMPLES
1 Plotted B-spline basis functions of order m = 2 having knots at BMI = 21 and BMI = 34................................................................................................................87
2 Model selection simulation results for the parametric bootstrap 2 df forward
selection procedure ................................................................................................88 3 A comparison of knot selection simulation results ...............................................89
BMI AND HEADACHE AMONG WOMEN: RESULTS FROM 11 EPIDEMIOLOGIC
DATASETS
1 Odds ratios for headaches among women by BMI (reference BMI = 20)...........110
BODY MASS INDEX AND WAIST-TO-HIP RATIO AS THEY RELATE TO MORTALITY IN NHANES III
1 Histograms depicting the distributions of BMI and WHR by gender .................142
2 Unadjusted proportion of deaths observed among those grouped per unit of BMI and 1
100 unit of WHR, respectively, by gender ...................................................143 3 Odds ratios plotted for BMI* and WHR by gender..............................................144
x
LIST OF ABBREVIATIONS
(Entries are listed alphabetically)
AIC Akaike’s information criterion
alt. alternative or “under the alternative hypothesis”
B-spline basis spline
BARS Bayesian adaptive regression splines
BIC Schwarz’s Bayesian information criterion
BMI body mass index (kg/m2)
BRR balanced repeated replication
CV cross validation
df degree(s) of freedom
GAM generalized additive model
GLM generalized linear model
GCV generalized cross validation
IML integrated matrix language package in SAS
LMF linked mortality file
log(odds) the natural logarithm of the odds in favor of some binary event occurring
LR likelihood ratio
LSE least squares estimate, estimator, or estimation
arg min() argument of the minimum (i.e., minimize)
MARS multivariate adaptive regression splines
xi
MLE maximum likelihood estimate, estimator, or estimation
NCHS National Center for Health Statistics
NHANES National Health and Nutrition Examination Survey
NHIS National Health Interview Survey
NLIN least squares nonlinear optimization package in SAS
NLP nonlinear programming optimization package in SAS
OR odds ratio(s)
P-splines penalized splines
PLS piecewise linear slope(s) (refers to representation of spline parameters)
PROC SAS software procedure
PSU primary sampling unit
RDC Research Data Center at NCHS
rep replicate (i.e., indicating a randomly drawn replicate)
SAS Statistical Analysis Software
SSE sum of squared residual error
SRS simple random sample or sampling
SUDAAN a SAS-callable package for common analyses of complex samples
TPSLINE thin plate spline modeling package in SAS
TRANSREG transformation regression modeling package in SAS
WHO World Health Organization
WHR waist-to-hip ratio
xii
INTRODUCTION
Goals and Objectives
In this dissertation research I investigated an approach to nonlinear modeling
designed to take advantage of complex sample design features common to large
nationally representative observational datasets. This nonlinear modeling framework
focused on:
1) using piecewise linear free-knot splines to estimate the nonlinear relationship
between a binary outcome of interest and a continuous predictor variable adjusted
for relevant covariate information; and
2) applying bootstrap methods to perform two important functions:
a) estimating the complexity of the free-knot spline by determining the
number of knot parameters that provide the optimal piecewise linear fit to
the data; and
b) making appropriate adjustments to parameter estimates, variance
estimates, and confidence intervals by taking into account complex sample
design information.
To demonstrate the properties of the novel aspects of this nonlinear modeling
framework, I have designed and carried out a comparative simulation study. Next, I
focused my efforts on making original contributions to public health knowledge by
applying the framework to real data from the third National Health and Nutrition
Examination Survey (NHANES III) and other large, cross-sectional datasets. In
1
particular, I have conducted two potentially nonlinear analyses to investigate the
following:
1) the relationship between headaches and body mass index (BMI: weight in kg
divided by the square of height in meters) among women (Keith et al., 2008); and
2) the risk of mortality as it relates to BMI and waste-to-hip ratio (WHR; waist
circumference divided by hip circumference), respectively.
The estimates I have computed may then be used to provide a basis for designing
more focused observational and clinical quantitative research in the areas of obesity,
headache, and mortality. The eventual results from these studies may be expressed by
complex decision models (Parmigiani, 2002) that may realistically affect public health
policy in the long-term.
How These Ideas Developed
The ideas for this research stem primarily from my efforts to model the apparently
nonlinear relationships between headache and BMI among women sampled throughout
the United States in datasets that included the first National Health and Nutrition
Examination Survey (NHANES I: 1971-1975) and the National Health Interview Surveys
(NHIS series 1997-2003). These eight publicly available datasets each had complex,
multistage sample designs that allowed for efficient achievement of samples that
represented the civilian non-institutionalized US population.
I could find no software packages that would fit nonlinear models and make the
appropriate adjustments for the complex sample designs of these large cross-sectional
datasets. Thus, I set out to construct my own nonlinear modeling software capable of
2
utilizing the strata, primary sampling unit (PSU), and sample weight information
provided by these surveys.
An Overview of the Components of This Research
Here I discuss very briefly the topics and issues involved in this research. Each of
these areas will be discussed in detail in later chapters.
Piecewise Linear Free-Knot Splines
After extensively reviewing the literature on nonlinear modeling, free-knot splines
stood out for their potential for flexibility and interpretability. A free-knot spline may be
loosely described as a nonlinear regression characterized by piecewise polynomials of
order m joined at locations called knots where the adjoining segments typically agree at
their (m-2)th derivative and both the number and locations of the knots are parameters to
be estimated along with other model parameters.
Knot Selection Via Parametric Bootstrapping
Picking the most appropriate free-knot spline model is a complicated problem. I
have devised a unique method of knot selection based on parametric bootstrap
methodology.
Complex Multistage Probability Samples
The complex sample designs, for which I am designing this framework, provide
investigators information necessary to adjust analyses for the planned sampling schemes.
3
These designs employ multistage probability sampling involving stratification, clustering,
assessment of non-response, and oversampling of specific subpopulations (e.g., age or
race subgroups) that would be difficult to represent well with simple random sampling
(SRS). Ignoring these characteristics can result in biased and possibly misleading
estimates.
Adjustments to Estimates and Confidence Intervals
Research into adjustments for complex, multistage probability sample designs
revealed useful applications of resampling techniques, such as the bootstrap, which may
be conveniently employed to make appropriate adjustments.
Simulations
The knot selection procedure I have devised for piecewise linear free-knot spline
modeling framework will be carefully evaluated on its performance in analyzing data
simulated under a variety of conditions.
Applications to Public Health Outcomes
Experts tend to agree that obesity is complex (Keith et al., 2006a) and costly
(Allison et al., 1999; WHO, 1998), having relationships with headache (e.g., Keith et al.,
2008; Bigal et al., 2006) and mortality rate (e.g., Flegal et al., 2005; Fontaine et al., 2003;
Calle et al., 1999; Narayan et al., 2007; Keith et al., 2006b), respectively, where many
factors contribute or modify an individual’s susceptibility to obesity and its correlates.
The contemporary studies of these relationships call for large sample sizes and analytical
4
tools capable of identifying significant associations that are likely to be unevenly
distributed over groups that vary by age, race, gender, geographic locations,
socioeconomic status, etc. Moreover, statistical tools, such as those which use splines,
that are capable of providing highly flexible models are called for in these settings (Korn
and Graubard, 1999).
The information derived from this research may be potentially useful for
clinicians and biostatisticians. Firstly, BMI is an easily measured and modifiable risk
factor. Thus, clinicians and public health officials could use the results of this study to
advise or counsel patients regarding the benefits associated with remaining “below” or
“above” a given BMI to reduce their risk of headache or mortality. Secondly,
biostatisticians and other quantitative researchers may be interested in more closely
estimating the nonlinear functional form of the association between a continuous
predictor and a binary outcome related to their particular field.
Specific Aims
Specific Aim 1
To complete development of and evaluate a piecewise linear free-knot spline
approach to modeling the nonlinear relationship between a binary outcome and a
continuous prognostic variable in large datasets having complex sampling designs and
covariate information.
In brief, the framework carries forward concepts and ideas developed for least
squares estimation (LSE) to applying maximum likelihood estimation (MLE) in
piecewise linear free-knot splines based upon either the truncated power basis or B-
5
splines (de Boor, 1978). Statistical software will be employed in this framework to
simultaneously optimize model equations with respect to multiple covariate and knot
parameters. For the purpose of selecting the optimal number of knots and their locations,
I have developed a model selection algorithm that utilizes the binomial probability
distribution assumption under the piecewise linear logistic regression model. This
framework will employ parametric bootstrapping (Davison and Hinkley, 1997) of a two
degree of freedom test of model improvement from adding a knot parameter and a slope
parameter (my “2 df knot testing procedure”). In order to compute accurate standard error
estimates and confidence intervals, another level of specialized nonparametric
bootstrapping must be applied to rescale individual sampling weights according to the
methods outlined by Rao and colleagues (1992). A critical component of this aim
involved evaluating the 2 df knot testing procedure with respect to its efficiency and
qualities. This evaluation involved simulating data under a variety of controlled
conditions, applying the method to the simulated data, and plotting the results for
comparison to the “true” simulated model. Comparisons to other popular alternative
methods, such as AIC and BIC, was also an important component of this simulation study
which demonstrated the advantages and disadvantages of using the proposed method.
Specific Aim 2
To apply the framework to nationally representative datasets to carefully examine
1) the risk of headache; and 2) the risk of mortality associated with BMI while adjusting
for the effects of covariates and complex sample designs.
6
This aim focused on applying my nonlinear modeling framework. Cross-sectional
data analyses of the relationship between BMI and headache outcomes were conducted in
eleven large datasets, many of which had complex sample designs. For estimation and
comparison, nonlinear analyses of the respective relationships of BMI and WHR with
mortality in the United States have been carried out on data from NHANES III. This
dataset was large (over 14,000 adult participants), had a complex, nonignorable
multistage probability cluster sampling design, and contained several detailed measures
of adiposity. When used in conjunction with its Linked Mortality File (LMF), NHANES
III provided information on vital status, BMI, WHR, and an abundance of covariate
measures.
The Papers
This dissertation follows the “three papers” model. The first paper is an extensive
literature review which points out alternative contributors to the obesity epidemic to be
considered alongside the “big two” contributors (i.e., food marketing practices and
reductions in physical activity). This paper illustrates the importance of developing new
ideas and challenging the assumptions we, as scientists, often make in obesity-related
research. With data resources growing in size and scope, so too should our abilities to
draw connections between health outcomes and possible predictors. The statistical
methodology and applications detailed in the following three papers take that general
aim. The second paper describes my unique nonlinear modeling framework for complex
samples and the implementation of B-splines and likelihood-based methods to improve
computational performance and stability. It also includes a simulation study of my novel
7
knot testing procedure. The third paper applies the framework, based on least-squares
estimation by the Levenberg-Marquardt optimization procedure, to analyze the possibly
nonlinear relationship between BMI and headache among women in 11 large
epidemiologic datasets. The fourth paper details an application of my likelihood-based
modeling framework to NHANES III for the purpose of comparing and contrasting the
predictive capacities of BMI and WHR as they relate to all-cause mortality as a binary
outcome.
8
PUTATIVE CONTRIBUTORS TO THE SECULAR INCREASE IN OBESITY:
EXPLORING THE ROADS LESS TRAVELED
by
SCOTT W. KEITH, DAVID T. REDDEN, PETER KATZMARZYK, MARY M. BOGGIANO, ERIN C. HANLON, RUTH M. BENCA, DOUGLAS RUDEN,
ANGELO PIETROBELLI, JAMIE BARGER, KEVIN R. FONTAINE, CHENXI WANG, LOUIS J. ARONNE, SUZANNE WRIGHT, MONICA BASKIN,
NIKHIL DHURANDHAR, MARIA C. LIJOI, CARLOS M. GRILO, MARIA DELUCA, ANDREW O. WESTFALL, DAVID B. ALLISON
International Journal of Obesity. 30:1585-94.
Copyright 2006 by
Scott W. Keith
9
Abstract
Objective: To investigate plausible contributors to the obesity epidemic beyond the two
most commonly suggested factors, reduced physical activity and food marketing
practices.
Design: A narrative review of data and published materials that provide evidence of the
role of additional putative factors in contributing to the increasing prevalence of obesity.
Data: Information was drawn from ecological and epidemiological studies of humans,
animal studies, and studies addressing physiological mechanisms when available.
Results: For at least 10 putative additional explanations for the increased prevalence of
obesity over recent decades, we found supportive (though not conclusive) evidence that
in many cases is as compelling as the evidence for more commonly discussed putative
explanations.
Conclusion: Undue attention has been devoted to two postulated causes for increases in
the prevalence of obesity leading to neglect of other plausible mechanisms and well-
intentioned, but potentially ill-founded proposals for reducing obesity rates.
Key Words: additional explanations, prevalence of obesity, obesity epidemic, body mass
index, food marketing, physical activity.
10
Introduction
The prevalence of obesity has increased substantially since 1970.1 Although the
causes are uncertain, many contend that environmental changes are almost certainly
responsible and focus overwhelmingly on food marketing practices and technology, and
on institution-driven reductions in physical activity (the “Big Two”), eschewing the
importance of other influences. This has created a hegemony whereby the importance of
the Big Two is accepted as established and other putative factors are not seriously
explored. The result may be well-intentioned but ill-founded proposals for reducing
obesity rates.
We begin by reviewing key facts about the secular increase in obesity (“the
epidemic”). We then highlight evidence showing that the obesogenic influence of the Big
Two is largely ‘circumstantial’, relying heavily on ecological correlations rather than
individual-level epidemiologic data or randomized experiments. Subsequently, we
delineate the evidence for 10 other putative factors for which the evidence is also
circumstantial but in many cases, at least equally compelling. We conclude that undue
attention has been devoted to 2 postulated causes for the epidemic, yielding neglect of
other plausible mechanisms.
The Epidemic
Obesity prevalence in the United States has been increasing for at least 100 years2
with an apparent acceleration in the past 3 decades. The distribution of body mass index
11
(BMI; Kg/m2) has increased modestly in median and moderately in mean. What has
increased far more dramatically is the positive (right-tailed) skewness of the distribution,
such that the most obese segments of the distribution are far more obese than in years
past. Obesity has increased in every age, sex, race, and smoking-status stratum of the
population, which has correctly been taken to indicate that changes in the distribution of
age, race, sex, and smoking status cannot completely account for the epidemic. However,
as we show later, this finding does not indicate that changes in the distribution of these
variables are not contributing to the epidemic.
Evidence for the Big Two
Reduced physical activity,3 particularly from reduced school-based physical
education,4 and specific food manufacturing and marketing practices (e.g., vending
machines in schools,5 increased portion size,6 increased availability of fast-food,3,7,8 use
of high-fructose corn syrup (HFCS)9) comprise the Big Two explanations proffered for
the obesity epidemic and are frequently cited as targets of potential public health
interventions. We do not intend to imply that the Big Two are not salient contributors to
the epidemic. Rather, we offer that the evidence of their role as primary players in
producing the epidemic (as well as the evidence supporting their potential ability to
reverse the trend if manipulated) is both equivocal and largely circumstantial—that is, the
hypothesized effects are underdetermined by the data. Data rarely, if ever, stem from
randomized controlled trials of the effects in population settings and in many cases do not
even include a consistently supportive body of individual-level epidemiologic studies.
The arguments for the effects of each subcomponent tend to rely heavily (though not
12
exclusively) on presumed mechanisms of action and ecological studies10 in which
associations between the putative factor and obesity rates are shown at the aggregate
population level across times or geographic locations. According to the Food and Drug
Administration,11 because ecological “studies do not examine the relationship between
exposure and disease among individuals, the studies have been traditionally regarded as
useful for generating, rather than definitively testing, a scientific hypothesis.” Consider
several examples. Regarding physical education classes, Pathways, a large, expensive,
and expertly designed childhood obesity prevention program emphasized increasing
frequency and quality of physical education classes and found no effect on BMI.12
Regarding vending machines, a thorough evidence-based review (Faith et al.,
unpublished, 2005) found no published randomized trials, quasi-experiments, or
observational epidemiologic studies evaluating their effects on obesity. Regarding fast-
food availability, although some studies showed associations with obesity, Burdette and
Whitaker13 found no association between being overweight and proximity to fast-food
restaurants in over 7000 children. Regarding HFCS, the leading source (in the United
States) is sweetened beverages and 3 out of 4 studies conducted in children have found
no association between soft drink consumption and BMI when controlling for total
energy intake14-17 raising the issue that there is no independent effect of HFCS calories on
body weight other than its pleasant taste possibly leading to the potential increase in total
caloric intake as would any food.
Regarding TV viewing, a recent meta-analysis concluded “A statistically
significant relationship exists between TV viewing and body fatness among children and
youth although it is likely to be too small to be of substantial clinical relevance. …media-
13
based [TV-based] inactivity may be unfairly implicated in recent epidemiologic trends of
overweight and obesity among children and youth”.18 Regarding portion size, Rolls has
presented considerable evidence that portion size may increase daily food intake.
Nevertheless, Rolls19 wrote, “…. that adults who are obese eat bigger portions of energy-
dense foods do[es] not prove that portion size plays a role in the etiology of obesity.
Indeed, at this time we know of no data showing such a causal relationship.”
Again, these data and quotations do not disprove the importance of those factors
listed but highlight their less-than-unequivocal evidential basis. Realizing this should
serve as an impetus for more vigorous consideration of additional factors.
Additional Explanations for the increase in obesity
We do not review all plausible contributors to the epidemic but select those that
are most interesting and for which the totality of current evidence is strongest. Figure 1
portrays the secular increase in a number of key indicators of these putative causal
influences. For most Additional Explanations we offer the conclusion that a factor (e.g.,
X) that has contributed to the epidemic will logically follow acceptance of two
propositions: 1) X has a causal influence on human adiposity and 2) during the past
several decades, the frequency distribution of X has changed such that the relative
frequency of values of X leading to higher adiposity levels has increased. Absent of
countervailing forces, if both propositions are true, obesity levels will increase.
Therefore, for postulated factors supported by this line of propositional argument
(Additional Explanations 1-7), we evaluate evidence addressing whether the factor can
increase fatness and whether the factor’s frequency distribution has changed in the
14
obesogenic direction. For the remaining Additional Explanations, propositional
arguments vary in form and are outlined separately.
Additional Explanation 1: Sleep Debt
Evidence that less sleep can cause increased body weight. For children and adults,
hours of sleep per night is inversely related to BMI and obesity in cross-sectional studies
and incident obesity in longitudinal studies.20,21 In animals, sleep deprivation produces
hyperphagia, offering a mechanism of action.22 Evidence for the physiologic mechanism
includes decreased leptin and thyroid stimulating hormone secretion, increased ghrelin
levels and decreased glucose tolerance, all endocrine changes that occur with sleep
deprivation.23-25 Sleep restriction in humans has recently been shown to produce similar
effects, including increased hunger and appetite.26 These changes are consistent with
chronic sleep deprivation leading to increased risk of obesity.
Has average sleep debt increased?Data clearly show that the average amount of
sleep has steadily decreased among U.S. adults and children during the past several
decades.27,28 Average daily sleep has decreased from over 9 to just over 7 hours among
adults.
We note that future studies examining the association between sleep debt on BMI
or any cause-effect link between them would benefit from utilizing more objective
assessments of sleep duration and sleep quality (vs. self-reporting). A good example is
the measure of spontaneous physical activity during sleep measured by microwave radar
detector. Bitz et al. (2002)29 used this technique in finding increased sleep disruptions
15
among diabetic subjects. Resta et al. (2003)30 found that even in the absence of sleep
apnea, obese subjects were observed to suffer more sleep disruptions defined as higher
sleep latency, a lower percentage of REM sleep, and a lower sleep efficiency (a ratio
between total sleep time and time spent in the bed) than non-obese subjects. The effect
of age should be controlled in such assessments, as it correlates positively with sleep time
activity.29 Large-scale self-report studies could also be improved with subjects’ use of
actigrophy watches to verify self-reported sleep times.
Additional Explanation 2: Endocrine Disruptors
Evidence that endocrine disruptors can increase adiposity. Endocrine disruptors
(EDs) are lipophilic, environmentally stable, industrially produced substances that can
affect endocrine function and include dichlorodiphenyltrichloroethane (DDT), some
polychlorinated biphenols (PCBs), and some alkylphenols. By disturbing endogenous
hormonal regulation, EDs may fatten through multiple pathways. Consider the effect of
estrogen on white adipose tissue: In rodents white adipose is increased by ovariectomy
and decreased by estrogen replacement therapy.31 Similarly, postmenopausal women
have increased white adipose tissue, which is reduced by estrogen replacement therapy.32
The estrogen receptor-α knockout mouse has increased white adipose tissue in mice of
both sexes.33 Some EDs directly bind to nuclear receptors, including the peroxisome
proliferator-activated receptor γ and the retinoic acid X receptor. Kanayama et al.34 found
that the organontin EDs are high-affinity agonists for the peroxisome proliferator-
activated receptor γ and retinoic acid X receptor and stimulate adipocyte proliferation.
Other EDs are antagonists of certain nuclear receptors. For example, vinclozolin is a
16
dicarboximide fungicide and an androgen receptor antagonist.35 Some EDs are anti-
androgens36 and may thereby alter nutrient partitioning toward a more fatty body
composition. EDs can also inhibit aromatases37 and the aromatase knockout mouse has
increased adiposity. In humans, body ED burden and BMI or fat mass are positively
correlated even when normalized to total body triglyceride.38
Evidence that ED exposure has increased. EDs have increased in the food
chain.39,40 One example indicator is that polybrominated diphenyl ether concentration in
Swedish women’s breast milk almost doubled every 5 years from 1972 to 1998.39
Additional Explanation 3: Reduction in Variability in Ambient Temperature
Evidence that remaining in the thermoneutral zone promotes adiposity. The
thermoneutral zone (TNZ) is the range of ambient temperature in which energy
expenditure is not required for homeothermy. Exposure to ambient temperatures above or
below the TNZ increases energy expenditure, which, all other things being equal,
decreases energy stores (i.e., fat). This effect was shown in short-term controlled human
experiments41,42 and the decreases in adiposity were evidenced in controlled animal
experiments; these effects are widely exploited in livestock husbandry, where selecting
the environment to maximize weight gain is critical.43
Animal44 and human45 studies show that excursions above the TNZ markedly
reduce food intake. Herman45 cited a consumer survey suggesting that after an air-
conditioning breakdown, restaurant sales drop dramatically.
17
Evidence that time in the TNZ has increased. Humans dwell more in the TNZ
than they did 30 years ago. For example, the average internal U.K. home temperature
increased from 13°C to 18°C between 1970 and 2000.46 The U.S. thermal standard for
winter comfort increased from 18°C in 1923 to 24.6°C in 1986.47,48 The percentage of
U.S. homes with central air conditioning increased from 23% to 47% between 1978 and
1997 while the percentage of homes with no air-conditioning decreased from 44% to
28%. In the southern United States, where some of the highest obesity rates are observed,
the percentage of homes with central air conditioning increased from 37% to 70%
between 1978 and 1997 and the percentage of homes without any air-conditioning
decreased from 26% to 7%.49
Additional Explanation 4: Decreased Smoking
Evidence that smoking reduces weight. Epidemiologic and clinical studies
consistently show that smokers tend to weigh less than nonsmokers and weight gain
follows smoking cessation.50,51 Nicotine has both thermogenic and appetite suppressant
effects and its effects on appetite are enhanced by caffeine.52
Evidence that smoking rates have decreased. Rates of cigarette smoking among
U.S. adults steadily declined during the past several decades.53 Centers for Disease
Control and Prevention scientists estimated that between 1978 and 1990 smoking
cessation was responsible for about one quarter (2.3 of 9.6 percentage points) of the
increase in the prevalence in overweight in men and for about one sixth (1.3 of 8.0
percentage points) of the increase in women.50
18
Additional Explanation 5: Pharmaceutical Iatrogenesis
Evidence that certain pharmaceuticals increase weight. Weight gain is induced
by many psychotropic medications (antipsychotics, antidepressants, mood stabilizers),
anticonvulsants, antidiabetics, antihypertensives, steroid hormones and contraceptives,
antihistamines, and protease inhibitors. Selective serotonin reuptake inhibitors
(antidepressants) may also produce weight gain but data are less consistent.54-56 Almost
all atypical antipsychotics produce markedly more weight gain than placebo or than
traditional antipsychotics. For olanzapine and clozapine, mean weight gains were over 4
kg at 10 weeks.57 These drugs are active at many receptors involved in body weight
regulation58 and these findings were reproduced in animal models.59 Most antidiabetics
including insulin, sulfonylureas, and thiazolidinediones also promote adiposity,
especially the newer thiazolidinediones, which promote adipocyte proliferation.60 Beta-
blockers induce a mean weight gain of approximately 1.2 kg.61 Data are less consistent
for oral contraceptives, but one study estimated a mean weight gain of approximately 5
kg at 2 years.62 Antihistamines also appear to induce weight gain, with more potent
antihistamines producing greater weight gain.63 HIV antiretroviral drugs and protease
inhibitors also produce weight gain and increased abdominal adiposity.64
Evidence that use of such pharmaceuticals has increased. Most pharmaceuticals
described above were introduced or had their use dramatically increased in the past 3
decades. In the past 30 years, outpatient prescriptions for atypical anti-psychotic
medications have increased from essentially zero to be nearly 70% of the prescriptions to
this large patient population.65,66 Oral antidiabetic prescriptions increased more than 2-
19
fold from 1990 to 2001.67 Similar increases were also observed for use of
anticonvulsants68 and antihypertensives.69 HIV therapies were only introduced in the past
several decades.
Additional Explanation 6: Changes in Distribution of Ethnicity and Age
Evidence that some age and ethnic groups have higher prevalences of obesity
than others. Compared with young European-Americans, middle-age adults, African-
Americans (when comparing women only), and Hispanic-Americans have a markedly
higher obesity prevalence.1
Evidence that those age and ethnic groups have increased in relative frequency.
As a proportion of U.S. adults, the Hispanic-American population increased from less
than 5% in 1970 to approximately 13% in 2000.70,71 Similarly, from 1970 to 2000, the
proportion of the total U.S. adult population aged 35–44 years and 45–54 years increased
by 43% and 18%, respectively.71 Given that these groups have higher than average
obesity rates, it is likely that these demographic changes in the population are
contributing to the increased prevalence of obesity in at least a small way.
Additional Explanation 7: Increasing Gravida Age
Evidence that greater gravida age increases risk of offspring obesity. Wilkinson
et al.72 studied obese British children and found that a common risk factor was having an
elderly mother. Patterson et al.73 studied girls aged 9–10 years and found that the odds of
obesity increased 14.4% for every 5-year increment in maternal age. Biological data
20
support these findings. Symonds et al.74 observed a correlation between maternal age and
fat deposition in sheep, in part related to uncoupling protein levels. This is in part related
to an accelerated loss of the brown adipose uncoupling protein 1 levels in the offspring of
adult primiparous mothers after birth, which may act to increase white adipose tissue
deposition in later life.74
Evidence that gravida age is increasing. Gravida age is increasing globally,75,76
rising in mean by 1.4 years in the United Kingdom between 1984 and 199475 and in
median by 2 years in Canada from 1981 to 1987.76 Mean age at first birth increased 2.6
years among U.S. mothers since 1970.77 Given Patterson et al.’s73 finding above, these
increases in maternal age might produce a clinically meaningful ~7% increase in the odds
of obesity.
Additional Explanation 8: Intrauterine and Intergenerational Effects
Some influences on obesity may occur in utero or even 2 generations back when
oocytes are formed in the grandmother.78 These may occur partly through epigenetic
(e.g., methylation) events as evidenced by the fact that cloned mice tend to be obese yet
do not pass on this obesity to their offspring.79 Thus, the increases in obesity we see today
may well be due, in part, to environmental changes that affected prior generations.
Obesity, which began increasing at least a century ago,2 may perpetuate its own increase
through a fetally-driven positive feedback loop. Specifically, maternal obesity and
resulting diabetes during gestation and lactation may promote the same conditions in
subsequent generations.80
21
Animal studies testing the fetal origins hypothesis provide support.81-83 In one
study, offspring from parent rats fed high-fat and low-fat diets were fed a high-fat diet.
Not only were body weight and abdominal adiposity increased in the offspring of high-
fat-fed parents, but the effect remained significant over 3 generations.81,84 Similarly,
overfeeding first generation female pups produced heavier pups as compared with a
control group and effects persisted for 2 subsequent generations.84 In humans, birth
weight positively correlates with adult BMI. However, as Allison et al.85 showed, barring
extreme variations, this association seems to reflect common genetic influences on birth
weight and adult BMI rather than an intrauterine environment that affects both birth
weight and adult obesity. Nevertheless, there may be intrauterine effects on adult BMI
that are not manifested in high birth weight. New evidence suggests that low birth weight
and/or the rapid catch-up growth that often follows it may be a risk factor for later obesity
and its life-shortening sequelae.86 It is then noteworthy that the incidence of low birth
weight in the United States has increased. According to Hamilton et al.,87 low birth
weight increased to 7.8% for 2002, the highest in more than 3 decades; the rate of low
birth weight had declined in the 1970s and early 1980s but has increased since the mid-
1980s. Furthermore, mothers who were themselves low-birth-weight infants are at
increased risk for gestational diabetes,88 which, in turn, places their offspring at increased
obesity risk.89
Thus, it is possible that the extremes of energy imbalance in utero (overfeeding
and low birthweight) may contribute to obesity. We may now be seeing the
transgenerational obesogenic effects of environmental changes initiated one or more
22
generations ago. Forebodingly, obesity’s prevalence could increase further if children of
the current generation’s overweight or obese parents are thereby predisposed further still.
Additional Explanation 9: Greater BMI is Associated With Greater Reproductive Fitness
Yielding Selection for Obesity-Predisposing Genotypes
Reproductive fitness can be defined as one’s capacity to pass on one’s DNA.
BMI-associated reproductive fitness (viz natural selection) would increase obesity
prevalence if BMI has a genetic component (i.e., is heritable) and if individuals
genetically predisposed toward higher BMIs reproduce at a higher rate than do
individuals genetically predisposed toward lower BMIs.
Proposition A. BMI has a genetic component. That BMI (or adiposity) has a
heritable component is well supported by animal breeding studies and human twin,
family, and adoption studies90 with an estimated heritability of approximately 65%.91
Proposition B. Individuals with genetic predisposition toward greater adiposity
are reproducing at a higher rate than are individuals with a predisposition toward lesser
adiposity. Number of offspring is positively correlated with BMI among women.91 One
might assume that this is because childbearing or child rearing leads to weight gain.
Although this is plausible, other mechanisms may be contributing to this correlation.
Specifically, mild-to-moderate (but not severe) phenotypic obesity and/or a genotypic
predisposition to obesity may increase fecundity relative to phenotypic thinness and/or a
genetic predisposition to thinness because 1) obesity (at least in women) leads to
23
socioeconomic falling93 that, in turn, is associated with producing more offspring;94 2)
leanness beyond a certain point impairs fertility in women;95 and 3) other biological,
social, or economic factors may induce a positive correlation between genetic
predisposition to obesity and fecundity. Indeed, evidence shows that the direction of
causation may be from obesity predisposition to fecundity and not only the reverse. First,
while true that high BMI (> 25) is associated with reduced sperm concentration and total
sperm count, so too is low BMI (< 20) and the reduction is greater among men with low
BMI,96 there is an association between parent adiposity and number of offspring for both
fathers and mothers.97 Although this does not rule out that child rearing leads to obesity,
the correlation among fathers obviously cannot be ascribed to the effects of childbearing.
Second, at least one study showed that higher BMI among parents before producing
offspring is associated with subsequent offspring number.97 Finally, animal studies are
supportive: In cattle, calving rate and adiposity have a positive genetic correlation98 and
in male rhesus monkeys, adiposity is positively correlated with siring rate.99
Additional Explanation 10: Assortative Mating and Floor Effects
Assortative mating is a pattern of nonrandom mating that we will use to refer to
positive assortment in which the probability that 2 individuals mate is positively related
to their degree of phenotypic similarity. Assortative mating increases genetic variance in
a population even though it does not affect allele frequencies (it does affect genotype
frequencies). Three propositions imply that assortative mating is contributing to increased
obesity prevalence:100,101 1) human adiposity variations have a genetic component, 2) the
adiposity threshold for defining obesity was historically above the population median,
24
and 3) humans assortatively mate for adiposity. Moreover, if factors are present that
prevent most people from becoming extremely thin (i.e., floor effects), then the
population distribution of adiposity will become increasingly positively skewed, further
increasing the population mean. The extent of assortative mating does not need to have
increased over time for it to have contributed to increasing prevalence of obesity over
time.
Evidence that human adiposity variations have a genetic component. This was
discussed in the context of Additional Explanation 9.
The threshold for defining obesity was historically above the population median.
The threshold for defining obesity is currently a BMI of 30. This is above the current and
historical median.1
Humans assortatively mate for adiposity. Extensive research shows that for BMI
and other adiposity indicators, the spousal correlation is small (~0.15) but clearly
statistically significant and cannot be attributed to the effects of cohabitation.102 This
combined evidence strongly suggests that assortative mating has contributed to the
epidemic.100,101 Finally, there are clear floor effects on BMI103 that have likely
accentuated these effects.
Putting It All Together – Interconnections
Having laid out several of these possible contributing factors, it is interesting to
25
consider what their relative importance may be and whether there are interconnections
among these putative causal variables. With respect to their relative importance,
importance can be judged in multiple ways. For example, one could judge importance in
terms of the amount of variance in BMI explained, the magnitude of the mean increase in
BMI, a population attributable fraction, or some other measure of effect. Unfortunately,
we do not believe we are currently at the point where we can confidently say what the
effect size metrics are for each of these putatively causal variables and therefore cannot
confidently evaluate their relative importance on these metrics. Another way to consider
the importance of variables is their potential modifiability. It is unlikely that anyone
would suggest that we should have more people take up smoking as a way of controlling
body weight. Therefore, further consideration of the effects of smoking cessation on
population increases on BMI may be less important than consideration of other factors
that we might be more willing or able to modify. In this regard, factors such as sleep
reduction and increased use of heating and air conditioning might be things that are easily
modifiable and for which modifications in the direction that would hypothetically reduce
obesity levels would also have added benefits (e.g. a more healthy and alert population
and less use of fossil fuels). Thus, these types of putative contributing factors may be
more important in terms of meriting more attention.
It is also noteworthy that there may be interconnections among these putative
contributing factors. For example, Additional Explanation 6 specifies that the average
age of the US adult population has increased relative to the average age of that population
several decades ago. Even if the rates of reproduction within an any age category remain
constant, this would not only result in an older adult population who are more likely to be
26
obese solely by virtue of their own age, but would also result in increasing gravida age on
average (Additional Explanation 7) which may lead to more obesity among offspring.
Moreover, the greater obesity among the parental generation, due in part to increasing
age, may also predispose to greater obesity among the offspring generation as articulated
in Additional Explanation 8. Similarly, it is possible that the effects of assortative
mating, as discussed in Additional Explanation 10, may be accentuated by all other
factors. That is, it is possible that the influence of assortative mating is quite modest
when most people lie within some intermediate range of BMI with very few people being
severely obese. However, as larger proportions of the population become severely obese
as a result of the influence of other factors, it may be that there is a greater pattern of
intermating among these severely obese individuals which may then further accelerate
the increase in obesity levels in subsequent generations. There may yet be additional
connections among these factors that remain to be explored.
Discussion
The evidence for the putative roles of the 10 Additional Explanations in the
epidemic is compelling and in most cases consists of the concurrence of ecological
correlations, epidemiologic study results, model organism studies, and strong theoretical
or plausible mechanisms of action models. Nevertheless, we do not claim that all of the
Additional Explanations definitively are contributors, only that they are as plausibly so as
are the Big Two and deserve more attention and study.
Although the effect of any one factor may be small, the combined effects may be
consequential. Moreover, the Additional Explanations we consider do not exhaust the
27
possibilities. Other factors potentially involved in the epidemic with varying degrees of
evidential support include an epidemic of adenovirus-36,104 increases in childhood
depression,105 less calcium (or dairy) consumption,106 and hormones in agricultural
species.107 In trying to reduce obesity levels, we consider only factors that have changed
over time and potentially contributed to the epidemic. Other factors such as shift
work108,109 and not breastfeeding110 can contribute to obesity and decreasing them may
alleviate the epidemic even though they may not have contributed to it, because their
rates have not increased in the past 30 years.111,112 Of course, as we consider any
environmental factor it is important to remain cognizant that such factors act in concert
with individual genetic susceptibilities.113
Bray and Champagne114 have recently published a review of five environmental
agents that they found disturb energy balance and cause obesity in susceptible hosts.
While they offer three available strategies for combating the epidemic (nutrition
education, regulation of serving size and food labels, and modification to the food
system), their suggested measures target the Big Two and not the drugs, chemicals,
viruses, or toxins that they have implicated as contributing factors. If the Additional
Explanations we have offered are probable contributors to the epidemic as we believe,
then additional research is warranted to evaluate how much they actually contribute, their
mechanisms of action, their interaction effects, and how they may be countermanded.
While we are not suggesting in this paper that one discount the potential effects of the
Big Two, if Additional Explanations are veracious, the expectations for the likely public
health impact of programs that only target the Big Two might be tempered. Public health
28
practitioners and clinicians may need to address a broader range of influential factors to
more adequately address the epidemic.
Acknowledgements
Each author contributed to writing one or more sections of the manuscript and
each author edited the entire manuscript. We gratefully acknowledge Richard Forshee,
Ph.D. of the Center for Food and Nutrition Policy at Virginia Polytechnic Institute and
State University for his suggestions. This research was supported in part by NIH grant
P30DK056336. This funding source had no involvement in the writing of or the decision
to submit this paper.
References
1 Hedley AA, Ogden CL, Johnson CL, Carroll MD, Curtin LR, Flegal KM. Prevalence
of overweight and obesity among US children, adolescents, and adults, 1999-2002.
JAMA 2004;291:2847-2850.
2 Heimburger DC, Allison DB, Goran MI, et al. A festschrift for Roland L. Weinsier:
nutrition scientist, educator, and clinician. Obes Res 2003;11:1246-1262.
3 Swinburn B, Egger G. The runaway weight gain train: too many accelerators, not
enough brakes. BMJ 2004;329:736-769.
4 Gabbard C. The need for quality physical education. J Sch Nurs 2001;17:73-75.
5 Sothern MS. Obesity prevention in children: physical activity and nutrition. Nutrition
2004;20:704-708.
29
6 Matthiessen J, Fagt S, Biltoft-Jensen A, Beck AM, Ovesen L. Size makes a
difference. Public Health Nutr 2003;6:65-72.
7 Ebbeling CB, Sinclair KB, Pereira MA, Garcia-Lago E, Feldman HA, Ludwig DS.
Compensation for energy intake from fast food among overweight and lean adolescents.
JAMA 2004;291:2828-2833.
8 Rogers JH. Living on the fat of the land: How to have your burger and sue it too.
Washington Univ Law Q 2003;81:859-884.
9 Bray GA. The epidemic of obesity and changes in food intake: the fluoride
hypothesis. Physiol Behav 2004;82:115-121.
10 Morgenstern H. Ecologic studies in epidemiology: concepts, principles, and methods.
Annu Rev Public Health 1995;16:61-81.
11 U.S. Food and Drug Administration. Redbook 2000: Toxicological Principles for the
Safety Assessment of Food Ingredients. Available at:
http://vm.cfsan.fda.gov/~redbook/red-vib.html. Accessed March 3, 2005.
12 Caballero B, Clay T, Davis SM, Ethelbah B, Rock BH, Lohman T, Norman J, Story
M, Stone EJ, Stephenson L, Stevens J; Pathways Study Research Group. Pathways: a
school-based, randomized controlled trial for the prevention of obesity in American
Indian schoolchildren. Am J Clin Nutr 2003 Nov;78(5):1030-8.
13 Burdette HL, Whitaker RC. Neighborhood playgrounds, fast food restaurants, and
crime: relationships to overweight in low-income preschool children. Prev Med
2004;38:57-63.
14 Berkey CS, Rockett HR, Field AE, Gillman MW, Colditz GA. Sugar-added
beverages and adolescent weight change. Obes Res 2004;12:778-788.
30
15 Field AE, Austin SB, Gillman MW, Rosner B, Rockett HR, Colditz GA. Snack food
intake does not predict weight change among children and adolescents. Int J Obes Relat
Metab Disord 2004;28:1210-1216.
16 Ludwig DS, Peterson KE, Gortmaker SL. Relation between consumption of sugar-
sweetened drinks and childhood obesity: a prospective, observational analysis. Lancet
2001;357:505-508.
17 Newby PK, Peterson KE, Berkey CS, Leppert J, Willett WC, Colditz GA. Beverage
consumption is not associated with changes in weight and body mass index among low-
income preschool children in North Dakota. J Am Diet Assoc 2004;104:1086-1094.
18 Marshall SJ, Biddle SJ, Gorely T, Cameron N, Murdey I. Relationships between
media use, body fatness and physical activity in children and youth: a meta-analysis. Int J
Obes Relat Metab Disord 2004;28:1238-1246.
19 Rolls BJ. The supersizing of America: portion size and the obesity epidemic. Nutr
Today 2003;38:42-53.
20 von Kries R, Toschke AM, Wurmser H, Sauerwald T, Koletzko B. Reduced risk for
overweight and obesity in 5- and 6-y-old children by duration of sleep—a cross-sectional
study. Int J Obes Relat Metab Disord 2002;26:710-716.
21 Gangwisch J, Heymsfield S. Sleep deprivation as a risk factor for obesity: results
based on the NHANES I. North American Association for the Study of Obesity
(NAASO) 2004;Abstract no. 42-OR:A11.
22 Everson CA. Functional consequences of sustained sleep deprivation in the rat. Behav
Brain Res 1995;69:43-54.
31
23 Spiegel K, Leproult R, Van Cauter E. Impact of sleep debt on metabolic and
endocrine function. Lancet 1999;354:1435-1439.
24 Spiegel K, Leproult R, L'hermite-Baleriaux M, Copinschi G, Penev PD, Van Cauter
E. Leptin levels are dependent on sleep duration: relationships with sympathovagal
balance, carbohydrate regulation, cortisol, and thyrotropin. J Clin Endocrinol Metab
2004;89:5762-5771.
25 Taheri S, Lin L, Austin D, Young T, Mignot E. Short sleep duration is associated
with reduced leptin, elevated ghrelin, and increased body mass index. PloS Med
2004;1:e62.
26 Spiegel K, Tasali E, Penev P, Van Cauter E. Brief communication: Sleep curtailment
in healthy young men is associated with decreased leptin levels, elevated ghrelin levels,
and increased hunger and appetite. Ann Intern Med 2004;141:846-850.
27 Bonnet MH, Arand DL. We are chronically sleep deprived. Sleep 1995;18:908-911.
28 Iglowstein I, Jenni OG, Molinari L, Largo RH. Sleep duration from infancy to
adolescence: reference values and generational trends. Pediatrics 2003;111:302-307.
29 Bitz C, Harder H, Astrup A. A paradoxical diurnal movement pattern in obese
subjects with type 2 diabetes: a contributor to impaired appetite and glycemic control?
Diabetes Care. 2005; 28:2040-2041.
30 Resta, O., Foschino, B.M.P., Bonfitto, P., Giliberti, T., Depalo, A., Pannacciulli, N.,
De Pergola, G. Low sleep quality and daytime sleepiness in obese patients without
obstructive sleep apnoea syndrome. J Intern Med. 2003; 253:536-43.
31 Wade GN, Gray JM, Bartness TJ. Gonadal influences on adiposity. Int J Obes
1985;9(suppl 1): 83-92.
32
32 Haarbo J, Marslew U, Gotfredsen A, Christiansen C. Postmenopausal hormone
replacement therapy prevents central fat distribution. Metabolism 1991;40:1323-1326.
33 Heine PA, Taylor JA, Iwamoto GA, Lubahn DB, Cooke PS. Increased adipose tissue
in male and female estrogen receptor-alpha knockout mice. Proc Natl Acad Sci USA
2000;97:12729-12734.
34 Kanayama T, Kobayashi N, Mamiya S, Nakanishi T, Nishikawa J. Organotin
compounds promote adipocyte differentiation as agonists of the peroxisome proliferator-
activated receptor γ/ retinoid X receptor pathway. Mol Pharmacol 2005;67:766-774.
35 Uzumcu M, Suzuki H, Skinner M, Effect of the anti-androgenic endocrine disruptor
vinclozolin on embryonic testis cord formation and postnatal testis development and
function. Reprod Toxicol 2004;18:765-774.
36 Sohoni P, Sumpter JP. Several environmental oestrogens are also anti-androgens. J
Endocrinol 1998;158:327-339.
37 Woodhouse AJ, Cooke GM. Suppression of aromatase activity in vitro by PCBs 28
and 105 and Aroclor 1221. Toxicol Lett 2004;152:91-100.
38 Pelletier C, Imbeault P, Tremblay A. Energy balance and pollution by
organochlorines and polychlorinated biphenyls. Obes Rev 2003;4:17-24.
39 Noren K, Meironyte D. Certain organochlorine and organobromine contaminants in
Swedish human milk in perspective of past 20-30 years. Chemosphere 2000;40:1111-
1123.
40 Nilsson R. Endocrine modulators in the food chain and environment. Toxicol Pathol
2000;28:420-431.
33
41 Westerterp-Plantenga MS, van Marken Lichtenbelt WD, Cilissen C, Top S. Energy
metabolism in women during short exposure to the thermoneutral zone. Physiol Behav
2002;75:227-235.
42 Saxton C. Effects of severe heat stress on respiration and metabolic rate in resting
man. Aviat Space Environ Med 1981;52:281-286.
43 Mader TL. Environmental stress in confined beef cattle. J Anim Sci 2003;81:E110-
E119.
44 Collin A, van Milgen J, Dubois S, Noblet J. Effect of high temperature on feeding
behaviour and heat production in group-housed young pigs. Br J Nutr 2001;86:63-70.
45 Herman CP. Effects of heat on appetite. In: Marriott BM, ed. Nutritional Needs in
Hot Environments: Applications for Military Personnel in Field Operations. Washington,
DC: National Academy Press, 1993:187-214.
46 EHCS 2000. Housing Research Summary: English House Condition Survey 1996:
Energy Report (No. 120). Office of the Deputy Prime Minister, The Stationary Office,
UK.
47 Understanding comfort, behavior, and productivity. Available at:
http://www.esource.com/public/pdf/Heating.pdf. Accessed March 3, 2005.
48 E Source space heating atlas. Available at:
http://www.esource.com/public/products/atlas_heating.asp. Accessed March 3, 2005.
49 Type of air-conditioning equipment by census region and survey year. Available at:
http://www.eia.doe.gov/emeu/consumptionbriefs/recs/actrends/recs_ac_trends_table2.ht
ml. Accessed March 3, 2005.
34
50 Flegal KM, Troiano RP, Pamuk ER, Kuczmarski RJ, Campbell SM. The influence of
smoking cessation on the prevalence of overweight in the United States. N Engl J Med
1995;333:1165-1170.
51 Filozof C, Fernandez Pinilla MC, Fernandez-Cruz A. smoking cessation and weight
gain. Obes Rev 2004;5:95-103.
52 Jessen AB, Buemann B, Toubro S, Skovgaard IM, Astrup A. The appetite-
suppressant effect of nicotine is enhanced by caffeine. Diab Obes Metab 2005;7:327-333.
53 Centers for Disease Control and Prevention. Cigarette smoking among adults—
United States, 2002. MMWR 2004;53:427-431.
54 Fava M. Weight gain and antidepressants. J Clin Psychiatry. 2000:61(suppl 11):37-
41.
55 Garland EJ, Remick RA, Zis AP. Weight gain with antidepressants and lithium. J
Clin Psychopharmacol 1988;8:323-330.
56 Sussman N, Ginsberg DL, Bikoff J. Effects of nefazodone on body weight: a pooled
analysis of selective serotonin reuptake inhibitor- and imipramine-controlled trials. J Clin
Psychiatry 2001;62:256-260.
57 Allison DB, Mentore JL, Heo M, et al. Antipsychotic-induced weight gain: a
comprehensive research synthesis. Am J Psychiatry 1999:156:1686-1696.
58 Allison DB, Casey DE. Antipsychotic-induced weight gain: a review of the literature.
J Clin Psychiatry 2001;62(suppl 7):22-31.
59 Cope MB, Nagy TR, Fernandez JR, Geary N, Casey DE. Allison DB. Antipsychotic
drug–induced weight gain: development of an animal model. Int J Obes Relat Metab
Disord 2005;29:607-14.
35
60 Fonseca V. Effect of thiazolidinediones on body weight in patients with diabetes
mellitus. Am J Med 2003;115(suppl 8A):42S-48S.
61 Sharma AM, Pischon T, Hardt S, et al. Hypothesis: beta-adrenergic receptor blockers
and weight gain. A systematic analysis. Hypertension 2001;37:250-254.
62 Espey E, Steinhart J, Ogburn T, Qualls C. Depo-provera associated with weight gain
in Navajo women. Contraception 2000;62:55-58.
63 Aronne LJ. Drug-induced weight gain: non-CNS medications. In: Aronne LJ, ed. A
Practical Guide to Drug-induced Weight Gain. Minneapolis: McGraw-Hill, 2002:77-91.
64 Stricker RB, Goldberg B. Weight gain associated with protease inhibitor therapy in
HIV-infected patients. Res Virol 1998;149(2):123-126.
65 Daumit GL, Crum RM, Guallar E, Rowe RN, Primm AB, Steinwachs EM, Ford DE.
Outpatient prescriptions for atypical antipsychotics for African Americans, Hispanics and
Whites in the United States. JAMA 2003: 60: 121-128.
66 Hermann RC, Yang D, Ettner SL, Marcus SC, Yoon C, Abraham M. Prescription of
antipsychotic drugs by office-based physicians in the United States, 1989-1997. Psychiatr
Serv 2002;53:425-430.
67 Wysowski DK, Armstrong G, Governale L. Rapid increase in the use of oral
antidiabetic drugs in the United States, 1990-2001. Diabetes Care 2003;26:1852-1855.
68 Citrome L, Jaffe A, Levine J, Allingham B. Use of mood stabilizers among patients
with schizophrenia, 1994-2001.Psychiatr Serv 2002;53:1212.
69 Psaty BM, Manolio TA, Smith NL, et al. Time trends in high blood pressure control
and use of antihypertensive medications in older adults. Arch Intern Med 2002;162:2325-
2332.
36
70 Race and Hispanic origin 1790 to 1990. Available at:
http://www.census.gov/population/documentation/twps0056/tab01.pdf. Accessed March
15, 2005.
71 The Hispanic population 2000. Available at:
http://www.census.gov/prod/2001pubs/c2kbr01-3.pdf. Accessed March 15, 2005.
72 Wilkinson PW, Parkin JM, Pearlson J, Philips PR, Sykes P. Obesity in childhood: a
community study in Newcastle upon Tyne. Lancet 1977;1:350-352.
73 Patterson ML, Stern S, Crawford PB, et al. Sociodemographic factors and obesity in
preadolescent black and white girls: NHLBI's Growth and Health Study. J Natl Med
Assoc 1997;89:594-600.
74 Symonds ME, Pearce S, Bispham J, Gardner DS, Stephenson T. Timing of nutrient
restriction and programming of fetal adipose tissue development. Proc Nutr Soc
2004;63:397-403.
75 Armitage B, Babb P. Population review: (4). Trends in fertility. Popul Trends
1996;84:7-13.
76 Wadhera S. Trends in birth and fertility rates, Canada, 1921-1987. Health Rep
1989;1(2):211-223.
77 Mathews TJ, Hamilton BE. Mean age of mother, 1970-2000. Natl Vital Stat Rep
2002;51:1-13.
78 Finch CE, Loehlin JC. Environmental influences that may precede fertilization: a first
examination of the prezygotic hypothesis from maternal age influences on twins. Behav
Genet 1998;28(2):101-106.
37
79 Inui A. Obesity—a chronic health problem in cloned mice? Trends Pharmacol Sci
2003;24(2):77-80.
80 Levin B, Govek E. Gestational obesity accentuates obesity in obesity-prone progeny.
Am J Physiol 1998;275:R1374-R1379.
81 Wu Q, Mizushima Y, Komiya M, Matsuo T, Suzuki M. Body fat accumulation in the
male offspring of rats fed high-fat diet. J Clin Biochem Nutr 1998;25:71-79.
82 Wu Q, Mizushima Y, Komiya M, Matsuo T, Suzuki M. The effects of high-fat diet
feeding over generations on body fat accumulation with lipoprotein lipase and leptin in
rat adipose tissues. Asia Pacific J Clin Nutr 1999;8:46-52.
83 Lim K, Shimomura Y, Suzuki M. Effect of high-fat diet feeding over generations on
body fat accumulation. Japan Sci Soc Press 1991; 181-190.
84 Diaz J, Taylor EM. Abnormally high nourishment during sensitive periods results in
body weight changes across generations. Obes Res 1998;6:368-374.
85 Allison DB, Paultre F, Heymsfield SB, Pi-Sunyer FX. Is the intra-uterine period
really a critical period for the development of adiposity? Int J Obes Relat Metab Disord
1995;19:397-402.
86 Ozanne SE, Hales CN. Lifespan: catch-up growth and obesity in male mice. Nature
2004;427:411-412.
87 Hamilton BE, Martin JA, Sutton PD. Births: preliminary data for 2003. Natl Vital
Stat Rep 2004;53(9):1-17.
88 Bo S, Marchisio B, Volpiano M, Menato G, Pagano G. Maternal low birth weight and
gestational hyperglycemia. Gynecol Endocrinol 2003;17(2):133-136.
38
89 Silverman BL, Rizzo TA, Cho NH, Metzger BE. Long-term effects of the intrauterine
environment. The Northwestern University Diabetes in Pregnancy Center. Diabetes Care
1998;21:B142-B149.
90 Allison DB, Pietrobelli A, Faith MS, Fontaine KR, Gropp E, Fernández JR. (2003).
Genetic influences on obesity. In: Eckel R, ed. Obesity: Mechanisms & Clinical
Management. New York: Elsevier, 2003:31-74.
91 Segal NL, Allison DB. Twins and virtual twins: bases of relative body weight
revisited. Int J Obes Relat Metab Disord 2002;26:437-441.
92 Weng HH, Bastion LA, Taylor DH, Moser BK, Ostbye T. (2004). Number of
children associated with obesity in middle-aged women and men: Results from the Health
and Retirement Study. J Womens Health 2004;13:85-91.
93 Lipowicz A. Effect of husbands' education on fatness of wives. Am J Hum Biol
2003;15:1-7.
94 Salihu HM, Kinniburgh, Aliyu MH, Kirby RS, Alexander GR. Racial disparity in
stillbirth among singleton, twin and triplet gestations in the United States. Obstet Gynecol
2004;104:734-740.
95 Frisch RE. Body fat, menarche, fitness and fertility. Hum Reprod 1987;2:521-533.
96 Jensen TK, Andersson AM, Jorgensen N et al. Body mass index in relation to semen
quality and reproductive hormones among 1,558 Danish men. Fertil Steril 2004;82:863-
70.
97 Ellis L, Haman D. Population increases in obesity appear to be partly due to genetics.
J Biosoc Sci 2004;36:547-559.
39
98 Splan RK, Cundiff LV, Van Vleck LD. Genetic correlations between male carcass
and female growth and reproductive traits in beef cattle. Available at: http://elib.tiho-
hannover.de/publications/6wcgalp/papers/23274.pdf. Accessed March 4, 2005.
99 Bercovitch FB, Nurnberg P. Socioendocrine and morphological correlates of
paternity in rhesus macaques (Macaca mulatta). J Reprod Fertil 1996;107:59-68.
100Hebebrand J, Wulftange H, Goerg T, et al. Epidemic obesity: are genetic factors
involved via increased rates of assortative mating? Int J Obes Relat Metab Disord
2000;24:345-353.
101Katzmarzyk,PT, Hebebrand J, Bouchard C. Spousal resemblance in the Canadian
population: implications for the obesity epidemic. Int J Obes Relat Metab Disord
2002;26:241-246.
102Katzmarzyk PT, Perusse L, Rao DC, Bouchard C. Spousal resemblance and risk of 7-
year increases in obesity and central adiposity in the Canadian population. Obes Res
1999;7:545-551.
103Henry CJK. Variability in adult body size: uses in defining the limits of human
survival. In: Ulijaszek SJ, Mascie-Taylor CGN, eds. Anthropometry: The Individual and
the Population. New York: Cambridge University Press, 1994:117-129.
104Atkinson RL, Dhurandhar NV, Allison DB, et al. Human adenovirus-36 is associated
with increased body weight and paradoxical reduction of serum lipids. Int J Obes
2005;29:281-6.
105Pine DS, Goldstein RB, Wolk S, Weissman MM. The association between childhood
depression and adulthood body mass index. Pediatrics 2001;107:1049-1056.
40
106Zemel MB, Thompson W, Milstead A, Morris K, Campbell P. Calcium and dairy
acceleration of weight and fat loss during energy restriction in obese adults. Obes Res
2004;12:582-590.
107Mayfield R. (2003). Hormones in meat—what you should know! News from Dr.
Robin. You Can Feel Good!, No. 6, April 22, 2003. Available at:
http://www.drrobinmayfield.com/newsletters/newsletter-6.html. Accessed March 4, 2005.
108Di Lorenzo L, De Pergola G, Zocchetti C, et al. Effect of shift work on body mass
index: results of a study performed in 319 glucose-tolerant men working in a Southern
Italian industry. Int J Obes Relat Metab Disord 2003;27:1353-1358.
109Kivimaeki M, Kuisma P, Virtanen M, Elovainio M. Does shift work lead to poorer
health habits? A comparison between women who had always done shift work with those
who had never done shift work. Work-and-Stress 2001;15:3-13.
110Arenz S, Ruckerl R, Koletzko B, von Kries R. Breast-feeding and childhood obesity -
a systematic review. Int J Obes 2004;28:1247-1256.
111Hamermesh DS. The Timing of Work Over Time. Economic J. 1999;109. Available
at: http://www.res.org.uk/journals/abstracts.asp?ref=0013-
0133&vid=109&iid=452&aid=390. Accessed March 4, 2005.
112Breastfeeding by mothers 15-44 years of age by year of baby’s birth, according to
selected characteristics of mother: United States, average annual 1972-74 to 1993-94.
Available at: http://www.cdc.gov/nchs/data/hus/tables/2003/03hus018.pdf. Accessed
March 4, 2005.
113Friedman JM. A war on obesity, not the obese. Science 2003;299:856-8.
41
114Bray GA, Champagne CM. Beyond energy balance: there is more to obesity than
kilocalories. J Am Diet Assoc 2005;105(5 Suppl 1):S17-23.
115Middleton N, Gunnell D, Whitley E, Dorling D, Frankel S. Secular trends in
antidepressant prescribing in the UK, 1975-1998. J Public Health Med; 2001;23:262-267.
42
Figure 1. Secular changes in a number of key indicators of factors that may be related to
the increase in obesity. These indicators include: mean age of US mothers at first birth;77
antidepressant prescribing in the UK;115 prevalence of AC—the percentage of US
households equipped with air conditioning;49 UK average internal home temperature—
average internal home temperature;46 PDBE concentration—the concentration of
polybrominated diphenyl ethers in the breast milk of Swedish women from 1972 to
1978;39 proportion of US adult population that is Hispanic and/or between 35 and 55
years of age;71 time spent awake;27,28 non-smoker prevalence;50,53 adult obesity
prevalence, U.S. adults only, BMI ≥ 30 indicates obesity.1
43
A FREE-KNOT SPLINE MODELING FRAMEWORK FOR PIECEWISE LINEAR
LOGISTIC REGRESSION IN COMPLEX SAMPLES
by
SCOTT W. KEITH, DAVID B. ALLISON
In preparation for Statistics in Medicine
Format adapted for dissertation
44
Summary
This paper details the design, evaluation, and implementation of a framework for
modeling nonlinearity between a binary outcome and a continuous prognostic variable
adjusted for covariates in complex samples. The primary objective of this methodology is
to analyze non-random survey samples by applying sophisticated modeling techniques
capable of detecting nonlinearity and adjusting model flexibility. Providing familiar-
looking parameterizations of output, such as linear slope coefficients and odds ratios, is
the secondary objective. Estimation methods include least squares or maximum
likelihood optimization of piecewise linear free-knot splines formulated as truncated
power bases or B-splines. Correctly specifying the optimal number and positions of the
knots improves the approximating power of a spline model, but has been marked by
computational intensity and numerical instability. Inference methods utilize both
parametric and nonparametric bootstrapping. Unlike other nonlinear modeling packages,
this framework accounts for multistage cross-sectional survey sample designs common to
nationally representative datasets. We conducted a simulation study of our novel
procedure for specifying the optimal number of knots. Under the conditions we
simulated, our method was commonly more accurate than Schwarz’s Bayesian
Information Criterion (BIC) and very similar to Akaike’s Information Criterion (AIC) in
terms of accuracy and precision as long as sample sizes were large. AIC and BIC were
not effective model selection methods when complex sampling weights were
incorporated into the likelihood functions.
45
Keywords: Free-knot splines, nonlinear logistic regression, bootstrap, complex samples,
body mass index.
46
Introduction
Large epidemiological health cross-sectional surveys are powerful sources of
observational information for investigating health outcomes as they relate to potentially
predictive or confounding factors. Appropriately analyzing the data from many of these
surveys, such as the National Health and Nutrition Examination Survey (NHANES) and
the National Health Initiative Survey (NHIS), is not straightforward as their participants
are not selected by simple random sampling (SRS). Conducting an SRS of a large,
diverse population would be exorbitantly expensive. Instead, the survey designers plan
the sampling of groups of individuals in multiple stages with oversampling of certain
demographic or geographic clusters to collect a complex sample which represents the
population more efficiently than by SRS. There is a drawback in the statistical analysis of
these survey samples. The observations should not be considered independent and
identically distributed (iid) and therefore traditional statistical methods for modeling and
hypothesis testing must be adjusted to account for the correlation induced by the survey
sampling design (Korn and Graubard, 1999; U.S. DHHS NHANES III Analytic and
Reporting Guidelines, 1996).
Specialized software packages, such as SUDAAN or WestVar, have been
designed for conducting many types of statistical analyses on complex samples.
However, such software is not currently available for free-knot spline modeling.
Bessaoud and colleagues (2005) have pointed out the utility of these models for
effectively representing nonlinear associations between continuous predictors and a
47
binary outcome. Interestingly, they also describe how certain free parameters in their
models can be interpreted as thresholds for distinguishing groups with differing risk
relationships.
Free-knot spline modeling methodology could be very useful in providing an
alternative to traditional quantitative epidemiological methods for characterizing
nonlinear risk relationships (i.e., categorizing the continuous predictor or using
polynomials). A free-knot spline may be loosely described as a nonlinear regression
characterized by piecewise polynomials of order m joined at locations called knots where
the adjoining segments typically agree at their (m-2)th derivative and both the number and
locations of the knots are free parameters estimated along with other model parameters
(de Boor, 1978).
We propose in this paper a free-knot spline framework for conducting piecewise
linear logistic regression in complex multi-stage cross-sectional survey samples using B-
splines and bootstrapping with a focus on likelihood function maximization for model
computation. Piecewise linear representations of parameter estimates and odds ratios are
output for expressing results in a familiar-looking format. Simulation study results will
demonstrate the performance of our procedure for specifying the optimal number of
knots.
Free-Knot Splines: Nonlinear Modeling and Parameter Estimation
What is Available for Nonlinear Modeling
It appears that the literature regarding innovation in nonlinear modeling and
smoothing methods in recent years has been focused in several areas: penalized splines
48
with fixed knots (P-splines) (e.g., Eilers and Marx, 1996; Ruppert et al., 2003);
multivariate adaptive regression splines (MARS) (Friedman, 1991); incorporating splines
into logistic regression (e.g., Bessaoud et al., 2005; Johnson, 2007) and survival analysis
(e.g., Kooperberg et al., 1995; Gray, 1996; Gray, 1994; Rosenberg, 1995; Molinari et al.,
2001); Bayesian methods that utilize Reversible Jump Markov Chain Monte Carlo such
as penalized free-knot splines (e.g., Lindstrom, 2002) and Bayesian Adaptive Regression
Splines (BARS) (DiMatteo et al., 2001); applying mixed models to smoothing (e.g.,
Ruppert et al., 2003; Wand and Pearce, 2006); generalized additive models (GAM)
(Hastie and Tibshirani, 1986; 1990; Wood, 2006); and free-knot splines (e.g. Lindstrom,
1999; Stone et al., 1997). Some researchers are also applying bootstrapping methods to
spline estimation (Kauermann et al., 2006; Bessaoud et al., 2005; Molinari et al., 2001).
Research in spline methodology continues to be popular as new methods, particularly
those utilizing the increased computing power of today’s technology, are increasingly
important for summarizing information and drawing inferences from data sources that are
growing in number and complexity.
Features of P-splines and GAM. P-splines and GAM are perhaps the most popular
modeling tools available for modeling a binary outcome as a nonlinear function of one or
more continuous predictor variables. P-splines were introduced by Eilers and Marx
(1996) as a semiparametric method to analyzing nonlinear relationships by fitting B-
splines with a relatively large number of fixed knot locations and difference penalties on
adjacent B-spline coefficients. The penalties imposed; often based on Akaike’s
Information Criterion (AIC) (Akaike, 1974), Schwarz’s Bayesian Information Criterion
49
(BIC) (Schwarz, 1978), cross validation (CV) (e.g., Eilers and Marx, 1996), or
generalized cross validation (GCV) (e.g., Wahba, 1990); adjust the smoothness of the
fitted function. Methods have also been developed to select the number of knots for P-
spline models by fitting a dense map of knots and iteratively adding and removing knots
(Ruppert, 2002).
GAMs were introduced by Hastie and Tibshirani (1986) as a way of additively
relating the mean of a response (outcome) variable to a set of linear predictors in addition
to a set of smoothed predictors. Any GAM reduces to a generalized linear model (GLM)
(Nelder and Wedderburn, 1972; McCullagh and Nelder, 1989) by “zeroing” or
“shrinking” spline parameters. CV or GCV methods are commonly used in GAM to
optimize smoothing parameters by balancing residual and prediction errors. Hence, they
control the dilemma surrounding over- or under- fitting the data. This has been referred to
as “bias - variance trade-off.” Unpenalized nonlinear modeling based on minimizing
residual sums of squares results in over-fitting to the degree that the model interpolates
the data points themselves. Over-penalizing the nonlinear model will result in excessive
residual error variance and an under-fitted model. For a thorough general discussion on
these issues, see Hastie et al. (2001).
What Free-knot Splines Offer and Why They are Used
Specialized statistical modeling tools are called for in clinical and epidemiological
settings for constructing useful models under circumstances of nonlinearity, non-
normality, and heteroscedasticity which represent departures from GLM assumptions
(Korn and Graubard, 1999). Of particular interest are those models with localized
50
estimation, such as free-knot splines (Lindstrom, 1999; 2002), which can limit the
influence of observations to particular regions of the fitted model. This property may lead
to a better characterization of associations in the tails of the predictor and response joint
distribution where small proportions of perhaps the most interesting observations exist.
Although there are many nonlinear modeling tools available, free-knot splines offer these
as well as other features, which make them appealing for the applications in health survey
research.
Our nonlinear framework will utilizes piecewise linear free-knot splines to build
an additive model of an outcome (or a function of an outcome) as a nonlinear function of
a continuous predictor. The knots will be estimated as free parameters along with other
linear continuous or categorical covariate parameters. Estimating the optimal number and
positions of the knots improves the approximating power (Burchard, 1974) of the model,
but has been marked by computational intensity and numerical instability. Free-knot
splines are very sensitive to local maxima in either the likelihood or residual sums of
squares surfaces. Effort has been made to mitigate these ailments by the introduction and
evaluation of B-splines (de Boor, 1978) and penalties for coalescent knots (Lindstrom,
1999). When free knots coalesce or overlap, the result is poor computational performance
described by Jupp (1978) as the “lethargy property” of free-knot splines.
In our research, we will be restricting our method to nonlinearity between an
outcome and only one independent variable. This will allow us to avoid the “curse of
dimensionality” (Bellman, 1957; Hastie and Tibshirani, 1986) which may be described as
the problem of extremely rapid increases in data sparseness as the dimension of the
nonlinear multivariate space increases. Regardless, the optimization of even one
51
nonlinear relationship via a free-knot spline has proven to be a difficult task in large
datasets. If the computational demands and numerical instability associated with free-
knot splines may be overcome, the free-knot models may have great potential for optimal
fit to observed data.
A key feature of the framework is that the splines may be represented
algebraically and interpreted according to their piecewise polynomial segments which
gives the output from these models a familiar appearance to researchers accustomed to
interpreting GLM results. This is an important aspect as the framework is intended to be
accessible and attractive for use by epidemiologists and other quantitative researchers.
Interpretability of the Knots can be Biologically or Clinically Useful
Effectively estimating both the numbers and locations of the knots tends to
produce a simpler, low dimensional analytic function than fixing either the number or
locations (or both) of the knots a priori. This is appealing from the perspective of
maximizing the parsimony of the fitted model, but can also provide an interesting
interpretation for the knots. Assuming that the true model has the same order and number
of knots as that estimated, then the model can be considered parametric. Bessaoud et al.
(2005) and Molinari et al. (2001) use this to their advantage by interpreting the knots in
their free-knot spline models as cut-points in a risk relationship that define thresholds
between groups with differential patterns of association with the outcome of interest. We
suggest that this may be inappropriate for cubic, perhaps even quadratic, free-knot splines
as the ability to correctly specify the true model seems to decrease with increasing order.
52
However, in situations where the aforementioned assumption holds sufficiently well, this
interpretation of the knots can yield compelling biological or clinical insights.
Basis Functions
A spline is constructed from basis functions. A basis function is an element of the
basis for a function space. Each function in a given function space can be expressed as a
linear combination of its basis functions. For example, the class of cubic polynomials
with real-valued coefficients has a basis consisting of {1, x, x2, x3}. Every cubic function
can be written as a linear combination of this basis (i.e. a1+bx+cx2+dx3). Basis function
expansions must be explicitly specified in order to calculate free-knot splines. We will
consider two possible bases for our framework: the truncated power basis (Ruppert et al.,
2003) and the B-spline basis (de Boor, 1978).
Truncated power basis. The truncated power basis of order m (Ruppert et al.,
2003) can be expressed as,
10 1
1( ) (x) x ... x x
Kp
p pii
g f b piµ β β β ζ +
=
= = + + + + ( −∑ )
ζ
(1)
where some function of an average response, g(µ), is a nonlinear predictive function of an
independent variable, f(x); ζi is the ith of K knots such that ζ1 ≤ ζ2 ≤…≤ ζK; and u+=
max(u,0). Here, we limit our scope to the piecewise linear truncated power basis (order m
= p+1 = 2),
0 11
( ) (x) x x .K
i ii
g f bµ β β +=
= = + + ( −∑ ) (2)
53
A piecewise linear representation. We use indicator functions, , to then
express the truncated power base as a piecewise linear function on the i
{ }I
th contiguous
interval of the domain of the predictor, x, delimited by either the knots or bounds of x.
Suppose that and that we fix knots that will not be estimated at the endpoints a =
min(x) and b = max(x) such that ζ
x +∈ℜ
0 = a and ζ K+1 = b, then we have,
{ } {0 1 1 1 1 11 1
( ) [ (x)]
x I x (x ) ( ) I x ,K i
i i j j j i ii j
g h f
a a a a }
µ
ζ ζ ζ ζ ζ+ −= =
=
⎡ ⎤= + < + − + − ≤ <⎢ ⎥
⎣ ⎦∑ ∑ ζ +
(3)
where using the coefficients from (2) can give us,
11
, ( 1, , 1),l
l ii
a b l Kβ=
= + = …∑-1
+ (4)
the slope parameter for any observed [ )1x ,l lζ ζ−∈ . Note that although algebraically
equivalent, this basis representation of the piecewise linear space is much less stable for
computation than the truncated power basis.
To illustrate our model including covariates, suppose that X is an N x 1 vector of
data on some continuous prognostic variable of interest and Z represents a N x (p + 1)
matrix consisting of a column of ones followed by p columns of data on covariates. Let
η(.) be a parametric function of p+1 linear covariate predictors multiplied by their
respective logistic regression coefficients (β) and the K+1 piecewise linear spline
coefficients (a1, ..., aK+1). For the qth individual (q = 1, …, N),
( ) { }
{ }
0 1 2 ( 1) 1 1
1 11 1
X Z ... Z X I X
(X ) ( ) I X .
q q q p q p q q
K i
i q i j j j i q ii j
a
a a
η β β β ζ
ζ ζ ζ ζ ζ
+
+ −= =
= + + + + <
⎡ ⎤+ − + − ≤ <⎢ ⎥
⎣ ⎦∑ ∑
Z
1+
(5)
For comparison, consider the simpler truncated power basis expression,
54
( ) 0 1 2 p ( 1) 01
X ... X XK
q q q q p q i q ii
Z Z b bη β β β ζ+ +=
= + + + + + ( − )∑Z (6)
B-splines. B-spline bases are easy to incorporate into the framework as de Boor
(1978) describes a recurrence relation for their practical implementation. B-splines are
used extensively throughout the nonlinear modeling literature. Here, we will discuss them
only in brief detail.
Consider a knot sequence,
( ) ( )0 1 2 1 2 3min maxK K K Kζ ζ ζ ζ ζ ζ ζ+ += = ≤ ≤ …≤ ≤ ≤ = =X X + , such that there are
K interior knots. Let [ ]T2 1... .Kζ ζ +=ζ By definition of B-splines (see de Boor, 1978), the
jth B-spline of order m = 1 (piecewise constant) is,
11
1, if XB
0, otherwisej q j
j
ζ ζ +≤ <⎧⎪= ⎨⎪⎩
(7)
and the higher order B-splines may be constructed by this recurrence relation,
( )( 1) ( 1) ( 1)( 1)B B 1 Bjm jm j m j m j mω ω− + += + − ,− (8)
where,
( )1
XX q j
jm qj m j
.ζ
ωζ ζ+ −
−=
− (9)
So, the linear B-spline basis of order m = 2 (piecewise linear) we use can be expressed for
any as, Xq ∈ℜ
{ } { }22 1 1 2
1 2 1
X XB I X I X , 0,..., 1,q j j q
j j q j j q jj j j j
j Kζ ζ
ζ ζ ζ ζζ ζ ζ ζ
++ + +
+ + +
− −= ≤ < + ≤ < =
− −+ (10)
where K is the number of interior knots fitted. Thus we have,
55
(11) ( )1
0 1 2 p ( 1) 21
[ X ] Z ... Z B X ,K
q q q q q q p i qi qi
bη β β β+
+=
= + = + + + + (∑Z Z B bβ )
where b1, …, bK+1 are linear regression coefficients corresponding to their respectively
indexed values in the qth row vector, Bq, of the B-spline expansion matrix B. This shows
how η is an additive, linear expression of B-spline parameters. Note that (11) may be
easily transformed to a polynomial expression as piecewise polynomial coefficients are
clearly linear combinations of the B-spline coefficients. Figure 1 shows what B-spline
basis functions of order m = 2 look like for a continuous predictor (e.g., BMI) with two
knots.
Computation Methods
Before we discuss optimizing the fit of the spline to data, we briefly consider
some computational aspects. Mathematicians and computer scientists have demonstrated
that B-splines can have desirable properties, such as local linear independence (de Boor,
1978) and computational stability (Dierckx, 1993). As such, they have been a popular
choice and used extensively for free-knot modeling (e.g., Bessaoud et al., 2005;
Lindstrom, 1999).
We will be fitting nonlinear functions constrained to the class of piecewise linear
free-knot spline functions mapping a continuous independent variable onto the space of
the outcome variable as a projected estimate of the mean response surface. Our goal is to
first find the optimal fit for a given number of knots, K, and then determine which value
of K best represents the data. There are two general approaches to these computations:
1) by minimizing the distance between observed and predicted values (i.e., least
squares estimation or LSE); and
56
2) by maximizing the likelihood function (i.e., maximum likelihood or MLE
approach).
Least Squares
LSE in this context involves minimizing, with respect to residual sums of squares
(SSE), the distance between the observed outcome or a function of the observed outcome
and nonlinear estimates. This typically requires a method, such as the Gauss-Newton
method with the Levenberg-Marquardt adjustment (Levenberg, 1944; Marquardt, 1963),
which uses derivatives or estimates of derivatives to pick out the optimal fit.
No canned SAS procedures (version 9.1; SAS Institute Inc, Cary, NC) such as
PROC GAM, PROC TRANSREG, or PROC TPSPLINE are capable of fitting free-knot
splines. Thus, we programmed a spline basis using SAS macros and PROC NLIN for
least squares estimation. This involves minimizing a measure of distance between
vectors, 2ˆ ,−f f which represents the nonlinear SSE in a multidimensional space where f
is the collected data and f̂ is a collection of nonlinear estimates as a function of the data,
complex sample weights, and model parameters including free-knots.
Maximum Likelihood
For MLE, the nonlinear logistic likelihood function must be numerically
maximized to find the parameter values under which the observed data was most likely
produced. In theory, these estimates might have the asymptotic efficiency and invariance
under reparameterization which makes MLE attractive in general (Casella and Berger,
2002). The invariance property might be important to our framework as we intend to
57
perform the optimization with B-splines, but to report linear combinations of B-spline
parameters that represent the local piecewise linear slopes as they are easier to interpret in
practice.
The Nelder-Mead simplex (Nelder and Mead, 1965) is a popular and powerful
direct search procedure for likelihood-based optimization. The attraction of this method is
that the simplex does not use any derivatives and does not assume that the objective
function being optimized has continuous derivatives. Nelder-Mead simplex optimization
is the only method currently available in SAS which does not require derivative
calculations to search the parameter space. In cases such as ours (i.e., piecewise linear
splines) we do not expect continuity in the first derivatives at the knot locations.
Therefore an MLE and simplex optimization approach seems more reasonable than the
LSE and residuals sums of squares minimization approach. Direct search methods can,
however, be much less efficient or even highly unstable as compared to derivative-based
LSE or MLE methods when sample sizes are as large as the datasets common to complex
survey designs. Hence, as a compromise, we have used “quasi-Newton methods” with
estimated derivatives to perform the MLE.
Nonlinear Logistic Likelihood. Let’s now examine the nonlinear logistic
likelihood function for modeling binary outcomes. The probability of the qth participant
having experienced the outcome of interest, Yq = 1, can be expressed as,
( )( ) ( )( ) ( ){ }( ){ }
exp [ X ][ X ] 1| [ X ]
1 exp [ X ]
qqq q q qq q
P Yη
π η ηη
= = =+
ZZ Z
Z, (12)
58
where η([ ) may be equation (11). Note that the logit or log(odds) function of this
probability,
X ]qqZ
( )( ){ } ( )( )( )( ) (
[ X ]logit [ X ] log [ X ] ,
1 [ X ]
qqq qq q
π ηπ η η
π η
⎧ ⎫⎪ ⎪= =⎨ ⎬
−⎪ ⎪⎩ ⎭
ZZ Z
Z)q (13)
may reasonably be modeled piecewise linearly as a function of the variables in [Zq Xq].
We may express a weighted likelihood function,
( )n wy 1 y
1
| [ ], {1 ) ,q
q qq q
q
L π π −
=
⎡ ⎤= −⎣ ⎦∏Z X Wθ (14)
where θ is a vector of all the linear and spline parameters expressed in (11), n = sample
size, πq is defined above in (12), yq is the binary outcome, and wq is the complex sample
weight in the weight vector, W, assigned to the qth participant by the study designers. The
weighted log-likelihood, which is more convenient for use in optimization procedures,
( ){ }n
1
log L | [ ], w log(1 ) y log ,1
qq q q
q q
ππ
π=
⎡ ⎤⎛ ⎞= − +⎢ ⎥⎜ ⎟⎜ ⎟−⎢ ⎥⎝ ⎠⎣ ⎦
∑Z X Wθ (15)
may be maximized numerically using PROC NLP in SAS.
Optimization
Our goal is to first find the optimal fit for a given number of knots and leave the
optimization with respect to the number of knots for the next section.
Levenberg-Marquardt adaptation to the Gauss-Newton algorithm for nonlinear
LSE. The nonlinear LSE optimization procedure by the Gauss-Newton method is fairly
straightforward. Consider this nonlinear system of equations that represent our nonlinear
59
model between a vector of outcomes, Y, and a function of the observed data, X, and
parameters,
( ) , F ε= θ +Y X (16)
where ε is the error vector. The general approach to solving for the minimum distance
between Y and ( )ˆ ,F Xθ , that is, the residual distance e = Y - ( )ˆ ,F Xθ , is to solve the
nonlinear “normal” equations,
( )T ˆ , F =D X Dθ T ,Y (17)
where D represents the gradient matrix,
( )ˆ
.ˆF ,∂
=∂
XD
θ
θ (18)
Note that, in practice, we cannot actually calculate D because the derivatives at the knot
locations do not exist. Instead, we used finite difference approximations to D.
A closed form solution to (17) generally will not exist, so we try to find a solution
by an iterative process beginning with some starting value for the values, , and
continuing to update to until e
oldθ̂
oldθ̂ newθ̂ Te, the residual sum of squares (SSE), shows no
major improvement after reiterating,
(19) ˆ ˆSSE( ) SSE( ) SSE( ),k= + <new old old∆θ θ θ̂
where ∆ represents the next “step,” and k is a coefficient that can be adjusted to control
the size of the step.
For this LSE approach to numerically solving the piecewise linear function
optimization problem, SAS software offers several popular iterative algorithms. We
60
chose the Levenber-Marquardt updating formula (Levenberg, 1944; Marquardt, 1963)
defined as follows:
( )( )T Tdiag 'λ−
= +D D D D D e.∆ (20)
This method is a compromise between the Gauss-Newton and steepest descent ( '= De∆ )
methods (Marquardt, 1963) affected by adjusting the magnitude of λ. Lindstrom (1999)
suggested that for estimating free-knot parameter locations, the Levenberg-Marquardt
method increases the chance of finding the global optimum.
Quasi-Newton methods for MLE. Using quasi-Newton methods worked more
efficiently and produced more stable results than the Nelder-Mead simplex. Quasi-
Newton methods are a class of optimization algorithms which we used to locate minima
in the negative natural logarithm of (14). The particular quasi-Newton procedure we
employed is called the dual Broyden-Fletcher-Goldfarb-Shanno method (DBFGS)
(Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970). This is a complicated
procedure, the details of which extend beyond our scope. In brief, DBFGS uses line
searches along feasible descent search directions in combination with estimation of the
Cholesky factor of the Hessian matrix of second derivatives to iteratively update the
overall search for minima. Although this method requires first derivatives, we were able
to calculate derivative estimates by using finite difference approximations as we did for
the LSE methods. In application to large survey datasets, we have found that the MLE
methods suffer fewer problems with convergence than the LSE methods.
61
Knot Selection
A Novel Parametric Bootstrap-Based Method
We outline in this section a novel method of selecting the optimal number of
knots. Knot locations, linear and nonlinear coefficients, and a common intercept are
parameters optimized simultaneously having complex sample weights incorporated into
the function fitted to achieve unbiased and fully adjusted estimates. Like Bessaoud et al.
(2005) and Molinari et al. (2001) we will be interested in interpreting the fitted knots to
define clinically meaningful groups with differential patterns of risk. It will be very
important to correctly specify a parsimonious number of knots, say 4 or fewer, which
would indicate 5 or fewer different risk groups. Therefore, keeping the framework from
producing models with unnecessary knots is a priority.
Our technique involves a forward selection procedure based on the concept of a
two degree of freedom test for the addition of two parameters, a knot and a slope, to the
piecewise linear model (our “2 df knot testing procedure”). As depicted in (11), we are
considering a set of p linear or categorical covariates for adjustment purposes, but this
procedure is targeted at optimizing the complexity necessary to effectively model the one
potentially nonlinear prognostic variable, X. The test statistic for the LSE framework is
an F-ratio,
( )reduced full
full
full
SSE -SSE2 ,SSE
df
F = (21)
where dffull = N-(p+2K) (i.e., one less than the sample size minus the number of free
parameters estimated: the intercept, p linear coefficients, K free-knots, and the K+1 spline
parameters (the piecewise linear slopes)).
62
We do not know the distribution of F, so we use the parametric bootstrap (Efron,
1982) as described by Davison and Hinkley (1997) to build a hypothetic distribution of F-
ratio test statistics under the null hypothesis that the reduced model having K knots is true
against an alternative having K+1 knots. We draw D1 parametrically resampled replicate
datasets of binary outcomes and compute the F-ratio distribution { }11 D,...,rep repF F . A
bootstrap p-value representing the probability that adding the (K+1)th knot produces an F-
ratio at least as large as what might be observed by chance alone can be calculated from
this F-ratio distribution as,
1D
1
1boot
1 I{.
D 1
repj
jF F
p =
+ ≥=
+
∑ } (22)
This is analogous to integrating the distribution of F to determine the probability of
observing the data given that the null model is true.
We select a value for α to represent the significance level or inclusion criterion for
this test of the contribution to reducing SSE. That is, the p-value (22) would have to be
smaller than α in order to reject the null hypothesis that the model with K knots is the true
model in favor of the model with K+1 knots. We can control the flexibility of the model
by manipulating α.
The LSE approach to the 2 df knot testing procedure for the null hypothesis that
the “true” model has Knull knots versus an alternative of Kalt. can be specified for model
parameters (θ; including linear and nonlinear free parameters as expressed in (11)) fitted
to a dataset having binary outcome (Y), potentially nonlinear continuous predictor (X),
covariates (Z), and sample weights (W) by these algorithm specifications:
63
Step 1: Set: Knull = 0, Kalt. = Knull + 1; Step 2: Input: Y, X, Z, W; Step 3: Initialize: ;0
.alt0null ,θθ
Step 4: Minimize: arg min( ) , start = ; nullSSE( )θ nullθ̂ 0nullθ
Compute: ; nullˆSSE( )θ
Step 5: Minimize: arg min(objective = , start = ; alt.SSE( )θ .altθ̂ 0.altθ
Compute: ; alt.ˆSSE( )θ
Step 6: Compute: ( )null alt.
alt.
alt.
SSE -SSE2
SSEdf
F = ;
Step 7: Parametric Bootstrap: for j = 1 to D1 do Generate by drawing a random binary outcome for each subject, i =
1,…,N, from Bernoulli(p
repjY
i| ); nullθ̂ Repeat Steps 2 through 6 replacing Y with ; rep
jY
Compute: re of F under HpjF o: θ = ; nullθ̂
End do;
Step 9: Compute: 1D
repboot
11
1 1 I{ }D 1 j
jp F F
=
⎛ ⎞= + ≤⎜ ⎟+ ⎝ ⎠
∑ ;
Step 10: Select the model: If pboot ≤ α and Knull ≤ 2 then do Set Knull = Knull + 1, Kalt. = Kalt. + 1; Repeat Steps 2 through 9; End do; Else if pboot ≤ α and Knull = 3 then do K = 4; = ; θ̂ alt.θ̂ End do; Else do; K = Knull; = ; θ̂ nullθ̂ End do; Step 11: Compute: from where B-spline parameter elements have been
linearly transformed to piecewise linear slope parameters; PLSθ̂ θ̂
For the MLE, we adopted a similar approach, but with a likelihood ratio (LR) test statistic
in place of the F-ratio statistics.
64
The MLE approach to the 2 df knot testing procedure algorithm is be specified as
follows:
Step 1: Set: Knull = 0, Kalt. = Knull + 1; Step 2: Input: Y, X, Z, W; Step 3: Initialize: ;0
.alt0null ,θθ
Step 4: Minimize: arg min(objective = -ln[L(θnull)]) , start = ; nullθ̂ 0nullθ
Compute: -ln[L( )]; nullθ̂
Step 5: Minimize: arg min(objective = -ln[L(θalt.)]) . , start = ; altθ̂ 0.altθ
Compute: -ln[L( )]; .altθ̂
Step 6: Compute: null
alt.
ˆln L( )ˆln L( )
LR −=
−θθ
;
Step 7: Parametric Bootstrap: for j = 1 to D1 do Generate by drawing a random binary outcome for each subject, i =
1,…,N, from Bernoulli(p
repjY
i| ); nullθ̂ Repeat Steps 2 through 6 replacing Y with ; rep
jY
Compute: of LR under HrepjLR o: θ = ; nullθ̂
End do;
Step 9: Compute: 1D
repboot
11
1 1 I{ }D 1 j
jp LR LR
=
⎛ ⎞= + ≤⎜ ⎟+ ⎝ ⎠
∑ ;
Step 10: Select the model: If pboot ≤ α and Knull ≤ 2 then do Set Knull = Knull + 1, Kalt. = Kalt. + 1; Repeat Steps 2 through 9; End do; Else if pboot ≤ α and Knull = 3 then do K = 4; = ; θ̂ .
ˆaltθ
End do; Else do; K = Knull; = ; θ̂ nullθ̂ End do; Step 11: Compute: from where B-spline parameter elements have been
linearly transformed to piecewise linear slope parameters; PLSθ̂ θ̂
Thus, by either approach, we end with output parameters, , for an optimal piecewisePLSθ̂
65
linear model expressed in terms of local linear slope coefficients, knot locations, and
covariate coefficients.
Grid Search
The selection of good starting values is critical for iterative optimization
procedures in avoiding locally optimal model parameter settings in favor of converging to
the global optima. It has been difficult identifying such starting values when modeling
with free-knot splines. This issue is ubiquitous in the literature and is particularly
troublesome when the functional surface relating the nonlinear predictor and response is
nearly flat. Not only is it important to place the knots well, but the algorithm must also
start with well placed covariate and spline parameters. To address this, we start spline
coefficient parameters at zero and any covariate coefficient parameters at their
multivariate GLM estimates. For the knots, we objectively search the free-knot parameter
space for plausible knot locations by using a grid search algorithm similar to that
described by Bessaoud et al. (2005). This obviates the need for subjectivity in assigning
starting values, but comes with extreme computational costs as increasing the size of the
grid has a multiplicative effect on the number of times we need to run the bootstrap
testing procedure.
The grid search was implemented in steps 4 and 5 of the 2 df MLE testing
procedure algorithm to locate the best starting values, . In the LSE approach
applied to headaches among women (Keith et al., 2008), we had started with values over
a fairly sparse set of plausible locations for where to place the knot values and simply
picked the configurations which minimized the SSE most. By either approach, we set
0null alt.andθ 0θ
66
starting covariate coefficient parameters in equal to those estimated by a linear logistic
regression in SAS PROC SURVEYLOGISTIC. For the MLE, we calculate models for L
possible starting locations placed at nearly equal distances throughout the range of the
predictor to avoid getting stuck on local maxima and help prevent coalescent knots. To
ensure that knots did not overlap, we also enforced linear constraints so that a small
minimal distance was maintained between any two knots, including the boundary knots.
As noted, the search can be extremely computationally expensive as for each loop from
step 2 through 9 of the algorithm, we call PROC NLP times. In the most extensive
case, where we reject the model having K = 2, we would require
PROC NLP calls. For instance, this
quantity can vary from 22,604 when D
0θ
LK
⎛ ⎞⎜ ⎟⎝ ⎠
( )1
L L L L L4 D 1 2
0 1 2 3 4⎧ ⎫⎡ ⎤⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎪+ + + + + +⎨ ⎢ ⎥⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠⎪ ⎪⎣ ⎦⎩ ⎭
⎪⎬
1 = 200 and L = 6 up to 511,004 when D1 = 1,000
and L = 9. The latter is a large number of calls and is best run in a high performance
parallel processor computing environment.
Simulations
Evaluating the MLE 2 df Knot Testing Procedure
We have devised a simulation study to assess how well our MLE 2 df knot testing
procedure performs in correctly specifying the optimal number of knots and compare
results to those by,
( ){ }ˆAIC 2log L | , 2 ,r= − +X Wθ (23)
and
67
( ){ } ( )ˆBIC 2log L | , log n ,r= − +X Wθ (24)
where r represents the number of parameters in the model and n is the sample size. The
conditions we focused upon in simulating data were the size of the sample (n) and the
proportion of events (po). For each we selected two setting: { }n 500, 5000∈ and
{ }op 0.10, 0.33∈ . The simulated data were generated from randomly selected samples of
n BMI records from the Third National Health and Nutrition Examination Survey
(NHANES III) – a complex sample weighted nationally representative cross-sectional
survey conducted between 1988 and 1994 with mortality follow-up in 2000. 100
simulated sets of n binary outcomes were generated conditional on the n BMI records and
a true log(odds) model. More precisely, for a given set of n BMI values, each one, q = 1,
…, n, was assigned a Bernoulli random variable, Yqj | Xq = BMIq (j = 1, …, 100), based
on the probability of event defined by,
( )( ) ( )( ) ( ){ }( ){ }
exp [X ][X ] 1| [X ] ,
1 exp [X ]q
q q q qq
P Yη
π η ηη
= = =+
(25)
where the η function was specified by a true log(odds) model having Ktrue = 2 knots fixed
at BMI = 25 and BMI = 32 and piecewise slopes fixed at a1 = -0.4, a2 = 0.0, and a3 = 0.2
as in (3). These parameter settings were chosen as they define a functional shape that
characterizes the nonlinear U-shaped BMI relationship with a binary outcome variable
indicating mortality during follow-up among NHANES III participants having 17 ≤ BMI
≤ 45 at baseline. The intercept of the true model, a0, was calculated for each BMI dataset
as it must be conditioned on the desired level of po. To evaluate sensitivity to the
68
inclusion criterion (α), we ran our MLE 2 df knot testing procedure with settings of
{ }α 0.10, 0.25∈ on all combinations of the n and po settings.
Figure 2 displays the true model (in red) and replicated models (in black)
resulting from the application of the MLE 2 df knot testing procedure to 20 randomly
selected simulated datasets for each combination of settings. When the sample size was
low (n = 500 in Figure 1 frames a) – d)), there was considerable error variance or “noise”
distorting the true model “signal” which generated the binary simulated data and our
procedure did not perform nearly as well as when there was more information available
(n = 5,000 in Figure 1 frames e) – i)). When the proportion of events was elevated (po =
0.33 in frames b), d), f), and h)), it also introduced more information and reduce
uncertainty in where the true model was located. Note that Figure 1 frame i) shows the
only instance in which the sample weights from NHANES III had been included. They
did not greatly impact the performance of the MLE 2 df knot testing procedure, but they
did introduce an extra source of variance and possibly some degree of numerical
instability resulting in increased computation time and more varied results.
In Figure 3 we plotted the frequencies with which each of the possible number of
knots (i.e., K = 0, 1, 2, 3, or 4) was selected as optimal under each combination of
settings. Each frame has colored points representing the observed frequencies connected
by colored lines representing knot selection results from our method (in red), AIC (in
black), and BIC (in blue). All of these approaches were too conservative when the sample
size was low (n = 500 in Figure 3 frames a) – d)), but BIC was also too conservative
when the sample size was high and the proportion of events was low (n = 5000, po = 0.10
in Figure 3 frames e) and g)). Sample-weighted likelihoods from large survey samples are
69
not on a scale by which the AIC or BIC penalties would have any effect to curb
overfitting. You can see this result clearly in Figure 3 frame i) where our method was
accurate, but somewhat imprecise while AIC and BIC methods were neither accurate nor
precise. In general, our method worked accurately and very similarly to AIC, in cases
where no sample weights were used, as long as the sample size was large or when the
inclusion criterion was set high (α = 0.25 or n = 5000 in Figure 3 frames c) - h)).
Estimating Uncertainty in Parameter Estimates and Expressing Results
To illustrate why using the sample weights is important in such studies, consider
this simple hypothetical example. Suppose that in the population you have 20% African
Americans and 80% Caucasians and that due to planned oversampling, you have drawn a
sample consisting of 50% African Americans and 50% Caucasians. Now suppose that
the effect you are studying is more pronounced in Caucasians than African Americans. If
there is heterogeneity between these two groups and you do not adjust for the additional
weight given to the African Americans, you can expect to misspecify the variability
estimates and you might miss detecting effects or differences because of the bias induced
by over-representation of African Americans. The sampling design structures we run into
in practice are analogous, but more complicated and will be described in some detail
below. For a more general and technical review of the use of sampling weights for
analytic inference about parameters and how to incorporate the weights into statistical
models the interested reader should see Pfeffermann (1993).
70
Complex Samples
Standard statistical procedures and software typically have the underlying
assumption that the sample to be analyzed was collected by simple random sampling
(SRS). SRS gives equal probability of selection to each unit of the population which
results in a “random sample” of independent observations. Complex samples do not give
equal probability of selection to each unit in the population and are not independently
sampled, thus care must be taken in conducting the statistical analyses required to analyze
these samples appropriately. Analyzing a complex sample with methods designed for
SRS samples will produce incorrect estimates of variances and standard errors, and
possibly incorrect estimates of means and model parameters.
Multistage probability cluster sampling. The data we are considering will be from
samples designed to efficiently represent the US noninstitutionalized population. These
samples are drawn from the population using complex, multistage probability cluster
sampling that achieves the quality of effectively representing the population much more
quickly than the classic simple random sampling (SRS) design (Kish, 1995; U.S. DHHS
NHANES III Analytic and Reporting Guidelines, 1996). There are three components to
the information provided to the analyst to adjust for the unequal probablility sampling of
multistage complex sample designs we see in datasets such as NHANES and NHIS. The
components are stratum, primary sampling unit (PSU), and sample weight. The strata are
usually based on geographic area. PSUs are clusters within a stratum and generally given
a probability of being selected for sampling that is proportional to the size of the cluster
(with the exception that some clusters, such as the New York City metropolitan PSU, that
71
are assigned a selection probability = 1). The sample weights can be loosely defined as
giving each sampled subject a weight to indicate what proportion of the population they
represent.
The complex sample design variables actually presented to the researcher are
pseudo-variables. That is, they are false, but useful design variables provided by the
survey designers to mask the true sampling design features in order to protect participant
confidentiality. It is not clear from the pseudo-variables which PSUs have been sampled
with certainty and which have not. For more information on the issues surrounding
confidentiality and complex survey samples, the interested reader may see Lu (2000).
Making adjustments without existing software. As we are not aware of any
available tools, such as SUDAAN software, for nonlinear modeling of survey data with
complex sample designs, we found a way to make appropriate adjustments in our
program. There are two basic approaches to making complex sample adjustments:
linearization and resampling. Linearization is the application of a Taylor’s series
expansion to make first order linear approximations to possibly nonlinear parameters.
Variance estimates are then based on the linear approximations (Rao, 1997). Rao et al.
(1992) provide useful ideas for alternative approaches to this problem based on
resampling. Rao (1997) suggests that,
“An advantage of a resampling method is that it employs a single standard-error formula for all statistics θ , unlike the linearization method, which requires the derivation of a separate formula for each statistic . Moreover, linearization can become cumbersome in handling poststratification and nonresponse adjustments, whereas it is relatively straightforward with resampling methods… As a result, they [software using linearization] cannot handle more complex analyses such as logistic regression with poststratified weights.”
ˆ
θ̂
72
Thus, resampling provides a more general and versatile approach well suited to our
problem.
The resampling methods detailed by Rao et al. (1992) include balanced repeated
replication (BRR), the jackknife, and bootstrap. BRR involves resampling many “half-
sample” replicates by deleting one PSU from each stratum, rescaling the complex sample
weights, calculating a weighted replicate parameter estimate, and computing variance
estimates for the original parameter estimate based on the variability in the BRR
replicates. This method does not work well in cases where we have more than two PSU
per stratum. The jackknife method deletes one PSU, rescales the sample weights,
calculates a replicate parameter estimate, and repeats this for each PSU within each
stratum. A variance estimate for the original parameter estimate can then be calculated
from these jackknife replicates.
The most convenient resampling approach is to resample the PSUs with
replacement within each strata by using the nonparametric bootstrap method (Rao et al.,
1992) and appropriately rescale the weights. To be specific, the individual sampling
weights within the hth stratum (h = 1, …, H) are rescaled by the following equation:
nw w 1
1 1* h h hhij hij hi
h h h
d d rn n d
⎛ ⎞= − + × ×⎜⎜ − −⎝ ⎠
,⎟⎟ (26)
where, ,w*hij
is the rescaled weight for jth individual in the ith PSU, whij is the original
weight for the jth individual in the ith PSU, nh and dh are respectively the number of PSUs
and the number of bootstrap samples drawn from this stratum, and rhi is the number of
times the ith PSU is resampled. This is the underlying methodology applied in our
73
framework to achieve approximately unbiased standard errors and confidence intervals
adjusted for multistage complex sample designs.
Rao et al. (1992) and Rust & Rao (1996) each discuss in detail this method for
bootstrap adjustment of complex multistage sample weights when the number of PSUs
per statum is at least 2 (nh ≥ 2). Rao et al. (1992) suggested that this method is valid and
consistent for estimated parameters expressed as either smooth or nonsmooth functions of
totals when nh ≥ 2 and H is relatively large (e.g., H = 49 in NHANES III). Setting nh = 2
is a popular choice (common to both the NHANES and NHIS series) as it provides the
maximum amount of stratification possible for conducting valid variance estimation.
Once we have settled on a model with K knots by application of our 2 df knot
testing procedure, we are prepared to ascertain the certainty in our parameter estimates.
We begin by applying the methods suggested by Rao et al. (1992) described above to
generate D2 nonparametric bootstrap replicate estimates per each parameter of interest.
Keith et al. (2008), in applying the LSE methodology, used the bootstrap-t method
described by DiCiccio and Efron (1996) for calculating 95% CI from D2 = 1000
nonparametric bootstrap replicates. Hall and Wilson (1991) suggest this method as a
general guideline for improving statistical power and the accuracy of coverage
probabilities (i.e. bootstrapping a distribution for an asymptotically pivotal quantity,
ˆ
ˆθ θTσ−
= by *
*2*
ˆ ˆθ -θ , 1, , D ,ˆ
ii
i
T iσ
= = … where θ is some parameter of interest (say a
particular knot or slope), is the original parameter estimate, σ is the original standard
deviation estimate, and are the parameter and standard deviation estimates,
θ̂ ˆ
*θ̂i*σ̂i
74
respectively, from the ith bootstrapped sample). Then the bootstrap estimate of the
standard error of is, θ̂
( ) ( )T* * * *
2
1 ˆ ˆ ˆ ˆσ̂ -θ -θD 1
* =−
θ θ (27)
where represents the vector of ’s estimated from the D*θ̂ ˆi*θ 2 bootstrap samples and
2
T1θ̂D
1* = θ̂* is the mean of the bootstrap replicates.
The distribution of T is not necessarily symmetric, so we locate the critical values
at either end of the ordered bootstrapped distribution { }2
* * *(1) (D ),...,T T=T such that
P(T*(lower critical)<T< T*
(upper critical)) ≥ 0.95, with equal probability in either tail, and applying
some algebra leads to the 95% CI for θ.
This method can be more stable and less conservative than using the more basic
percentile methods described by Davison and Hinkley (1997) and applied to free-knot
splines by Bessaoud et al. (2005), however, the standard error estimates, , were drawn
from the optimization procedure (PROC NLIN) and required running the model with the
far less stable piecewise linear basis, depicted in (5), in order to apply them directly to the
bootstrap-t distribution of the slope coefficient parameters. The following specification
outlines this complex sample adjustment procedure.
*σ̂i
Nonparametric bootstrap procedure specifications for calculating standard
errors and 95% confidence intervals for parameter estimates by the LSE approach:
Step 1: Input: Y, X, Z, W Step 2: Nonparametric bootstrap: for j = 1 to D2 do for h = 1 to H do resample with replacement mh = nh - 1 PSUs from stratum h; rescale sample weights;
75
End do; Minimize: arg min(objective = SSE(θ ) , start = ; ˆ
jθ θ̂
Compute: where B-spline parameter elements have been linearly transformed to a piecewise linear slope parameters;
PLSˆjθ
End do; Step 3: Let represent the vector transopose of the iT
iΛ th row of the matrix
; 2
PLS PLS1 D ×N
ˆ ˆ...p
⎡ ⎤= ⎣ ⎦θ θΛ
Step 4: Compute SE and 95% CI for each model parameter, i: for i = 1 to p do
Compute: PLS T T
2
1θ̂Di i= 1 Λ ;
Compute: ( ) ( )T* T PLS T
2
1 ˆ ˆσ̂ -θ -θD 1i i i i=
−Λ Λ PLS
i ;
Sort: in ascending order; TiΛ
Compute: { }2
* * *(1) (D ),...,i i iT T=T from T
iΛ
Compute: P(Ti *(lower critical)<Ti< Ti *(upper critical)) ≥ 0.95 with equal probability in either tail; Compute: 95% CI for PLS
iθ from Ti *(lower critical) and Ti *(upper critical) ; End do;
For the MLE, we decided to implement the more conservative percentile method
of calculating the 95% confidence intervals and do all optimization with the B-spline
basis.
Nonparametric bootstrap procedure specifications for calculating standard
errors and 95% confidence intervals for parameter estimates by the MLE approach
Step 1: Input: Y, X, Z, W Step 2: Nonparametric bootstrap: for j = 1 to D2 do for h = 1 to H do resample with replacement mh = nh - 1 PSUs from stratum h; rescale sample weights; End do; Minimize: arg min(objective = -ln[L(θ)]) , start = ; ˆ
jθ θ̂
Compute: where B-spline parameter elements have been linearly transformed to a piecewise linear slope parameters;
PLSˆjθ
End do;
76
Step 3: Let represent the vector transopose of the iTiΛ th row of the matrix
; 2
PLS PLS1 D ×N
ˆ ˆ...p
⎡ ⎤= ⎣ ⎦θ θΛ
Step 4: Compute SE and 95% CI for each model parameter, i: for i = 1 to p do
Compute: PLS T T
2
1θ̂Di i= 1 Λ ;
Compute: ( ) ( )T* T PLS T
2
1 ˆ ˆσ̂ -θ -θD 1i i i i=
−Λ Λ PLS
i ;
Sort: in ascending order; TiΛ
Set lower bound for the 95% CI of to the 2.5PLSiθ
th percentile of TiΛ ;
Set upper bound for the 95% CI of to the 97.5PLSiθ
th percentile of TiΛ ;
End do;
Odds ratios
We found odds ratios (OR) to be a powerful way of expressing event risk as a
function of the nonlinear predictor. We choose OR over the log(odds) when models have
been adjusted for covariate information because, unlike log(odds), OR for comparing two
otherwise similar individuals do not depend on the covariates. While computing OR in
our framework is not quite as simple as in a conventional GLM, it is straight-forward.
Assume the basis in equation (5) and, assuming all else is equal between individuals l = 1
and l = 2 except for their respective nonlinear predictor values, X1 and X2, respectively,
we may compute an odds ratio:
{ } {
{ } {
}
}
1 1 1 1 1 1 1 1 11 1
1 2 2 1 1 2 1 2 11 1
X I X X I XOR .
X I X X I X
K i
i i j j j i ii j
K i
i i j j j i ii j
a a ( ) a ( )
a a ( ) a ( )
ζ ζ ζ ζ ζ
ζ ζ ζ ζ ζ
+ −= =
+ −= =
⎡ ⎤< + − + − ≤ <⎢ ⎥
⎣ ⎦=⎡ ⎤
< + − + − ≤ <⎢ ⎥⎣ ⎦
∑ ∑
∑ ∑
ζ
ζ
+
+
(28)
Graphical representations of this OR may be created if a a suitable reference level can be
fixed for X2 while allowing X1 to range.
77
Possible Future Directions
The next step for the development of the MLE framework is to introduce
penalties for coalescing knots in a fashion similar to that of Lindstrom (1999). If the data
are truly better modeled by jump-discontinuities, then the modeling framework should
allow for this possibility. Inducing penalties to avoid unnecessary overlapping of knot is
thus a more appealing approach to avoiding lethargy problems (Jupp, 1978) than
dropping models in which knots have coalesced (Bessaoud et al., 2005) or by enforcing
linear constraints to ensure a certain amount of space between knots as we have done.
Modeling time to events or censorship during follow-up in complex samples is a
crucial objective for this nonlinear modeling framework. We expect to extend our
likelihood-based MLE methods to modeling nonlinear bases in partial likelihood
formulations (Cox, 1975). This will provide a foundation to begin modeling relative risks
in terms of hazard ratios computed by nonlinear proportional hazards regression in our
framework with some modifications to design of the MLE approach.
Discussion
The methods described in this paper have been successfully implemented and
applied elsewhere to real complex data on BMI related to headaches among women by
the LSE approach (Keith et al., 2008) and to BMI or waist-to-hip ratio related mortality
by the MLE approach (Keith et al, in preparation). Our MLE 2 df knot testing procedure
for specifying the optimal number of knots worked accurately and very similarly to AIC
as long as the sample size was large or when the inclusion criterion was set fairly high
(i.e., α = 0.25 or n = 5000). BIC was too conservative unless both the proportion of
78
events and the sample size was large (i.e., po = 0.33 and n = 5000). Most sample sizes
among complex nationally representative surveys have at least 5,000 participant records
available for analyses. However, when the data are stratified and analyses are run on
small subsets of the survey data, our methods may not have enough power to precisely or
accurately characterize the “true” model.
Neither AIC nor BIC will incur penalties sufficient enough to curb overfitting
when complex sampling weights were applied to the likelihood objective functions. The
weights distorted the scale of the likelihood away from the penalty to the point that they
no longer corrected for overfitting. It is clear that the likelihood and/or the penalty terms
must be normalized in some way before these methods would work correctly in the
complex sample analysis setting.
Due to computational demands and long run-times associated with our MLE 2 df
knot testing procedure, we were only able to conduct analyses on 100 simulations for 8
setting combinations with D1 = 200 parametric bootstrap replicates and a coarse grid
search over L = 6 evenly-spaced BMI locations. Bessaoud et al. (2005) suggest D1 =
1000 and L = 9 which would be feasible for this simulation study only by using high
performance parallel computing resources. We are currently porting the SAS programs to
R (R Development Core Team, 2005) for parallel processing of a more extensive
simulation study. Additionally, we did not introduce covariates into the simulated
models. In future studies we will examine how correlation structures and collinearities
might influence the performance of our knot selection procedure. We acknowledge these
weaknesses in our current study, however, we feel that the results from the simulations
79
are valid and characterize several properties of this aspect of the modeling framework
quite well.
Our methods are intended for use on biological data in which the knot parameters
have meaning and thus we expect to be looking for a relatively low number of knots in
data conditions where the number of observations, n, is expected to be much larger than
the number of parameters, p. Furthermore, given the computational intensity of fitting the
framework, it is not recommended for applications where p is close to or greater than n.
With enough computing power, we suggest that the nonlinear bases in our framework can
be easily extended for fitting models with more than one nonlinear predictor, say age and
BMI, as well as multiplicative interactions. We again offer the caveat that the volume of
the multivariate parameter space will increase exponentially by adding nonlinear
predictors (the “the curse of dimensionality” (Bellman, 1957; Hastie and Tibshirani,
1986) as mentioned in the introduction.
We feel that our 2 df testing procedure for the significant contribution to model fit
of adding a knot is general enough to test against polynomial models. That is, the null
and/or alternative models do not necessarily have to be linear logistic or piecewise linear
logistic. The distributions of the likelihood ratio or F-ratio test statistics are constructed
empirically by the parametric bootstrap procedure and do not rely in any particular null or
alternative model specifications. The forward testing routine could then be modified to
test if a smooth polynomial would fit the data significantly better than a piecewise model.
AIC and BIC were designed for testing non-nested models. Although our MLE 2
df knot testing procedure performed well in our simulation study in comparison to AIC
and BIC, it is important to note that a problem may exist for our approach to selecting the
80
optimal number of knots, K. One of the foundational assumptions of the forward
selection is that the model with K knots is nested within the model with K+1 knots. As
Bessaoud et al. (2005) pointed out, free-knot splines in which both the number and
locations of knots are estimated are non-nested, with the notable exception that the linear
model (K=0) is nested in all K-knot models. Although it is hard to imagine a well-fitted
(K+1)-knot model fitting any worse than a K-knot model, this could possibly happen
since the models are not nested. Defining nested models is not an easy task. Clarke
(2001) provides intuitive, but oversimplified definitions of nested and nonnested models:
“Two models are nested if one model can be reduced to the other model by
imposing a set of linear restrictions on the parameter vector.”
“Two models are nonnested, either partially or strictly, if one model cannot be
reduced to the other model by imposing a set of linear restrictions on the
parameter vector.”
This is the basic idea presented in introductory statistics courses as a foundation for the
asymptotic F-test or likelihood ratio test for testing the contribution of parameter subsets
to the overall model fit in regression analysis. Our situation with fitting free knot
parameters is more complicated than regression. When a free knot parameter is added or
removed from these models, the parameters locally fitted to construct the adjoining spline
segments do not maintain their definition. If we compare two piecewise linear free-knot
spline models (say K=1 to K=2) fitted to the same data, we cannot say that the slope
parameter to the right of the knot in the K=1 model (a2) means the same thing as the slope
parameter (a2) between the two knots in the K=2 model (note that it does not mean the
same thing as the parameter (a3) to the right of the second knot either). These models
81
would be nested if the knot fitted in the K=1 knot model was in a fixed location for the
K=2 knot model, but conditioning added knots on the location of the previous knot
locations undermines the properties we prize in free-knot splines. Even though our 2 df
knot testing procedure appeared to work well in simulations and in application to real
data, finding a more comprehensive approach to finding the optimal K from amongst
nonnested candidate models deserves further research.
References
Akaike H. (1974) A new look at the statistical model identification. IEEE Transactions
on Automatic Control;19:716–723.
Bellman RE. (1957) Dynamic Programming. Princeton University Press, Princeton, NJ.
Bessaoud F, Daurès JP, Molinari N. (2005) Free knot splines for
logistic models and threshold selection. Computer Methods and Programs
in Biomedicine;77:1-9.
Broyden CG. (1970) The Convergence of a Class of Double-rank Minimization
Algorithms. Journal of the Institute of Mathematics and Its Applications;6:76-90.
Burchard HG. (1974) Splines (With Optimal Knots) Are Better. Applicable
Analysis;3:309-319.
Casella G, Berger R. (2001) Statistical Inference. 2nd Edition. New York: Duxbury.
Clarke KA. (2001) Testing nonnested models of international relations: reevaluating
realism. American Journal of Political Science; 45:724-44.
Cox DR. (1975) Partial likelihood. Biometrika;62:69-72.
82
Davison AC, Hinkley DV. (1997) Bootstrap Methods and their Application. New York:
Cambridge University Press.
de Boor C. (1978) A Practical Guide to Splines. New York: Springer-Verlag.
DiCiccio TJ, Efron B. (1996) Bootstrap Confidence Intervals. Stat Sci;11:189-212.
Dierckx, P. (1993) Curve and surface fitting with splines, Oxford Science Publications.
DiMatteo I, Genovese CR, Kass RE. (2001) Bayesian curve-fitting with free-knot splines.
Biometrika;88:1055-71.
Efron, B. (1982) The Jackknife, the Bootstrap, and Other Resampling Plans.
Philadelphia: SIAM.
Eilers P, Marx B. (1996) Flexible Smoothing with B-splines and Penalties. Statistical
Science;11:89-121.
Fletcher R. A New Approach to Variable Metric Algorithms. (1970) Computer
Journal;13:317-22.
Friedman J. (1991) Invited Paper: Multivariate Adaptive Regression Splines. The Annals
of Statistics;19:1-141.
Gray RJ. (1996) Hazard Rate Regression Using Ordinary Nonparametric Regression
Smoothers. Journal of Computational and Graphical Statistics;5:190-207.
Gray RJ. (1994) Spline-based tests in survival analysis. Biometrics;50:640-52.
Goldfarb D. (1970) A Family of Variable Metric Updates Derived by Variational Means.
Mathematics of Computation;24:23-6.
Hall P, Wilson S. (1991) Two guidelines for bootstrap hypothesis testing.
Biometrics;47:757-762.
83
Hastie T, Tibshirani R. (1986) Generalized Additive Models (with discussion). Statistical
Science;1:297-318.
Hastie T, Tibshirani R. (1990) Generalized additive models. Chapman and Hall, London.
Hastie T, Tibshirani R, Friedman J. (2001) The Elements of Statistical Learning: Data
Mining, Inference, and Prediction. New York: Springer-Verlag.
Johnson MS. (2007) Modeling dichotomous item responses with free-knot splines.
Computational Statistics and Data Analysis;51:4178-4192.
Jupp DL. (1978) Approximation to data by splines with free knots. SIAM J Numer
Anal;15:328-343.
Kauermann G, Claeskens G, Opsomer JD. (2006). Bootstrapping for Penalized Spline
Regression. Preprint Series #06-01, Department of Statistics, Iowa State
University. Submitted to Statistical Science.
Keith SW, Wang C, Fontaine KR, Allison DB. (2008) Body mass index and headache
among women: Results from 11 epidemiologic datasets. Obesity;16:377-83.
Kish L. (1995) Survey Sampling, Wiley, New York.
Kooperberg C, Stone C, Truong Y. (1995) Hazard Regression. JASA;90:78-94.
Korn EL, Graubard BI. (1999) Analysis of Health Surveys. J. Wiley & Sons, New York.
Levenberg K. (1944) A Method for the Solution of Certain Problems in Least Squares.
Quart. Appl. Math;2:164-168.
Lindstrom MJ. (1999) Penalized estimation of free-knot splines. Journal of
Computational and Graphical Statistics;8:333-352.
Lindstrom MJ. (2002) Bayesian estimation of free-knot splines using reversible jumps.
Computational Statistics and Data Analysis;41:255-269.
84
Lu WW. (2000) Confidentiality and variance estimation in complex surveys. M.Sc.
Thesis, Simon Fraser University.
Marquardt, D. (1963) An Algorithm for Least-Squares Estimation of Nonlinear
Parameters. SIAM J. Appl. Math;11:431-441.
McCullagh P, Nelder JA. (1989) Generalized Linear Models, Second Edition. Chapman
& Hall/CRC, Boca Raton.
Molinari N, Daures JP, Durand JF. (2001) Regression splines for threshold selection in
survival data analysis; 20:237-247.
Nelder JA, Mead R. (1965) A Simplex Method for Function Minimization. Computer
J;7:308-313.
Nelder JA, Wedderburn R. (1972) Generalized Linear Models. J.R. Statisti. Soc.
A;135:370-384.
Pfeffermann D. (1993) The role of sampling weights when modeling survey data.
International Statistical Review;61:317-37.
R Development Core Team (2005). R: A Language and Environment for Statistical
Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-
900051-07-0, URL http://www.R-project.org/.
Rao JNK. (1997) Developments in sample survey theory: an appraisal. The Canadian
Journal of Statistics;25:1-21.
Rao JNK, Wu CFJ, Yue K. (1992) Some recent work on resampling methods for
complex surveys. Survey Methodology;18:209–217.
Rosenberg, P. (1995) Hazard Function Estimation Using B-Splines. Biometrics;51:874-
887.
85
Ruppert D. (2002) Selecting the Number of Knots for Penalized Splines. Journal of
Computational & Graphical Statistics;11:735-57.
Ruppert D, Wand MP, Carroll RJ. (2003) Semiparametric Regression. New York:
Cambridge University Press.
Rust KF, Rao JNK. (1996) Variance estimation for complex surveys using replication
techniques. Statistical Methods in Medical Research;5:283-310.
SAS Institute Inc., Cary, NC, USA; version 9.1.
Schwarz G. (1978) Estimating the dimension of a model. Ann. Stat;6:461-464.
Shanno DF. Conditioning of Quasi-Newton Methods for Function Minimization. (1970)
Mathematics of Computation;24:647-56.
Stone CJ, Hansen MH, Kooperberg C, Truong YK. (1997) Polynomial Splines and Their
Tensor Products in Extended Linear Modeling. The Annals of Statistics;25:1371-
1425.
U.S. Department of Health and Human Services (DHHS). (1996) National Center for
Health Statistics. Third National Health and Nutrition Examination Survey, 1988-
1994, NHANES III Laboratory Data File (CD-ROM). Public Use Data File
Documentation Number 76200. Hyattsville, MD: Centers for Disease Control and
Prevention.
Wahba G. (1990) Spline Models for Observational Data. SIAM, Philadelphia.
Wand MP, Pearce ND. (2006) Penalized splines and reproducing kernel methods. The
American Statistician;60:233-240.
Wood, S. (2006) Generalized Additive Models: An Introduction with R. New York:
Chapman & Hall/CRC.
86
Figure 1. Plotted B-spline basis functions of order m = 2 having knots at BMI =
21 and BMI = 34.
87
a) α=0.10; po=0.10; n=500 b) α=0.10; po =0.33; n=500 c) α=0.25; po =0.10; n=500
d) α=0.25; po =0.33; n=500 e) α=0.10; po =0.10; n=5000 f) α=0.10; po =0.33; n=5000
g) α=0.25; po =0.10; n=5000 h) α=0.25; po =0.33; n=5000 i) α=0.10; po =0.33; n=5000*
* includes complex sample weights Figure 2. Model selection simulation results for the parametric bootstrap 2 df forward selection procedure. True log(odds) model (K = 2) plotted by BMI (in red) with results from 20 replicate datasets (in black). Each cell a) - i) depicts results from data simulated under various conditions including inclusion criterion, α, the proportion of events, po, and sample size, n.
88
a) α=0.10; po=0.10; n=500 b) α=0.10; po=0.33; n=500 c) α=0.25; po=0.10; n=500
d) α=0.25; po=0.33; n=500 e) α=0.10; po=0.10; n=5000 f) α=0.10; po=0.33; n=5000
g) α=0.25; po=0.10; n=5000 h) α=0.25; po=0.33; n=5000 i) α=0.10; po=0.33; n=5000*
* includes complex sample weights Figure 3. A comparison of knot selection simulation results. Plotted lines connect frequencies for the number of knots fitted to 100 simulated datasets by method: AIC (in black), BIC (in blue), and the parametric 2 df forward selection procedure (in red). Each cell a) - i) depicts results from data simulated under various conditions including inclusion criterion, α, the proportion of events, po, and sample size, n.
89
BMI AND HEADACHE AMONG WOMEN: RESULTS FROM 11 EPIDEMIOLOGIC
DATASETS
by
SCOTT W. KEITH, CHENXI WANG, KEVIN R. FONTAINE, CHARLES D. COWAN, DAVID B. ALLISON
Obesity; 16:377-83.
Copyright 2008 by
Scott W. Keith
90
Abstract
Objective: To evaluate the association between body mass index (BMI: kg/m2) and
headaches among women.
Research Methods and Procedures: Cross-sectional analysis of 11 datasets identified
after searching for all large publicly available epidemiologic cohort study datasets
containing relevant variables. Datasets included: National Health Interview Survey:
1997-2003; the first National Health Examination and Nutrition Survey; Alameda County
Health Study; Tecumseh Community Health Study; and Women’s Health Initiative. The
women (220,370 in total) were aged 18 years or older and had reported their headache or
migraine status.
Results: Using nonlinear regression techniques and models adjusted for age, race, and
smoking, we found that increased BMI was generally associated with increased
likelihood of headache or severe headache among women. A BMI of approximately 20
was associated with the lowest risk of headache. Relative to a BMI of 20, mild obesity
(BMI of 30) was associated with roughly a 35% increase in odds of headache whereas
severe obesity (BMI of 40) was associated with roughly an 80% increase in odds. Results
were essentially unchanged when models were further adjusted for socioeconomic
variables, alcohol consumption, and hypertension. Being diagnosed with migraine
showed no association with BMI.
Discussion: Among US women, a BMI of approximately 20 (about the 5th percentile)
was associated with the lowest likelihood of headache. Consistently across studies, obese
91
women had significantly increased risk of headaches. In contrast, risk of diagnosed
migraine headache per se was not obviously related to BMI. The direction of causation
and mechanisms of action remain to be determined.
Key Words: women, headaches, migraines, nonlinear regression, splines.
92
Introduction
Various forms of headache (e.g., chronic daily headache, tension-type headache,
migraine headache) are disabling conditions (1-3) that, compared to other common pain
conditions, produce the greatest loss of productive time in the US workforce. (2) Because
the prevalence of the different forms of headache vary widely in published studies (e.g.,
1.3% to 86% for tension-type headache (1)) it is difficult both to derive a definitive
estimate and to assess whether the headache prevalence has changed over time. (4,5)
Headache has been shown to be associated with breathing disorders, caffeine
consumption, alcohol consumption, hypertension, anxiety and depressive disorders. (6)
Emerging evidence from case-control (6,7) and observational studies (8-10) suggest that
increased body mass index (BMI: kg/m2) might be a risk factor for headache.
In this study we estimate the association between BMI and headache among adult
women using data from several large publicly available epidemiologic datasets. We
restricted our analyses to women because it has been established that headache
prevalence is much higher among women, (3) and preliminary unpublished data
suggested that obesity’s association with headache varied substantially by gender. This is
consistent with the differential associations of obesity and a variety of health issues
observed between men and women. (11-14) Rather than analyzing a single dataset and
issuing the near ubiquitous call for replication in the discussion, to evaluate the
reproducibility of results and how results might change as a function of study-related
factors, we opted to analyze multiple data sets using identical statistical procedures. This
93
allowed us to derive estimates of the magnitude of the BMI-headache association across
all publicly available epidemiologic datasets meeting a set of inclusion criteria.
Methods
Inclusion Criteria for Datasets
To rigorously evaluate the association between BMI and headache among
women, we used cross-sectional epidemiologic datasets that met the following
requirements: (i) they must be large enough (i.e., ≥ 500 women) to allow us to generate
reasonably precise estimates across a broad range of BMI; (ii) they must contain the
height and weight of respondents (measured or self-reported) allowing calculation of
BMI; (iii) they must contain respondents’ age, race, and other variables of interest (i.e.,
smoking status, socioeconomic status, and hypertension); and (iv) they must contain
information on the presence/absence of headache.
Dataset Search Procedures
To obtain epidemiologic datasets that met the aforementioned criteria, we
searched the following electronic resources: Inter-University Consortium for Political and
Social Research (ICPSR) [http://www.icpsr.umich.edu/access/index.html], the National
Center for Health Statistics [http://www.cdc.gov/nchs/express.htm], the National Heart,
Lung and Blood Institute (NHLBI) Population Studies Dataset
[http://apps.nhlbi.nih.gov/popstudies], the North Carolina Center for Population Studies
[http://www.cpc.unc.edu/restools/sdf], the Economic and Social Data Service, United
Kingdom [http://www.esds.ac.uk/access/access.asp], and the National Library of
94
Medicine’s Medline and pre-Medline dataset [http://www.ncbi.nlm.nih.gov]. The search
of these resources yielded 11 datasets that met criteria for inclusion in our analyses.
Overview of Datasets Used
We briefly describe here and in Table 1 the characteristics of the 11 datasets used
in our analysis.
The Alameda County Health Study (ACHS) followed adults selected in 1965 to
represent the non-institutionalized population of Alameda County California.15 Data
collected included self-reported demographic information, as well as physical, cognitive,
psychological, and social functioning.
The Tecumseh Community Health Study (TCHS), initiated in 1959, investigates
health and disease determinants in the rural community of Tecumseh, Michigan.
Participants completed extensive questionnaires and medical examinations.16
The National Health Interview Survey (NHIS: 1997 to 2003), begun in 1969, is a
continuing nationwide survey of the U.S. civilian non-institutionalized population
conducted in households on a yearly basis. (17) A probability sample of households is
interviewed each year. Detailed information on the health of each living member of the
sample household is obtained.
The First National Health and Nutrition Examination Survey (NHANES I) was
conducted from 1971 to 1975 on a nationwide probability sample of individuals aged 1-
74 years. We analyzed the data from women aged 18 and over. NHANES I collected data
via questionnaire as well as through comprehensive medical and dental examinations.
NHANES I design and sampling methods have been reported previously. (18)
95
The Women’s Health Initiative (WHI) is a 40-center, national United States study
of risk factors and the prevention of common causes of mortality, morbidity, and
impaired quality of life in women. Post-menopausal women, aged 50 to 79 years,
completed health forms and attended a clinic visit at baseline and three years later.
Details of the sampling design, protocol sampling procedures, and selection criteria have
been previously published. (19)
Study Variables
Predictor. Body mass index (BMI: kg/m2) was the predictor variable of primary
interest and was calculated from either measured or self-reported (NHIS only) weight and
height. Self-reported weight has been shown to correlate very highly with measured
weight. (20) BMI is largely independent of height (r ≈ -0.03), strongly related to weight (r
≈ 0.86), and reasonably correlated with body fatness. (21)
Outcomes variables. The datasets varied somewhat with regard to how headache
was assessed (see Table 2). We recoded and dichotomized headache outcomes so that: 0
= absence of an indicator of severe or frequent headache or migraine versus 1 = presence
of an indicator of severe or frequent headache or migraine.
Covariates. Data on age, race, and smoking status were included in the primary
analyses (i.e., the primary models) as covariates. We also included socio-economic status
variables (income, education, and employment status), alcohol consumption, and
hypertension as covariates in our secondary analyses (i.e., the extended models).
96
Missing Data. Missing data were handled using listwise deletion (22) because
more complex missing data management procedures would impose a significantly greater
computational demand to an already computationally demanding set of analyses.
Furthermore, the complex sampling designs of datasets, most notably the NHIS, would
create additional statistical issues related to imputation. Although there was no reason to
hypothesize that “missingness” was systematically related to the study variables, we
noted two datasets in which some study variables were missing information in at least 5%
of records. In these datasets, we fitted logistic regression models for each variable
producing such missingness (> 5%) to test for a relationship between missingness, coded
as a binary dependent variable and the other study variables as possible predictors.
Statistical analysis
Traditionally, analyses of the association between BMI and dichotomous
outcomes such as mortality or the presence/absence of a given medical condition have
been estimated by treating BMI as either a continuous or categorical variable. (23) Each
approach has advantages and disadvantages. Advantages of treating BMI as a continuous
variable include that it does not degrade the data, tends to preserve power, and does not
impose arbitrary cut-points. Rather, one can adjust for curvature in data via incorporation
of polynomials of BMI into the model. The major advantage of treating BMI as a
categorical variable, with the categories chosen a priori, is the (seeming) ease of
communication of the results and the allowance for marked non-linearity that may not be
easily captured by polynomials. The nonlinear regression we used offers an alternative
that captures advantages of treating BMI as both a continuous and categorical variable.
97
(24) Specifically, we applied piecewise linear free knot spline logistic regression models
that do not assume a linear relationship between BMI and headache and allow for fitting
“breakpoints” in the logit function at so-called knots that may be interpreted to define
categories. (24) Thus, these data-driven models can take into account potential non-
linearity by determining BMI categories for contiguous BMI groupings of individuals
with like patterns of risk while, at the same time, allowing individuals with different BMI
in a category to have different levels of risk estimated as a function of their BMI. In brief,
we fitted nonlinear models to each dataset, used a parametric bootstrap procedure to
select the optimal spline model for BMI, and then used a nonparametric bootstrap
procedure to calculate accurate standard error estimates and confidence intervals that
adjusted for complex sample design features.
Two analyses were conducted on each dataset: primary and secondary. In the
primary analyses, we adjusted for age, race, and smoking status. In secondary analyses,
we assessed an extended model that adjusted for the aforementioned covariates as well as
socio-economic variables, alcohol consumption, and hypertension. For purposes of
comparing models within each dataset, the number of knot parameters fitted in the
extended model was fixed at the number of knots found in the primary model.
We excluded BMI values less than 14 and greater than 90 to avoid possible outlier
effects and data recording errors. After exclusions, a total of 220,370 respondents from
the 11 datasets were available for statistical analyses. Results were presented as
parameter estimates with standard errors and 95% confidence intervals. We also plotted
odds ratios for graphically demonstrating the shapes of relationships we found and
98
calculated odds ratios and confidence intervals at selected BMI values (i.e., 18, 25, 30,
35, 40) as compared to a reference BMI value for each of the 11 datasets.
Results
Table 3 and Figure 1 present the piecewise logistic regression results for the
primary model (i.e., adjusted for age, race, and smoking) for each of the 11 datasets.
Increased BMI was generally associated with increased risk of headache or severe
headache among women. Moreover, results from the NHIS 1997, NHIS 1999, NHIS
2003, and ACHS datasets located “breakpoints” (knots) in the logit function around a
BMI of 20 suggesting that the relationship between increased BMI and the risk of
headache may change significantly at this point. As shown in Figure 1, our models often
predicted that a BMI of approximately 20 was associated with the lowest risk of severe
and/or frequent headaches. The NHIS 1997 data also produced a second knot at a BMI of
about 35. At this point, the increased risk of headache with increased BMI significantly
decelerated suggesting that people having BMI greater than 35 may share the same level
of headache risk with respect to BMI. Risk of being diagnosed with migraine headache
(as assessed in the WHI) or with taking medication for headache (as assessed in
NHANES I) were not significantly related to BMI.
Table 4 and Figure 1 present the results for the extended models (i.e., the primary
models extended to include socio-economic status variables, alcohol consumption, and
hypertension as covariates). As can be seen, these estimates were generally in accord
with those derived from the primary models. The results from the NHANES I data were
99
also not materially altered when headache was coded either as: [No = 0; Occasionally &
Regularly = 1 OR No & Occasionally = 0; Regularly = 1].
Table 5 presents the odds ratios (OR’s) and confidence intervals derived from the
primary and extended models for selected BMI values (i.e., 18, 25, 30, 35, and 40)
compared to the reference BMI of 20. We chose a BMI of 20 as the reference level
because it was the most common nadir (i.e., the value most often associated with the
lowest probability of reporting headache) across the datasets, thus making the computed
OR’s greater than 1 in most cases. This allowed us to present results on a consistent scale
for visually comparing OR’s across the datasets we examined. You can see that among
women with BMI greater than 20 the OR’s were mostly statistically significant and
exhibited a similar increasing pattern in headache risk across 9 of our 11 datasets. For
example, among the NHIS results: as compared to a BMI of 20, we estimated that mild
obesity (BMI of 30) was associated with an increase in odds of reporting headache
ranging from 31% to 65% whereas severe obesity (BMI of 40) was associated with an
increase in odds ranging from 49% to 118%.
We found that only NHANES I and TCHS had variables which were missing
information from at least 5% of records. Smoking was available in only about 40% of the
NHANES I women participants and was removed from the analyses presented here.
Analyses including smokers in NHANES I produced the same non-significant results
(data not shown). The missing smoking data was associated with decreased BMI,
decreased age, income below $20,000, not being a current drinker, and hypertension.
Hypertension was missing in 23% of women in NHANES I and missingness was
associated with taking headache medication, increased age, being white, having attended
100
graduate school, earning over $20,000, having ever been an alcohol drinker, and
increased BMI (data not shown). In the TCHS data, missing information on income
(22%) was associated with increased age; missing information on hypertension (9%) was
associated with income over $20,000 and decreased BMI; and missing information on
BMI (5%) was associated with having headaches (data not shown).
Discussion
In this set of analyses of 11 different, large datasets collectively containing over
200,000 US women, we found that increased BMI was generally associated with
significantly increased risk of headache, but not diagnosed migraines. We note that our
results across all datasets, with the exceptions of WHI (diagnosed migraines) and
NHANES I (taking headache medication), suggested that, as compared to a BMI of 20,
mild obesity (BMI of 30) was associated with approximately a 35% increase in odds of
reporting headache whereas severe obesity (BMI of 40) was associated with a roughly
80% increase in odds of headache. Across the databases, a BMI of about 20 was
commonly associated with the lowest risk of headache. These results were not materially
altered when socioeconomic variables, alcohol consumption, and hypertension were also
included in the model.
With regard to migraine headache, results from our primary model of the WHI
data, the only dataset that explicitly assessed migraine headache diagnosis, suggested that
BMI may not be associated with migraine, but our extended model revealed a slight
negative association. It is noteworthy that many people with migraines go undiagnosed.
Therefore, the relationship between BMI and diagnosed migraines is not likely to reflect
101
the BMI relationship with all migraines (both diagnosed and undiagnosed). Our main
conclusion that BMI was associated with increased likelihood of headaches is based on
the following logic: 1) the WHI analysis showed no positive correlation between BMI
and diagnosed migraines, and this is a finding of consequence because of the large WHI
sample; 2) the NHIS analyses showed a positive association between BMI and
“headaches or migraines;” 3) the ACHS and TCHS showed a positive association
between BMI and “headaches” which were probably interpreted by most participants to
include migraines; 4) finding 1) suggested that there was no association between BMI
and diagnosed migraines in our NHIS, ACHS and TCHS analyses; and 5) so we
concluded that the findings in NHIS, ACHS and TCHS suggest that BMI was associated
with non-migraine headaches and possibly undiagnosed migraine.
Most of the databases we analyzed individually provided ample statistical power
to detect the estimated effect sizes we observed. That, along with the consistency of the
results, obviated the need to conduct a formal meta-analysis. Our findings clarify and
accord with previous studies. Specifically, after adjusting for age, gender, race, and
education, Scher and colleagues (6) found that obesity was associated with prevalent
chronic daily headache (OR = 1.34). Similarly, Ohayon (10) and associates found that
overweight/obese (BMI >27) respondents were more likely to report morning headache
than were adults with BMIs 20-25 and among a sample of nearly 15,000 Australian
women, Brown (9) and colleagues found that obese persons were more likely to report
headache (OR = 1.47). Also, consistent with our primary model analysis of WHI data,
Bigal and colleagues, (8) using data from over 30,000 participants, found that BMI was
not associated with migraine prevalence.
102
Interestingly, we observed some evidence in four datasets (NHIS 1997, 1999,
2003; and ACHS) that unusually low BMI may be associated with increased risk of
headache. These results suggest that increased BMI may be associated with decreased
risk of headache among the category of women having BMI less than 20 and increased
risk of headache among those having BMI over 20. This finding should be interpreted
with caution since the association was statistically significant at the 0.05 level only in the
NHIS 1997 and 1999 datasets. It was noteworthy that, since only about 5% of all study
participants had BMI values below 20, we may have lacked sufficient power to reliably
detect the elevated risk levels estimated to be associated with low BMI across studies. To
our knowledge, low BMI, in the absence of major illness (e.g., cancer), has not been
previously associated with reports of headache. Nonetheless, this finding merits further
investigation before definitive conclusions can be drawn.
9 out of the 11 datasets we examined had no study variables missing more than
3% data, so any resulting effects from missingness were likely minimal in these cases. In
NHANES I and TCHS where we saw higher levels of missing data for some variables it
was less clear what, if any, effects missing values might have had on our results.
Interestingly, in TCHS, those reporting headache were about 80% more likely to be
missing BMI data than those not reporting headache, but we cannot know how this would
influence the significant linear relationship we detected between BMI and headache.
Considering results from the more complete datasets (i.e., NHIS 1997 - 2003 and ACHS)
and the similarity of their results to those from TCHS, missingness may not have
significantly affected the TCHS results.
103
The mechanisms that might be responsible for the obesity-headache association
are unclear. However, obesity associates with the metabolic syndrome, a pro-
inflammatory, pro-thrombotic state which may contribute to headache development and
progression. (25,26) Headache is also related to sleep apnea, a condition highly
associated with obesity. (27) Hypertension also associates with headache (28) and obesity
is a major risk factor for hypertension. (29) Moreover, headache is one of the side effects
of many medications, including sibutramine, a medication to treat obesity. (30) Each of
these offers a hypothesis meriting further study.
This study has limitations. First, the headache-related questions in the datasets
differed, in some cases substantially. For example, the WHI headache question focused
on migraine headache and asked, “Has a doctor told you that you have ‘migraine’?” In
contrast, the NHANES headache question did not ask whether the respondent suffered
from headache but, rather, whether they used medication for headache (“During the past
6 months have you used any medicine, drugs or pills for headache?”). Although we coded
the headache variables in the datasets to create uniformity in outcome variables (see
Table 2), these two datasets (WHI and NHANES I) which asked about headache in a way
related to diagnosis or treatment are the only two that did not detect clear and statistically
significant associations. The assessment of headache in the other datasets focused
primarily on the presence/frequency/severity of headaches. Second, we only considered
cross-sectional datasets as they were more widely available and different statistical
methodology would be required to analyze longitudinal data. Hence, our analyses were
restricted to headache or migraine status concurrent with BMI status. We did not look at
data on subjects free of headaches at baseline that were followed prospectively to see if
104
BMI or changes in BMI would predict headache or migraine occurrence over time.
Follow-up data were available in only the two smallest studies (ACHS and TCHS). In the
future, we recommend analyzing any available longitudinal data on headache and BMI
by using nonlinear methodology similar to that which we have applied to these cross-
sectional databases.
In conclusion, the results of estimating the association between BMI and
headache in large, nationally-representative samples of women indicated that obese
women have significantly higher risk of headaches. Further research is warranted to study
the direction and mechanisms of causation as well as to investigate the possible BMI-
headache relationship among men. The possibility that weight loss may alleviate severe
or chronic headache problems among obese people also warrants investigation.
Acknowledgement This research was supported by Ortho-McNeil Pharmaceutical, Inc. and NIH
grants P30DK056336, T32HL079888, K23MH066381, and AR49720-01A1.
Conflict of interest statement
The corresponding author, David B. Allison, had full access to all the data in the
study and had final responsibility for the decision to submit for publication. The
investigators have no financial and personal relationships with other people or
organizations that could inappropriately influence (bias) their work. They do wish to
disclose that the work was funded by Ortho-McNeil Pharmaceutical, Inc.
105
References
1 Schwartz BS, Stewart WF, Simon D, Lipton RB. Epidemiology of tension–type
headache. JAMA. 1998;279:381–83.
2 Stewart WF, Ricci JA, Chee E, Morganstein D, Lipton R. Lost productive time and
cost due to common pain conditions in the US workforce. JAMA. 2003;290:2443–54.
3 Scher AI, Stewart WF, Liberman J, Lipton RB. Prevalence of frequent headache in a
population sample. Headache. 1998;38:497–506.
4 Lipton RB, Diamond M, Freitag FG, Bigal M, Stewart WF, Reed ML. Migraine
prevention patterns in a community sample: Results from the American Migraine
Prevalence and Prevention (AMPP) Study. Headache. 2005;65:792. Abstract F38.
5 Rasmussen BK, Jensen R, Schroll M, Olesen J. Epidemiology of headache in the
general population: a prevalence study. J Clin Epidemiol. 1991;44:1147–57.
6 Scher AI, Stewart WF, Ricci JA, Lipton RB. Factors associated with the onset and
remission of chronic daily headache in a population–based study. Pain. 2003;106:81–
89.
7 Peres MFP, Lerario DDG, Garrido AB, Zukerman E. Primary headache in obese
patients. Arq Neuropsiquiatr. 2005;63:931–33.
8 Bigal ME, Liberman JN, Lipton RB. Obesity and migraine: A population study.
Neurology. 2006;28:545–50.
9 Brown WJ, Mishra G, Kenardy J, Dobson A. Relationships between body mass index
and well-being in young Australian women. Int J Obese. 2000;24:1360–68.
10 Ohayon MM. Prevalence and risk factors of morning headaches in the general
population. Arch Intern Med. 2004;164:97–102.
106
11 Calle EE, Rodriguez C, Walker-Thurmond BA, Thun MJ. Overweight, obesity and
mortality from cancer in a prospectively studied cohort of US adults. N Engl J Med.
2003;348:1625–38.
12 Haslam DW, James WP. Obesity. Lancet. 2005;366:1197–1209.
13 Klein S, Burke LE, Bray GA, et al. Clinical implications of obesity with specific
focus on cardiovascular disease: a statement for professionals from the American
Heart Association Council on nutrition, Physical Activity, and Metabolism.
Circulation. 2004;110:2952–67.
14 Pi-Sunyer FX. Comorbidities of overweight and obesity: current evidence and
research issues. Med Sci Sports Exerc. 1999;31:S602–S608.
15 Berkham LF, Breslow L. Health and ways of living: The Alameda County Studies.
New York, NY: Oxford University Pres, 1983.
16 Epstein FH, Napier JA, Block WD, et al. The Tecumseh Study: design, progress, and
perspectives. Arch Environ Health. 1970;21:402–07.
17 NCHS. National Health Interview Survey (NHIS). Public–Use Data Files. http:
//www.cdc.gov/nchs/products/elec_prods/subject/nhis.htm.
18 Cox CS, Mussolino ME, Rothwell ST, et al. Plan and operation of the NHANES I
Epidemiologic Follow–Up Study, 1992. Vital Health Stat 1. 1997;35:1–231.
19 The Women’s Health Initiative Study Group. Design paper. Design of Women’s
Health Initiative Clinical Trial and Observational Study. Control Clin Trials.
1998;19:61–109.
20 Jeffrey RW. Bias in reported body weight as a function of education, occupation,
health, and weight concern. Addict Behav. 1996;21:217–22.
107
21 Heymsfield SB, Allison DB, Heshka S, Pierson RN. Assessment of human body
composition. In: D.B. Allison, ed. Handbook of assessment methods for eating
behaviors and weight-related problems: Measures, theory, and research. San Diego,
CA: Sage Publications, 1995:515–60.
22 Rao JNK, Wu CFJ, Yue K. Some recent work on resampling methods for complex
surveys. Survey Methodology. 1992;18:209–17.
23 Fontaine KR, Allison DB: Obesity and Mortality Rates. In: G. Bray & C Bouchard
(Eds.), Handbook of Obesity, 2nd Edition. New York, Dekker and Co., 2003:767–85.
24 Bessaoud F, Daures JP, Molinari N. Free knot splines for logistic models and
threshold selection. Comp Meth Prog Biomed. 2005;77:1–9.
25 Lee YH, Pratley RE. The evolving role of inflammation in obesity and the metabolic
syndrome. Curr Diabetes Rep. 2005;5:70–75.
26 Alessi MC, Lijnen HR, Bastelica D, Juhan-Vague I. Adipose tissue and
atherothrombosis. Pathophysiol Haemost Thromb. 2004;33:290–97.
27 Dodick DW, Eross EJ, Parish JM. Clinical, anatomical, and physiologic relationship
between sleep and headache. Headache. 2003;43:282–92.
28 Law M, Morris JK, Jordan R, Wald N. Headaches and treatment of blood pressure:
results from a meta–analysis of 94 randomized placebo–controlled trials with 24,000
participants. Circulation. 2005;112:2301–06.
29 Pi-Sunyer FX. Pathophysiology and long–term management of metabolic syndrome.
Obes Res. 2004;12 Suppl:174S–180S.
30 Loewinger LE, Young WB. Headache preventives: effect on weight. Neurology.
2002;58[7 Suppl 3]:A286.
108
31 SAS, Version 9.1. SAS Institute. Cary, NC. 2003.
32 Davison AC, Hinkley DV. Bootstrap Methods and their Applications. Cambridge:
Cambridge University Press, 1997.
33 Efron B. The jackknife, the bootstrap, and other resampling plans, in CBMS–NSF
Regional Conf. Series in Applied Mathematics, no. 38. SIAM. 91, 1982.
34 DiCiccio TJ, Efron B. Bootstrap Confidence Intervals. Stat Sci. 1996;11:189–212.
35 Hall P, Wilson SR. Two guidelines for bootstrap hypothesis testing. Biometrics.
1991;47:757–62.
36 Rust KF, Rao JNK. Variance estimation for complex systems using replication
techniques. Stat Methods Med Res. 1996;5:283–310.
37 Lahiri, P. On the impact of Bootstrap in survey sampling and small-area estimation.
Stat Sci. 2003;18:199–210.
109
Figure 1. Odds ratios for headaches among women by BMI (reference BMI = 20).
110
Table 1 Description of epidemiologic datasets used Study
Composition of Sample
Dates of
study
Female
(%)
Age at entry (yrs)
Weight
& height
White (%)
National Health Interview Survey (NHIS: 1997-2003)
Continuous nationwide household survey of the
civilian non-institutionalized US
population
Annual ~52% ≥ 18 Self-report
~72%
Women’s Health Initiative Observational Study (WHI)
Women ineligible for clinical trial
components enrolled from 40 US centers
1993-1998
100% 50-79 Measured 83%
National Health and Nutrition Examination Survey (NHANES I)
Collects information about the health and
lifestyle of a nationally representative sample of
the civilian non-institutionalized US
population
Annual ~51% 20+ Measured ~60%
Alameda County Health Study (ACHS)
Representative sample of Alameda County, CA
1965-1975
54% 16-94 Measured 79%
Tecumseh County Health Study (TCHS)
Representative sample of Tecumseh, MI
1959-1985
52% 35-69 Measured 100%
111
Table 2 The coding of headache among the 11 datasets
Dataset Question Response options National Health Interview Survey (NHIS: 1997-2003)
“During the PAST THREE MONTHS, did you have… severe headache or migraine?”
Yes No Refused Not ascertained Don’t know
National Health and Nutrition Examination Survey (NHANES I)
“During the past 6 months have you used any medicine, drugs or pills for headache?” Coded and analyzed in two ways: [No = 0; Occasionally & Regularly = 1] [No & Occasionally = 0; Regularly = 1]
Regularly Occasionally No Blank
Women’s Health Initiative (WHI)
“Has a doctor told you that you have any of the following conditions?”
Migraine headache
Alameda County Health Study (ACHS)
“Have you had frequent headaches during the past 12 months?”
Yes No No answer
Tecumseh Community Health Study (TCHS)
(If gets headaches) do they bother you just a little or quite a bit? [0,1,2 coded as absence of headache AND 3,4,5,6,7,8 coded as presence of headache]
0. Never gets headaches 1. Gets headaches rarely, bother a little 2. Gets headaches frequently, bother a
little 3. Gets headaches rarely, bother quite a
bit 4. Gets headaches frequently, bother
quite a bit 5. Gets headaches, bother quite a bit 6. Other headaches, not classifiable 7. Headaches rarely, not ascertained how
bothersome 8. Headaches frequently, not ascertained
how bothersome 9. Not ascertained
112
Table 3 Piecewise logistic regression primary model results*
Study Sample Size Parameter
Estimate Standard Error 95% CI (bootstrapped)
NHIS 1997 BMI Slope (low)†
19,727 -0.151 0.115 (-1.235, -0.017) knot 1‡ 18.97 2.236 (16.45, 20.30)
BMI Slope (mid)† 0.041 0.093 (0.033, 0.049) knot 2 35.08 4.212 (32.74, 48.91)
BMI Slope (high)† -0.003 0.019 (-0.061, 0.016) NHIS 1998
BMI Slope 17,355 0.027 0.004 (0.021, 0.032) NHIS 1999
BMI Slope (low) 16,704 -0.056 0.063 (-0.137, 0.015) knot 20.19 3.219 (18.83, 22.17)
BMI Slope (high) 0.030 0.020 (0.025, 0.036) NHIS 2000
BMI Slope 17,298 0.034 0.003 (0.030, 0.039) NHIS 2001
BMI Slope 17,666 0.029 0.003 (0.024, 0.033) NHIS 2002
BMI Slope 16,280 0.031 0.004 (0.025, 0.036) NHIS 2003
BMI Slope (low) 16,230 -0.102 0.130 (-1.030, 0.020) knot 19.61 3.471 (18.04, 21.72)
BMI Slope (high) 0.039 0.010 (0.033, 0.044) ACHS
BMI Slope (low) 3,647 -0.157 0.155 (-0.333, 0.018) knot 19.42 2.087 (18.21, 22.45)
BMI Slope (high) 0.032 0.013 (0.010, 0.052) TCHS§
BMI Slope 2,397 0.022 0.009 (0.005, 0.038) NHANES I**
BMI Slope 10,113 0.013 0.012 (-0.016, 0.032) WHI
BMI Slope 82,953 0.000 0.002 (-0.004, 0.004) * Adjusted for age, race, and smoking status; † BMI was divided into 1, 2, or 3 contiguous segments (depending on the number of knots in the model – 0, 1, or 2, respectively) and will have an estimate of the linear rate of change in log odds of headache per segment: low, mid, high; ‡ Knots were entered into the model and retained only if they contributed significantly to model fit at significance level 0.05; § 100% Caucasian; ** Employment status on men only and not adjusted for smoking as nearly 60% of records are missing that data.
113
114
Table 4 Piecewise logistic regression extended model results*
Study Sample Size Parameter Estimate
Standard Error
95% CI (bootstrapped)
NHIS 1997 BMI Slope (low)†
18,480 -0.068 0.093 (-0.224, 0.046) knot 1‡ 20.04 1.794 (17.44, 23.23)
BMI Slope (mid)† 0.051 0.130 (0.029, 0.125) knot 2 26.38 6.897 (10.71, 35.82)
BMI Slope (high)† 0.017 0.036 (0.006, 0.030 NHIS 1998
BMI Slope 16,132 0.020 0.004 (0.014, 0.026) NHIS 1999
BMI Slope (low) 15,421 -0.062 0.098 (-0.898, -0.004) knot 20.19 5.497 (12.30, 22.28)
BMI Slope (high) 0.024 0.061 (0.019, 0.032) NHIS 2000
BMI Slope 16,205 0.029 0.003 (0.024, 0.033) NHIS 2001
BMI Slope 16,429 0.023 0.003 (0.018, 0.028) NHIS 2002
BMI Slope 15,077 0.024 0.004 (0.018, 0.031) NHIS 2003
BMI Slope (low) 14,985 -0.096 0.151 (-0.809, 0.040) knot 19.57 2.956 (17.73, 22.34)
BMI Slope (high) 0.033 0.011 (0.028, 0.039) ACHS
BMI Slope (low) 3,397 -0.146 0.198 (-0.441, 0.042) knot 19.46 3.139 (17.43, 23.81)
BMI Slope (high) 0.013 0.110 (-0.011, 0.037) TCHS§
BMI Slope 1,624 0.027 0.011 (0.006, 0.049) NHANES I**
BMI Slope 7,427 -0.015 0.015 (-0.048, 0.014) WHI
BMI Slope 82,423 -.009 0.002 (-0.013, -0.005) * Adjusted for age, race, smoking, alcohol, hypertension, and SES; † BMI divided into 1, 2, or 3 contiguous segments (depending on number of knots in model) and has an estimate of the linear rate of change in log odds of headache per segment: low, mid, high; ‡ Extended models fit with same number of knots as the primary model; § 100% Caucasians; ** Employment status on men only and not adjusted for smoking as nearly 60% of records are missing that data.
115
Table 5 Odds ratios* and 95% confidence intervals across selected BMI values for the primary and extended models
Primary Model: Body Mass Index (BMI: kg/m2) 18 25 30 35 40NHIS 1997 1.11 (0.91, 1.35) 1.23 (1.10, 1.32) 1.51 (1.32, 1.73) 1.86 (1.51, 2.17) 1.84 (1.54, 2.11) NHIS 1998 0.95 (0.93, 0.96) 1.14 (1.10, 1.19) 1.31 (1.21, 1.41) 1.50 (1.34, 1.68) 1.72 (1.47, 2.00) NHIS 1999 1.12 (0.94, 1.32) 1.14 (1.03, 1.20) 1.33 (1.19, 1.44) 1.54 (1.36, 1.73) 1.78 (1.53, 2.07) NHIS 2000 0.93 (0.92, 0.95) 1.19 (1.15, 1.23) 1.41 (1.32, 1.51) 1.68 (1.51, 1.86) 1.99 (1.74, 2.29) NHIS 2001 0.94 (0.93, 0.96) 1.16 (1.12, 1.19) 1.33 (1.25, 1.42) 1.53 (1.40, 1.69) 1.77 (1.57, 2.02) NHIS 2002 0.94 (0.93, 0.95) 1.17 (1.23, 1.21) 1.36 (1.26, 1.46) 1.58 (1.41, 1.77) 1.85 (1.59, 2.14) NHIS 2003 1.16 (0.91, 1.37) 1.21 (1.12, 1.26) 1.47 (1.34, 1.59) 1.79 (1.59, 1.99) 2.18 (1.87, 2.50) ACHS 1.23 (0.92, 1.48) 1.17 (1.02, 1.30) 1.37 (1.09, 1.69) 1.60 (1.15, 2.19) 1.88 (1.20, 2.84) TCHS 0.96 (0.93, 0.99) 1.11 (1.03, 1.21) 1.24 (1.05, 1.47) 1.38 (1.08, 1.78) 1.54 (1.11, 2.16) NHANES I 0.97 (0.93, 1.02) 1.07 (0.96, 1.21) 1.14 (0.91, 1.45) 1.22 (0.87, 1.75) 1.31 (0.83, 2.11) WHI 1.00 (0.99, 1.01) 1.00 (0.98, 1.02) 1.00 (0.96, 1.04) 1.00 (0.94, 1.05) 0.99 (0.92, 1.07)
Extended Model: Body Mass Index (BMI: kg/m2) 18 25 30 35 40NHIS 1997 1.14 (0.90, 1.37) 1.29 (1.11, 1.51) 1.47 (1.25, 1.70) 1.60 (1.39, 1.98) 1.74 (1.48, 2.04) NHIS 1998 0.96 (0.94, 0.98) 1.10 (1.06, 1.15) 1.22 (1.12, 1.33) 1.35 (1.18, 1.54) 1.49 (1.25, 1.77) NHIS 1999 1.13 (0.93, 1.31) 1.11 (0.99, 1.19) 1.26 (1.12, 1.41) 1.42 (1.24, 1.67) 1.60 (1.36, 1.88) NHIS 2000 0.94 (0.93, 0.96) 1.15 (1.12, 1.20) 1.33 (1.24, 1.43) 1.54 (1.39, 1.71) 1.79 (1.55, 2.05) NHIS 2001 0.96 (0.94, 0.97) 1.12 (1.08, 1.16) 1.26 (1.18, 1.35) 1.41 (1.28, 1.56) 1.58 (1.38, 1.81) NHIS 2002 0.95 (0.94, 0.97) 1.13 (1.08, 1.18) 1.28 (1.17, 1.39) 1.44 (1.26, 1.63) 1.63 (1.36, 1.92) NHIS 2003 1.15 (0.92, 1.39) 1.18 (1.11, 1.25) 1.40 (1.28, 1.53) 1.65 (1.47, 1.86) 1.95 (1.67, 2.27) ACHS 1.23 (0.95, 1.49) 1.06 (0.92, 1.20) 1.13 (0.88, 1.44) 1.21 (0.84, 1.73) 1.28 (0.79, 2.07) TCHS 0.95 (0.91, 0.99) 1.14 (1.02, 1.27) 1.31 (1.04, 1.62) 1.49 (1.06, 2.07) 1.71 (1.08, 2.64) NHANES I 1.03 (0.97, 1.10) 0.93 (0.79, 1.07) 0.86 (0.62, 1.14) 0.80 (0.49, 1.22) 0.75 (0.39, 1.31) WHI 1.02 (1.01, 1.03) 0.96 (0.94, 0.98) 0.92 (0.88, 0.95) 0.88 (0.82, 0.93) 0.84 (0.77, 0.91) * Compared to the BMI reference level of 20
BODY MASS INDEX AND WAIST-TO-HIP RATIO AS THEY RELATE TO
MORTALITY IN NHANES III
by
SCOTT W. KEITH, DAVID B. ALLISON
In preparation for The Journal of the American Medical Association
Format adapted for dissertation
116
Abstract
Context: As body mass index (BMI) and waist-to-hip ratio (WHR) continue to be popular
choices for characterizing obesity, it remains unclear which might better predict mortality
in the general population.
Objective: To analyze BMI and WHR for their respective capacities to predict the odds of
mortality among adults in a recent nationally representative dataset.
Design, Setting, and Participants: Piecewise linear logistic regression was applied to data
from the third National Health and Nutrition Examination Survey (NHANES III: 1988-
1994) to model the possibly nonlinear relationships between the predictors of interest
(BMI and WHR) and mortality among the non-institutionalized United States population
of adults aged at least 25 years. Models were adjusted for baseline age and indicators of
ethnicity, smoking, alcohol, and serious illnesses.
Main Outcome Measures: Mortality indicated as death prior to follow-up in 2000.
Results: We analyzed data from 14,386 participants. The likelihood of mortality related
piecewise linearly to BMI where models indicated thresholds in the odds at about 19 and
23 for women and men, respectively. The shapes of the BMI-mortality relationships were
similar for both men and women suggesting no significant elevation in odds with
increasing BMI. Linear logistic models were adequate to relate WHR to mortality and
suggested that increased WHR linearly increased log(odds) of mortality among women (β
= 2.4; 95% CI (1.3, 3.5)), but not among men.
117
Conclusions: Weighted logistic regression was, however, sufficient to model the WHR-
mortality relationship, but WHR was a significant predictor for women only. BMI related
nonlinearly to mortality with a broken line shape similar among both women and men
and was a significant predictor of mortality only for very low values. This finding was
unexpected and should be viewed with caution as an example of how the nonlinear
framework, restricted to certain settings and a specific a priori analysis plan, would fit
the data, not as a final concluding result. The data suggested that a longer follow-up time
might be required for characterizing mortality at high levels of BMI.
Key words: mortality, BMI, waist-to-hip ratio, NHANES III, piecewise linear logistic
regression.
118
Introduction
Mortality as a Possibly Nonlinear Function of BMI or WHR
Obesity prevalence has been increasing in the United States (Ogden et al., 2002;
Flegal et al., 2002) along with many plausible contributors (Keith et al., 2006a) to the
obesity epidemic (WHO, 1998). Body mass index (BMI) is a commonly used proxy
measure of adiposity or obesity which has been shown to have a nonlinear J- or U-shaped
relationship with several health outcomes including health-related quality of life (Heo et
al., 2003); headaches (Keith et al., 2008); dementia (Rosengren et al., 2005); and
mortality rate or longevity (e.g.’s Allison et al., 1997; Bigaard et al., 2004; Calle et al.,
1999; Fontaine et al., 2003; Kaplan et al., 2002; Keith et al., 2006b; Troiano et al., 1996;
Zhou et al., 2002). Much effort has been directed at determining the effects associated
with elevated BMI. Allison et al. (1999a) used estimates of relative risk of mortality in
combination with the distribution of BMI and other factors to estimate the annual deaths
attributable to obesity in the United States. Since then, several studies have conducted
similar analyses and reported a variety of results ranging from 165,000 (Flegal et al.,
2005) to 365,000 (Mokdad et al., 2004; 2005) which suggests considerable uncertainty in
how much excess risk might be attributable to obesity. As these estimates provide a basis
for estimating obesity-related healthcare costs (Allison et al., 1999b) and which might
influence healthcare budgeting that can affect millions of Americans, the reliability and
quality of risk estimates is of considerable public health import.
Here we consider a different modeling approach to obtaining the mortality risk
estimates in terms of odds ratios that might better account for the nonlinearity in the
119
BMI-mortality relationship in the presence of covariate information. We also consider
waist-to-hip ratio (WHR) in analyses along side those for BMI for purposes of comparing
and contrasting these two popular respective measures of weight distribution and relative
weight for their capacities to predict mortality.
Goetghebeur and Pocock (1995) warn that analyzing such relationships with
many conventional methods, such as a quadratic polynomial of BMI, can distort the risk-
relationship and produce misleading results. We hypothesized that mortality and BMI
will have a relationship characterized by upturns such as those warned about by
Goetghebeur and Pocock (1995) as we expect mortality risk may increase substantially
with decreasing BMI below some unknown, but estimable, threshold and may thereafter
flatten-out or begin increasing with increasing BMI above some other unknown
threshold.
We have found little evidence in the literature to suggest that WHR might relate
nonlinearly with mortality. Studies have suggested that linear models depict the
relationship well WHR-mortality relationships (e.g., Lindqvist et al., 2006). It remains
unclear if nonlinearity might be detected with a sufficiently flexible modeling framework.
Categorical vs. Continuous Representations of predictors: Basic Issues
A key assumption in fitting statistical models between a continuous predictor and
an outcome of interest is that there is an underlying quantitative relationship between the
predictor of interest (e.g., BMI) and the outcome (e.g., mortality odds). Furthermore, we
assume that the outcome can be represented well by some estimable function of the
predictor (i.e., a functional form). There are basically three different approaches to
120
modeling a continuous predictor’s relationship with an outcome: categorize the
observations or maintain a continuous metric and apply polynomials or piecewise
polynomial splines.
Many investigators have applied (e.g., Manson et al., 1995; Stevens et al., 1998)
or advocated (e.g., Rothman, 1992) in favor of using contiguous categories of BMI set a
priori by common standards (e.g. underweight: BMI<18; normal: 18≤BMI<25;
overweight: 25≤BMI<30; and obese: 30≤BMI), quintiles, or some other arbitrary
classification rules in the estimation of mortality relative risk. The reasons for this
treatment of BMI are most likely borne out of convenience or convention. Theoretically,
categorization permits an examination of differences in risk or odds of an event occurring
between categories and does not assume linearity or smoothness in the relationship.
However, categorization of BMI has significant disadvantages and limitations. They
include: 1) ignoring within-category BMI information resulting in decreased statistical
power; 2) insensitivity of trend tests to non-monotonic relationships; 3) trend tests may
indicate a trend, but cannot describe it; 4) similar individuals within a BMI category are
treated as though they have a uniformly constant risk regardless of their actual BMI level;
5) results can depend heavily upon how the categories are chosen; and 6) unfortunately, a
priori classification boundaries of BMI are not likely to represent “true” partitions or
thresholds that would group individuals according to the underlying pattern of mortality
risk or odds within a BMI category.
Treating BMI continuously with polynomial predictor variables does not degrade
data, tends to preserve power, and does not impose arbitrary cut-points. Curvilinear
relationships (U- or J-shaped) have commonly been detected between BMI and mortality
121
risk when modeled using polynomials of BMI. Modeling substantially nonlinear
relationships via polynomials will also have disadvantages and limitations. They include:
1) a lack of flexibility possibly leading to biased estimation, particularly in the tails of the
BMI distribution; 2) poorly parameterized models; and 3) the model will smooth over
any cutpoints between BMI groups with different mortality risk relationships.
Zhao and Kolonel (1992) suggest using categorical analysis as an outstanding
exploratory technique. Categorization can help in evaluating the extent to which any
smooth function fitted to the data adequately captures patterns in the data. Following this
exploration, in the overwhelming majority (though perhaps not all) cases, we believe it is
appropriate to return to a continuous metric and model any nonlinearity with appropriate
functions.
Piecewise linear regression models (also called regression spline models; see de
Boor, 1978) may offer some of the best features of the polynomial and categorical
approaches to modeling BMI. When the knots (also called breakpoints, changepoints,
joinpoints, or cutpoints) that connect the linear segments in a spline model are allowed to
be fitted as free model parameters (the so-called free-knots), they may estimate the
boundaries of BMI groups experiencing differential, non-constant risk relationships. The
localized estimation properties of free-knot splines may help pick out the relationships in
the tails of the BMI distribution where polynomial models tend to lack flexibility and
protect central observations from excessive model influence of extreme values.
To our knowledge, no studies have employed nonlinear (non-categorical)
statistical methodology to estimate risk or odds of mortality related to BMI or WHR with
122
appropriate adjustments for complex survey samples which are capable of representing
the US population. This represents a gap in the literature that deserves attention.
Methods
Population and sample design
The data from the Third National Health and Nutrition Examination Survey
(NHANES III) is a complex multistage cross-sectional sample representative of the non-
institutionalized United States population described in detail elsewhere (Plan and
operation of the Third National Health and Nutrition Examination Survey, 1994). The
National Center for Health Statistics conducted this survey and in 2000 collected
mortality information on participants based on probabilistic match with records in the
National Death Index, thus providing up to 13 years of mortality follow-up. NHANES III
was subject to institutional review and obtained informed consent from participants.
Measurements
Table 1 lists the variables we considered in our analyses and respective
distribution summaries by gender. Both reported and measured information are available
in NHANES III. During standardized health examinations men and women were
measured for dimensions including height, weight, waist circumference, and hip
circumference. Participants were asked to report demographic information including
indications of their ethnicity or race (Non-Hispanic black, Non-Hispanic white, or
Mexican-American); tobacco smoking; heavy alcohol use; and personal history of serious
123
illness or disease which could include congestive heart failure (CHF), myocardial
infartion (MI), stroke, non-skin cancer, or emphysema.
Analyses included only anthropometric variables resulting from technician
measurements and we applied appropriate NHANES III sample weights; the “mobile
examination center (MEC) and home examination weights (WTPFHEX6)” (U.S. DHHS
NHANES III Analytic and Reporting Guidelines, 1996). These weights were the most
inclusive for our purposes. That is, they were non-zero for the largest proportion of
completed examination records of the three general sample final weights.
The predictor variables we are concerned with are anthropometric ratios of which,
in the case of BMI, might indicate relative weight or adiposity or, in the case of WHR,
might indicate weight or adipose tissue distribution about the trunk. Note that 650 women
and 500 men were inexplicably missing WHR measurements from their examination
records, but otherwise had complete information on study variables.
Outcome Definition
We are interested in mortality outcomes which have occurred among adult
NHANES III participants aged 25 years or older. Specifically, this outcome was coded as
a binary random variable where participants determined to have died during follow-up
(i.e., prior to NCHS drawing mortality information from NDI on December 31, 2000)
were assigned a ‘1’ while the others assumed to have survived the follow-up period were
assigned a ‘0’.
124
Statistical Analysis
To model the respective relationships BMI and WHR have with mortality we
incorporated free-knot splines with B-spline bases (de Boor, 1978) into logistic
regression in a way similar to the maximum likelihood-base optimization methodology
described by Bessaoud et al. (2005). The distinctions in our methods focus on knot
selection and adjustments for the complex sample design of NHANES III. We employed
a parametric bootstrap approach (Keith et al., in preparation) to forward selection of the
optimal number of knots or change-points. Bessaoud and colleagues suggested using the
Bayesian Information Criterion (BIC) first described by Schwarz (1978) for this purpose.
However, we found that the complex sample weights inflates and distorts the distribution
of the likelihood to the extent that the over-parameterization penalty imposed by the BIC
has no meaningful effect. Bootstrapping allowed us to generate distributions of 200
likelihood replicates to determine the impact of adding knots and spline parameters
relative to the cost of increased model complexity. Another distinction is that our
modeling framework incorporated the analysis of 500 nonparametric bootstrap samples
according to the methods described by Rao et al. (1992) for adjusting model parameters
for the multistage probability cluster sampling design of NHANES III.
Due to apparent interactions between gender and BMI or WHR in their
relationships with mortality, we stratified analyses by gender group. B-splines were used
to help stabilize the computational performance of our programs, however, the raw spline
parameters have been transformed into their piecewise linear polynomial representation
(i.e., localized slopes). This gives the parameters an interpretation common to linear
logistic regression where they represent the estimated increase in log-odds of mortality
125
per unit increase in predictor (i.e., BMI). The slopes and optimal knot locations have been
estimated along with bootstrap estimates of their standard errors (SE) and bootstrapped
95% confidence intervals (CI). Odds ratios (OR) and bootstrappled 95% CI were
calculated and tabled for selected BMI and WHR values. To depict the bootstrapping
sampling variation, we plotted OR for both BMI and WHR along with the selected model
fitted to 20 randomly selected bootstrap replicated datasets.
All programs were written in SAS (version 9.1; SAS Institute Inc, Cary, NC)
using macro utilities, PROC IML, and PROC NLP.
Results
Table 1 shows how the study variables were distributed among the men and
women surveyed in NHANES III. More than twice as many men had smoked tobacco
and about three times as many men regularly consumed at least 35 oz/day of alcohol as
compared to women. There were also about 20% more deaths among men than women
during follow-up even though their baseline indications of serious illness were similar.
Mean BMI was higher among women while men generally had higher WHR. Figure 1
shows how the distributions of these anthropometrics differ by gender. Men have less
variation in BMI and WHR and there is a noticeable upward shift in the distribution of
WHR for men as compared to women.
The distribution of age and mortality per unit bins of BMI and per 1100 - unit bins
of WHR are displayed in Figure 2. The plotted values refer to the median age of
participants falling into in each bin rounded and truncated to the nearest decade. The
proportion of deaths in each bin seems to be collinear between age and WHR for both
126
women and men. The picture is less clear for BMI. U- or J- shaped curves are apparent in
frames a) and c) where both high and low values of BMI seem to translate to higher
proportions of death in these bin groupings across genders. It is interesting to see that
men and women with the highest levels of BMI are relatively young (many in their 30’s
and 40’s) and survived the follow-up period.
We detected one knot in the BMI-mortality relationships for women and men
located at BMI = 23.4 (95% CI (23.2, 25.8)) and BMI = 19.0 (95% CI (18.1, 20.0)),
respectively. Table 2 displays these model parameter estimates as well as the weighted
and covariate-adjusted linear slopes (β’s) in the log-odds of mortality associated with
BMI relative to adjacent knot parameters detected. That is, if a knot has been fitted in the
model, there will be a slope parameter to represent the linear relationship on one side of
the knot and another slope parameter to represent the linear relationship on the other side
of the knot. The knot points we detected were nadirs in the piecewise linear log-odds
models. They may represent biologically meaningful and gender-specific thresholds at
which points the relationships between BMI and mortality inflect from increased BMI
relating to decreased log-odds of mortality (women: β1 = -0.56, 95% CI (-1.02, -0.32);
men: β1 = -0.23, 95% CI (-0.30, -0.01)) to BMI losing any significant capacity to predict
mortality for either gender. Figure 2 frames a) and c) show plots of odds ratios (OR) for
BMI and mortality against a reference BMI of 23 among otherwise alike individuals. The
model fitted to the original data (in red) shows our best estimates by gender along with
models fitted to 20 randomly selected bootstrap replicate datasets which illustrate the
sampling variability. Table 3 shows OR results with 95% confidence intervals for
selected BMI levels which are commonly used to categorize BMI. The 95% CI for the
127
elevated BMI (say 30 and over) in comparison to the BMI = 23 reference all contained 1,
further showing how BMI does not perform well for predicting odds of mortality.
No knots were necessary to adequately model the adjusted associations between
mortality and WHR among women or men. See Table 2. WHR was, however, a
significant linear predictor of adjusted log-odds of mortality only for women (β = 2.41;
95% CI (1.31, 3.46)). This translates to statistically significant OR (CI not containing 1)
from comparing a reference woman, having WHR = 0.9 (about the average), with
otherwise like women having various WHR (see Table 3). See frame b) of Figure 3 for a
graphical representation of the fitted WHR model (in red) for women. The mortality
response to WHR was basically flat which can be seen in the OR plot in frame d) of
Figure 3.
Comments
Our results did not suggest that elevated BMI predicts mortality odds significantly
in NHANES III for either men or women. This finding was unexpected and should be
viewed with caution as an example of how the nonlinear framework, restricted to certain
settings and a nascent a priori analysis plan, would fit the data, not as a final concluding
statement on how elevated BMI relates to mortality risk. It is clear that more analyses
will be required; particularly regarding possible interactions between BMI and age.
We did feel confident, however, in detecting possibly biologically meaningful
thresholds in the BMI-mortality relationships among men and women at BMIs of about
19 and 23, respectively, which suggest the points at which increased BMI no longer
relates with decreased mortality odds. No such thresholds were evident for WHR in either
128
women or men. The linear relationship of WHR and mortality suggested that as
compared to a woman with an average WHR (0.9), similar women having WHR of 1.0 or
1.1 might have an increase in odds of mortality of 25% or 62%, respectively. While these
results should be interpreted with some caution considering that these are observational
results, they do suggest that more focused prospective studies might produce similar
results and be more useful for direct application in clinical care and public health policy.
WHR does not appear to significantly predict mortality odds among men. This
might seem counter to what one might infer from the plots in frames b) and d) of Figure 2
which seem to suggest that the WHR-mortality relationships might be the same for men
and women. These plots are unadjusted and the apparent relationship among men
disappears when adjusted for other covariates possibly as a result of collinearity with age.
The results accord with those of Price et al. (2006) where they also found among their
United Kingdom cohort stronger associations between WHR and mortality for women
than men. However, the multicollinearity apparent in Figure 2 might be causing unstable
results in the model while it attempts to sort out respective contributions to increasing the
likelihood function from age and WHR. Note that removing a possible outlier (the
octogenarian in the upper left-hand corner of Figure 2 frame d) had no effect on the
overall model. Lindqvist et al. (2006) concluded that the WHR-mortality relationship
depended on age in their Swedish cohort. We did not observe such an interaction. At least
one study has also found that waist circumference has been increasing in the US
population over time beyond what might be expected given the increases in BMI
observed over the same period (Elobeid et al., 2007). This would likely affect the
129
distribution of WHR over time and further complicate effective modeling of the WHR-
mortality relationship.
There are issues surround BMI, age, and the length of mortality follow-up which
limited our ability to model the BMI-mortality relationship. The prevalence of obesity
has increased and may be accelerating with passing time for many plausible reasons
(Keith et al., 2006b). Interpretation of Figure 2 suggests that the distribution of BMI
might be changing with calendar time and it is clear that the heaviest individuals in
NHANES III were among the youngest and tended to survive the follow-up period. We
cannot rule out that self-selection bias had entered into this survey as perhaps only the
younger and/or healthier among those having BMI ≥ 45 might have been motivated to
travel for participation in the MEC health examination. Regardless, we had little or no
information in this dataset for statistical analyses on what factors would be associated
with their mortality. These observations were very influential and pulled the adjusted log-
odds of mortality models nearly flat over the upper part of the BMI distribution despite
the fact that older participants with high BMI were showing elevated mortality odds.
Thus, NHANES III might be of only limited use for estimating BMI-mortality risk
associations unless BMI has been categorized.
Flegal et al. (2005) used categorized BMI and Cox proportional hazards modeling
to analyze mortality in NHANES I, II, and III. As compared to our continuous piecewise
linear logistic regression modeling of BMI, these two aspects represent the most obvious
dissimilarities in statistical methodology and probably account for most disparity in
results. To see how results from categorized BMI would compare to those previously
published by Flegal et al. (2005) we used SAS PROC SURVEYLOGISTIC and SAS-
130
Callable SUDAAN PROC RLOGIST to fit complex survey design weighted logistic
regression models (data not shown) stratified by baseline age groups (1: 25≤age<60; 2:
60≤age<69; and 3: age≥70) with participants categorized as “underweight” if BMI≤18.5;
“normal weight” (reference) if 18.5≤BMI<25; “overweight” if 25≤BMI<30; “obese” if
30≤BMI<35; and “severely obese” if BMI≥35. In summary, we noted that the BMI-
mortality odds ratios from these models were quite similar in magnitude to the hazard
ratios presented by Flegal et al. (2005) except that among those in their sixties where we
noted significantly smaller odds of mortality among the obese as compared to normal
weight and an odds ratio below 1 (not statistically significant) for the severely obese. We
can offer no clear explanation for this result at this point, but it is noteworthy that when
the proportion of events are as high as 10 or 15%, odds ratios will not likely estimate
relative risks as accurately as hazard ratios. It might be important to note also that Flegal
et al. (2005) made no adjustments for height or indications of baseline illness and, to be
consistent with the NHANES I and II covariates, they coded Mexican Americans as
having ‘white’ race/ethnicity. However, our categorical BMI analyses were robust to
removal of the illness, height, and race/ethnicity covariates from the models. Thus, these
covariates were not likely the root cause of the heterogeneity between any of our findings
and those of Flegal et al. (2005).
Incorporating follow-up time, say by nonlinear Cox proportional hazards
regression, and stratifying analyses by age groupings as Flegal et al. (2005) had done or
by using time-dependent covariates might provide more information and a better
approach to modeling the sparse mortality information among the most obese
participants. However, there are no software packages of which we are aware that offer
131
Cox regression model tools capable of incorporating free-knot spline bases and complex
sample designs. We suggest that if NCHS collects another round of mortality data from
the NDI, there might by then be sufficient mortality information to characterize the BMI-
mortality relationship among these individuals.
Some have found evidence from studies of the first three waves of NHANES that
the effects of elevated BMI on mortality risk might be decreasing with time (Flegal et al.,
2005). In some ways, this makes intuitive sense as being overweight or obese has become
highly prevalent and healthy people are commonly reaching these levels of BMI at earlier
ages. If elevated BMI does have deleterious effects which might accumulate with time, it
might take stratification by age cohorts in addition to greater follow-up periods to tease
out the BMI-mortality relationship. On the other hand, if these individuals continue to
survive, then their excess weight may be protective against mortality as they age (Stevens
et al., 1998).
Another limitation is that we have conducted analyses on the entire sample and
our results might be subject to the biasing effects of regression dilution (Clarke et al.,
1999) and what has been called “reverse-causation” (Manson et al., 1987; Willett et al.,
2005) which tend to deflate estimates of mortality related to BMI. Given the structure of
the NHANES III data as depicted in Figure 2, we suggest that the models are being
influenced by extreme values that would likely overshadow any regression dilution
affecting the size of estimates for elevated BMI. We have taken some measures to adjust
for reverse-causation factors such as smoking and baseline illness. These factors have the
potential to confound and spuriously associate low levels of BMI with elevated risk of
death. This can result in deflated relative risks when comparing groups with high BMI to
132
those low BMI (Greenberg et al., 2007). Some have suggested removing participants
reporting exposures to reverse-causation factors (e.g., Willett et al., 2005). Others have
suggested that removal of subjects experiencing early deaths or confounding exposures
will not likely remove the bias and is thus, not advisable (e.g., Allison et al., 1999c). Our
modeling approach allowed for considerable flexibility over the entire range of BMI. As
opposed to fitting a quadratic or cubic polynomial curve to the data, the piecewise linear
model effectively allowed us to fit well the steep linearly decreasing association observed
between low BMI and mortality and then break abruptly to model the relatively flat
mortality response surface associated with elevated BMI. This approach has placed the
estimated thresholds where they are likely to belong (i.e., at low BMI levels) which
limited the influence of those observations likely to cause bias on those with elevated
BMI. This, in addition to adjusting to some extent for illness and smoking, appeared to
effectively diminish the biasing effects of reverse-causation without removing
participants reporting exposures to reverse-causation factors.
References
Allison DB, Faith MS, Heo M, Kotler DP. (1997) Hypothesis concerning the U-shaped
relation between BMI and mortality. Am. J. Epidemiol;146:339-349.
Allison DB, Fontaine KR, Manson JE, Stevens J, VanItallie TB. (1999a) Annual Deaths
Attributable to Obesity in the United States. JAMA;282:1530-8.
Allison DB, Heo M, Flanders DW, Faith MS, Carpenter KM, Williamson DF. (1999c)
Simulation Study Of The Effects Of Excluding Early Mortality On Risk Factor-
133
Mortality Analyses In The Presence Of Confounding Due To Occult Disease: The
Example Of Body Mass Index. Annals of Epidemiology;9:132-42.
Allison DB, Zannolli R, Narayan KVM. (1999b) The direct health care costs of obesity in
the United States. American Journal of Public Health;89:1194-9.
Bessaoud F, Daurès JP, Molinari N. (2005) Free knot splines for
logistic models and threshold selection. Computer Methods and Programs
in Biomedicine;77:1-9.
Bigaard J, Frederiksen K, Tjonneland A, Thomsen BL, Overvad K, Heitmann BL,
Sorensen TI. (2004) Body fat and fat-free mass and all-cause mortality. Obes
Res;12:1042-9.
Calle EE, Thun MJ, Petrelli JM, Rodriguez C, Heath CW Jr. (1999) Body mass index and
mortality in a prospective cohort of US adults. N Engl J Med;341:1097–1105.
Clarke R, Shipley M, Lewington S, Youngman L, Collins R, Marmot M et al.
Underestimation of risk associations due to regression dilution in long-term
follow-up prospective studies. (1999) Am J Epidemiol;150:341–353.
de Boor C. (1978) A Practical Guide to Splines. New York: Springer-Verlag.
Elobeid MA, Desmond R, Thomas O, Keith SW, Allison DB. (2007) Waist
circumference values are increasing beyond that expected from body mass index
increases. Obesity;15:2380-3.
Flegal KM, Graubard BI, Williamson DF, Gail MH. (2005) Excess deaths associated
with underweight, overweight, and obesity. JAMA;293:1861-7.
Flegal et al. (2002) Prevalence and Trends in Obesity Among US Adults, 1999-2000.
JAMA;288:1723-7.
134
Fontaine KR, Redden DT, Wang C, Westfall AO, Allison DB. (2003) Years of life lost
due to obesity. JAMA;289:187-93.
Goetghebeur E, Pocock S. (1995) Detection and Estimation of J-shaped Risk-Response
Relationships. JR Statist Soc A;158:107-122.
Greenberg JA, Fontaine KR, Allison DB. (2007) Putative biases in estimating mortality
attributable to obesity in the US population. IJO;31:1449-55.
Heo M, Allison DB, Faith MS, Zhu S, Fontaine KR. (2003) Obesity and quality of life:
mediating effects of pain and comorbidities. Obes Res;11:209-16.
Kaplan RC, Heckbert SR, Furberg CD, Psaty BM. (2002) Predictors of subsequent
coronary events, stroke, and death among survivors of first hospitalized
myocardial infarction. J Clin Epidemiol;55:654-64.
Keith SW, Wang C, Fontaine KR, Allison DB. (2008) Body mass index and headache
among women: Results from 11 epidemiologic datasets. Obesity;16:377-83.
Keith SW, Desmond R, Allison DB. (2006b) Body fat and mortality: A survival analysis
of the third National Health and Nutrition Examination Study (NHANES III).
Obesity;14 Suppl A262.
Keith SW, Redden DT, Katzmarzyk P, Boggiano MM, Hanlon EC, Benca RM, Ruden D,
Pietrobelli A, Barger J, Fontaine K, Wang C, Arronne L, Wright S, Baskin M,
Dhurandhar N, Lijoi M, Grilo CM, De Luca M, Allison DB. (2006a) Putative
Contributors to the Secular Increase in Obesity: Exploring the Roads Less
Traveled. IJO;30:1585-94.
Greenberg JA, Fontaine KR, Allison DB. (2007) Putative biases in estimating mortality
attributable to obesity in the US population. IJO;31:1449-55.
135
Lindqvist P, Andersson K, Sundh V, Lissner L, Bjorkelund C, Bengtsson C. (2006)
Concurrent and separate effects of body mass index and waist-to-hip ratio on 24-
year mortality in the Population Study of Women in Gothenburg: Evidence of
age-dependency. European Journal of Epidemiology;21:789-94.
Manson JE, Stampfer MJ, Hennekens CH, Willett WC. (1987) Body weight and
longevity: a reassessment. JAMA;257:353–58.
Manson, J.E., Willett, W.C., Stampfer, M.J., Colditz, G.A., Hunter, D.J., Hankinson,
S.E., Hennekens, C.H., & Speizer, F.E. (1995) Body weight and mortality among
women. New England Journal of Medicine, 333, 677-685.
Mokdad AH, Marks JS, Stroup DF, Gerberding JL. (2004) Actual causes of death in the
United States, 2000. JAMA;291:1238-45.
Mokdad AH, Marks JS, Stroup DF, Geberding JL. (2005) Correction: actual causes of
death in the United States,2000. JAMA.;293:293-4.
Ogden et al. Prevalence and Trends in Overweight Among US Children and Adolescents,
1999-2000. JAMA. 2002 Oct 9;288(14):1728-32.
Plan and operation of the Third National Health and Nutrition Examination Survey,
1988-1994: series 1: programs and collection procedures. (1994) Vital Health Stat
1;32:1-407.
Price GM, Uauy R, Breeze E, Bulpitt CJ, Fletcher AE. (2006) Weight, shape, and
mortality risk in older persons: elevated waist-hip ratio, not high body mass index,
is associated with a greater risk of death. Am J Clin Nutr;84:449-60.
Rao JNK, Wu CFJ, Yue K. (1992) Some recent work on resampling methods for
complex surveys. Survey Methodology;18:209–217.
136
Rosengren A, Skoog I, Gustafson D, Wilhelmsen L. (2005) Body mass index, other
cardiovascular risk factors, and hospitalization for dementia. Arch Intern
Med;165:321-6.
Rothman KJ. Modern Epidemiology. Boston, MA: Little Brown, 1992.
Schwarz G. (1978) Estimating the dimension of a model. Ann. Stat;6:461-464.
Stevens J, Cal J, Pamuk ER, Williamson DF, Thun MJ, Wood JL. (1998). The effect of
age on the association between body-mass index and mortality. New Eng J Med;
338, 1-7.
Troiano RP, Frongillo EA, Sobal J, Levitsky DA. (1996) The relationship between body
weight and mortality: A quantitative analysis of combined information from
existing studies. Int J Obesity;20:63-75.
U.S. Department of Health and Human Services (DHHS). (1996) National Center for
Health Statistics. Third National Health and Nutrition Examination Survey, 1988-
1994, NHANES III Laboratory Data File (CD-ROM). Public Use Data File
Documentation Number 76200. Hyattsville, MD: Centers for Disease Control and
Prevention.
WHO. (1998) Obesity. Preventing and managing the global epidemic. World Health
Organization, Geneva.
Willett WC, Hu FB, Colditz GA, Manson JE. (2005) Underweight, overweight, obesity,
and excess deaths. JAMA;294:551.
Zhao LP., Kolonel LN. (1992). Efficiency loss from categorizing quantitative exposures
into qualitative exposures in case-control studies. American Journal of
Epidemiology, 136(4):464-74.
137
Zhou BF. (2002) Effect of body mass index on all-cause mortality and incidence of
cardiovascular diseases-report for meta-analysis of prospective studies open
optimal cut-off points of body mass index in Chinese adults. Biomed Environ
Sci;15:245-52.
138
Table 1 NHANES III data description.* Survey baseline years 1988-1994 Mortality follow-up through 2000 Unweighted sample size 14,386 Women (%) 7,626 (53) Baseline Study Variables
Women Men BMI: mean (SD) 27.83 (6.58) 26.84 (4.83) WHR: mean (SD) 0.89 (0.08) 0.97 (0.07) Age: mean (SD) 52.22 (18.56) 52.83 (18.37) Height (cm): mean (SD) 159.99 (7.24) 173.40 (7.52) Ethnicity: Black (%) 2,219 (29) 1,833 (27) Mexican-American (%) 1,843 (24) 1,849 (27) White (%) 3,564 (47) 3,078 (46) Smoking: Never (%) 4,559 (60) 2,192 (32) Former (%) 1,433 (19) 2,514 (37) Current (%) 1,634 (21) 2,054 (31) Alcohol ≥ 0.35 oz./day (%) 435 (6) 1,255 (19) Illnesses (CHF, MI, stroke,
cancer, or emphysema) 1,077 (14) 1,104 (16)
Follow-up Variable
Deaths (%) 1,235 (16) 1,482 (22) * Means and frequencies were not weighted or adjusted. They reflect the information available, but may not represent well population parameters.
139
Table 2 Piecewise linear logistic regression model* results for relating log-odds of mortality during follow-up to BMI and WHR by gender.
Predictor Parameter Parameter Estimate
Standard Error (bootstrapped)
95% CI (bootstrapped)
BMI among women (n=7,626)
β1 (slope for BMI<knot) -0.56 0.18 (-1.02, -0.32)
knot 19.00 0.54 (18.10, 20.00)
β2 (slope for BMI>knot) 0.01 0.01 (-.01, 0.03) BMI among men (n=6,760)
β1 (slope for BMI<knot) -0.23 0.05 (-0.30,-0.01)
knot 23.43 0.87 (23.23, 25.81)
β2 (slope for BMI>knot) 0.03 0.02 (-0.01, 0.06) WHR among women (n=6,973)
β 2.41 0.54 (1.31, 3.46) WHR among men (n=6,267)
β -0.37 0.85 (-2.03, 1.34) * Adjusted for baseline age, age squared, and height, and indicators of ethnicity, smoking status, alcohol use, and serious illness.
140
Table 3 Odds ratios and bootstrap 95% CI across selected BMI and WHR values by gender.
BMI*
18.5 25 30 35 40 Women 1.25 (0.91, 1.74) 1.03 (0.99, 1.07) 1.10 (0.96, 1.25) 1.18 (0.94, 1.47) 1.26 (0.91, 1.72) Men 2.85 (1.55, 3.57) 0.94 (0.68, 0.98) 1.07 (0.69, 1.15) 1.21 (0.69, 1.39) 1.38 (0.68, 1.74)
WHR†
0.8 1.0 1.1 1.2 1.3 Women 0.79 (0.71, 0.88) 1.27 (1.14, 1.41) 1.62 (1.30, 2.00) 2.06 (1.48, 2.82) 2.63 (1.69, 3.99) Men 1.04 (0.87, 1.22) 0.96 (0.82, 1.14) 0.93 (0.67, 1.31) 0.90 (0.54, 1.50) 0.86 (0.44, 1.71) * Compared to a BMI reference level of 23; † compared to a WHR reference level of 0.9.
141
a) BMI among women b) WHR among women
Freq
uenc
y
Freq
uenc
y
c) BMI among men d) WHR among men
Freq
uenc
y
Freq
uenc
y
Figure 1. Histograms depicting the distributions of BMI and WHR by gender.
142
a) BMI among women b) WHR among women
Prop
ortio
n
Prop
ortio
n
c) BMI among men d) WHR among men
Prop
ortio
n
Prop
ortio
n
Figure 2. Unadjusted proportion of deaths observed among those grouped per unit of BMI and 1
100 unit of WHR, respectively, by gender. Plotted numbers refer to the median age rounded and truncated to the nearest decade of each subunit grouping.
143
a) BMI* among women b) WHR† among women
Mor
talit
y O
R
Mor
talit
y O
R
c) BMI* among men d) WHR† among men
Mor
talit
y O
R
Mor
talit
y O
R
* BMI reference level is 23; † WHR reference level is 0.9. Figure 3. Odds ratios plotted for BMI and WHR by gender. Fitted model (in red) and 20 randomly selected bootstrap replicate models (in black).
144
CONCLUSIONS
The obesity research field has grown considerably in recent years, but I believe
undue attention has been devoted to two postulated causes for increases in the prevalence
of obesity leading to neglect of other plausible mechanisms and well-intentioned, but
potentially ill-founded proposals for reducing obesity rates. For at least 10 putative
additional explanations for the increased prevalence of obesity over recent decades, in my
first paper I showed supportive (though not conclusive) evidence that, in many cases, is
as compelling as the evidence for more commonly discussed putative explanations.
Although the effect of any one factor may be small, the combined effects may be
consequential.
This paper illustrates the importance of developing new ideas and challenging the
assumptions we make in research. Currently, many researchers are investigating the
connections between the putative contributors we described and the obesity epidemic.
Each of the 10 putative contributors we highlighted deserves more attention, but sleep
debt has stood out as a strong candidate for immediate extensive research. Both sleep
research and obesity research are popular and investigators have been proposing (e.g.,
http://clinicaltrials.gov/ct2/show/NCT00261898?term=obesity+AND+sleep&rank=1) and
conducting observational studies or experiments relating sleep and the regulation of
certain aspects of metabolism including appetite and satiety which may affect excess
adipose tissue accumulation (e.g.’s, Gangswisch et al., 2005; Hasler et al., 2004; Speigel
et al., 2004). Other putative contributors are being investigated as well, such as study of
145
endocrine disruptors (e.g.,
http://crisp.cit.nih.gov/crisp/CRISP_LIB.getdoc?textkey=7229996&p_grant_num=5R21
ES01372402&p_query=(obesity+%26+endocrine+%26+disruptors)&ticket=74028657&
p_audit_session_id=357566744&p_audit_score=14&p_audit_numfound=1&p_keywords
=ob) and environmental temperature (e.g.,
http://clinicaltrials.gov/ct2/show/NCT00521729?term=obesity+AND+temperature&rank
=1). These efforts will likely result in even more cogent investigations in future
interdisciplinary research providing a more comprehensive picture of what might be
driving the epidemic so that efforts to curb the secular increases in obesity may be better
guided.
With data resources growing in size and scope, so too should our abilities to draw
connections between health outcomes and possible predictors. I have provided extensive
details on the design, evaluation, and implementation of a framework for modeling
nonlinearity between a binary outcome and a continuous prognostic variable adjusted for
covariates in complex health survey samples. The primary objective of this methodology
was to analyze non-random survey samples by applying sophisticated modeling
techniques capable of detecting nonlinearity and adjusting model flexibility. It is
important that this methodology be useful in practice, so providing familiar-looking
parameterizations of output, such as linear slope coefficients and odds ratios, was a key
objective. Unlike other nonlinear modeling packages, my framework accounted for
multistage cross-sectional survey sample designs common to nationally representative
datasets. Under the conditions I simulated, my method of selecting the optimal number of
knots was commonly more accurate than Schwarz’s Bayesian Information Criterion
146
(BIC) (Schwarz, 1978) and very similar to Akaike’s Information Criterion (AIC)
(Akaike, 1974) in terms of accuracy and precision as long as sample sizes were relatively
large. Moreover, AIC and BIC were substantially biased in model selection when
sampling weights were incorporated.
In the application of my framework to examine the relationship between BMI and
headaches among over 220,000 women, I found that a BMI of approximately 20 was
associated with the lowest risk of headache. Relative to a BMI of 20, mild obesity (BMI
of 30) was associated with roughly a 35% increase in odds of headache whereas severe
obesity (BMI of 40) was associated with roughly an 80% increase in odds. Results were
essentially unchanged when models were further adjusted for socioeconomic variables,
alcohol consumption, and hypertension. Consistently, across the study databases I
analyzed, obese women had significantly increased risk of headaches.
Menopause was a possible confounder of some of the results, but I would suggest
that it is unlikely to have a strong influence on the BMI-migraine association as our
results from WHI were consistent with other studies (e.g., Bigal et al., 2006) which were
not based on older women (as WHI was). From communicating with people in the field
about these results, a common opinion is that the personal volition to take medication for
illness or pain can be highly heterogeneous from person-to-person. While this might not
have confounded our results from NHANES I, per se, it might have added enough
“noise” to the outcome response (coded as headache vs. no headache) such that we could
not detect a significant association even with thousands of observations. Behavior is
almost certainly an important factor; over which our analyses had no control. Stress,
physical activity, and diet behaviors each seem to have a strong impact on the frequency
147
and severity of migraine headaches. Research efforts on these connections and finding
treatment interventions are currently ongoing (e.g., The American Migraine Prevention
Study: http://clinicaltrials.gov/ct2/show/NCT00363506?term=headache+diet&rank=4).
BMI showed a nonlinear association with mortality in NHANES III where models
indicated thresholds in the odds of mortality at BMI values near 19 and 23 for women
and men, respectively. The broken line shapes of the BMI-mortality relationships were
similar for both men and women suggesting no significant elevation in odds with
increasing BMI. The data, however, suggested that a longer follow-up time might be
required for characterizing mortality at high levels of BMI. Linear logistic models were
adequate to relate WHR to mortality and suggested that increased WHR linearly
increased log(odds) of mortality among women (β = 2.4; 95% CI (1.3, 3.5)), but not
among men. For more information on the influence of ignoring the complex sample
design information and secondary analyses of BMI and mortality, see Appendix B.
I suggest that the methodology and information described in this research will be
useful to clinicians, public health scientists, epidemiologists, and biostatisticians. I have
implemented the only free-knot spline logistic regression modeling framework which
makes adjustments for complex sample designs. By linearly transforming the B-spline or
truncated power basis parameters to the more familiar-looking piecewise linear
polynomial parameter representations I have provided a powerful estimation and
inference tool; the output of which may be understood by non-statisticians. Since BMI is
an easily measured and modifiable risk factor, clinicians and public health officials might
use the results I displayed from the application of my framework to advise or counsel
patients regarding the risk associated with being “below” or “above” a threshold BMI
148
level I detected to reduce their risk of headache or mortality. My results on the linear
relationship of WHR and mortality suggest that as compared to a woman with an average
WHR (0.9), similar women having WHR of 1.0 or 1.1 might have an increase in odds of
mortality of 25% or 62%, respectively. While these results should be interpreted with
some caution considering that these are observational cross-sectional results, they do
suggest that more focused prospective studies might produce similar results and be more
useful for direct application in clinical care and public health policy.
149
GENERAL LIST OF REFERENCES
Akaike H. (1974) A new look at the statistical model identification. IEEE Transactions
on Automatic Control;19:716–23.
Allison DB, Zannolli R, Narayan KVM. (1999) The direct health care costs of obesity in
the United States. American Journal of Public Health;89:1194-1199.
Bessaoud F, Daurès JP, Molinari N. (2005) Free knot splines for
logistic models and threshold selection. Computer Methods and Programs
in Biomedicine;77:1-9.
Bender R, Augustin T, Blettner M. (2005) Generating survival times to simulate Cox
proportional hazards models. Statistics in Medicine;24:1713-1723.
Bigal ME, Liberman JN, Lipton RB. (2006) Obesity and migraine: A population study.
Neurology;28:545–50.
Calle EE, Thun MJ, Petrelli JM, Rodriguez C, Heath CW Jr. (1999) Body mass index and
mortality in a prospective cohort of US adults. N Engl J Med;341:1097–1105.
Cox, D.R., 1961. Tests of separate families of hypotheses. Pro-ceedings of the Fourth
Berkeley Symposium on Mathematical Statistics and Probability. University of
California Press, Berkeley, pp. 105–123.
Cox, D.R., 1962. Further results on tests of separate families of hypotheses. Journal of the
Royal Statistical Society. Series B 24, 406–424.
Cox DR. (1972) Regression models and life tables (with Discussion). Journal of the
Royal Statistical Society, Series B;34:187-220.
150
Cox DR. (1975) Partial likelihood. Biometrika;62:69-72.
Davison AC, Hinkley DV. (1997) Bootstrap Methods and their Application. New York:
Cambridge University Press.
de Boor C. (1978) A Practical Guide to Splines. New York: Springer-Verlag.
Efron B. (1977) The efficiency of Cox’s likelihood function for censored data.
JASA;72:557-65.
Flegal KM, Graubard BI, Williamson DF, Gail MH. (2005) Excess deaths associated
with underweight, overweight, and obesity. JAMA;293:1861-7.
Fontaine KR, Redden DT, Wang C, Westfall AO, Allison DB. (2003) Years of life lost
due to obesity. JAMA;289:187-93.
Gangwisch JE, Malaspina D, Boden-Albala B, Heymsfield SB. (2005) Inadequate sleep
as a risk factor for obesity: analyses of the NHANES I. Sleep;28:1289-96.
Hasler G, Buysse DJ, Klaghofer R, Gamma A, Ajdacic V, Eich D, Rossler W, Angst J.
(2004) The association between short sleep duration and obesity in young adults:
a 13-year prospective study. Sleep;27:661-6.
Keith SW, Wang C, Fontaine KR, Allison DB. (in press) Body mass index and headache
among women: Results from 11 epidemiologic datasets. Obesity;16:377-83.
Keith SW, Desmond R, Allison DB. (2006b) Body fat and mortality: A survival analysis
of the third National Health and Nutrition Examination Study (NHANES III).
Obesity;14 Suppl A262.
Keith SW, Redden DT, Katzmarzyk P, Boggiano MM, Hanlon EC, Benca RM, Ruden D,
Pietrobelli A, Barger J, Fontaine K, Wang C, Arronne L, Wright S, Baskin M,
Dhurandhar N, Lijoi M, Grilo CM, De Luca M, Allison DB. (2006) Putative
151
Contributors to the Secular Increase in Obesity: Exploring the Roads Less
Traveled. IJO;30:1585-94.
Klein JP, Moeschberger ML. (2003) Survival Analysis, Second Edition. Springer-
Verlag, New York.
Korn EL, Graubard BI. (1999) Analysis of Health Surveys. J. Wiley & Sons, New York.
Molinari N, Daures JP, Durand JF. (2001) Regression splines for threshold selection in
survival data analysis; 20:237-247.
Narayan KM, Boyle JP, Thompson TJ, Gregg EW, Williamson DF. (2007) Effect of BMI
on lifetime risk for diabetes in the U.S. Diabetes Care;30:1562-6.
Parmigiani G. (2002) Measuring uncertainty in complex decision analysis models.
Statistical Methods in Medical Research;11:513-37.
Rao JNK. (1997) Developments in sample survey theory: an appraisal. The Canadian
Journal of Statistics;25:1-21.
Rao JNK, Wu CFJ, Yue K. (1992) Some recent work on resampling methods for
complex surveys. Survey Methodology;18:209–217.
Schwarz G. (1978) Estimating the dimension of a model. Ann. Stat;6:461-464.
Spiegel K, Tasali E, Penev P, Van Cauter E. (2004) Brief communication: Sleep
curtailment in healthy young men is associated with decreased leptin levels,
elevated ghrelin levels, and increased hunger and appetite. Ann Intern
Med;141:846-50.
Therneau TM, Grambsch PM. (2000) Modeling Survival Data: Extending the Cox
Model. New York: Springer.
152
U.S. Department of Health and Human Services (DHHS). (1996) National Center for
Health Statistics. Third National Health and Nutrition Examination Survey, 1988-
1994, NHANES III Laboratory Data File (CD-ROM). Public Use Data File
Documentation Number 76200. Hyattsville, MD: Centers for Disease Control and
Prevention.
153
APPENDIX
A. Future Directions: Nonlinear Cox Proportional Hazards Regression
Since the proposed framework can be based on maximizing likelihood functions, I
anticipate that the method may be extended to censored time-to-event data models to
construct nonlinear Cox proportional hazards models, that will rely on the partial
likelihood theory introduced by Cox (1972; 1975). The basic Cox model relies on a
hazard relation. Suppose that X is an N x 1 vector of data on some continuous prognostic
variable of interest, Z represents a N x (p + 1) matrix consisting of a column of ones
followed by p columns of data on covariates, and t is a vector of time to event or
censoring. The basic Cox model relies on the hazard relation,
λ(t | Z,X) = λo(t)exp{[Z X]β} ,
where t represents time to either the event of interest or right censoring (at the end of the
study or otherwise lost to follow-up) and β is a (p + 2) x 1 vector of regression parameter
coefficients which can be estimated without specifying the baseline hazard function, λo(t),
by choosing β̂ to maximize the partial likelihood (Cox, 1975).
The linear relationship between log0
( | )( ),λ
λ⎧ ⎫⎨ ⎬⎩ ⎭
tt
Z X and [Z X]β is linear in β and can
be easily modified to include a nonlinear predictor by replacing X with a B-spline basis
and representing 2p+β with a linear combination of B-spline coefficients.
In general, the values in Z or X can be either fixed under the assumption that they will
not change over time or they can be modeled if they vary with time: Z(t) or X(t). For the
154
immediately planned applications, I may assume X to be fixed and leave modeling of
time varying covariates for future research perhaps to coincide with modeling survival as
a counting process (Therneau and Grambsch, 2000).
Our method will extend to model hazard ratios nonlinearly by estimating hazard
by,
λ(t | Z,X) = λo(t)exp{ ([ ])η Z X },
where for the qth individual,
( )1
0 1 2 p ( 1) 21
[ X ] Z ... Z B XK
q q q q q q p i i qi
bη β β β+
+=
= + = + + + + (∑Z Z B bβ ) .
Here I will be estimating parameters by maximizing the nonlinear weighted partial
likelihood,
( )( ){ }
( ){ } { }11
exp [ X ]|
exp [ X ] I
q qwN q q
Nq q q j qj
PL t t
δ
η
η==
⎡ ⎤⎢ ⎥=⎢ ⎥≤⎣ ⎦
∏∑
Z, X
θZ
Z,
where θ = {β0, β1,…, βp, b1,…, bK+1}, wq is the complex sample weight assigned to the qth
participant, and δq is a censorship variable such that,
1, if is an event time
0, if is a censored time.q
t
tδ
⎧⎪= ⎨⎪⎩
As it may be more convenient to optimize by maximization in PROC NLP, the nonlinear
log weighted partial likelihood is,
( ){ } ( ) ( ){ } { }1 1
log | [ X ] log exp [ X ] IN N
q q q q q q j qq j
PL w t tδ η η= =
⎡ ⎤⎧ ⎫⎪ ⎪= − ≤⎢ ⎥⎨ ⎬⎪ ⎪⎢ ⎥⎩ ⎭⎣ ⎦
∑ ∑θ Z, X Z Z .
It is important to note that I may expect to see a fairly large number of ties in survival
times and adjustment to the partial likelihood must be made. Efron (1977) proposed an
155
adjustment to the partial likelihood for this situation and his method will be incorporated
in to our nonlinear framework.
I propose in this section a novel method of selecting the optimal number of knots.
Like Bessaoud et al. (2005) and Molinari et al. (2001) I will be interested in identifying
the fitted knots and evaluate their potential to be interpretted thresholds defining
clinically (or biologically) meaningful groups having differential patterns of risk. It will
be very important to specify a parsimonious number of knots, say K = 4 or fewer, which
would indicate 5 or fewer different risk groups. Therefore, keeping the framework from
producing overparameterized models having unnecessary knots will be a priority.
Again, I are considering a set of p linear and/or categorical covariates and one
potentially nonlinear prognostic variable. Model selection herein is aimed at selecting the
number of meaningful knots that best suits a given set of complex sample data. Our
technique involves a selection procedure based on Monte Carlo sampling of partial
likelihood ratio statistics for the addition of parameters (i.e., knots and adjoining slopes)
to a given piecewise linear model.
Note here that these free-knot models are generally non-nested models (Cox 1961,
1962) and thus requires the testing of nearly all possible max 12
K +⎛ ⎞⎜⎝ ⎠
⎟ models, where Kmax
represents the maximum number of knots (i.e., 4). The exception being that a K = 0 linear
model is nested in every free-knot spline model. So, I would have to conduct all pairwise
tests for models shown to be significantly better than the linear Cox model. That is, the
number of tests against the K = 0 model will be Kmax plus the number of models found
significantly better than the K = 0 model (say S):
156
# tests = . { }max I 12S
K S⎛ ⎞
+ > ⎜ ⎟⎝ ⎠
The test statistic for the least squares framework will be a partial likelihood ratio (PLR),
( )( )
0
1
ˆ |
ˆ |
PLPLR
PL=
X
X
θ
θ,
where represents the parameters found to be the best fit for some model having K0θ̂ 0
knots which is then considered to compose the null model (including the intercept, linear
coefficients, free-knots, and the spline parameters (i.e., the piecewise linear slopes)); and
represents the parameter set that maximized the partial likelihood under some
alternative model having K
1θ̂
1 knots such that K0 < K1 ≤ Kmax.
The parametric bootstrap described by Davison and Hinkley (1997) is a tool I can
use to repeatedly simulate Monte Carlo datasets. I use the fitted model parameter
estimates from the null model fitted to the original data to compute a hypothetic
distribution of PLR statistics, under the null hypothesis that the “reduced” null model
having K0 knots is true.
Say I draw D parametrically bootstrapped datasets of outcomes (censored time-to-
event random variables) and compute the PLR distribution { }*D
*1 PLR,...,PLR by fitting
models having K0 and K1 knots, respectively, to each of the D bootstrap datasets. The
bootstrap p-value representing the probability that adding (K1 - K0) knots to the null
model produces a PLR at least as small as what was observed by chance alone if the K0-
knot null model were true,
∑=
≤=D
1i
*iboot }PLRPLR{
D1p I .
157
I let α represent the significance level for this parametric bootstrap test of the
contribution to significantly increasing the partial likelihood. Comparing pboot to α gives
us a likelihood ratio test of size α for the contribution to the likelihood from the K1-knot
model. Simulating a variety of data and fitting models at various levels of α should lead
to an broad understanding of the specificity and sensitivity of the parametric bootstrap
testing and how to properly adjust the framework for optimal parsimony.
Parametric bootstrapping for censored time-to-event outcomes
Imputing repeated simulated outcomes by the parametric bootstrap procedure for
free-knot selection in the piecewise logistic regression modeling (see step 4 in Table 1) is
fairly straight forward. The free-knot selection in the piecewise Cox regression modeling
will be more complicated. In order to estimate a distribution of censored time-to-event
outcomes under the null hypothesis of K knots, I will need to not only estimate the linear
and spline parameters, but also the baseline hazard function by Breslow’s estimator
(Klein and Moeschberger, 2003) to estimate the survival probabilities for each subject.
Once I have estimated the model parameters and baseline hazard function, I plan to
generate censored survival times by the methods outlined by Bender et al. (2005).
B. On the effects of ignoring sample weights or those with extremely high BMI
To examine how robust our findings might be to ignoring the sample weights and
assuming the data were drawn by simple random sampling, I analyzed some of the data
without the survey weights and design information. Some of the headache study datasets
were not nationally representative and did not include sample weights. In these datasets,
158
the results were very consistent across studies that ascertained information on the
frequency or severity of headache and when I analyzed the NHIS 2003 data without
sample weights the results were quite similar, but I noted some obviously deflated
variance estimates in comparison to the complex sample weighted results. For the
mortality study, I used the AIC and BIC methods to select the “best” nonlinear models
for BMI and WHR by gender group. This approach avoided the parametric 2 df testing
procedure allowing me to grid search over 9 BMI and WHR locations (as opposed to 6)
in a reasonable amount of time. The WHR parameter estimates were nearly identical in
magnitude, but the variance estimates were somewhat deflated. BMI, on the other hand,
would allow two knots into the model for women and three knots for the men. Figure 1
shows the results plotted in terms of odds ratios. The model for women maintained the
steep slope, as we saw before, for low BMI and the first knot remained near BMI = 20
after which the slope increased linearly until the second knot made a breakpoint around
50 for specifying a different very steep slope reflecting the lack of mortality information
for those with extremely high BMI. The model for men maintained the steep slope, as we
saw before, for low BMI and the first knot remained near BMI = 23 after which the slope
was flat until the second knot made a breakpoint around 36 for specifying an increase in
risk for the severely obese. The third knot breaks the relationship again at around BMI =
42 at which point the leverage of those with BMI ≥ 45 showing little or no mortality
information caused our model to show decreased mortality odds with increased BMI
among those with extremely high BMI. These results demonstrate what models are likely
to be fitted with our framework once we have the computing power to grid search over
more potential knot locations in the BMI spectrum.
159
a) BMI* among women b) BMI* among men M
orta
lity
OR
Mor
talit
y O
R
* BMI reference level is 23 Figure 1. NHANES III mortality odds ratios plotted for BMI by gender from models selected without consideration for sample design. Models shown were selected by using AIC.
To see what the models would look like if we removed those participants with
extremely high BMI, I ran the analyses over again, but after having removed any subjects
with BMI > 45. Figure 2 shows the results from these models. The results for the women
in this restricted dataset were nearly identical to the results previously reported by our
methods. However, the model fitted to the men in this restricted dataset was quite
different. It specified four knots located at { }BMI 25.2,35.7,40.4,41.7 .∈ I am confident
that we are seeing some overfitting in this model especially at BMI > 40, but these the
two knots at 40.4 and 41.7 significantly increased the likelihood that the data came from
this model and were necessary to partition the data to better reflect the increased odds of
mortality we expect to see, given the information presented in the fourth paper, among
men having 35 < BMI < 40.
160
a) BMI* among women b) BMI* among men
Mor
talit
y O
R
Mor
talit
y O
R
* BMI reference level is 23 Figure 2. NHANES III mortality odds ratios calculated by our framework plotted for participants having BMI ≤ 45 by gender.
161