Upload
claud-ferguson
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Health effects modeling
Lianne Sheppard University of Washington
Outline
Introduction Conceptual overview for health effect studies
Disease and risk model Exposure and measurement models
Health effects study designs and relationship to exposure assessment Measured exposure focused through the lens of study design
Challenges in health modeling Example 1: Cohort study – implications of predicted exposure Example 2: Time series study – understanding the estimated
health effect parameter Discussion
Introduction Epidemiological study interpretation
Estimates of association in the context of the particular study Study population Health outcome Exposure metric and data Confounders and other adjustment variables Study design
Can’t infer causality from observational studies Goal: Understand properties of health effect estimates in
epidemiological studies Context: Health effects of ambient air pollution My interests:
Impact and implications of specific study designs in the context of the exposure data – study design as a lens to focus the data
Role of exposure assessment and data on health effect estimates
Conceptual framework for health effect studies
Disease model Relates the true environmental exposure to the disease outcome Includes the health effect parameter(s) of interest
Exposure model Describes the distribution of exposure over space, time, and
individuals Measurement model
Relates measured exposures to the true unknown exposure Study design
Sources of exposure variation should frame the design of any epidemiological study
Limitations in exposure assessment that will lead to measurement error bias must also be considered
Disease model
Relates the exposure to the disease model, e.g.
E(Yit) = exp(XPitβ+Zitγ)
for the outcome Yit on individual i at time t,
personal exposures XPit and
health effect parameter β β is the parameter of interest – “toxicity” Also includes
Confounders and other adjustment variables (Zit) A dependence model (as needed)
Risk model
The disease model includes the risk model – a model to reflect risk over time
Under an expanded risk model, the disease model is
where β(t;s) denotes the influence of exposure at time s on
risk at time t.
E( ) = exp ( ; )Pit is it
s t
Y X t s Z
Risk model examples
Current risk: Risk at time t is affected by exposure at time t:
Cumulative constant risk: Risk is determined by cumulative exposure during the previous m days:
Lagged constant risk: Risk is determined by cumulative exposure during the previous m days lagged n days
Cumulative time-varying risk: Risk varies over time and is determined by cumulative exposure during the previous m days
1E( ) = exp Pit it itY X Z
2( 1)
E( ) = expt
Pit is it
s t m
Y X Z
3( 1)
E( ) = expt n
Pit is it
s t n m
Y X Z
( 1)
E( ) = expt
Pit is s it
s t m
Y X Z
Basic personal air pollution exposure model (e.g. particulate matter – PM)
Total personal exposure :
Total personal exposure
=Non-ambient source exposure
+ Fraction of ambient * Ambient source
concentration
XPit = XN
it + αit * Cit
• Ambient source exposure: Ambient source exposure: XXAAitit= = αitCit
• We can measure We can measure Cit and and XPit,,
• Assume ambient and non-ambient sources are Assume ambient and non-ambient sources are independentindependent
Person i
Time t
Exposure model component: α
Fraction of ambient concentration experienced as exposure:
αit = oit + (1-oit) Finf(it)
• oit is the fraction of time spent outdoorsis the fraction of time spent outdoors
• Finf(it) is the infiltration efficiency is the infiltration efficiency (building filter)(building filter)
• Varies by season, person/building, region, Varies by season, person/building, region, species (or characteristic)species (or characteristic)
• Note Note 0 1it
Measurement model
Needed because typically only measurements of Cit are available while XP
it or XAit are of interest
The measurement model defines sources of variation: The data don’t capture (“Berkson”) The data capture but aren’t of interest (“classical”)
Measurement models Are needed to avoid bias Are assumed to not provide additional information
about health effects
Health effect study designs – Ambient source air pollution exposure Rely most on short-term temporal exposure variation:
Panel studies Time series studies Case-crossover studies
Rely most on spatial exposure variation: Cohort studies Migration studies
Rely on either or both temporal and spatial variation: Medium term longitudinal studies Cross-sectional studies
Panel studies
Enroll a panel of subjects and observe them repeatedly over time Strengths
Possible to collect comprehensive personal, home indoor, and home outdoor exposure data on every subject
Uniquely suited to study personal exposure effects Can directly measure health outcomes
Challenges High effort for a limited number of subjects Power limited for affordable studies and rare outcomes Significant feasibility issues need to be overcome Can be very difficult to detect small effects because of the large
heterogeneity in individual responses and uneven compliance to study protocol (medication use, data collection)
Heterogeneity between subjects can swamp the small effects of air pollution Analysis approach can affect conclusions, particularly with typical small panel sizes
Time series studies Estimate the association between time-varying ambient
concentration and time-varying population event counts Rely on temporal exposure variation
Strengths Simple and inexpensive (use administrative data) Powerful -- can target huge populations Appear uniquely suited to estimate acute health effects of ambient pollutants
for rare events Bias due to spatial variation in PM is likely to be small
Challenges Sources of bias not well understood (Is an ecological design => possible
ecological bias) However individuals are crossed with time so ecological biases much less
likely to dominate than when individuals are nested Results can be sensitive to modeling choices (and software)
Confounding removed through modeling Don’t capture chronic effects, non-ambient exposures
Don’t estimate toxicity (rather estimate attenuated toxicity, attenuated for building characteristics and population behavior)
Case-crossover studies
Assess acute effects of air pollution by comparing exposures on the day with an event (index day) to days without the event (referent days)
Essentially time series studies with a different approach to confounding control: Confounding controlled by matching (and modeling) rather than
modeling alone Some approaches to referent selection lead to biased health effects
(overlap bias) Time-stratified referent selection recommended:
Commonly used symmetric bidirectional referents are subject to overlap bias
Similar scientific considerations as time series studies
Cohort studies
Follow subjects over time to relate some measure of usual exposure to health events
Rely on variation in exposure over space (shared exposure) and individual (total exposure, including unshared components)
Incomplete exposure ascertainment implies Need to rely on an exposure prediction model Because of limited exposure assessment, these are semi-
individual studies Can’t rule out ecological biases Individuals are nested within areas
Unclear how to best accumulate exposure over time. What are the implications? e.g., Average exposure Time-varying risk model
Challenges in analysis and interpretation of epidemiological studies – Bias Air pollution health effects are small and thus can be
easily swamped by even small biases Confounding is
A major source of bias Orders of magnitude larger than the air pollution effect of
interest Other less well understood issues
Exposure vs. concentration and attenuation of ambient exposure (recall ambient exposure=ambient concentration*α)
Loss of information Bias Policy implications
Specification, cross-level, and overlap biases Model selection
Small Effects and Large Confounders: Air pollution signal is an order of magnitude smaller than confounder effects (time series study example)
Courtesy of Francesca Dominici and NMMAPS
Challenges in analysis and interpretation of epidemiological studies – Uncertainty
Uncertainty of the model: key features Linearity of the exposure-response model Which single or distributed lags in the risk model? Multiple pollutants Confounder control Exposure data, metrics, and measurement error
How does measured “exposure” relate to true exposure?
Additional model selection issues Model selection process often not disclosed Model averaging as an alternative
Exposure data considerations for health effects analyses Considerations in study planning:
Source of variation needed for study design Measurements available or feasible to collect Predicted exposure required? Interpretation of estimated health effects depends on exposure
data used in the analysis
Example 1: Effect of prediction on cohort study health effect estimates
Example 2: Time series study health effects estimates: Interpretation and relevant features of personal exposure when concentration is used in the analysis
Cohort study and predicted exposure example: Simulation set-up
Realistic setting: Monitored PM2.5 data Outcome model based on cardiovascular events using published
estimates (Women’s Health Initiative, Miller et al 2007)
Los Angeles geography Compare exposure prediction models: Nearest monitor vs. universal kriging Simulation structure
Simulate spatially dependent exposure for subject residences and monitoring sites
Explore a variety of exposure models Use true exposure to generate the health outcome data
Predict exposure from monitoring site data only Estimate health effects conditioned on modeled (and true) exposure
Purpose: To investigate how prediction of pollutants over space affects estimated relative risk in a cohort study
Cohort study and predicted exposure example: Simulation study area
Underlying PM2.5 AQS monitoring data
PM2.5 Air Quality Standard (AQS) monitors
22 monitors in five counties in greater Los Angeles
PM2.5 concentration in year 2000
(black < red < green < blue) Spatial analysis to
estimate parameters: Mean (using
geographic covariates)
Variance Range Partial sill Nugget
Exposure (concentration) models
Multivariate normal distribution with spatial autocorrelation using assumed mean and covariance model parameters
Realizations of PM2.5 at 2,000 residences and 22 monitoring sites Five underlying exposure models using different spatial structures:
True exposure Source of spatial variability Initial parameters
Model (TEM) RangePartial sill
Nugget Mean
TEM 1 Geographic characteristics Short Small Small 2nd order
TEM 2 Medium range Middle Large Small Constant
TEM 3 Measurement error only Short Small Large Constant
TEM 4 Short rangeShortest
Large Small Constant
TEM 5 Long range Long Large Small Constant
Examples of spatial surfaces
Spatial surface* of five exposure models (lighter = higher concentration):
Geographic characteristics Medium range
Measurement error only Short range
Long range
*One realization of each surface
True and predicted PM2.5
Relationship between true and predicted PM2.5 at 2,000 individual sites in one simulation :
Observations: Better association between
predictions and true values when there is more spatial structure
Spatial structure can be In the mean model (TEM 1) In the variance model (TEM 5)
Models 1 and 2 were based on different estimated fits to the same data, with model 1 allowing a spatially varying mean and model 2 assuming a constant mean. Model 1 appears to capture spatial structure better.
True vs. Nearest True vs. Kriged Nearest v.s Kriged
Geog Char
Med Range
Meas ErrorOnly
ShortRange
LongRange
Health effect estimates – Geographical characteristics exposure Comparison of β estimates for true and modeled PM2.5
True exposure True exposure Nearest neighbor
True exposure vs. nearest neighbor
True exposure vs. kriged
Nearest neighbor vs. kriged
Kriged
Nearest
Kriged
x=y line best fit line
Health effect estimates – Exposures with little spatial structure
Measurement error only
Short Range – Low spatial correlation
Health effect estimates – Spatially dependent exposures only in the variance model
Medium range – medium spatial correlation
Long range – High spatial correlation
Health effect estimates – Summary
True exposure Fitted exposure Bias2 VarianceMean squared
errorCoverage probability
Geographical True 0 56 56 0.96characteristics Nearest 34 105 139 0.85
Kriging 0 153 153 0.92Medium range – True 0 31 31 0.95Medium spatial Nearest 33 58 91 0.76
correlation Kriging 1 734 735 0.74Measurement error True 0 9 9 0.95
Nearest 465 15 480 0.00Kriging 469 346 815 0.54
Short range – Low True 0 9 9 0.95spatial correlation Nearest 327 23 350 0.03
Kriging 342 778 1120 0.58Long range – High True 0 69 69 0.95spatial correlation Nearest 30 125 155 0.87
Kriging 1 426 427 0.89
Conclusions: Impact of predicted exposure on cohort study health effect estimates
Exposure prediction Kriging prediction gave better estimates of PM2.5 than nearest
monitor prediction Less biased Generally smaller prediction error
Kriging predictions were less variable than nearest monitor predictions
Health effect estimates Kriged PM2.5 as compared to nearest monitor PM2.5 had:
Better coverage (in most cases) Less biased health effect estimates More variable health effect estimates (and thus worse MSE)
Underlying exposure models with higher spatial dependence had better coverage
Results more consistent with prior expectations for a Berkson measurement error model
Less that 95% coverage with predicted exposure Not incorporating uncertainty of prediction in this analysis
Discussion: Impact of predicted exposure on cohort study health effect estimates
Other lessons learned: More dense monitoring doesn’t change these results
Only 22 monitor measurements Same results for up to 42 monitors
Not all the kriging results were believable Spatial statistics is iterative, uses judgment and thus is not well suited to
our nonjudgmental approach to the simulations Some realizations of kriging parameter estimates were unacceptably
large Universal kriging performed better on average than ordinary kriging
Fewer poor estimates of kriging parameters, even when the true exposure had a constant mean
Better coverage for health effects Spatial pollution structure best suited to modeling and good health
effect estimates: High spatial variability Spatial variability characterized in the mean model Spatial variability in the variance model should have long range and a
smaller partial sill so there is relatively small prediction error variance.
Acute Air Pollution Health Effects:Sources of Bias in Time Series Studies
Use of concentration when exposure is of interest Not estimating toxicity Not accounting for time-varying ambient attenuation
Substitution of measured for true concentration Classical measurement error
Dropping the within-day component of exposure variation by using central site concentration measurements Specification bias (small because the effects are small) Cross-level bias (inference on effects in individuals
when the data only come from groups) Inadequate adjustment for covariates
Uncontrolled confounding Multipollutant exposures
Time series study example: Impact of
aspects of personal exposure – Set-up We conducted simulation studies to assess the
behavior of time series study estimates under differing exposure and measurement models
Assume Acute risk model (same day exposure only) Total personal exposure affects true disease risk Only ambient concentration is measured and used in
the time series study analysis Simulate individual data; analyze using a time series
study design
Time series study example: Impact of
aspects of personal exposure – Set-up Assume a true individual-level disease model with personal exposure
Personal exposure model:
Generate NT personal exposures and binary events for N=100,000 individuals on T=1,000 days
Use a time series study analysis with ambient concentration measurements, i.e. fit
Assess the impact of Major independent non-ambient exposure contributions Seasonally varying ambient attenuation α Varying characterizations of daily exposure or concentration
measurements
E( ) exp( )Pit itY X
ˆE( ) exp( [ ])t tY C
XPit= [nonambient source ]it+ αitCit
Time series study example: Impact of
aspects of personal exposure – Results Time series studies estimate αβ – toxicity times
ambient attenuation Non-ambient source exposure doesn’t affect
estimates when it is independent of ambient concentration
Variation in α affects time series study results when it is seasonal and correlated with ambient concentration: (supported by data – see next slide) Larger estimates if α is high when concentration is high Smaller estimates if α is low when concentration is high
Average concentration from multiple monitors improves estimates slightly (reduction in classical measurement error)
Central Air Conditioning (%)
0 10 20 30 40 50 60 70 80
CV
D C
oeffi
cie
nt
0.0000
0.0005
0.0010
0.0015
0.0020
0.0025
Regression Coefficients for CVD-Related Hospital Admissions vs. Ambient PM10
Janssen N, Schwartz J, Zanobetti A, Suh H (2002). Environ. Health Perspect.
Summer peaking citiesWinter peaking cities
Slide courtesy of Doug Dockery
↑ => smaller summer α
Time series study example: Impact of
aspects of personal exposure – Summary
Measurements’ effect on health effect parameter interpretation:
Models with concentration as the predictor don’t estimate toxicity alone: When the disease model has a simple form, e.g. E(Y)=exp(XAβ) =exp(Cαβ) β is toxicity Assuming XA=Cα, the disease model with ambient
concentration has parameter αβ. Differences between estimates of αβ can be due to
variations in α (e.g. due to season, region or individual) Huge policy implications that variation in time series study
health effect estimates is not (only) toxicity
Time series study example: Impact of
aspects of personal exposure – Discussion Ambient attenuation (α) is not just measurement error
In models with concentration as the predictor, it changes the interpretation of the estimated health effect parameter (not just toxicity)
α has structure that varies by season, region, person, species (due to e.g. size, reactivity)
Averaging exposure over time or area averages over α
α is not measured and properties (e.g. seasonality, population patterns) not well understood – Important area for exposure assessment research
Discussion – Health modeling in the context of exposure data These two examples illustrate ways study design and exposure
data influence The health effect parameters estimated The characteristics of the health effect estimates
Design of choice depends on: Health outcome of interest Exposure characteristics of interest (e.g. is exposure usual or unusual?)
What sources of variation in exposure do available exposure data capture?
If an exposure prediction model is needed, are there sufficient data to produce a good model that captures the key sources of variation?
Feasibility
Discussion – Other research directions
Link health effect parameters from acute and chronic exposures Ascertain time-varying risk in cohort studies Incorporation of complex risk models into policy estimates
Effect of exposure structure on estimates in single vs. distributed lag models
Multipollutant exposures More complete estimates of uncertainty. Uncertainty due to:
Model selection Exposure assessment and predicted exposure Form of the distributed risk model Confounder selection Subgroup selection