38
Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes David Moriña, Georgina Casanovas and Albert Navarro December 07 2014, Pisa

Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Embed Size (px)

Citation preview

Page 1: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Use of multivariate survival models with common baselinerisk under event dependence and unknown number of

previous episodes

David Moriña, Georgina Casanovas and Albert Navarro

December 07 2014, Pisa

Page 2: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Recurrent event data refers to situations where the subject can experi-ence repeated episodes of the same type of event

• There are many examples such as injuries, nosocomial infections, asthmaattacks . . .

• If these phenomena are studied via a cohort, the application of survivalmethods would seem appropriate

• During the 1980s, at theoretical level, and during the 1990s and early21st century in practical terms through development of software, variousmethods have been proposed to tackle events of this type, both non-parametric and parametric

2 / 28

Page 3: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Recurrent event data refers to situations where the subject can experi-ence repeated episodes of the same type of event

• There are many examples such as injuries, nosocomial infections, asthmaattacks . . .

• If these phenomena are studied via a cohort, the application of survivalmethods would seem appropriate

• During the 1980s, at theoretical level, and during the 1990s and early21st century in practical terms through development of software, variousmethods have been proposed to tackle events of this type, both non-parametric and parametric

2 / 28

Page 4: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Recurrent event data refers to situations where the subject can experi-ence repeated episodes of the same type of event

• There are many examples such as injuries, nosocomial infections, asthmaattacks . . .

• If these phenomena are studied via a cohort, the application of survivalmethods would seem appropriate

• During the 1980s, at theoretical level, and during the 1990s and early21st century in practical terms through development of software, variousmethods have been proposed to tackle events of this type, both non-parametric and parametric

2 / 28

Page 5: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Recurrent event data refers to situations where the subject can experi-ence repeated episodes of the same type of event

• There are many examples such as injuries, nosocomial infections, asthmaattacks . . .

• If these phenomena are studied via a cohort, the application of survivalmethods would seem appropriate

• During the 1980s, at theoretical level, and during the 1990s and early21st century in practical terms through development of software, variousmethods have been proposed to tackle events of this type, both non-parametric and parametric

2 / 28

Page 6: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events• Recurrent events present two problems which cannot be handled using

the standard methods• Individual heterogeneity (the unmeasured variability between subjects

beyond that of the measured covariates)• Within-subject correlation, that can be specially problematic if there is also

event dependence (the risk of experiencing the event changes as a functionof the number of previous episodes presented by the individual)

• Ignoring this phenomenon and using methods not taking it into accountresults in inefficient estimates

• Event dependence is tackled through the application of models employingbaseline hazards specific for the episode to which the individual is at risk

• The most widely used is that proposed by Prentice, Williams and Peterson(PWP)

• This is an extension of the Cox proportional hazards model, which esti-mates baseline hazards appropriate to each episode through stratificationby the number of previous episodes

3 / 28

Page 7: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events• Recurrent events present two problems which cannot be handled using

the standard methods• Individual heterogeneity (the unmeasured variability between subjects

beyond that of the measured covariates)• Within-subject correlation, that can be specially problematic if there is also

event dependence (the risk of experiencing the event changes as a functionof the number of previous episodes presented by the individual)

• Ignoring this phenomenon and using methods not taking it into accountresults in inefficient estimates

• Event dependence is tackled through the application of models employingbaseline hazards specific for the episode to which the individual is at risk

• The most widely used is that proposed by Prentice, Williams and Peterson(PWP)

• This is an extension of the Cox proportional hazards model, which esti-mates baseline hazards appropriate to each episode through stratificationby the number of previous episodes

3 / 28

Page 8: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events• Recurrent events present two problems which cannot be handled using

the standard methods• Individual heterogeneity (the unmeasured variability between subjects

beyond that of the measured covariates)• Within-subject correlation, that can be specially problematic if there is also

event dependence (the risk of experiencing the event changes as a functionof the number of previous episodes presented by the individual)

• Ignoring this phenomenon and using methods not taking it into accountresults in inefficient estimates

• Event dependence is tackled through the application of models employingbaseline hazards specific for the episode to which the individual is at risk

• The most widely used is that proposed by Prentice, Williams and Peterson(PWP)

• This is an extension of the Cox proportional hazards model, which esti-mates baseline hazards appropriate to each episode through stratificationby the number of previous episodes

3 / 28

Page 9: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events• Recurrent events present two problems which cannot be handled using

the standard methods• Individual heterogeneity (the unmeasured variability between subjects

beyond that of the measured covariates)• Within-subject correlation, that can be specially problematic if there is also

event dependence (the risk of experiencing the event changes as a functionof the number of previous episodes presented by the individual)

• Ignoring this phenomenon and using methods not taking it into accountresults in inefficient estimates

• Event dependence is tackled through the application of models employingbaseline hazards specific for the episode to which the individual is at risk

• The most widely used is that proposed by Prentice, Williams and Peterson(PWP)

• This is an extension of the Cox proportional hazards model, which esti-mates baseline hazards appropriate to each episode through stratificationby the number of previous episodes

3 / 28

Page 10: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events• Recurrent events present two problems which cannot be handled using

the standard methods• Individual heterogeneity (the unmeasured variability between subjects

beyond that of the measured covariates)• Within-subject correlation, that can be specially problematic if there is also

event dependence (the risk of experiencing the event changes as a functionof the number of previous episodes presented by the individual)

• Ignoring this phenomenon and using methods not taking it into accountresults in inefficient estimates

• Event dependence is tackled through the application of models employingbaseline hazards specific for the episode to which the individual is at risk

• The most widely used is that proposed by Prentice, Williams and Peterson(PWP)

• This is an extension of the Cox proportional hazards model, which esti-mates baseline hazards appropriate to each episode through stratificationby the number of previous episodes

3 / 28

Page 11: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Use of the PWP model requires knowing at every moment the number ofprevious episodes suffered by each individual

• To have this detailed information would imply having the complete historyof each individual with respect to the event of interest

• With the exception of studies using specific sampling in healthy popula-tions, or studies based on particular interventions and/or events whichsignificantly determine health status and are relatively infrequent (for ex-ample cardiovascular events, cancers, etc), it is not usually possible tohave such information

• In public health contexts we are interested in estimating the marginal ef-fect of one or several covariates (exposures) on an event, the previoushistory of which is often unknown

4 / 28

Page 12: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Introduction

Recurrent events

Recurrent events

• Think of examples such as studying episodes of sickness absence inworkers of all ages (some of whom may have been working for manyyears), or studying the occurrence of asthma attacks in a sample includ-ing people who already had this problem previously

• When the number of previous episodes suffered by the individual is un-known, we have no method to directly handle occurrence dependence,and the usual practice in such cases is to fit models specified with a com-mon baseline hazard, or frailty models

• The aim of the present study is to assess the performance of two models,as possible alternatives to PWP when we want to estimate the effect ofone or several exposures on the risk of presenting a recurrent event af-fected by event dependence, in situations where the number of previousepisodes of each individual is unknown

5 / 28

Page 13: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Models

Recurrent events

Models

• All the models we are considering are non-parametric and extensions ofthe Cox model

• Prentice, Williams and Peterson (PWP)• Andersen-Gill (AG)• Shared frailty model (SFM)

6 / 28

Page 14: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Models

Recurrent events

Models

• All the models we are considering are non-parametric and extensions ofthe Cox model

• Prentice, Williams and Peterson (PWP)• Andersen-Gill (AG)• Shared frailty model (SFM)

6 / 28

Page 15: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Models

Models

Prentice, Williams and Peterson (PWP)

• For recurrent phenomena in situations of event dependence, the survivalmodel of reference is PWP

• It incorporates event dependence through stratifying by the number ofprevious episodes presented by each individual

• There is a specific baseline hazard for each particular episode to whichthe individual is at risk

• When the i-th individual is at risk of the k -th episode, the hazard functionis defined as

hik (t) = h0k (t)eXi β̂ ,

where h0k (t) = eβ̂0k and Xi β̂ represent the vector of covariates and theregression coefficients

• This model is only applicable if the episode number to which eachindividual is at risk is known at all times

7 / 28

Page 16: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Models

Models

Andersen-Gill (AG)

• It’s the natural extension of the Cox model for proportional hazards• It’s based on counting processes and assumes that the baseline risk is

common to all episodes and independent of the number of previousepisodes presented

• When the i-th individual is at risk of the k -th episode, the hazard functionis defined as

hi(t) = h0(t)eXi β̂ ,

where h0(t) = eβ̂0 and is therefore the same for all episodes• Notice that the PWP model is a stratified AG model

8 / 28

Page 17: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Models

Models

Shared frailty model (SFM)

• May be used in contexts of recurrent events, where the differentepisodes of a given individual share a frailty independent of that of otherindividuals

• In addition to the observed regressors, this model also accounts for thepresence of a latent multiplicative effect on the hazard function:

hi(t) = Ui · h0(t)eXi β̂ ,

where the baseline hazard is specified independently of the episode k towhich the individual is exposed, h0(t) = eβ̂0

• Ui is an individual random effect which is not directly estimated from thedata, but instead is assumed to have unit mean and finite variance,which is estimated

• Since Ui is a multiplicative effect, we can think of the frailty asrepresenting the cumulative effect of one or more omitted covariates

• Specifically, the model used in this study is the shared gamma frailtymodel, with E[Ui ] = 1 and V[Ui ] = θ

9 / 28

Page 18: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Examples

• We illustrate the application of these models reproducing twophenomena described in the literature

• The frequency of long-term sickness absence in a cohort of Dutch workers,with a baseline hazard of 0.0021 per worker-week

• The frequency of falls among residents of a geriatric centre, with a baselinehazard of the first fall of 0.0361 per resident-week

10 / 28

Page 19: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Generation of populations

• Eighteen different populations of 250,000 individuals, each with 20 yearsof follow-up, were generated using the survsim package in R

• These populations are dynamic in the sense of being open on the left,i.e. follow-up of individuals may begin before the start of the study period

• For each individual i the hazard of the next episode k has beensimulated through an exponential distribution:

hik (t) = exp (β0k + β1X1 + β2X2 + β3X3) · νi

where eβ0k is h0k (t), i.e. the baseline hazard for individuals exposed toepisode k

11 / 28

Page 20: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Generation of populations

• The maximum number of episodes which a subject may present has notbeen fixed, although the baseline hazard has been considered constantwhen k ≥ 3. X1, X2 and X3 are the three covariates which represent theexposure, with Xi ∼ Bernoulli(0.5). β1, β2 and β3 are the parameters ofthe three covariates which represent the effect, and have been set,independently of the episode k to which the subject is exposed, to:β1 = 0.25, β2 = 0.50 and β3 = 0.75 in order to represent effects ofdifferent magnitude

• νi is a random effect• Event dependence has been introduced through using various values of

h0k (t) by specifying different β0k

• Individual heterogeneity was introduced through the random effect νi .This is constant over the various episodes of a given individual but differsbetween individuals

12 / 28

Page 21: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Generation of populations

• Individual heterogeneity was introduced through the random effect νi .This is constant over the various episodes of a given individual but differsbetween individuals

• We established three possibilities:• Absence of any random effect• νi ∼ Gamma with mean 1 and variance 0.1• νi ∼ Uniform(0.5, 1.5)

13 / 28

Page 22: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Generation of populations

• Individual heterogeneity was introduced through the random effect νi .This is constant over the various episodes of a given individual but differsbetween individuals

• We established three possibilities:• Absence of any random effect• νi ∼ Gamma with mean 1 and variance 0.1• νi ∼ Uniform(0.5, 1.5)

13 / 28

Page 23: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Generation of populations

• Individual heterogeneity was introduced through the random effect νi .This is constant over the various episodes of a given individual but differsbetween individuals

• We established three possibilities:• Absence of any random effect• νi ∼ Gamma with mean 1 and variance 0.1• νi ∼ Uniform(0.5, 1.5)

13 / 28

Page 24: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Simulations

Examples

Cohort design

• In practice, follow-up is limited to 1, 3 and 5 years• At the start of follow-up there are individuals who have been previously

exposed• For each of the generated sub-bases, 500 random samples were drawn

with samples n1 = 500, n2 = 1000 and n3 = 3000• For each selected individual the episodes they present within the

effective follow-up period were recorded• Finally, the proposed models were fitted to each of these samples by

means of the coxph function in R

14 / 28

Page 25: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Results

Performance

Model assessment criteria

• Percentage bias: δ =( ¯̂

β−ββ

)· 100

• Coverage: Proportion of times the 100 · (1 − α)% confidence intervalβ̂j ± z1−α

2SE(β̂j) includes β, for j = 1, . . . , 500.

• Proportional hazards: Proportion of times that the assumption ofproportionality of hazards cannot be rejected, for j = 1, . . . , 500,according to the contrast of Grambsch & Therneau (Biometrika, 1994)

15 / 28

Page 26: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Results

Results

Results

• The results appearing in this section only refer to cohorts with 5 years offollow-up

• The results referring to 1 and 3 years of follow-up are very similar

16 / 28

Page 27: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Results

Bias

Bias

• The only differences between AG and SFM are observed in thepopulations with high levels of occurrence dependence, the percentageof bias being slightly lower for AG

• For these models the average bias is around 10-15% for populationswith lower occurrence dependence, and rises to 40-70% for those withhigher dependence

• In general there do not appear to be any changes in the effectassociated with β related to either sample size, or with whether thepopulation presented heterogenity or not

17 / 28

Page 28: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Results

Coverage

Coverage

• There are no differences in coverage between AG and SFM• Both models only achieve performances close to 95% for populations

with small or moderate occurrence dependence and for β1 = 0.25• For the other scenarios coverage falls notably, worsening with increasing

occurrence dependence, effect to estimate and sample size. Forexample, when estimating β3 in the highest ocurrence dependencecohorts, the percentage of samples where the 95%CI includes the truevalue is between 0 and 7% for sample sizes of n = 1000 or n = 3000

• In populations with heterogeneity the average size of the 95%CIincreases, which often translates into a rise in level of coverage

18 / 28

Page 29: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Results

Proportional hazards

Proportional hazards

• SFM seems to present better performance in populations with low ormoderate occurrence dependence, although only slightly

• In general model performance worsens with increasing occurrencedependence, effect to estimate and sample size, only reaching levelsnear 90% for lowest occurrence dependence cohorts with n = 500 orn = 1000

19 / 28

Page 30: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• The PWP model presents much better results than the models withcommon baseline risk

• The percentage of bias does not reach 10%, and is generally negative,i.e. slightly underestimating the effect

• For populations free of heterogeneity the coverage levels are around85-95%, but fall in populations with heterogeneity as the effect toestimate and sample size increase

• In this model generally over 85% of the simulated samples comply withthe assumption of proportional hazards, however in certain particularcases when β3 = 0.75 and the population is that of greatestdependence, this percentage falls to around 70%

20 / 28

Page 31: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• The performance of the models with common baseline risk worsens asoccurrence dependence increases, producing worse coverage andincreasing overestimation of the effect

• Members of the exposed group have more events and therefore presentmore recurrent episodes, and also they suffer these episodes earlierthan members of the non-exposed group

• The exposed subjects come to be at risk of a higher baseline hazardsooner and in greater numbers

• By not using specific baseline risks, the increase in baseline hazard ismostly attributed to the exposed group

21 / 28

Page 32: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• As the effect to be estimated increases, performance of models withcommon baseline hazard worsens

• This leads to part of the effect of the baseline hazard being attributed toexposure

• For these models, coverage is affected by sample size, worsening assample size increases

• Almost no differences were observed between the AG and SFM models,not even for populations generated with heterogeneity, and regardless ofwhether the SFM model specified it correctly (gamma) or not (uniform)

• SFM assumes a frailty specific to each individual which can represent acumulative effect of one or several unmeasured covariates

22 / 28

Page 33: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• If the interest of our analysis was not strictly the marginal estimates, butrather we aimed to construct a prognostic model where the estimation ofindividual hazard was a priority, the SFM models might perform betterthan AG models

• If there was any association between the covariates of interest and theunmeasured covariates, perhaps SFM could partly capture it andpresent better performance than AG

• Regarding level of compliance with the assumption of proportionality ofhazards, this declines as occurrence dependence increases

• Although in populations with greater dependence it seems that more ofthe AG models satisfy the assumptions than SFM, their performance inthis area is still not sufficient

23 / 28

Page 34: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• In situations of event dependence the performance of PWP is clearlybetter than that of models with common baseline risk

• Even so, values of coverage and PH compliance do not achieve theexpected levels when event dependence is high, and the effect to beestimated is large

• In the context of health sciences it is common for the phenomenon ofstudy to exhibit recurrence, and also that the risk of suffering an episodechanges depending on the number of episodes suffered previously

• Therefore, incorporating information about previous episodes into theanalysis would appear to be fundamental

• However, in certain contexts, this is not possible simply because thenumber of previous episodes is unknown

24 / 28

Page 35: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• The AG and SFM models analysed in this study have achieved low, verysimilar, performances, making it impossible to recommend one insteadof the other

• The only context in which it would seem reasonable to use one of them,in situations involving occurrence dependence, would be when the levelof such dependence was low and the effect to be estimated was small

• Although this would produce a somewhat biased estimate, modelperformance in terms of coverage and PH compliance might beconsidered acceptable

• In other situations the use of these models is clearly inappropriate, ingeneral they present levels of coverage and PH compliance which arelow or extremely low, and blatantly overestimate the effect of the factor

25 / 28

Page 36: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Conclusions

Conclusions

• Currently there are no models available which allow estimating thepossible effect of occurrence dependence when the number of previousepisodes is unknown, and to incorporate this in fitting the model

• Consequently, it is important to find valid alternatives to permit tacklinganalyses of this type

26 / 28

Page 37: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes
Page 38: Use of multivariate survival models with common baseline risk under event dependence and unknown number of previous episodes

Centre for Researchin EnvironmentalEpidemiology

Parc de Recerca Biomèdica de BarcelonaDoctor Aiguader, 8808003 Barcelona (Spain)Tel. (+34) 93 214 70 00Fax (+34) 93 214 73 02

[email protected]

Grup de Recerca d’Amèrica i Àfrica LlatinesUnitat de Bioestadística, Facultat de MedicinaUniversitat Autònoma de Barcelonawww.uab.cat