13
This article was downloaded by: [Temple University Libraries] On: 12 November 2014, At: 18:49 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Journal of Statistical Computation and Simulation Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gscs20 Survival functions for the frailty models based on the discrete compound Poisson process Nihal Ata a & Gamze Özel a Department of Statistics, Faculty of Science, Hacettepe University, Beytepe 06800, Ankara, Turkey Published online: 23 Apr 2012. To cite this article: Nihal Ata & Gamze Özel (2013) Survival functions for the frailty models based on the discrete compound Poisson process, Journal of Statistical Computation and Simulation, 83:11, 2105-2116, DOI: 10.1080/00949655.2012.679943 To link to this article: http://dx.doi.org/10.1080/00949655.2012.679943 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Survival functions for the frailty models based on the discrete compound Poisson process

  • Upload
    gamze

  • View
    214

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Survival functions for the frailty models based on the discrete compound Poisson process

This article was downloaded by: [Temple University Libraries]On: 12 November 2014, At: 18:49Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Journal of Statistical Computation andSimulationPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gscs20

Survival functions for the frailty modelsbased on the discrete compoundPoisson processNihal Ataa & Gamze Özela Department of Statistics, Faculty of Science, HacettepeUniversity, Beytepe 06800, Ankara, TurkeyPublished online: 23 Apr 2012.

To cite this article: Nihal Ata & Gamze Özel (2013) Survival functions for the frailty models basedon the discrete compound Poisson process, Journal of Statistical Computation and Simulation,83:11, 2105-2116, DOI: 10.1080/00949655.2012.679943

To link to this article: http://dx.doi.org/10.1080/00949655.2012.679943

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation, 2013Vol. 83, No. 11, 2105–2116, http://dx.doi.org/10.1080/00949655.2012.679943

Survival functions for the frailty models based on the discretecompound Poisson process

Nihal Ata* and Gamze Özel

Department of Statistics, Faculty of Science, Hacettepe University, Beytepe 06800, Ankara, Turkey

(Received 26 August 2011; final version received 23 March 2012)

Frailty models are often used to model heterogeneity in survival analysis. The distribution of the frailty isgenerally assumed to be continuous. In some circumstances, it is appropriate to consider discrete frailtydistributions. Having zero frailty can be interpreted as being immune, and population heterogeneity maybe analysed using discrete frailty models. In this paper, survival functions are derived for the frailty modelsbased on the discrete compound Poisson process. Maximum likelihood estimation procedures for theparameters are studied. We examine the fit of the models to earthquake and the traffic accidents’ data setsfrom Turkey.

Keywords: infinitely divisible distributions; stable distributions; censored data models; estimation; frailtymodels; compound Poisson distribution

2000 Mathematics Subject Classification codes: 60E07; 62N01; 62N02; 65C50; 68U20

1. Introduction

The proportional hazards model introduced by Cox [1] has been widely used for the survivalanalysis. Under the situation that the individuals have repeated measurements or the survivaldata come from different groups, the heterogeneity among individuals should be considered.Besides, Vaupel et al. [2] introduced univariate frailty models into survival analysis to accountfor unobserved heterogeneity or missing covariates in the study population. Heterogeneity maybe explained by covariates, but if important covariates have not been observed, this leads tounobserved heterogeneity. If the heterogeneity is omitted, it would cause bias when estimatingthe regression coefficients [3,4]. Hougaard [5] and Aalen [6] found that it would lead to theoverestimation of relative hazard rate. Pickles and Crouchley [7] also showed that it would makethe estimation of the regression parameters tend to zero.

A frailty is defined as unobservable, random, multiplicative factor acting on the hazard functionand the aim of introducing frailty is to account for unobserved heterogeneity among individuals.Published studies on the frailty models generally assume a non-negative and continuous frailtyrandom variable. Frequently used frailty distributions include gamma [2,8], inverse Gaussian [8],log-normal [9] or positive stable distribution [10]. From a computational point of view, thesedistributions are convenient since it is easy to derive the closed forms of survival, density andhazard functions using the Laplace transform. On the other hand, continuous frailty distributions

*Corresponding author. Email: [email protected]

© 2013 Taylor & Francis

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 3: Survival functions for the frailty models based on the discrete compound Poisson process

2106 N. Ata and G. Özel

do not allow having zero risks. Continuous compound Poisson process (CCPP) for the frailtydistribution has been shown by Aalen and Hjort [11] to be useful when there is a subgroup of non-susceptible individuals. It has nice mathematical properties, and the point mass at zero correspondsto the inclusion of a non-susceptible subgroup of individuals. This is often biologically reasonable,since only a subgroup of individuals may have the genetic makeup which makes them vulnerableto certain disease [11]. An example of fitting CCPP to the incidence of testicular cancer is givenin Aalen and Tretli [12].

There are some situations in which a discrete distribution may be appropriate. For example,heterogeneity in lifetime arises because of the presence of a random number of flaws in a unitor because of exposure to damage on a random number of occasions. There are two kinds ofdiscrete frailty models in the literature. One kind of discrete frailty model is constructed by sepa-rating the frailty into ones with fixed and random numbers of mass points. Most such referencesturn out to consider finite mixtures, in which Z is a group or stratum indicator taking a fewvalues, rather than having a probability distribution over a wider range. Xue and Brookmeyer[13] stated that their results also hold for the discrete frailty distributions, as well as continuousones, but did not pursue that line any further. Li and Zhong [14] referred to the proportionalhazards model with discrete frailty meaning finite mixtures as in Xue and Brookmeyer [13].Recently, Wienke [15] has considered a discrete compound Poisson process (DCPP) for thefrailty models. It can be useful for a failure model containing a proportion of units that never fail(long-term survivors). The second kind of discrete frailty model is based on a random number ofcomponents and with masses at integers. Although the possibility of this model has been men-tioned, it has not been investigated in detail. Recently, Caroni et al. [16] have developed discretefrailty models using geometric, Poisson and binomial distributions. In this study, we consideronly discrete frailty models with random number of components and with masses at integersusing DCPP.

The main point of our construction is to view the frailty as having arisen by a cumulativedamage process. This means that the amount of damage, or frailty, of an individual is determinedby how long the DCPP has been running, and of the stochastic outcome of the process. The restof paper is organized as follows. In Section 2, survival functions for the frailty model with thediscrete distributions are given. In Section 3, the survival functions are derived based on DCPP.In Section 4, we present the results of application of the models with two real data examples. Theconclusion is given in Section 5.

2. Survival functions for the frailty model with discrete distributions

The concept of frailty provides a suitable way to introduce random effects in the model to accountfor association and unobserved heterogeneity. The frailty model assumes a proportional hazardsstructure conditional on the random effect, Z . Let Z be a non-negative frailty variable, that is, arandom variable indicating the individual level of risk. Then, the frailty model is represented bythe following hazard function, given the frailty

λ(t|Z) = Zλ(t), (1)

where λ(t) is equal to λ0(t) exp(β ′X) in the proportional hazards model. Note that the baselinehazard function λ0(t) can be chosen non-parametrically or parametrically (Weibull, exponential,gamma, Gompertz, log-normal and log-logistic distributions, etc.).

The frailty model can also be represented by its conditional survival function

S(t|Z) = exp

(−Z

∫ t

0λ(u)du

)= exp[−Z�(t)], (2)

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 4: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation 2107

where �(t) = ∫ t0 λ(u)du is a cumulative hazard function. S(t|Z) represents the fraction of

individuals surviving until time t, given Z .The unconditional survival function S(t) can be obtained by integrating Equation (2) over the

range of the distribution Z . The unconditional survival function is calculated by

S(t) =∫

S(t|z)g(z)dz = E[S(t|Z)] = L[�(t)], (3)

where g(z) is the density of Z and L(s) is the Laplace transform of Z . The distribution of Z isknown as the frailty distribution and the most common frailty distributions are the gamma [8],inverse Gaussian [8], log-normal [9], and positive stable [10].

The discrete frailty distributions have received little attention in the literature, certainly muchless than continuous ones. The unconditional survival function for the discrete frailty distributioncan be written as

S(t) =∞∑

z=0

S(t|z)pZ(z) =E[S(t|Z)] = MZ [S0(t)], (4)

where Z is a discrete random variable with the probability function P(Z = z) = pZ(z). MZ(s) isthe moment generating function (m.g.f) of Z and S0(t) is the baseline survival function.

Recently, standard discrete distributions such as geometric, Poisson or negative binomialdistributions have been considered by Caroni et al. [16] for the frailty models. Let Z be a Poisson-distributed random variable with parameter μ > 0, and the unconditional survival function forthe Poisson distribution is given by

S(t) = exp[μ(S0(t) − 1)]. (5)

If Z is a geometric-distributed random variable with the probability function pZ(z) = p(1 − p)z,z = 0, 1, 2, . . . , then the unconditional survival function is

S(t) = p

[1 − qS0(t)] , (6)

where p > 0 and p + q = 1.Let Z be a negative binomial-distributed random variable with the probability function pZ(z) =(

z − 1k − 1

)pk(1 − p)z−k , z = k, k + 1, . . . , where parameters k > 0 and p > 0. The unconditional

survival function is given by

S(t) =[

pS0(t)

[1 − qS0(t)]]k

. (7)

It is often convenient to fix the mean of Z at 1 by a suitable constraint on the parameters of thecontinuous frailty distribution to achieve identifiability with the form zλ0(t) [17]. This entails noloss of generality because the average level of frailty can always be absorbed into the baselinehazard. The same problem does not arise with discrete frailty because the frailty distribution lacksa scale parameter.

In this study, we derive the survival function of the frailty model using binomial distribution.To obtain the unconditional survival function, let us define Z as a binomial-distributed randomvariable with the parameters m and p. Using Equation (4) and m.g.f of binomial distribution, theunconditional survival function is obtained by

S(t) = [pS0(t) + q]n, (8)

where p > 0 and p + q = 1.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 5: Survival functions for the frailty models based on the discrete compound Poisson process

2108 N. Ata and G. Özel

3. Survival functions for the frailty model with DCPP

In this section, we consider the frailty distributions determined by a DCPP, which is taken to be aprocess with non-negative, independent and time-homogeneous increments. Let {Nt , t ≥ 0} be ahomogeneous Poisson process with parameter α and let Zi, i = 1, 2, . . . , be i.i.d. discrete randomvariables, independent of the process {Nt , t ≥ 0}. Then, {Zt , t ≥ 0} is a DCPP and given as

Zt = Z1 + Z2 + . . . + ZNt . (9)

One way to think about this process is to imagine an individual that suffers from several hitscausing damage. The effect of these hits cumulates over time and increases individual frailty. Itshould be noted that this process takes place before the follow-up starts, maybe in early life. Notethat unlike the standard continuous frailty distribution, {Zt , t ≥ 0} allows a positive probabilityfor the risk to be zero. Here, {Zt , t ≥ 0} shows the cumulative damage of failures or flaws in asystem up to time t.

Here, {Zt , t ≥ 0} can be constructed as the sum of a Poisson-distributed number of Poisson,binomial, geometric or negative binomial-distributed random variables up to time t. This classof DCPP is discussed by Özel and Inal [18,19] in detail. From Equation (4), the unconditionalsurvival function can be written as

S(t) = E{exp[−Zt�(t)]} = E

( ∞∑z=0

exp[−z�(t)]P(Zt = z)

), (10)

where the baseline survival function S0(t) equals exp[−�(t)].In particular, if Z (1)

i , i = 1, 2, 3, . . . , are Poisson-distributed in Equation (9), {Z (1)t , t ≥ 0} is

called a Neyman type A process. Assuming that {Nt , t ≥ 0} has a homogeneous Poisson processwith parameter α, the unconditional survival function with the Neyman type A process is derivedusing Equation (10). We have

S(t) =∞∑

n=0

∞∑z=0

exp[−z�(t)] exp(−nμ)(nμ)z

z! exp(−αt)(αt)n

n! ,

where Z (1)i , i = 1, 2, 3, . . . , has a Poisson distribution with parameter μ > 0. Then,

S(t) = exp(−αt)∞∑

n=0

exp(−nμ)(αt)n

n!∞∑

z=0

(nμ exp[−�(t)])z

z! ,

= exp(−αt)∞∑

n=0

exp(−nμ)(αt)n

n! exp{nμ exp[−�(t)]},

= exp(−αt)∞∑

n=0

(αt exp(μ{exp[−�(t)] − 1}))n

n! .

The unconditional survival function for the frailty model with the Neyman type A process takesthe form

S(t) = exp[αt(exp(μ{exp[−�(t)] − 1}) − 1)]. (11)

Let {Nt , t ≥ 0} be a homogeneous Poisson process with parameter α. If Z (2)i , i = 1, 2, 3, . . . ,

are binomial distributed, then {Z (2)t , t ≥ 0} is called a Neyman type B process. The uncondi-

tional survival function for the frailty model with the Neyman type B process is obtained from

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 6: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation 2109

Equation (10),

S(t) =∞∑

n=0

nm∑z=0

exp[−z�(t)](

nmz

)pzqnm−z exp(−αt)

(αt)n

n! ,

where Z (2)i , i = 1, 2, 3, . . . , are binomial-distributed random variables with parameters m and

p > 0, p + q = 1. Then, we have

S(t) = exp(−αt)∞∑

n=0

(αtqm)n

n!nm∑z=0

(nmz

) (p exp[−�(t)]

q

)z

.

Since (1 + x)r = ∑rm=0

(rm

)xm, it is easy to obtain

S(t) = exp(−αt)∞∑

n=0

(αtqm)n

n!(

1 + p exp[−�(t)]q

)nm

= exp(−αt)∞∑

n=0

[(αt)(q + p exp[−�(t)])m]n

n! .

Consequently, the unconditional survival function for the frailty model with the Neyman type Bprocess is given by

S(t) = exp[αt({p exp[−�(t)] + q}m − 1)]. (12)

Here, {Z (3)t , t ≥ 0} is a geometric-Poisson process, where {Nt , t ≥ 0} is a homogeneous Poisson

process with parameterα, and Z (3)i , i = 1, 2, 3, . . . , are geometric distributed with parameter p [14].

The unconditional survival function for the frailty model with the geometric-Poisson process isgiven by

S(t) =∞∑

n=0

∞∑z=n

exp[−z�(t)](

z − 1n − 1

)pnqz−n exp(−αt)

(αt)n

n! ,

= exp(−αt)∞∑

n=0

(p

q

)n(αt)n

n!∞∑

z=n

(z − 1n − 1

)(q exp[−�(t)])z.

Since (1 − q exp[−�(t)])−n = ∑∞j=0

(n + j − 1

j

)(q exp[−�(t)])j, we have

S(t) = exp(−αt)∞∑

n=0

1

n!

(αtp exp[−�(t)][

1 − q exp[−�(t)]])n

,

= exp

[αt

(p exp[−�(t)]

[1 − q exp[−�(t)]] − 1

)]. (13)

Defining Z (4)i , i = 1, 2, 3, . . . , as the negative binomial-distributed random variables with param-

eters k and p, {Z (4)t , t ≥ 0} is called a negative binomial-Poisson process. Equation (10) yields

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 7: Survival functions for the frailty models based on the discrete compound Poisson process

2110 N. Ata and G. Özel

the unconditional survival function for the frailty distribution with the negative binomial-Poissonprocess as follows

S(t) =∞∑

n=0

∞∑z=k

exp[−z�(t)](

z − 1nk − 1

)pnkqz−nk exp(−αt)

(αt)n

n! ,

= exp(−αt)∞∑

n=0

(p

q

)nk(αt)n

n!∞∑

z=k

(z − 1

nk − 1

)(q exp[−�(t)])z.

Since (1 − q exp[−�(t)])−nk = ∑∞j=0

(nj + j − 1

j

)(q exp[−�(t)])nj, we have

S(t) = exp(−αt)∞∑

n=0

1

n!

[αt

(p exp[−�(t)]

[1 − q exp[−�(t)]])k

]n

,

= exp

[αt

((p exp[−�(t)]

1 − q exp[−�(t)])k

− 1

)]. (14)

The method of maximum likelihood is utilized in order to estimate parameters. Initially, letxi be the true failure time for unit i. Let ci be a unit’s potential censoring time and di be thecorresponding censoring indicator. Then, observations consist of ti = min(xi, ci) with di = 1 ifxi ≤ ci (uncensored) and di = 0 if xi > ci (censored). A unit observed to fail at time x contributesf (x) to the likelihood. Similarly, the contribution from a unit whose survival time is censored atc is S(c), the probability of surviving beyond c. Then, the likelihood function is

Ln(t1, . . . , tn) =(

n∏i=1

f (ti)di(S(ti))

1−di

), (15)

where f (t) = −dS(t)/dt is the probability density of T . Maximum likelihood estimates of param-eters can be obtained by applying a standard function-optimization routine to log-likelihood.Standard errors (SEs) for the parameters are derived from the inverse Hessian matrix evaluated atthe maximum likelihood estimate.

4. Application

4.1. Application to earthquake data from Turkey

The suggested models are applied to earthquake data from Turkey since it is located on a highlyactive Eurasian Plate which has caused numerous large scale earthquakes throughout history.In this study, destructive earthquakes between 1900 and 2011 having magnitudes M ≥ 5.0 inthe coordinates (39 − 42◦N) and (26 − 45◦E) are considered. These data are obtained from anearthquake monitor, which is based on real-time earthquake list of Kandilli Observatory for Turkey[20]. Our models deal with two kinds of random variable, namely the inter-event time between twoconsecutive earthquakes, and the censoring time, i.e. the time interval between the present timeand the last earthquake occurred. More detailed information about our earthquake data structurecan be found in Ata and Özel [21].

The heterogeneity can be modelled by a DCPP in Equation (9). In that case, a model in whichthe discrete frailty is the unknown number of the broken faults in a seismic motion, that is, an

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 8: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation 2111

Table 1. Estimated parameters and log-likelihood values in themodels for earthquake data.

Model Parameters Estimates (SE)

No-frailty λ 0.512 (0.11)ν 0.873 (0.06)

log L −735.209AIC 1474.418

Neyman type A λ 0.542 (0.08)ν 0.146 (0.03)α 0.701 (0.06)μ 1.270 (0.16)

log L −709.775AIC 1427.551

Neyman type B λ 0.548 (0.17)ν 0.082 (0.02)α 0.432 (0.07)p 0.451 (0.06)m 5.874 (1.72)

log L −710.780AIC 1431.561

Geometric-Poisson λ 0.488 (0.15)ν 0.044 (0.01)α 1.861 (0.22)p 0.555 (0.18)

log L −709.579AIC 1427.159

Negative binomial-Poisson λ 0.496 (0.08)ν 1.397 (0.17)α 1.793 (0.19)p 0.361 (0.06)k 3.124 (1.04)

log L −708.700AIC 1427.401

earthquake up to time t. Zero frailty corresponds to a model containing a proportion of faults thatnever broke (long-term survivors). The Poisson process is frequently used for the occurrence oflarge earthquakes. That is, a large earthquake is a rare event, and it might be supposed that ifwe take a series of large samples, the frequencies of large earthquakes would follow a Poissonprocess [22,23]. In this study, Nt is the number of main shocks that occurred in Turkey betweenyears 1903 and 2011. DCPP frailty models are applied to the data to account for the unobservableheterogeneity in risk. We utilize the frailty model assuming that the frailty term follows thePoisson, binomial, geometric and negative binomial distribution using a Weibull baseline survivalfunction. We tested the fit of the Weibull distribution to our data using the Kolmogorov–Smirnovtest of fit and were unable to reject the hypothesis that the earthquake data is Weibull distributed(p < 0.05). Maximum likelihood estimates of scale and shape parameters are obtained as λ = 0.48and ν = 0.8, respectively. The hazard function decreases in a non-linear pattern with increasingtime since ν < 1. Decreasing hazard rates are less commonly encountered in epidemiologic,medical survival data and reliability data [24]. However, it corresponds well to earthquake data,that is, the risk of a recurrence of a large earthquake in a specified region might decrease as timepasses. Estimated parameters with their SEs are given in Table 1.

We briefly focus on the use of the frailty models in comparison to a Weibull no frailty model.To provide further insight into model selection, we test for heterogeneity utilizing the proposedfrailty models and Weibull no frailty model. Using Table 1, −2 log L is computed, resulting inthe conclusion that heterogeneity exists. Consequently, the heterogeneity parameter should beincluded for an adequate description of the data.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 9: Survival functions for the frailty models based on the discrete compound Poisson process

2112 N. Ata and G. Özel

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57

Time (years)

Surv

ival

fun

ctio

n

KM

Neyman Type A

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57

Time (years)

Surv

ival

fun

ctio

n

KM

Neyman Type B

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57

Time (years)

Surv

ival

fun

ctio

n

KM

Geometric-Poisson

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

1 5 9 13 17 21 25 29 33 37 41 45 49 53 57

Time (years)

Surv

ival

fun

ctio

n

KM

Negative binomial-Poisson

Figure 1. Fitted models for earthquake data.

DCPP frailty models provide a better fit for the earthquake data set. In order to compare the fitof the models, the Akaike information criterion (AIC) [25] can be utilized. The use of the AIC inrandom effects models is sensitive to the number of degrees of freedom. AIC values of Weibull nofrailty model, Neyman typeA, Neyman type B, geometric-Poisson and negative binomial-Poissonare obtained and given in Table 1. Based on these values, DCPP frailty models give a better fitthan Weibull no frailty model for the earthquake data. We conclude that geometric-Poisson frailtymodel provides a good fit to the data set considered in this analysis. We also examine the Kaplan–Meier (KM) curve with the survival curve of the proposed models as depicted in Figure 1. As seenin Figure 1, the Neyman type A, Neyman type B and negative binomial-Poisson frailty modelsdo not appear to improve on the fit compared to the visual examination of the geometric-Poissonfrailty model.

4.2. Application to traffic accidents data of Turkey

In Turkey mostly road transportation is preferred, and the dominant role of road transport inpassenger and cargo transportations and the level of safety in traffic environment which is notfully parallel to this dominancy cause more frequent traffic accidents. Road traffic accidents havebecome a serious threat to public health in Turkey by killing or injuring many people and resultingin devastating human and economic loss. Numerous dissimilarities exist among drivers in trafficaccidents. To account for the unobservable heterogeneity in risk, proposed frailty models in Section3 are applied to the traffic accidents data of Turkey. The models are formulated based on the ideathat drivers who are most frail will experience the event of interest more than the others. There area number of known factors that may increase the risk of accident, such as collision type, roadway

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 10: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation 2113

Table 2. Estimated parameters and log-likelihood values in themodels for truck drivers’ data.

Model Parameters Estimates (SE)

No-frailty λ 0.286 (0.11)ν 0.663 (0.05)

log L −326.660AIC 657.320

Neyman type A λ 0.248 (0.08)ν 1.374 (0.12)α 0.090 (0.02)μ 1.529 (0.23)

log L −322.120AIC 652.239

Neyman type B λ 0.264 (0.07)ν 1.357 (0.15)α 0.183 (0.07)p 0.640 (0.08)m 8.597 (1.03)

log L −321.511AIC 653.021

Geometric-Poisson λ 0.273 (0.09)ν 1.022 (0.10)α 0.351 (0.14)p 0.463 (0.17)

log L −323.436AIC 654.872

Negative binomial-Poisson λ 0.271 (0.09)ν 1.249 (0.18)α 0.305 (0.13)p 0.612 (0.09)k 5.298 (1.42)

log L −322.026AIC 654.048

surface, vehicle speed, alcohol/drug use and restraint use. Undoubtedly, there will also be a largenumber of still unknown factors that could influence the accident. Among these, previous non-injury accidents of drivers will be prominent. However, the database of Turkish traffic records donot contain detailed information about the number of non-injury traffic accidents done by driversbefore. To account this influence, one has to allow for some random quantity to model individualheterogeneity.

In this study, we used the data of a previous study about truck drivers by an anonymousquestionnaire given to 500 randomly sampled truck drivers. A total of 315 (63%) truck driversreturned the questionnaire. All the drivers were male. The data of 315 truck drivers within the 10-year-follow-up period (2000–2010 years) are chosen to illustrate the use of DCPP frailty models.In the following analysis, involving in personal injury or material loss accidents is the endpointof interest. This variable is measured in years. The definition of failure time is the time intervalbetween the year 2000 and the time that the truck driver involved in personal injury or material lossaccidents. Truck drivers who were still not involved in personal injury or material loss accidentsat the end of the follow-up period were treated as censored observations. The complete data setconsists of 51.75% censored observations.

Let Nt be the number of personal injury or material loss accidents of truck drivers thatoccur between years 2000 and 2010. We define Zi, i = 1, 2, 3, . . . , as the number of personalinjury or material loss accidents of ith truck driver. We utilize the frailty model assuming thatZi, i = 1, 2, 3, . . . , follows the Poisson, binomial, geometric and negative-binomial distribution,respectively.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 11: Survival functions for the frailty models based on the discrete compound Poisson process

2114 N. Ata and G. Özel

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.5 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Time (years)

Su

rviv

al f

un

ctio

n

KM

Neyman Type A0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.5 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Time (years)

Su

rviv

al f

un

ctio

n

KM

Neyman Type B

0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.5 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Time (years)

Surv

ival

fun

ctio

n

KM

Geometric-Poisson0.00

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0.80

0.90

1.00

0.5 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0

Time (years)

Surv

ival

fun

ctio

n

KM

Negative binomial-Poisson

Figure 2. Fitted models for truck drivers’ data.

The survival time of truck drivers’ data has a Weibull distribution with parameters λ = 0.17and ν = 1.75. We obtain increasing hazard rate with increasing time. The risk of a recurrence ofa traffic accident might increase as the age of the truck driver increases. It is consistent with thefindings in Bedard et al. [26]. It was found that the increasing driver age was associated with fatalinjuries, while younger drivers had lower risk of a fatal injury than the drivers aged 50–60 years,and that older drivers had higher risk. Research on age-related driving concerns has shown that ataround the age of 65, drivers face an increased risk of being involved in a vehicle crash. After theage of 75, the risk of driver fatality increases sharply, because older drivers are more vulnerableto both crash-related injury and death.

Using the truck drivers’ data, the Neyman type A, Neyman type B, geometric-Poisson, andnegative binomial-Poisson frailty models are compared with a Weibull no-frailty model. Estimatedparameters with their SEs and also the AIC values are given in Table 2.

According to AIC values, DCPP frailty models are more appropriate than Weibull no-frailtymodel for the truck drivers’ data. The data set points that Neyman type A frailty model pro-vides a good fit among other models. We also examine the KM curve with the survivalcurve of the proposed models as given in Figure 2. Figure 2 also yields the same conclusionas the AIC.

5. Conclusion

Frailty models are extensions of the proportional hazards model which is the most popular modelin survival analysis. Traditionally, continuous frailty models have been used to analyse survivaldata. Popular frailty models use gamma, inverse Gaussian, positive stable and continuous com-pound Poisson distributions. In some cases, modelling considerations lead to discrete frailty ratherthan continuous frailty models. In this study, we have proposed DCPPs for frailty models whenheterogeneity can be attributed partly to unmeasured discrete-valued factors.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 12: Survival functions for the frailty models based on the discrete compound Poisson process

Journal of Statistical Computation and Simulation 2115

To goal of this paper is to evaluate and apply DCPP frailty models to model time-to-event data.We illustrate the models in an earthquake and traffic accidents’data sets. The survival time of bothdata sets follow a Weibull distribution. Decreasing hazard rate occurs for the earthquake data setand all earthquakes are assumed to susceptible to break a fault. In the second application, increasinghazard rate occurs and all truck drivers are assumed to susceptible to involve in a non-injurytraffic accident. The frailties are assumed to be independent and compound Poisson distributed.Estimation is performed via maximum likelihood techniques. In order to select the best modelamong all frailty models considered, theAIC is utilized.We have shown the DCPP frailty models toprovide a better fit to the data sets compared to the no-frailty model.Application to earthquake dataset indicates that the geometric-Poisson process fits data well. Graphical illustration also yieldsthe same conclusion. Besides, the traffic accidents’ data sets have been collected by receivinginformation on truck drivers in Turkey and it is well described by the Neyman type A processbased on AIC values. Further developments should be made about correlated bivariate DCPPfrailty models.

Acknowledgements

The authors thank the editor and anonymous referee for their constructive comments on an earlier version of this manuscriptwhich resulted in this improved version.

References

[1] D.R. Cox, Regression models and life-tables, J. R. Stat. Soc. Series B 34 (1972), pp. 187–220.[2] J.W.Vaupel, K. Manton, and E. Stallard, The impact of heterogeneity in individual frailty on the dynamics of mortality,

Demography 16 (1979), pp. 439–454.[3] N. Keyfitz and G. Littman, Mortality in a heterogeneous population, Popul. Stud. 33 (2) (1979), pp. 333–342.[4] T. Lancaster and S. Nickell, The analysis of re-employment probabilities for the unemployed, J. R. Stat. Soc. Series

A 143 (2) (1980), pp. 141–165.[5] P. Hougaard, Modeling heterogeneity in survival data, J. Appl. Prob. 28 (1991), pp. 695–701.[6] O.O. Aalen, Effects of frailty in survival analysis, Stat. Meth. Med. Res. 3 (1994), pp. 227–243.[7] A.R. Pickles and R. Crouchley, Generalizations and applications of frailty models for survival and event data, Stat.

Meth. Med. Res. 3 (1994), pp. 263–278.[8] P. Hougaard, Life table methods for heterogeneous populations: Distributions describing the heterogeneity,

Biometrika 71 (1984), pp. 75–83.[9] D.M. Santos, R.B. Davies, and B. Francis, Nonparametric hazard versus nonparametric frailty distribution in

modelling recurrence of breast cancer, J. Stat. Plan. Infer. 47 (1995), pp. 111–127.[10] P. Hougaard, A class of multivariate failure time distributions, Biometrika 73 (1986), pp. 671–678.[11] O.O. Aalen and N.L. Hjort, Frailty models that yield proportional hazards, Stat. Prob. Lett. 58 (2002), pp. 335–342.[12] O.O. Aalen and S. Tretli, Analyzing incidence of testis cancer by means of a frailty model, Canc. Causes Contr. 10

(1999), pp. 285–292.[13] X. Xue and R. Brookmeyer, Regression analysis of discrete time survival data under heterogeneity, Stat. Med. 16

(1997), pp. 1983–1993.[14] H. Li and X. Zhong, Multivariate survival models induced by genetic frailties, with application to linkage analysis,

Biostatistics 3 (2002), pp. 57–75.[15] A. Wienke, Frailty Models in Survival Analysis, Chapman & Hall/CRC Biostatistics Series, Boca Raton, FL, 2010

(e-book).[16] C. Caroni, M. Crowder, and A. Kimber, Proportional hazards models with discrete frailty, Lifetime Data Anal. 16

(2010), pp. 374–384.[17] O.O. Aalen, Modelling heterogeneity in survival analysis by the compound Poisson distribution, Ann. Appl. Prob. 4

(1992), pp. 951–972.[18] G. Özel and C. Inal, The probability function of a geometric Poisson distribution, J. Stat. Comput. Simul. 80(5)

(2010), pp. 479–487.[19] G. Özel and C. Inal, On the probability function of the first exit time for generalized Poisson processes, Pakistan J.

Stat. 28(1) (2012), pp. 27–40.[20] Kandilli Observatory and Earthquake Research Institute of University of Bogazici, Istanbul, Turkey. Available at

http://www.koeri.boun.edu.tr (accessed on 14 December 2011).[21] N. Ata and G. Özel, A multivariate non-parametric hazard model for earthquake occurrences in Turkey, J. Data Sci.

9(4) (2011), pp. 529–548.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014

Page 13: Survival functions for the frailty models based on the discrete compound Poisson process

2116 N. Ata and G. Özel

[22] G. Özel and C. Inal, The probability function of the compound Poisson process and an application to aftershocksequence in Turkey, Environmetrics 19(1) (2008), pp. 79–85.

[23] G. Özel, A bivariate compound Poisson model for the occurrence of foreshock and aftershock sequences in Turkey,Environmetrics 22(7) (2011), pp. 847–856.

[24] J.P. Klein and M.L. Moeschberger, Survival Analysis Techniques for Censored and Truncated Data, Springer, NewYork, 2003.

[25] H. Akaike, A new look at the statistical model indentification, IEEE Trans. Autom. Cont. 19 (1974),pp. 716–722.

[26] M. Bedard, H.G. Gordon, M.J. Stones, and J.P. Hirdes, The independent contribution of driver, crash, and vehiclecharacteristics to driver fatalities, Acci. Anal. Prev. 34 (2002), pp. 717–727.

Dow

nloa

ded

by [

Tem

ple

Uni

vers

ity L

ibra

ries

] at

18:

49 1

2 N

ovem

ber

2014