38
Journal of Data Science 11(2013), 781-818 The Log-Kumaraswamy Generalized Gamma Regression Model with Application to Chemical Dependency Data Marcelino A. R. Pascoa 1 , Claudia M. M. de Paiva 2 , Gauss M. Cordeiro 3 and Edwin M. M. Ortega 4* 1 Universidade Federal de Mato Grosso, 2 Universidade Federal de S˜ao Jo˜ao del Rei, 3 Universidade Federal de Pernambuco and 4 Universidade de S˜ao Paulo Abstract: The five parameter Kumaraswamy generalized gamma model (Pas- coa et al., 2011) includes some important distributions as special cases and it is very useful for modeling lifetime data. We propose an extended version of this distribution by assuming that a shape parameter can take negative values. The new distribution can accommodate increasing, decreasing, bath- tub and unimodal shaped hazard functions. A second advantage is that it also includes as special models reciprocal distributions such as the recipro- cal gamma and reciprocal Weibull distributions. A third advantage is that it can represent the error distribution for the log-Kumaraswamy general- ized gamma regression model. We provide a mathematical treatment of the new distribution including explicit expressions for moments, generating function, mean deviations and order statistics. We obtain the moments of the log-transformed distribution. The new regression model can be used more effectively in the analysis of survival data since it includes as sub- models several widely-known regression models. The method of maximum likelihood and a Bayesian procedure are used for estimating the model pa- rameters for censored data. Overall, the new regression model is very useful to the analysis of real data. Key words: Censored data, generating function, Kumaraswamy generalized gamma distribution, log-gamma generalized regression, moment, survival function. 1. Introduction Standard lifetime distributions usually present very strong restrictions to pro- duce bathtub curves, and thus appear to be unappropriate for data with this characteristic. The gamma distribution is the most popular model for analyzing skewed data. The generalized gamma (GG) (Stacy, 1962) distribution includes * Corresponding author.

782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

Journal of Data Science 11(2013), 781-818

The Log-Kumaraswamy Generalized Gamma Regression Modelwith Application to Chemical Dependency Data

Marcelino A. R. Pascoa1, Claudia M. M. de Paiva2, Gauss M. Cordeiro3

and Edwin M. M. Ortega4∗1Universidade Federal de Mato Grosso, 2Universidade Federal de Sao Joao delRei, 3Universidade Federal de Pernambuco and 4Universidade de Sao Paulo

Abstract: The five parameter Kumaraswamy generalized gamma model (Pas-coa et al., 2011) includes some important distributions as special cases andit is very useful for modeling lifetime data. We propose an extended versionof this distribution by assuming that a shape parameter can take negativevalues. The new distribution can accommodate increasing, decreasing, bath-tub and unimodal shaped hazard functions. A second advantage is that italso includes as special models reciprocal distributions such as the recipro-cal gamma and reciprocal Weibull distributions. A third advantage is thatit can represent the error distribution for the log-Kumaraswamy general-ized gamma regression model. We provide a mathematical treatment ofthe new distribution including explicit expressions for moments, generatingfunction, mean deviations and order statistics. We obtain the moments ofthe log-transformed distribution. The new regression model can be usedmore effectively in the analysis of survival data since it includes as sub-models several widely-known regression models. The method of maximumlikelihood and a Bayesian procedure are used for estimating the model pa-rameters for censored data. Overall, the new regression model is very usefulto the analysis of real data.

Key words: Censored data, generating function, Kumaraswamy generalizedgamma distribution, log-gamma generalized regression, moment, survivalfunction.

1. Introduction

Standard lifetime distributions usually present very strong restrictions to pro-duce bathtub curves, and thus appear to be unappropriate for data with thischaracteristic. The gamma distribution is the most popular model for analyzingskewed data. The generalized gamma (GG) (Stacy, 1962) distribution includes

∗Corresponding author.

Page 2: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

as special models the exponential, Weibull, gamma and Rayleigh distributions,among others. It is suitable for modeling data with hazard rate function of dif-ferent forms (increasing, decreasing, bathtub and unimodal) and then it is usefulfor estimating individual hazard functions and both relative hazards and relativetimes (Cox, 2008). The GG distribution has been used in several research ar-eas such as engineering, hydrology and survival analysis. In fact, Ortega et al.(2003) discussed influence diagnostics in GG regression models, Nadarajah andGupta (2007) applied this distribution to drought data and Cox et al. (2007)presented a parametric survival analysis and taxonomy of GG hazard functions.For X1 and X2 independent GG random variables, Ali et al. (2008) derived theexact distributions of the product X1X2 and quotient X1/X2 and provided ap-plications of their results to drought data from Nebraska. Further, Gomes et al.(2008) focused on parameter estimation, Ortega et al. (2008) compared threetypes of deviance component residuals in GG regression models under censoredobservations, Cox (2008) discussed and compared the F-generalized family withthe GG model, Almpanidis and Kotropoulos (2008) presented a text-independentautomatic phone segmentation algorithm based on this distribution and Nadara-jah (2008a) analyzed some incorrect references with respect to its use in electricaland electronic engineering. More recently, Barkauskas et al. (2009) studied thenoise part of a spectrum as an autoregressive moving average (ARMA) modelwith the innovations having the GG distribution and Malhotra et al. (2009)provided a unified analysis for wireless system over generalized fading channelsthat is modeled by a two parameter GG model. Also, Ortega et al. (2009) pro-posed a modified GG regression model to allow the possibility that long-termsurvivors may be presented in the data and Cordeiro et al. (2010) defined theexponentiated generalized gamma (EGG) distribution. This distribution due toits flexibility in accommodating many forms of the risk function seems to be animportant model that can be used in a variety of problems in survival analysis.

In the last years, new distributions for modeling survival data based on ex-tensions of the Weibull distribution were developed. Mudholkar et al. (1995),Xie and Lai (1995), Lai et al. (2003) and Carrasco et al. (2008) introducedthe exponentiated Weibull (EW), additive Weibull, modified Weibull (MW) andgeneralized modified Weibull (GMW) distributions, respectively. Further, themain motivation for the exponentiated generalized gamma (EGG) distribution(Cordeiro et al., 2010) is that it contains as special cases the GG, EW, expo-nentiated exponential (EE) (Gupta and Kundu, 1999) and generalized Rayleigh(GR) (Kundu and Raqab, 2005) distributions.

The Kumaraswamy generalized gamma (KGG) distribution (Pascoa et al.,2011) can model four types of the failure rate function (i.e. increasing, decreasing,unimodal and bathtub) depending on the values of its parameters. It is also

Page 3: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 783

suitable for testing goodness-of-fit of some sub-models, such as the EGG, GG,EW, Weibull and GR distributions. In this paper, we define an extended form ofthe KGG distribution to cope with several reciprocal type distributions and studysome of its structural properties. We extend the model by Pascoa et al. (2011),since a shape parameter can take negative values. The method of maximumlikelihood is used for estimating the model parameters. Unless otherwise stated,all of the results presented in the paper are new and original. It is expected thatthey could encourage further research on the new distribution.

Different forms of regression models have been proposed in survival analysis.Among them, the location-scale regression model (Lawless, 2003) is distinguishedsince it is frequently used in clinical trials. Here, we propose a location-scaleregression model based on the new distribution called the log-Kumaraswamygeneralized gamma (LKGG) regression model, which is a feasible alternative formodeling the four types of failure rate functions.

The rest of the paper is organized as follows. In Section 2, we define anextended version of the KGG distribution. In Section 3, we derive its ordinarymoments, moment generating function (mgf), order statistics and their moments.In Section 4, we define the LKGG distribution and derive an explicit expression forits moments. In Section 5, we propose the LKGG regression model for analysisof censored data. We estimate the model parameters by maximum likelihood,derive the observed information matrix and discuss a Bayesian methodology toestimate the model parameters. In Section 6, we illustrate the potentiality ofthe new regression model by means of a real data set in chemical dependency.Section 7 ends with some concluding remarks.

2. The KGG Distribution

Pascoa et al. (2011) defined the KGG distribution with five positive parame-ters α, τ , k, λ and ϕ to extend the GG and EGG distributions pioneered by Stacy(1962) and Cordeiro et al. (2010), respectively. The KGG probability densityfunction (pdf) (for t > 0) is given by

f(t) =λϕ τ

αΓ(k)

(t

α

)τk−1

exp

[−(t

α

)τ ]{γ1

[k,

(t

α

)τ ]}λ−1

×

(1−

{γ1

[k,

(t

α

)τ ]}λ)ϕ−1

, (1)

where Γ(k) =∫∞

0 wk−1e−wdw (for k > 0) is the gamma function, γ1(k, x) =γ(k, x)/Γ(k) is the incomplete gamma function ratio and γ(k, x) =

∫ x0 w

k−1e−wdwis the incomplete gamma function. The incomplete gamma function defined here

Page 4: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

784 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

for positive real k and x can be developed into the context of holomorphic func-tions for any complex k and x. Complex analysis shows how properties of thereal incomplete gamma functions extend to their holomorphic counterparts. Inthe density function (1), α is a scale parameter and τ , k, λ and ϕ are shapeparameters. The Weibull, GG and EGG distributions are special models of (1)when λ = k = 1, λ = ϕ = 1 and ϕ = 1, respectively. The KGG distributionapproaches the log-normal distribution when λ = ϕ = 1 and k →∞.

Now, we define an extended form of the density function (1) (for t > 0) givenby

f(t) =λϕ |τ |αΓ(k)

(t

α

)τk−1

exp

[−(t

α

)τ ]{γ1

[k,

(t

α

)τ ]}λ−1

×

(1−

{γ1

[k,

(t

α

)τ ]}λ)ϕ−1

, (2)

where τ is not zero and the other parameters are positive. The terminologyKumaraswamy generalized gamma (KGG) distribution is maintained for (2). Fora random variable T having this density, we write T ∼KGG(α, τ, k, λ, ϕ). Forτ > 0, (2) reduces to (1) and for τ > 0 and λ = ϕ = 1, it is identical to the GGdistribution. For τ > 0 and λ = ϕ = k = 1, we obtain the Weibull distribution.The case τ < 0 and λ = ϕ = k = 1 gives the reciprocal Weibull (or inverseWeibull) distribution. For τ = −1 and λ = ϕ = k = 1, we obtain the reciprocalexponential. If τ = −2 and λ = ϕ = k = 1, we have the reciprocal Rayleigh. Forτ = −1 and λ = ϕ = 1, we obtain the reciprocal gamma. The case τ < 0 andλ = ϕ = 1 corresponds to the generalized reciprocal gamma. If α = 1/2, τ = −1,k = p/2 and λ = ϕ = 1, we have the reciprocal chi-square. The values α = 1/

√2,

τ = −2, k = p/2 and λ = ϕ = 1 yield the reciprocal-chi. If α =√

2σ, τ = −2,k = p/2 and λ = ϕ = 1, we obtain the scaled reciprocal-chi. The case τ < 0 andk = 1 gives the Kumaraswamy (“Kum” for short) reciprocal Weibull. For τ = −1and k = 1, we have the Kum reciprocal exponential. If τ = −2 and k = 1, weobtain the Kum reciprocal Rayleigh. The Kum reciprocal gamma corresponds toτ < 0. Several special models of (2) when τ > 0 are discussed by Pascoa et al.(2011). Plots of the KGG density function for selected values of τ > 0 and τ < 0are displayed in Figure 1.

The cumulative distribution function (cdf) corresponding to (2) becomes

F (t) =

1−

{1−

[γ1

(k, ( tα)τ

)]λ}ϕ, if τ > 0,

{1−

[γ1

(k, ( tα)τ

)]λ}ϕ, if τ < 0.

(3)

Page 5: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 785

(a) (b) (c)

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

t

f(t)

α=1; τ=0.7; k=1; λ=1; ϕ=1α=1; τ=1.5; k=1; λ=1; ϕ=1α=1; τ=2; k=1; λ=1; ϕ=1α=1; τ=2.5; k=1; λ=1; ϕ=1α=1; τ=3; k=1; λ=1; ϕ=1

0.0 0.5 1.0 1.5 2.0 2.5

0.0

0.2

0.4

0.6

0.8

1.0

1.2

t

f(t)

α=1; τ=−0.7; k=1; λ=1; ϕ=1α=1; τ=−1.5; k=1; λ=1; ϕ=1α=1; τ=−2; k=1; λ=1; ϕ=1α=1; τ=−2.5; k=1; λ=1; ϕ=1α=1; τ=−3; k=1; λ=1; ϕ=1

0 2 4 6 8 10 12

0.0

0.2

0.4

0.6

0.8

1.0

1.2

t

f(t)

α=4.7; τ=7.2; k=2.8; λ=0.8; ϕ=1.5α=5; τ=2.3; k=3; λ=0.9; ϕ=2.5α=1.9; τ=−5; k=0.4; λ=2.1; ϕ=2α=1.2; τ=−0.6; k=0.4; λ=10; ϕ=0.8α=6; τ=3; k=1; λ=20; ϕ=0.9α=3.1; τ=−10.3; k=2; λ=0.1; ϕ=1.2

Figure 1: The KGG density curves: (a) For some values of τ > 0. (b) For somevalues of τ < 0. (c) For some values of τ > 0 and τ < 0

Let gα,τ,k(t) be the GG(α, τ, k) density function (Stacy and Mihram, 1965)given by (for t > 0)

gα,τ,k(t) =|τ |

αΓ(k)

(t

α

)τk−1

exp

[−(t

α

)τ ]. (4)

The generalized binomial expansion converges for |z| < 1 and any real b

(1− z)b =

∞∑j=0

(−1)j(b

j

)zj , (5)

where(bj

)= Γ(b + 1)/[Γ(b − j + 1)j!]. Then, from (5), the density function (2)

can be expressed as

f(t) =λϕ |τ |αΓ(k)

(t

α

)τk−1

exp

[−(t

α

)τ ]{γ1

[k,

(t

α

)τ ]}λ−1 ∞∑j=0

(−1)j(ϕ− 1

j

)

×

{γ1

[k,

(t

α

)τ ]}λj.

This equation holds for any ϕ > 0. From expansion (27) (given in Appendix A),we obtain

f(t) =λϕ |τ |αΓ(k)

(t

α

)τk−1

exp

[−(t

α

)τ ] ∞∑m,j=0

(−1)j sm(λ− 1)

(ϕ− 1

j

)

×

{γ1

[k,

(t

α

)τ ]}λj+m. (6)

Page 6: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

786 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

For any real λj +m > 0, we can write{γ1

[k,

(t

α

)τ ]}jλ+m

=∞∑l=0

(−1)l(λj +m

l

){1− γ1

[k,

(t

α

)τ ]}l.

Applying the binomial expansion, (6) can be rewritten as

f(t) =λϕ |τ |αΓ(k)

(t

α

)kτ−1

exp

[−(t

α

)τ ] ∞∑j,l,m=0

l∑q=0

(−1)j+l+q sm(λ− 1)

×(ϕ− 1

j

)(jλ+m

l

)(l

q

){γ1

[k,

(t

α

)τ ]}q.

From (31) (given in Appendix A) and replacing∑∞

l=0

∑lq=0 by

∑∞q=0

∑∞l=q, the

density function f(t) can be expressed as a double linear combination of GGdensity functions

f(t) =

∞∑d,q=0

wd,q gα,τ,k(1+q)+d(t), t > 0, (7)

where gα,τ,k(1+q)+d(t) is the GG(α, τ, k(1 + q) + d) density function given by (4)and the coefficients wd,q are

wd,q = wd,q(k, λ, ϕ) = λϕ

∞∑j,m=0

∞∑l=q

(−1)j+l+q Γ[k(1 + q) + d] sm(λ− 1) cq,dΓ(k)q+1

×(ϕ− 1

j

)(λj +m

l

)(l

q

), (8)

where the quantity sm(λ − 1) is given by (28) and the quantities cq,d can bedetermined from the recurrence (30) (see Appendix A). Clearly,

∑∞d,q=0wd,q = 1.

(7) is the main result of this section. Some KGG mathematical properties (forexample, the ordinary, inverse, factorial and incomplete moments and generatingfunction can be obtained from (7) and those GG properties.

3. General Properties

Henceforth, let T and Y be random variables having the densities (2) and (4),respectively.

3.1 Moments

Here, we obtain an infinite representation for the rth ordinary moment of T ,

Page 7: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 787

say µ′r = E(T r). The rth moment of the GG(α, τ, k) distribution is

µ′r,GG =αr Γ(k + r/τ)

Γ(k).

Based on (7), µ′r can be expressed as

µ′r = αr λϕ∞∑

d,q,j,m=0

∞∑l=q

(−1)j+l+q Γ[k(1 + q) + d+ r/τ ] sm(λ− 1) cq,dΓ(k)q+1

×(ϕ− 1

j

)(λj +m

l

)(l

q

),

where the quantity sm(λ) is given by (28) and the coefficients cq,d can be deter-mined from (30).

3.2 Generating Function

Here, we provide an explicit expression for the mgf of Y , say Mα,τ,k(s) =E(esY ), for τ > 1, based on the Wright function (Wright, 1935). We are unableto obtain a closed-form expression for Mα,τ,k(s) when τ < 0. We can write

Mα,τ,k(s) =|τ |

ατk Γ(k)

∫ ∞0

exp(st) tτk−1 exp{−(t/α)τ}dt.

Setting u = t/α, we have

Mα,τ,k(s) =|τ |

Γ(k)

∫ ∞0

exp(αsu)uτk−1 exp(−uτ )du.

Now, we assume τ > 0. Expanding the first exponential in Taylor series andusing

∫∞0 ukτ+m−1e−u

τdu = τ−1 Γ(m/τ + k), we obtain

Mα,τ,k(s) =sgn(τ)

Γ(k)

∞∑m=0

Γ(mτ

+ k) (αs)m

m!. (9)

(9) holds for any real τ > 0. However, for τ > 1, it can be simplified by con-sidering the Wright generalized hypergeometric function (Wright, 1935) definedby

pΨq

[ (α1, A1

), · · · ,

(αp, Ap

)(β1, B1

), · · · ,

(βq, Bq)

; x

]=

∞∑m=0

p∏j=1

Γ(αj +Ajm)

q∏j=1

Γ(βj +Bjm)

xm

m!.

Page 8: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

788 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

This function exists if 1 +∑q

j=1Bj −∑p

j=1Aj > 0. By combining the last twoequations, we have

Mα,τ,k(s) =1

Γ(k)1Ψ0

[ (k, τ−1

)− ; s α

]. (10)

Finally, the mgf of T (for τ > 1) can be written from (7) and (10) as

M(s) =

∞∑d,q=0

wd,qΓ(k[1 + q] + d)

1Ψ0

[ (k[1 + q] + d, τ−1

)− ; s α

]. (11)

(9)-(11) are the main result of this section.

3.3 Order Statistics

The density function fi:n(t) of the ith order statistic, say Ti:n, for i = 1, · · · , n,from random variables T1, · · · , Tn having density (2), is given by

fi:n(t) =1

B(i, n− i+ 1)f(t)F (t)i−1 {1− F (t)}n−i,

where f(t) and F (t) are the pdf and cdf of the KGG distribution, respectivelyand B(p, q) = Γ(p)Γ(q)/Γ(p + q) denotes the beta function. We readily obtainusing the binomial expansion

fi:n(t) =1

B(i, n− i+ 1)f(t)

n−i∑j1=0

(−1)j1(n− ij1

)F (t)i+j1−1.

Now, we derive an expression for the density function of the KGG order statisticsin terms of the baseline density function multiplied by a power series of G(t) =γ1(k, (t/α)τ ). Setting ϕ = λ = 1 in (3), it is clear that the cdf of the GG(α, τ, k)distribution is G(t) = γ1(k, (t/α)τ ) for τ > 0 and 1−G(t) = 1− γ1(k, (t/α)τ ) forτ < 0. This result enables us to obtain the ordinary moments of the KGG orderstatistics as infinite weighted sums of convenient quantities defined by

δs,r =

∫ ∞0

ts γ1

[k,

(t

α

)τ]rgα,τ,k(t)dt.

These quantities δs,r can be easily integrated numerically. They represent theprobability weighted moments (PWMs) of Y when τ > 0. We shall consider twodistinct cases: τ > 0 and τ < 0.

Page 9: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 789

• For τ > 0

An expansion for F (t)i+j1−1 follows from (3) as

F (t)i+j1−1 =

i+j1−1∑l1=0

(−1)l1(i+ j1 − 1

l1

) [1−G(t)λ

]l1ϕ.

Using the series expansion for[1−G(t)λ

]l1ϕ, we obtain

F (t)i+j1−1 =

i+j1−1∑l1=0

(−1)l1(i+ j1 − 1

l1

) ∞∑m1=0

(−1)m1

(l1ϕ

m1

)G(t)m1λ.

For m1λ > 0, we can write G(t)m1λ =∑∞

r=0 sr(m1λ)G(t)r, where thecoefficients sr(m1λ) can be obtained from (28) as

sr(m1λ) =∞∑l2=r

(−1)l2+r

(m1λ

l2

)(l2r

),

for r = 0, 1, · · · . Thus,

F (t)i+j1−1 =

i+j1−1∑l1=0

(−1)l1(i+ j1 − 1

l1

) ∞∑r=0

υr(l1, λ, ϕ)G(t)r,

where

υr(l1, λ, ϕ) =

∞∑m1=0

(−1)m1

(l1ϕ

m1

)sr(m1λ).

Interchanging the sums, we obtain

F (t)i+j1−1 =

∞∑r=0

pr,i+j1−1(λ, ϕ)G(t)r,

where

pr,u(λ, ϕ) =u∑

l1=0

(−1)l1(u

l1

) ∞∑m1=0

∞∑l2=r

(−1)m1+r+l2

(l1ϕ

m1

)(m1λ

l2

)(l2r

),

for r, u = 0, 1, · · · . The density function of Ti:n becomes

fi:n(t) =1

B(i, n− i+ 1)f(t)

n−i∑j1=0

(−1)j1(n− ij1

) ∞∑r=0

pr,i+j1−1(λ, ϕ)G(t)r,

Page 10: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

790 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

and then

fi:n(t) =1

B(i, n− i+ 1)

∞∑d,q,r=0

n−i∑j1=0

(−1)j1 wd,q pr,i+j1−1(λ, ϕ)

(n− ij1

)γ1

×[k,

(t

α

)τ]rgα,τ,k(1+q)+d(t).

The last equation can be rewritten as

fi:n(t) =∞∑

d,q,r=0

n−i∑j1=0

m(d, q, i, j1, n) pr,i+j1−1(λ, ϕ) tτ(kq+d) γ1

×[k,

(t

α

)τ]rgα,τ,k(t), (12)

where

m(d, q, i, j1, n) =λϕ

B(i, n− i+ 1)

∞∑j,m=0

∞∑l=q

(−1)j+j1+l+q sm(λ− 1) cq,d

ατ(kq+d) Γ(k)q

×(ϕ− 1

j

)(n− ij1

)(λj +m

l

)(l

q

). (13)

Then, for τ > 0, E (T si:n) can be expressed as

E (T si:n) =∞∑

d,q,r=0

n−i∑j1=0

m(d, q, i, j1, n) pr,i+j1−1(λ, ϕ) δs+τ(kq+d),r,

where δs+τ(kq+d),r was defined before.

• For τ < 0

From (3), we can write

F (t)i+j1−1 =[1−G(t)λ

]ϕ(i+j1−1)=

∞∑m1=0

(−1)m1

(ϕ(i+ j1 − 1)

m1

)G(t)m1λ.

For m1λ > 0, we have G(t)m1λ =∑∞

r=0 sr(m1λ)G(t)r, where sr(m1λ) isgiven by (28) for r = 0, 1, · · · . Further,

F (t)i+j1−1 =

∞∑r=0

qr(i, j1, λ, ϕ)G(t)r,

Page 11: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 791

where

qr(i, j1, λ, ϕ) =

∞∑m1=0

(−1)m1 sr(m1λ)

(ϕ(i+ j1 − 1)

m1

).

The density function of the KGG order statistics becomes

fi:n(t) =1

B(i, n− i+ 1)f(t)

n−i∑j1=0

(−1)j1(n− ij1

) ∞∑r=0

qr(i, j1, λ, ϕ)G(t)r,

and then

fi:n(t) =1

B(i, n− i+ 1)

∞∑d,q,r=0

n−i∑j1=0

(−1)j1(n− ij1

)wd,q qr(i, j1, λ, ϕ)γ1

×[k,

(t

α

)τ]rgα,τ,k(1+q)+d(t).

The last equation can be rewritten as

fi:n(t) =∞∑

d,q,r=0

n−i∑j1=0

m(d, q, i, j1, n) tτ(kq+d)qr(i, j1, λ, ϕ)γ1

×[k,

(t

α

)τ]rgα,τ,k(t), (14)

where m(d, q, i, j1, n) is given by (13). So, the moments of the KGG orderstatistics for τ < 0 can be expressed as

E (T si:n) =∞∑

d,q,r=0

n−i∑j1=0

m(d, q, i, j1, n)qr(i, j1, λ, ϕ) δs+τ(kq+d),r.

(12) and (14) are the main results of this section.

4. The LKGG Distribution

Henceforth, let T be a random variable following the KGG density function (2)and let Y = log(T ). Setting k = q−2, τ = (σ

√k)−1 and α = exp

[µ− τ−1 log(k)

],

the density function of Y can be expressed as

f(y) =λϕ|q|(q−2)q

−2

σΓ(q−2)exp{q−1(y − µ

σ

)− q−2exp

[q(y − µ

σ

)]}×{γ1

[q−2, q−2exp

{q(y − µ

σ

)}]}λ−1{1−

[γ1

(q−2, q−2exp

{q(y − µ

σ

)})]λ}ϕ−1,

Page 12: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

792 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

where −∞ < y, µ < ∞, σ > 0, λ > 0, ϕ > 0 and q is different from zero. Weconsider an extended form including the case q = 0 (Lawless, 2003). Thus, thedensity of Y can be expressed as

f(y) =

λϕ|q|(q−2)q−2

σΓ(q−2)exp{q−1(y−µσ

)− q−2exp

[q(y−µσ

)]}×{γ1

[q−2, q−2exp

{q(y−µσ

)}]}λ−1

×{

1−[γ1

(q−2, q−2exp

{q(y−µσ

)})]λ}ϕ−1, if q 6= 0,

λϕ√2πσ

exp{− 1

2

(y−µσ

)2}Φ(λ−1)

(y−µσ

)[1− Φλ

(y−µσ

)]ϕ−1, if q = 0,

(15)

where Φ(·) is the standard normal cumulative distribution. We refer to (15)as the log-Kumaraswamy generalized gamma (LKGG) distribution, say Y ∼LKGG(µ, σ, q, λ, ϕ), where µ ∈ R is the location parameter, σ > 0 is the scaleparameter and q, λ and ϕ are shape parameters. For q = 0 and λϕ = 1 and q = 1,we have from (15) the skew normal and extreme value distributions, respectively.For ϕ = 1 and λ = ϕ = 1, we obtain the log-exponentiated generalized gamma(Ortega et al., 2012) and log-gamma generalized distributions, respectively. Thecase λ = ϕ = 1 and q = −1 corresponds to the log-inverse Weibull distribution.Thus,

if T ∼ KGG(α, τ, k, λ, ϕ) then Y = log(T ) ∼ LKGG(µ, σ, q, λ, ϕ).

Setting µ = 0 and σ = 1, the plots of the density function (15) for selectedvalues of λ and ϕ, when q < 0, q > 0 and q = 0, are displayed in Figure 2.These plots clearly indicate that the LKGG distribution could be very flexiblefor modeling its kurtosis.

−4 −3 −2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

y

f(y)

q=0.4; λ=0.6; ϕ=0.9q=4; λ=3; ϕ=0.5q=0.2; λ=0.4; ϕ=1.2q=0.5; λ=1.9; ϕ=0.8q=0.1; λ=0.7; ϕ=5

−3 −2 −1 0 1 2 3 4

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

y

f(y)

q=−0.4; λ=0.6; ϕ=0.9q=−4; λ=3; ϕ=0.5q=−0.2; λ=0.4; ϕ=1.2q=−0.5; λ=1.9; ϕ=0.8q=−0.1; λ=0.7; ϕ=5

−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

0.5

y

f(y)

λ=0.6; ϕ=0.9λ=3; ϕ=0.5λ=0.4; ϕ=1.2λ=1.9; ϕ=0.8λ=0.7; ϕ=5

Figure 2: The LKGG density curves: (a) For some values of q > 0. (b) Forsome values of q < 0. (c) For some values of q = 0

Page 13: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 793

• For q > 0, the survival function of Y is given by

S(y) = 1− F (y) = P (Y > y) = P (µ+ σZ > y) = P (Z > z)

=λϕq(q−2)q

−2

Γ(q−2)

∫ ∞z

exp{q−1u− q−2exp(qu)

}{γ1

[q−2, q−2exp(qu)

]}λ−1du[

1−{γ1

[q−2, q−2exp(qu)

]}λ]−(ϕ−1);

• For q < 0, the survival function of Y is given by

S(y) = 1− F (y) = P (Y > y) = P (µ+ σZ > y) = P (Z > z)

=−λϕq(q−2)q

−2

Γ(q−2)

∫ ∞z

exp{q−1u− q−2exp(qu)

}{γ1

[q−2, q−2exp(qu)

]}λ−1du[

1−{γ1

[q−2, q−2exp(qu)

]}λ]−(ϕ−1).

These integrals can be reduced to

S(y) =

{1−

[γ1

(q−2, q−2exp

[q(y−µσ )

])]λ}ϕ, if q > 0,

1−{

1−[γ1

(q−2, q−2exp

[q(y−µσ )

])]λ}ϕ, if q < 0,

[1− Φλ

(y−µσ

)]ϕ, if q = 0.

(16)

Now, we derive the rth moment of Y ∼ LKGG(µ, σ, q, λ, ϕ), say µ′r.

• For q 6= 0, we obtain

µ′r =

∫ ∞−∞

yrλϕ|q|

(q−2)q−2

σΓ(q−2)exp

(q−2{q(y − µ

σ

)− exp

[q(y − µ

σ

)]})×(γ1

{q−2, q−2 exp

[q(y − µ

σ

)]})λ−1

×[1−

(γ1

{q−2, q−2 exp

[q(y − µ

σ

)]})λ]ϕ−1dy.

Setting x = q−2 exp[q(y−µ

σ

)]and using (5), µ′r reduces to

µ′r =λϕ sgn(q)

Γ (q−2)

∫ ∞0

q[log(x) + 2 log(|q|)] + µ

}rxq

−2−1e−x

×∞∑j=0

(−1)j(ϕ− 1

j

)[γ1

(q−2, x

)]λ(j+1)−1dx.

Page 14: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

794 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Applying expansion (27) in Appendix A for[γ1

(q−2, x

)]λ(j+1)−1, we obtain

µ′r =

∫ ∞0

λϕ sgn(q)

Γ (q−2)

q[log(x) + 2 log(|q|)] + µ

}rxq

−2−1e−x

×∞∑

j,m=0

(−1)j sm(λ(j + 1)− 1)

(ϕ− 1

j

)γ1

(q−2, x

)mdx.

Using expansion (31) in Appendix A for γ1

(q−2, x

)m, we have

µ′r =

∫ ∞0

λϕ sgn(q)

Γ (q−2)

q[log(x) + 2 log(|q|)] + µ

}re−x

×∞∑

j,m,i=0

(−1)j sm(λ(j + 1)− 1) cm,iΓ(q−2)m

(ϕ− 1

j

)xq

−2(m+1)+i−1dx.

Note that{2σ

qlog(|q|) + µ+

σ

qlog(x)

}r=

r∑l=0

(r

l

)[2σ

qlog(|q|) + µ

]r−l(σq

)ll log(x),

and then

µ′r =λϕ sgn(q)

Γ(q−2)

∞∑j,m,i=0

r∑l=0

(−1)j sm(λ(j + 1)− 1) cm,iΓ (q−2)m

(ϕ− 1

j

)(r

l

)

×[

qlog(|q|) + µ

]r−l (σq

)ll

∫ ∞0

log(x)e−x xq−2(m+1)+i−1dx.

The last integral is given in Prudnikov et al. (1986, Vol 1, Section 2.6.21)and then

µ′r =λϕ sgn(q)

Γ(q−2)

∞∑j,m,i=0

r∑l=0

(−1)j sm(λ(j + 1)− 1) cm,iΓ (q−2)m

(ϕ− 1

j

)(r

l

)

×[

qlog(|q|) + µ

]r−l [Γ(q−2(m+ 1) + i)

](σq

)ll,

where Γ(p) = ∂Γ(p)/∂p.

• For q = 0, we can write

µ′r = E(Y r) =λϕ√2πσ

∫ ∞−∞

yr exp[− 1

2

(y − µσ

)2]Φ(λ−1)

(y − µσ

)×[1− Φ(λ)

(y − µσ

)]ϕ−1dy.

Page 15: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 795

Using (5), we have

µ′r =λϕ√2πσ

∫ ∞−∞

yrexp[− 1

2

(y − µσ

)2] ∞∑m=0

(−1)m(ϕ− 1

m

)Φ(λ(m+1)−1)

×(y − µ

σ

)dy.

The PWM (for j and p non-negative integers) of the standardized normaldistribution is

νj,p =

∫ ∞−∞

zj φ(z) Φp(z)dz,

where φ(z) is the standard normal density. Setting y = µ + σz and using(27) in Appendix A for Φ(λ(m+1)−1)(z), we obtain

µ′r = λϕ∞∑

m,p=0

r∑j=0

(−1)m(ϕ− 1

m

)(r

j

)sp(λ(m+ 1)− 1)σj µr−j νj,p, (17)

The standard cumulative normal can be expressed as

Φ(x) =1

2

{1 + erf

(x√2

)}, x ∈ R.

Using the binomial expansion and interchanging terms, we obtain

νj,p =1

2p√

p∑l=0

(p

l

) ∫ ∞−∞

xje−x2/2 erf

(x√2

)p−ldx.

From the power series for the error function erf(·)

erf(x) =2√π

∞∑m=0

(−1)m x2m+1

(2m+ 1)m!,

the last integral follows from (9)-(11) of Nadarajah (2008b). When j+p− lis even, we have

νj,p = 2j/2π−(p+1)/2p∑l=0

(j+p−l) even

(p

l

)2−l πl/2 Γ

(j + p− l + 1

2

)

×F (p−l)A

(j + p− l + 1

2;1

2, · · · , 1

2;3

2, · · · , 3

2;−1, · · · ,−1

), (18)

Page 16: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

796 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

where F(p−l)A is given in terms of the Lauricella function of type A (Exton, 1978;

Aarts, 2000) defined by

F(n)A (a; b1, · · · , bn; c1, · · · , cn;x1, · · · , xn)

=

∞∑m1=0

· · ·∞∑

mn=0

(a)m1+···+mn (b1)m1· · · (bn)mn

(c1)m1· · · (cn)mn

xm11 · · ·xmnnm1! · · ·mn!

,

and (a)i = a(a + 1) · · · (a + i − 1) is the ascending factorial with the conventionthat (a)0 = 1. Numerical routines for the direct computation of the Lauricellafunction of type A are available, see Exton (1978) and Mathematica (Trott, 2006).

Plots of the skewness and kurtosis for selected values of µ, σ, q, λ and ϕ aredisplayed in Figures 3 and 4, respectively.

2 4 6 8 10

01

23

45

67

λ

Ske

wn

ess

ϕ=0.5ϕ=1ϕ=1.5ϕ=2

0.0 0.2 0.4 0.6 0.8 1.0

2.0

2.5

3.0

3.5

λ

Ku

rto

sis

ϕ=0.5ϕ=1ϕ=1.5ϕ=2

Figure 3: Skewness and kurtosis of the LKGG distribution as a function of λfor some values of ϕ with µ = 1.5, σ = 2 and q = 0.5

0 2 4 6 8 10

−10

−5

05

10

15

20

ϕ

Ske

wness

λ=0.5λ=1λ=1.5λ=2

0.0 0.2 0.4 0.6 0.8 1.0

1.5

2.0

2.5

3.0

3.5

ϕ

Ku

rto

sis

λ=0.5λ=1λ=1.5λ=2

Figure 4: Skewness and kurtosis of the LKGG distribution as a function of ϕfor some values of λ with µ = 1.5, σ = 2 and q = 0.5

Page 17: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 797

5. The LKGG Regression Model

In many practical applications, the lifetimes are affected by explanatory vari-ables such as the cholesterol level, blood pressure, weight and many others. Para-metric models to estimate univariate survival functions for censored data regres-sion problems are widely used. A parametric model that provides a good fitto lifetime data tends to yield more precise estimates of the quantities of in-terest. If Y ∼ LKGG(µ, σ, q, λ, ϕ), we define the standardized random variableZ = (Y − µ)/σ having density function given by

f(z) =

λϕ |q|Γ(q−2)

(q−2)q−2

exp{q−1z − q−2exp(qz)

}{γ1

[q−2, q−2exp(qz)

]}λ−1

×{

1−(γ1

[q−2, q−2exp(qz)

])λ}ϕ−1, if q 6= 0,

λϕ√2π

exp(− z2

2

)Φ(λ−1)(z)[1− Φλ(z)]ϕ−1, if q = 0.

(19)

We write Z ∼ LKGG(0, 1, q, λ, ϕ). Further, we propose a linear location-scaleregression model linking the response variable yi and the explanatory variablevector xTi = (xi1, · · · , xip) by

yi = xTi β + σzi, i = 1, · · · , n, (20)

where the random error zi has density function (19), β = (β1, · · · , βp)T , σ > 0,λ > 0 and −∞ < q < ∞ are unknown parameters. The parameter µi = xTi β isthe location of yi. The location parameter vector µ = (µ1, · · · , µn)T is defined bya linear model µ = Xβ, whereX = (x1, · · · ,xn)T is a known model matrix. TheLKGG model (20) opens new possibilities for fitted many different types of data.It contains as special models some well-known regression models. For ϕ = 1,we obtain the log-exponentiated generalized gamma (LEGG) regression model(Ortega et al., 2012). The case λ = 1 leads to the log-exponentiated complementgeneralized gamma (LECGG) regression model, whereas ϕ = λ = 1 yields thelog-gamma generalized (LGG) regression model. For ϕ = λ = q = 1, we obtainthe classical log-Weibull (or extreme value) regression model (see, Lawless, 2003).If σ = 1 and σ = 0.5, in addition to ϕ = λ = q = 1, the new regression modelreduces to the exponential and Rayleigh regression models, respectively. Thecase λ = ϕ = q = 1 refers to the log-exponentiated Weibull (LEW) regressionmodel (Mudholkar et al., 1995). See, also, Cancho et al. (1999, 2009), Ortegaet al. (2006) and Hashimoto et al. (2010). If σ = 1, in addition to q = 1, itcoincides with the log-exponentiated exponential regression model. If σ = 0.5,in addition to q = 1, it gives the log-generalized Rayleigh regression model. Forλ = 1, we have the log-gamma generalized (LGG) regression model (Lawless,2003). Recently, the LGG distribution has been used in several research areas.

Page 18: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

798 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

See, for example, Ortega et al. (2003, 2008, 2009). For q = −1, we obtain thelog-generalized inverse Weibull regression model. If λ = 1, in addition to q = −1,it follows the log-inverse Weibull regression model. For q = 0, we have the log-exponentiated normal regression model. Finally, for λ = 2, the LKGG regressionmodel reduces to the skew normal regression model.

5.1 Maximum Likelihood Estimation

Consider a sample (y1,x1), · · · , (yn,xn) of n independent observations, whereeach random response is defined by yi = min{log(ti), log(ci)}. We assume non-informative censoring such that the observed lifetimes and censoring times areindependent. Let F and C be the sets of individuals for which yi is the log-lifetimeor log-censoring, respectively. Conventional likelihood estimation techniques canbe applied here. The log-likelihood function for the vector of parameters θ =

(q, λ, ϕ, σ,βT )T from model (20) has the form l(θ) =∑i∈F

li(θ) +∑i∈C

l(c)i (θ), where

li(θ) = log[f(yi)], l(c)i (θ) = log[S(yi)], f(yi) is the density function (15) and S(yi)

is the survival function (16) of Yi. The total log-likelihood function for θ can bepartitioned as

l(θ) =

v log[λϕ q (q−2)q

−2

σΓ(q−2)

]+ q−1

∑i∈F

zi − q−2∑i∈F

exp(qzi)

+(λ− 1)∑i∈F

log{γ1[q−2, q−2 exp(qzi)]}

+(ϕ− 1)∑i∈F

log{

1− {γ1[q−2, q−2 exp(qzi)]}λ}

+ϕ∑i∈C

log{

1− {γ1[q−2, q−2 exp(qzi)]}λ}, if q > 0,

v log[λϕ (−q) (q−2)q

−2

σΓ(q−2)

]+ q−1

∑i∈F

zi − q−2∑i∈F

exp(qzi)

+(λ− 1)∑i∈F

log{γ1[q−2, q−2 exp(qzi)]}

+(ϕ− 1)∑i∈F

log{

1− {γ1[q−2, q−2 exp(qzi)]}λ}

+∑i∈C

log{

1−[1− {γ1[q−2, q−2 exp(qzi)]}λ

]ϕ}, if q < 0,

v log[

λϕ

σ√

]− 1

2

∑i∈F

z2i + (λ− 1)

∑i∈F

log[Φ(zi)]

+(ϕ− 1)∑i∈F

log[1− Φλ(zi)] + ϕ∑i∈C

log[1− Φλ(zi)], if q = 0,

(21)

where v is the number of uncensored observations (failures) and zi = (yi−xTi β)/σ.

The maximum likelihood estimate (MLE) θ of the vector of the model parameterscan be calculated by maximizing the log-likelihood (21). Initial values for β andσ are obtained by fitting the Weibull regression model with λ = 1, ϕ = 1 and

Page 19: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 799

q = 1. If zi = (yi − xTi β)/σ, the fit of the LKGG model yields the estimatedsurvival function for yi

S(yi; λ, ϕ, σ, q, βT

) =

{1− {γ1[q−2, q−2 exp(qzi)]}λ

}ϕ, if q > 0,

1−{

1− {γ1[q−2, q−2 exp(qzi)]}λ}ϕ, if q < 0,

[1− Φλ(zi)

]ϕ, if q = 0.

(22)

Under conditions that are fulfilled for the parameter vector θ in the interiorof the parameter space but not on the boundary, the asymptotic distribution of√n(θ−θ) is multivariate normal Np+3(0,K(θ)−1), where K(θ) is the information

matrix. The asymptotic covariance matrix K(θ)−1 of θ can be approximated bythe inverse of the (p + 3) × (p + 3) observed information matrix −L(θ). Theelements of −L(θ), namely −Lλλ, −Lλϕ, −Lλσ, −Lλβj , −Lϕϕ, −Lϕσ, −Lϕβj ,−Lσσ, −Lσβj and −Lβjβs for j, s = 1, · · · , p, are given in Appendix B. The

approximate multivariate normal distribution Np+3(0,−L(θ)−1) for θ can beused in the classical way to construct approximate confidence regions for somecomponents of θ.

Further, we can use the likelihood ratio (LR) statistic for comparing somesub-models with the LKGG model. We consider the partition θ = (θT1 ,θ

T2 )T ,

where θ1 is a subset of parameters of interest and θ2 is a subset of remaining

parameters. The LR statistic for testing the null hypothesis H0 : θ1 = θ(0)1 versus

the alternative hypothesis H1 : θ1 6= θ(0)1 is given by w = 2{`(θ)−`(θ)}, where θ

and θ are the estimates under the null and alternative hypotheses, respectively.The statistic w is asymptotically (as n → ∞) distributed as χ2

k, where k is thedimension of the subset of parameters θ1 of interest.

5.2 A Bayesian Analysis

As an alternative analysis, we use the Bayesian method which allows forthe incorporation of previous knowledge of the parameters through informativepriori density functions. When this information is not available, we can consider anoninformative prior. In the Bayesian approach, the information referring to themodel parameters is obtained through a posterior marginal distribution. In thisway, two difficulties usually arise. The first refers to attaining marginal posteriordistribution, and the second to the calculation of the moments of interest. Bothcases require numerical integration that, many times, do not present an analyticalsolution. Here, we use the simulation method of Markov Chain Monte Carlo(MCMC), such as the Gibbs sampler and Metropolis-Hastings algorithm.

Page 20: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

800 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Since we have no prior information from historical data or from previousexperiment, we assign conjugate but weakly informative prior distributions tothe parameters. Since we assume informative (but weakly) prior distribution,the posterior distribution is a well-defined proper distribution. Here, we considerthat the elements of the parameter vector are independent and that the jointprior distribution of all unknown parameters has density function given by

π(λ, ϕ, σ, q,β) ∝ π(λ)× π(ϕ)× π(σ)× π(q)× π(β). (23)

Here, λ ∼ Γ(a1, b1), ϕ ∼ Γ(a2, b2), σ ∼ Γ(a3, b3), q ∼ N (µ1, σ21) and βs ∼

N (µs, σ2s), where Γ(ai, bi) denotes a gamma distribution with mean ai/bi, variance

ai/b2i and density function

f(υ; ai, bi) =baii υ

ai−1 exp(−υbi)Γ(ai)

,

where υ > 0, ai > 0 and bi > 0, N (µs, σ2s) denotes the normal distribution with

mean µs, variance σ2s and density function given by

f(x;µs, σs) =1√

2πσ2s

exp

[−(x− µs)2

2σ2s

],

where x, µs ∈ R and σ2s > 0. All hyper-parameters are specified. Combining

the likelihood function (21) and the prior distribution (23), the joint posteriordistribution for λ, ϕ, σ, q and βTs reduces to

π(λ, ϕ, σ, q,β|y) ∝

[λϕ(−q)(q−2)q

−2

σΓ(q−2)

]υexp

[q−1

∑i∈F

(yi − xTi β)

σ

]

× exp

(−q−2

∑i∈F

exp

{q

[(yi − xTi β)

σ

]})

×∏i∈F

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ−1

×∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ−1

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ(24)

×π(λ, ϕ, σ, q,β).

Page 21: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 801

The joint posterior density above is analytically intractable because the in-tegration of the joint posterior density is not easy to perform. So, the inferencecan be based on MCMC simulation methods such as the Gibbs sampler andMetropolis-Hastings algorithm, which can be used to draw samples, from whichfeatures of the marginal distributions of interest can be inferred. In this direction,we first obtain the full conditional distributions of the unknown parameters givenby

π(λ|y, ϕ, σ, q,β) ∝ λυ∏i∈F

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ

×∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ−1

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ×π(λ),

π(ϕ|y, λ, σ, q,β) ∝ ϕυ∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ×π(ϕ),

π(σ|y, λ, ϕ, q,β)

∝ σ−υ exp

[q−1

∑i∈F

(yi − xTi β)

σ

]exp

(−q−2

∑i∈F

exp

{q

[(yi − xTi β)

σ

]})

×∏i∈F

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ−1

×∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ−1

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ×π(σ),

Page 22: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

802 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

π(q|y, λ, ϕ, σ,β) ∝

[(−q)(q−2)q

−2

Γ(q−2)

]υexp

[q−1

∑i∈F

(yi − xTi β)

σ

]

× exp

(−q−2

∑i∈F

exp

{q

[(yi − xTi β)

σ

]})

×∏i∈F

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ−1

×∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ−1

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ×π(q)

and

π(β|y, λ, ϕ, σ, q) ∝ exp

[q−1

∑i∈F

(yi − xTi β)

σ

]exp

(−q−2

∑i∈F

exp

{q

[(yi − xTi β)

σ

]})

×∏i∈F

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ−1

×∏i∈F

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ−1

×∏i∈C

1−

{1−

[γ1

(q−2, q−2 exp

{q

[(yi − xTi β)

σ

]})]λ}ϕ

×π(β).

Since the full conditional distributions for λ, ϕ, σ, q and β do not have closed-form, we require the use of the Metropolis-Hastings algorithm. The joint posteriordensity for q > 0 is defined analogously, considering the likelihood function (21)for q > 0 and the prior distribution (23). Combining the likelihood function (21)and the prior distribution (23), the joint posterior distribution for λ, ϕ, σ andβTs for q = 0 reduces to

Page 23: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 803

π(λ, ϕ, σ,β|y) ∝

(λϕ√2π

)υexp

[−1

2

∑i∈F

(yi − xTi β

σ

)2]∏i∈F

(yi − xTi β

σ

)]λ−1

×∏i∈F

{1−

(yi − xTi β

σ

)]λ}ϕ−1∏i∈C

{1−

(yi − xTi β

σ

)]λ}ϕ×π(λ, ϕ, σ,β).

The conditional distributions for these parameters taking q = 0 are given by

π(λ|y, ϕ, σ,β) ∝ λυ∏i∈F

(yi − xTi β

σ

)]λ∏i∈F

{1−

(yi − xTi β

σ

)]λ}ϕ−1

×∏i∈C

{1−

(yi − xTi β

σ

)]λ}ϕ×π(λ),

π(ϕ|y, λ, σ,β) ∝ ϕυ∏i∈F

{1−

(yi − xTi β

σ

)]λ}ϕ∏i∈C

{1−

(yi − xTi β

σ

)]λ}ϕ×π(ϕ),

π(σ|y, λ, ϕ,β) ∝ exp

[−1

2

∑i∈F

(yi − xTi β

σ

)2]∏i∈F

(yi − xTi β

σ

)]λ−1

×∏i∈F

{1−

(yi − xTi β

σ

)]λ}ϕ−1∏i∈C

{1−

(yi − xTi β

σ

)]λ}ϕ×π(λ),

and

π(β|y, λ, ϕ, σ) ∝ exp

[−1

2

∑i∈F

xTi β(1− 2yi)

σ2

]∏i∈F

(yi − xTi β

σ

)]λ−1

×∏i∈F

{1−

(yi − xTi β

σ

)]λ}ϕ−1∏i∈C

{1−

(yi − xTi β

σ

)]λ}ϕ×π(β).

The MCMC computations were implemented in the statistical software pack-age R.

Page 24: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

804 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

6. Application: Chemical Dependency Data

We demonstrate the potentiality of the LKGG regression model applied tochemical dependency patients as investigated by Pascoa (2008). The data wereprovided by the Association Mother Brave, localized in the city of Caratinga, MG,Brazil, where n = 141 residents were considered drug addicts from 2000 to 2005.The response variable was the time spent in the community until the withdrawof the treatment, considering that each resident remains in the community for amaximum period of 270 days without any contact with drugs or free drugs. Theresident who achieves this goal was regarded as a censored observation. Chemicaldependency is a chronic disease of the brain (Kalivas and Volkow, 2005). One ofits main characteristics is the relapse phenomenon, whereby sufferers return totheir consumption of the problem substance, followed by a renewed attempt tostop or reduce this consumption (Brandon et al., 2007; Koob and Le Moal, 1997).Relapse does not mean failure of treatment. Rather it is a part of the rehabili-tation process (Marlatt, 2001). The patient is in relapse when he or she presentsall or part of the dysfunctional behavior patterns that were present before treat-ment. Thus, relapse starts before actual renewed consumption. It is marked bythe occurrence of facilitating stimuli with cognitive, emotional, physical and so-cial aspects (Maisto et al., 2003). Behavioral changes and exposure to high-risksituations generally precede resumed consumption of the substance (Miller et al.,1996). Even the most motivated patients can present relapse episodes (Bakeret al., 2004). Nevertheless, relapse is predictable and avoidable (Brandon et al.,2007). As the time of abstinence increases, the chance of relapse decreases, al-though this risk is never totally eliminated during the lifetime of a person withan addiction. Studies of the risk factors associated with relapse in chemical de-pendency are very important, because their findings can increase the probabilityof suitable early clinical interventions. Furthermore, risk studies allow identifi-cation of protective factors to reduce vulnerability and favor resistance to thetemptation to relapse. The variables involved in the study are:

• ti : the time spent in the community until the withdrawal of treatment(days);

• δi : the censoring indicator (0 = censoring, 1 = lifetime observed);

• xi1 : the marital status (0 = single, 1 = married);

• xi2 : schooling (0 = no education or incomplete bases, 1 = elementaryeducation and higher education);

• xi3: age (0 =< 30 years, 1 =≥ 30 years).

Page 25: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 805

We fit the LKGG regression model

yi = β0 + β1xi1 + β2xi2 + β3xi3 + σzi,

where the errors z1, · · · , z141 are independent random variables having densityfunction (19).

Table 1 lists the MLEs of the parameters for the LKGG regression model fittedto the current data using the NLMixed procedure in SAS. This model involvesan extra parameter which gives it more flexibility to fit these data. Initial valuesfor β, q and σ are taken from the fit of the LGG regression model with λ = 1and ϕ = 1. The explanatory variables x1, x2 and x3 are marginally significantfor the LKGG model at the significance level of 5%.

Table 1: MLEs of the parameters, standard errors, p-values and 95% confidenceintervals for the LKGG model fitted to the chemical dependency data

Parameter Estimate SE p-value CI 95%

q -0.8911 0.0001 - (-0.8913; -0.8910)λ 2.1286 0.5408 - (1.0595; 3.1977)ϕ 0.3558 0.0387 - (0.2792; 0.4324)σ 1.9210 0.0299 - (1.8620; 1.9800)β0 6.0879 0.1336 < 0.001 (5.8238; 6.3520)β1 0.4814 0.0279 < 0.001 (0.4263; 0.5365)β2 -0.7097 0.0759 < 0.001 (-0.8598; -0.5597)β3 0.7033 0.0176 < 0.001 (0.6686; 0.7380)

Cox (1972) proposed a very useful regression model for analyzing censoringfailure times, where the random variable of interest represents failure time andthe failures times are assumed identically distributed in some specified form. Henoted that if the proportional hazards assumption holds (or, is assumed to hold)then it is possible to estimate the effect parameter(s) without any considerationof the hazard function (non-parametric approach). This approach to survivaldata is called proportional hazards model. The Cox model may be specialized ifa reason exists to assume that the baseline hazard follows a parametric form. Inthis case, the baseline hazard can be replaced by a parametric density. Typically,we can then maximize the full likelihood which greatly simplifies model-fittingand provides interpretability at the cost of flexibility.

Let R(ti) be the set of individuals at risk at time ti. Conditionally on the risksets, the required likelihood L(β) can be expressed as

L(β) =

n∏i=1

[exp(xTi β)∑

j∈R(ti)

exp(xTi β)

]δi, (25)

Page 26: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

806 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

where δi is the censoring indicator.

The MLE β of β can be calculated by maximizing the likelihood function (25)using the R software. In Table 2, we list the estimates, corresponding standarderrors and p-values for the fitted Cox regression model. Explanatory variables x2

and x3 are marginally significant at the 5% significance level.

Table 2: Estimates of the Cox model fitted to the data of chemical dependencyand corresponding hazard ratios (HRj)

covariate Estimate SE p-value HRj CI 95% (HRj)

β1 -0.1639 0.2164 0.4488 0.8488 (0.5554; 1.2973)β2 0.6074 0.2128 0.0043 1.8356 (1.2096; 2.7855)β3 -0.4916 0.2162 0.0230 0.6117 (0.4004; 0.9345)

Further, Table 3 gives the Akaike Information Criterion (AIC), ConsistentAkaike Information Criterion (CAIC) and Bayesian Information Criterion (BIC)to compare the LKGG model and Cox proportional hazard regression models.The LKGG regression model outperforms the other models irrespective of thecriteria and it can be used effectively in the analysis of these data. So, theproposed model is a great alternative to model survival data.

Table 3: The AIC, CAIC and BIC statistics for comparing the LKGG modeland Cox proportional hazard regression models

Model AIC CAIC BIC

LKGG 462.7 463.8 486.3Cox 877.1 877.5 885.9

In order to assess if the model is appropriate, we fit the LKGG regressionmodel for each explanatory variable. In Figures 5a, b, c, d, we plot the empiricalsurvival function and the estimated survival function (22) for each explanatoryvariable. We can conclude that the LKGG regression model provides a good fitto these data.

Bayesian analysis

The following independent priors were considered to perform the Metropolis-Hastings algorithm: λ ∼ Γ(0.01, 0.01), ϕ ∼ Γ(0.01, 0.01), σ ∼ Γ(0.01, 0.01), q ∼N (0, 10) and βs ∼ N (0, 10), so that we have a vague prior distribution. Consid-ering these prior density functions, we generate two parallel independent runs ofthe Metropolis-Hastings with size 100, 000 for each parameter, disregarding thefirst 10, 000 iterations to eliminate the effect of the initial values and, to avoidcorrelation problems, we consider a spacing of size 10, obtaining a sample of size9, 000 from each chain. To monitor the convergence of the Metropolis-Hastings,

Page 27: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 807

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

y

S(y

;x)

Kaplan−Meier (x1=0)Kaplan−Meier (x1=1)LKGG Regression model (x1=0)LKGG Regression model (x1=1)

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

yS

(y;x

)

Kaplan−Meier (x2=0)Kaplan−Meier (x2=1)LKGG Regression model (x2=0)LKGG Regression model (x2=1)

0 1 2 3 4 5 6

0.0

0.2

0.4

0.6

0.8

1.0

y

S(y

;x)

Kaplan−Meier (x3=0)Kaplan−Meier (x3=1)LKGG Regression model (x3=0)LKGG Regression model (x3=1)

Figure 5: Kaplan-Meier curves stratified by explanatory variable and estimatedsurvival functions to the chemical dependency data: (a) The marital statusexplanatory variable. (b) Schooling explanatory variable. (c) Age explanatoryvariable

we perform the methods suggested by Cowles and Carlin (1996). We use thebetween and within sequence information, following the approach developed inGelman and Rubin (1992), to obtain the potential scale reduction, R. In all cases,these values were close to one, indicating the convergence of the chain. The ap-proximate posterior marginal density functions for the parameters are presentedin Figure 6. In Table 4, we report posterior summaries for the parameters of theLKGG model. We note that the values for the means a posteriori (Table 4) arequite close (as expected) to the MLEs given in Table 1. SD denotes the standarddeviation from the posterior distributions of the parameters and HPD representsthe 95% highest posterior density (HPD) intervals.

1.0 2.0 3.0

0.0

0.4

0.8

1.2

λ

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

ϕ

Den

sity

1.6 1.8 2.0 2.2

01

23

σ

Den

sity

−1.4 −1.0 −0.6

01

23

q

Den

sity

5.0 5.5 6.0 6.5 7.0 7.5

0.0

0.4

0.8

β0

Den

sity

0.0 0.2 0.4 0.6 0.8

01

23

4

β1

Den

sity

−1.5 −1.0 −0.5 0.0

0.0

0.5

1.0

1.5

β2

Den

sity

0.4 0.6 0.8 1.0

01

23

4

β3

Den

sity

1.0 2.0 3.0

0.0

0.4

0.8

1.2

λ

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

ϕ

Den

sity

1.6 1.8 2.0 2.2

01

23

σ

Den

sity

−1.4 −1.0 −0.6

01

23

q

Den

sity

5.0 5.5 6.0 6.5 7.0 7.5

0.0

0.4

0.8

β0

Den

sity

0.0 0.2 0.4 0.6 0.8

01

23

4

β1

Den

sity

−1.5 −1.0 −0.5 0.0

0.0

0.5

1.0

1.5

β2

Den

sity

0.4 0.6 0.8 1.0

01

23

4

β3

Den

sity

1.0 2.0 3.0

0.0

0.4

0.8

1.2

λ

Den

sity

0.0 0.2 0.4 0.6 0.8 1.0

01

23

4

ϕ

Den

sity

1.6 1.8 2.0 2.2

01

23

σ

Den

sity

−1.4 −1.0 −0.6

01

23

q

Den

sity

5.0 5.5 6.0 6.5 7.0 7.5

0.0

0.4

0.8

β0

Den

sity

0.0 0.2 0.4 0.6 0.8

01

23

4

β1

Den

sity

−1.5 −1.0 −0.5 0.0

0.0

0.5

1.0

1.5

β2

Den

sity

0.4 0.6 0.8 1.0

01

23

4

β3

Den

sity

Figure 6: Approximate posterior marginal densities for the parameters fromthe LKGG model for the chemical dependency data

Page 28: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

808 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Table 4: Posterior summaries for the parameters from the LKGG model forthe chemical dependency data

Parameter Mean SD HPD (95%) R

q -0.9169 0.1140 (-1.1564; -0.7172) 1.0013

λ 2.1383 0.3423 (1.4634; 2.7906) 1.0006

ϕ 0.3737 0.1076 (0.1777; 0.6277) 1.0069

σ 1.9012 0.1129 (1.6758; 2.1115) 1.0020

β0 6.1413 0.3479 (5.5224; 6.8593) 1.0153

β1 0.4904 0.0962 (0.3015; 0.6741) 1.0043

β2 -0.7138 0.2418 (-1.1878; -0.2378) 0.9998

β3 0.7090 0.0972 (0.5220; 0.9063) 1.0025

7. Concluding Remarks

We introduce an extended form of the Kumaraswamy generalized gamma(KGG) distribution (Pascoa et al., 2011) for which the hazard rate function ac-commodates the four types of shape forms, i.e. increasing, decreasing, bathtuband unimodal. The KGG model includes as special cases several useful lifetimemodels. We derive explicit expressions for its moments, generating function,order statistics and their moments. Further, we also introduce the called log-Kumaraswamy generalized gamma (LKGG) distribution and obtain explicit ex-pressions for its moments. Based on this new distribution, we define the LKGGregression model which is very suitable for modeling censored and uncensoredlifetime data. The new regression model allow us to test as special models thegoodness of fit of some widely known regression models. Hence, it represents agood alternative for lifetime data analysis. We estimate the model parametersusing maximum likelihood and a Bayesian approach. We demonstrate by meansof an application to real data that the LKGG model can produce better fits thansome well-known models.

Acknowledgements

The authors are grateful to the Associate Editor and an anonymous refereeand the Editor for very useful comments and suggestions. This work was sup-ported by CNPq and CAPES.

Appendix A

We derive an expansion for γ1(k, x)λ−1 for any real λ > 0. We can write

Page 29: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 809

γ1(k, x)λ−1 = [1− {1− γ1(k, x)}]λ−1 =∞∑j=0

(−1)j(λ− 1

j

){1− γ1(k, x)}j ,

which always converges since 0 < γ1(k, x) < 1. Hence,

γ1(k, x)λ−1 =

∞∑j=0

j∑m=0

(−1)j+m(λ− 1

j

)(j

m

)γ1(k, x)m. (26)

We can substitute∑∞

j=0

∑jm=0 for

∑∞m=0

∑∞j=m to obtain

γ1(k, x)λ−1 =∞∑m=0

sm(λ− 1) γ1(k, x)m, (27)

where

sm(λ− 1) =∞∑j=m

(−1)j+m(λ− 1

j

)(j

m

). (28)

We use the power series for the incomplete gamma ratio function given by

γ1(k, x) =xk

Γ(k)

∞∑d=0

(−x)d

(k + d) d!.

By application a result by Gradshteyn and Ryzhik (2000, Section 0.314) for apower series raised to a positive integer q, we obtain( ∞∑

d=0

ad xd

)q=∞∑d=0

cq,d xd. (29)

Here, the coefficients cq,d (for d = 1, 2, · · · ) are determined from the recurrenceequation

cq,d = (da0)−1d∑p=1

[p (q + 1)− d] ap cq,d−p, (30)

where cq,0 = aq0 and ap = (−1)p/[(k + p)p!]. So, the coefficient cq,d can becalculated from cq,0, · · · , cq,d−1, and then it can be expressed as a function of thequantities a0, · · · , ad, although it is not necessary for programming numericallythe expansions. Further, using (29), we obtain

γ1(k, x)q =xkq

Γ(k)q

∞∑d=0

cq,d xd. (31)

Page 30: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

810 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Appendix B: Matrix of second derivatives −L(θ)

Here, we provide formulas for the second-order partial derivatives of the log-likelihood function. After some algebraic manipulations, we obtain

Lσ,σ =υ

σ2− q−1

σ2

∑i∈F

zi{−2[1− exp(qzi)] + qzi exp(qzi)} −q−1

σΓ(q−2)

×∑i∈F

wq−2−1i zi exp(qzi − wi)[γ1(q−2, wi)]

−1

{(− 2

σ+ zi exp(qzi)

×{− q−1(q−2 − 1)

σw−1i −

ziσ

+q−1

Γ(q−1)wq

−2−1i zi exp(−wi)[γ1(q−2, wi)]

−1

})

×(λ− 1 + λ(1− ϕ)[γ1(q−2, wi)]

λ{

1− [γ1(q−2, wi)]λ}−1

+ wq−2−1i zi

× exp(qzi − wi)[γ1(q−2, wi)]−1

)}− q−1

σ2Γ(q−2)

∑i∈C

wq−2−1i zi exp(qzi − wi)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}ϕ (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{− 2− q−1(q−2 − 1)zi exp(qzi)

{q−2 − 1 + zi +

wq−2−1i

Γ(q−2)exp(−wi)

×[γ1(q−2, wi)]−1

[λ− 1− λ[γ1(q−2, wi)]

λ{

1− [γ1(q−2, wi)]λ}−2

×(ϕ− 1 + ϕ

{1− [γ1(q−2, wi)]

λ}ϕ(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)]}},

Lσ,λ = − q−1

σΓ(q−2)

∑i∈F

wq−2−1i zi exp(qzi − wi)[γ1(q−2, wi)]

−1

{1 + (1− ϕ)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}−1

[1 + λ log[γ1(q−2, wi)]

×(

1 + [γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

)]}− ϕq−1

σΓ(q−2)

∑i∈C

wq−2−1i zi

× exp(qzi − wi)[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}ϕ−1

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

{1 + λ log[γ1(q−2, wi)]

Page 31: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 811

×(

1 + (1− ϕ)[γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

)− λϕ[γ1(q−2, wi)]

λ

×{

1− [γ1(q−2, wi)]λ}−1 (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

},

Lσ,ϕ =λq−1

σΓ(q−2)

∑i∈F

wq−2−1i zi exp(qzi − wi)[γ1(q−2, wi)]

λ−1

×{

1− [γ1(q−2, wi)]λ}−1− λq−1

σΓ(q−2)

∑i∈C

wq−2−1i zi exp(qzi − wi)

×[γ1(q−2, wi)]λ−1{

1− [γ1(q−2, wi)]λ}ϕ−1 (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{

1 + ϕ

(log{

1− [γ1(q−2, wi)]λ}

+{

1− [γ1(q−2, wi)]λ}ϕ

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)},

Lσ,βj = − q

σ2

∑i∈F

xij [(1 + qzi) exp(qzi)− 1]− q−1

σ2Γ(q−2)

∑i∈F

xijwq−2−1i zi exp(qzi)

×[γ1(q−2, wi)]−1

{[− 1 + λ

(1 + (1− ϕ)[γ1(q−2, wi)]

λ

×{

1− [γ1(q−2, wi)]λ}−1

)][− 1− q−1(q−2 − 1)w−1

i zi exp(qzi)− qzi

×(

exp(−wi) + q−2(q−2 − 1)wq−2−1i exp(qzi)

)+

q−1

Γ(q−2)wq

−2−1i zi

× exp(qzi − 2wi)[γ1(q−2, wi)]−1

]− q−1

Γ(q−2)wq

−2−1i zi exp(qzi − wi)

×{

1− [γ1(q−2, wi)]λ}−1

(1 + λ[γ1(q−2, wi)]

2λ−1 {1− [γ1

× (q−2, wi)]λ}−1

)}− λ2ϕ(−q−2)

σ2[Γ(q−2)]2

∑i∈C

xijwq−2−1i exp(qzi − wi)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}ϕ−2 (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{− 1 + ϕ

(1 +

{1− [γ1(q−2, wi)]

λ}ϕ(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)},

Page 32: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

812 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Lλ,λ = − υ

λ2+ (1− ϕ)

∑i∈F

{log[γ1(q−2, wi)]

}2[γ1(q−2, wi)]

λ{

1− [γ1(q−2, wi)]λ}−1

×{

1 + [γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

}+ ϕ

∑i∈C

{log[γ1(q−2, wi)]

}2

×[γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}ϕ−1 (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{

1 + [γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

[1− ϕ

(1 + {1− [γ1

× (q−2, wi)]λ}ϕ (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)]},

Lλ,ϕ = −∑i∈F

log[γ1(q−2, wi)][γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

+∑i∈C

log[γ1(q−2, wi)][γ1(q−2, wi)]λ

{1 +

{1− [γ1(q−2, wi)]

λ}ϕ−1

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

[− 1− ϕ log

{1− [γ1(q−2, wi)]

}×(

1 +{

1− [γ1(q−2, wi)]λ}ϕ (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)]},

Lλ,βj = − q−1

σΓ(q−2)

∑i∈F

xijwq−2−1i exp(qzi − wi)[γ1(q−2, wi)]

−1

{1 + (1− ϕ)

×[γ1(q−2, wi)]λ{

1− [γ1(q−2, wi)]λ}−1

(1 + log[γ1(q−2, wi)]

)}− q−1

σΓ(q−2)

∑i∈C

xijwq−2−1i exp(qzi − wi)[γ1(q−2, wi)]

λ−1

×{

1− [γ1(q−2, wi)]λ}ϕ−1 (

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{

1 + log[γ1(q−2, wi)]

[1 + λ[γ1(q−2, wi)]

λ{

1− [γ1(q−2, wi)]λ}−1

×(

1− ϕ{

1 +{

1− [γ1(q−2, wi)]λ}ϕ(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

})]},

Lϕ,ϕ = − υ

ϕ2−∑i∈C

(log{

1− [γ1(q−2, wi)]λ})2 {

1− [γ1(q−2, wi)]λ}ϕ

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

{1 +

{1− [γ1(q−2, wi)]

λ}ϕ

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

},

Page 33: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 813

Lϕ,βj =λq−1

σΓ(q−2)

∑i∈F

xijwq−2−1i exp(qzi − wi)[γ1(q−2, wi)]

λ−1

×{

1− [γ1(q−2, wi)]λ}−1− λq−1

σΓ(q−2)

∑i∈C

xijwq−2−1i exp(qzi − wi)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}ϕ−1(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{

1 + log{

1− [γ1(q−2, wi)]λ}[

ϕ

(1 +

{1− [γ1(q−2, wi)]

λ}ϕ

×(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)]},

Lβj ,βs = − 1

σ2

∑i∈F

xijxis exp(qzi)−q−1

σ2Γ(q−2)

∑i∈F

xijxiswq−2−1i exp(qzi − wi)

×[γ1(q−2, wi)]−1

{(− q + q−1 exp(qzi)

{− (q−2 − 1)w−1

i +wq

−2−1i

Γ(q−2)

× exp(−wi)[γ1(q−2, wi)]−1 + 1

})(− 1 + λ

{1 + (1− ϕ)[γ1(q−2, wi)]

λ

×{

1− [γ1(q−2, wi)]λ}−1

})− λq−1

Γ(q−2)wq

−2−1i exp(qzi − wi)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}−1

(1 + [γ1(q−2, wi)]

λ

×{

1− [γ1(q−2, wi)]λ}−1

)}− λϕq−1

σ2Γ(q−2)

∑i∈C

xijxiswq−2−1i exp(qzi − wi)

×[γ1(q−2, wi)]λ−1

{1− [γ1(q−2, wi)]

λ}ϕ−1(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

×{− q + q−1 exp(qzi)

[− (q−2 − 1)w−1

i + 1 +wq

−2−1i

Γ(q−2)exp(−wi)

×[γ1(q−2, wi)]−1

(1− λ

{1 + [γ1(q−2, wi)]

λ{

1− [γ1(q−2, wi)]λ}−1

[1− ϕ

×(

1 +{

1− [γ1(q−2, wi)]λ}ϕ(

1−{

1− [γ1(q−2, wi)]λ}ϕ)−1

)]})]},

where zi = (yi − xTi β)/σ and wi = q−2 exp(qzi).

Page 34: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

814 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

References

Aarts, R. M. (2000). Lauricella functions. From MathWorld, A Wolfram WebResource, created by Eric W. Weisstein. http://mathworld.wolfram.com/LauricellaFunctions.html.

Ali, M. M., Woo, J. and Nadarajah, S. (2008). Generalized gamma variableswith drought application. Journal of the Korean Statistical Society 37,37-45.

Almpanidis, G. and Kotropoulos, C. (2008). Phonemic segmentation using thegeneralized gamma distribution and small sample Bayesian information cri-terion. Speech Communication 50, 38-55.

Barkauskas, D. A., Kronewitter, S. R., Lebrilla, C. B. and Rocke, D. M. (2009).Analysis of MALDI FT-ICR mass spectrometry data: a time series ap-proach. Analytica Chimica Acta 648, 207-214.

Baker, T. B., Piper, M. E., McCarthy, D. E., Majeskie, M. R. and Fiore, M. C.(2004). Addiction motivation reformulated: an effective processing modelof negative reinforcement. Psychological Bulletin 111, 33-51.

Brandon, T. H., Vidrine, J. I. and Litvin, E. B. (2007). Relapse and relapseprevention. Annual Review of Psychology 3, 257-284.

Cancho, V. G., Bolfarine, H. and Achcar, J. A. (1999). A Bayesian analysisfor the exponentiated Weibull distribution. Journal of Applied Statistics 8,227-242.

Cancho, V. G., Ortega, E. M. M. and Bolfarine, H. (2009). The exponentiated-Weibull regression models with cure rate. Journal of Applied Probability &Statistics 4, 125-156.

Cowles, M. K. and Carlin, B. P. (1996). Markov chain Monte Carlo convergencediagnostics: a comparative review. Journal of the American StatisticalAssociation 91, 883-904.

Carrasco, J. M. F., Ortega, E. M. M. and Cordeiro, G. M. (2008). A gener-alized modified Weibull distribution for lifetime modeling. ComputationalStatistics and Data Analysis 53, 450-462.

Cordeiro, G. M., Ortega, E. M. M. and Silva, G. O. (2011). The exponentiatedgeneralized gamma distribution with application to lifetime data. Journalof Statistical Computation and Simulation 81, 827-842.

Page 35: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 815

Cox, D. R. (1972) Regression models and life-tables (with discussion). Journalof the Royal Statistical Society, Series B 34, 187-220

Cox, C. (2008). The generalized F distribution: an umbrella for parametricsurvival analysis. Statistics in Medicine 27, 4301-4312.

Cox, C., Chu, H., Schneider, M. F. and Munoz, A. (2007). Parametric sur-vival analysis and taxonomy of hazard functions for the generalized gammadistribution. Statistics in Medicine 26, 4352-4374.

Exton, H. (1978). Handbook of Hypergeometric Integrals: Theory, Applications,Tables, Computer Programs. Halsted Press, New York.

Gelman, A. and Rubin, D. B. (1992). Inference from iterative simulation usingmultiple sequences (with discussion). Statistical Science 7, 457-472.

Gomes, O., Combes, C. and Dussauchoy, A. (2008). Parameter estimation ofthe generalized gamma distribution. Mathematics and Computers in Sim-ulation 79, 955-963.

Gradshteyn, I. S. and Ryzhik, I. M. (2000). Table of Integrals, Series, andProducts, 6th edition. Edited by A. Jeffrey and D. Zwillinger. AcademicPress, New York.

Gupta, R. D. and Kundu, D. (1999). Generalized exponential distributions.Australian & New Zealand Journal of Statistics 41, 173-188.

Hashimoto, E. M., Ortega, E. M. M., Cancho, V. G. and Cordeiro, G. M. (2010).The log-exponentiated Weibull regression model for interval-censored data.Computational Statistics & Data Analysis 54, 1017-1035.

Kalivas, P. W. and Volkow, N. D. (2005). The neural basis of addiction: apathology of motivation and choice. American Journal Psychiatric 162,1403-1413.

Koob, G. F. and Le Moal, M. (1997). Drug abuse: hedonic homeostatic dysreg-ulation. Science 278, 52-58.

Kundu, D. and Raqab, M. Z. (2005). Generalized Rayleigh distribution: dif-ferent methods of estimation. Computational Statistics and Data Analysis49, 187-200.

Lai, C. D., Xie, M. and Murthy, D. N. P. (2003). A modified Weibull distribu-tion. IEEE Transactions on Reliability 52, 33-37.

Page 36: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

816 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Lawless, J. F. (2003). Statistical Models and Methods for Lifetime Data. Wiley,New York.

Maisto, S. A., Pollock, N. K., Cornelius, J. R., Lynch, K. G. and Martin, C.S. (2003). Alcohol relapse as a function of relapse definition in a clinicalsample of adolescents. Addictive Behaviors 28, 449-459.

Malhotra, J., Sharma, A. K. and Kaler, R. S. (2009). On the performanceanalysis of wireless receiver using generalized-gamma fading model. Annalsof Telecommunications 64, 147-153.

Marlatt, G. A. (2001). Integrating contingency management with relapse pre-vention skills: comment on Silverman et al. (2001). Experimental andClinical Psychopharmacology 9, 33-34.

Miller, W. R., Westerberg, V. S., Harris, R. J. and Tonigan, J. S. (1996). Whatpredicts relapse? Prospective testing of antecedent models. Addiction 91,155-172.

Mudholkar, G. S., Srivastava, D. K. and Freimer, M. (1995). The exponentiatedWeibull family: a reanalysis of the bus-motor-failure data. Technometrics37, 436-445.

Nadarajah, S. (2008a). On the use of the generalised gamma distribution. In-ternational Journal of Electronic 95, 1029-1032.

Nadarajah, S. (2008b). Explicit expressions for moments of order statistics.Statistics & Probability Letters 78, 196-205.

Nadarajah, S. and Gupta, A. K. (2007). A generalized gamma distribution withapplication to drought data. Mathematics and Computers in Simulation74, 1-7.

Ortega, E. M. M., Bolfarine, H. and Paula G. A. (2003). Influence diagnosticsin generalized log-gamma regression models. Computational Statistics andData Analysis 42, 165-186.

Ortega, E. M. M., Cancho, V. G. and Bolfarine, H. (2006). Influence diagnosticsin exponentiated-Weibull regression models with censored data. Statisticsand Operations Research Transactions 30, 172-192.

Ortega, E. M. M., Paula G. A. and Bolfarine, H. (2008). Deviance residualsin generalized log-gamma regression models with censored observations.Journal of Statistical Computation and Simulation 78, 747-764.

Page 37: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

The Log-Kumaraswamy Generalized Gamma Regression Model 817

Ortega, E. M. M., Cancho, V. G. and Paula, G. A. (2009). Generalized log-gamma regression models with cure fraction. Lifetime Data Analysis 15,79-106.

Ortega, E. M. M., Cordeiro, G. M., Pascoa, M. A. R. and Couto, E. V. (2012).The log-exponentiated generalized gamma regression model for censoreddata. Journal of Statistical Computation and Simulation 82, 1169-1189.

Pascoa, M. A. R. (2008). Intervalos de Credibilidade para a Razao de Riscos doModelo de Cox Considerando Estimativas Pontuais Bootstrap. 71 p. Mastertheses (Master of Science) - Universidade Federal de Lavras, Lavras, MinasGerais, Brazil.

Pascoa, M. A. R., Ortega, E. M. M. and Cordeiro, G. M. (2011). The Ku-maraswamy generalized gamma distribution with application in survivalanalysis. Statistical Methodology 8, 411-433.

Prudnikov, A. P., Brychkov, Y. A. and Marichev, O. I. (1986). Integrals andSeries, volume 1. Amsterdam, Gordon and Breach Science Publishers, NewYork.

Stacy, E. W. (1962). A generalization of the gamma distribution. Annals ofMathematical Statistics 33, 1187-1192.

Stacy, E. W. and Mihram, G. A. (1965). Parameter estimation for a generalizedgamma distribution. Technometris 7, 349-358.

Trott, M. (2006). The Mathematica Guidebook for Symbolics. With 1 DVD–ROM (Windows, Macintosh and UNIX). Springer, New York.

Xie, M. and Lai, C. D. (1995). Reliability analysis using an additive Weibullmodel with bathtub-shaped failure rate function. Reliability Engineering &System Safety 52, 87-93.

Wright, E. M. (1935). The asymptotic expansion of the generalized Bessel func-tion. Proceedings of the London Mathematical Society 38, 257-270.

Received June 29, 2012; accepted May 8, 2013.

Marcelino A. R. PascoaDepartamento de EstatısticaUniversidade Federal de Mato Grosso78060-900, Cuiaba, MT, [email protected]

Page 38: 782 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M. as special models the exponential, Weibull, gamma and Rayleigh distributions, among

818 Pascoa, M. A. R., de Paiva, C. M. M., Cordeiro, G. M. and Ortega, E. M. M.

Claudia M. M. de PaivaDepartamento de PsicologiaUniversidade Federal de Sao Joao del Rei36307-352, Sao Joao Del Rei, MG, [email protected]

Gauss M. CordeiroDepartamento de EstatısticaUniversidade Federal de Pernambuco52171-900, Recife, PE, [email protected]

Edwin M. M. OrtegaDepartamento de Ciencias ExatasUniversidade de Sao Paulo13418-900, Piracicaba, SP, [email protected]