14
Sankhy¯a : The Indian Journal of Statistics 2012, Volume 74-B, Part 1, pp. 1-14 c 2012, Indian Statistical Institute Extension of the Harrington-Fleming tests to multistate models Prabhanjan N. Tattar Dell International Services, Bangalore, India H.J. Vaman Central University of Rajasthan, Rajasthan, India Abstract Harrington-Fleming tests are popular in the classical time-to-event clinical trial studies. Such weighted tests are sometimes preferred alternatives to the powerful log-rank tests. It is thus natural that the Harrington-Fleming tests should be extended to multistate models. This forms the main objec- tive of this paper. The results of this paper through simulation study and applications to real data sets, suggests that such a generalization is indeed worthwhile. AMS (2000) subject classification. Primary 62N01; Secondary 62N03. Keywords and phrases. Extended Harrington-Fleming test statistics, multi- state models, competing risks model. 1 Introduction The log-rank, Peto-Peto, and Wilcoxon tests, and the class of Harrington- Fleming tests form a useful array of methodologies in the classical time-to- event studies in clinical trials for testing the null hypothesis that the survival function, equivalently the hazard rate function, equals some specific form of interest. A time-to-event study involves two states: “alive” and “dead.” But these studies are a particular case of the general multistate models. A gen- eralization of the tests to the multistate models finds important applications and this forms the core focus of this paper. Multistate models have been found to have wide applicability in many problems of survival analysis; Health Related Quality of Life (HRQoL), and Competing Risks Models are two examples. Hougaard (2000) contains many Prabhanjan N. Tattar was working at CustomerXPs Software Private Limited up to ac- ceptance of the paper.

Extension of the Harrington-Fleming tests to multistate …sankhya.isical.ac.in/search/74b1/13571_2012_38_PrintPDF.pdf · Extension of the Harrington-Fleming tests to multistate models

Embed Size (px)

Citation preview

Sankhya : The Indian Journal of Statistics2012, Volume 74-B, Part 1, pp. 1-14c© 2012, Indian Statistical Institute

Extension of the Harrington-Fleming teststo multistate models

Prabhanjan N. TattarDell International Services, Bangalore, India

H.J. VamanCentral University of Rajasthan, Rajasthan, India

Abstract

Harrington-Fleming tests are popular in the classical time-to-event clinicaltrial studies. Such weighted tests are sometimes preferred alternatives tothe powerful log-rank tests. It is thus natural that the Harrington-Flemingtests should be extended to multistate models. This forms the main objec-tive of this paper. The results of this paper through simulation study andapplications to real data sets, suggests that such a generalization is indeedworthwhile.

AMS (2000) subject classification. Primary 62N01; Secondary 62N03.Keywords and phrases. Extended Harrington-Fleming test statistics, multi-state models, competing risks model.

1 IntroductionThe log-rank, Peto-Peto, and Wilcoxon tests, and the class of Harrington-

Fleming tests form a useful array of methodologies in the classical time-to-event studies in clinical trials for testing the null hypothesis that the survivalfunction, equivalently the hazard rate function, equals some specific form ofinterest. A time-to-event study involves two states: “alive” and “dead.” Butthese studies are a particular case of the general multistate models. A gen-eralization of the tests to the multistate models finds important applicationsand this forms the core focus of this paper.

Multistate models have been found to have wide applicability in manyproblems of survival analysis; Health Related Quality of Life (HRQoL), andCompeting Risks Models are two examples. Hougaard (2000) contains many

Prabhanjan N. Tattar was working at CustomerXPs Software Private Limited up to ac-ceptance of the paper.

2 P.N. Tattar and H.J. Vaman

applications of the multistate models in survival analysis and a flow-graphapproach to a multistate model with reference to reliability theory is givenin Huzurbazar (2005). The counting process approach to inference problemsin such models is given in Andersen et al. (1993), abbreviated as ABGKin the rest of this paper. Our main interest is in problems of testing thetransition probability matrix (t.p.m.) associated with such models underthe assumption of Markov dependence.

A competing risks model is used in studies involving individualsexposed to various identifiable causes of death. For example, in theNecropsy Data on the Radiation-Exposed Male Mice from Example I.3.8of Andersen et al. (1993), death may be due to thymic limphoma (cause1), reticulum cell sarcoma (cause 2), or some other reason (cause 3) andit may be of interest to the clinician to test if all the three causes areequally likely, or follow some other expected pattern. Beyersmann, Allig-nol, and Schumacher (2012) gives a detailed introduction to the competingrisks model.

Wherever multistate models are applicable, the data consist of samplepaths, some of which are observed until a certain terminal state, and otherscensored. Tattar and Vaman (2008) have developed, assuming a Markoviansetup, test statistic for the hypothesis that the t.p.m. is equal to a specifiedone that is based either on the past data pertaining to a standard treatmentor on the predictions of the clinician. In this paper, we extend this teststatistic, proposed in equation (3.2), by cumulating the differences betweenthe incremental estimated values of the cumulative hazard function and thecorresponding hypothesized values through a weight process. Suppose thatthe state-space of the multistate model consists of k elements. Then thetest statistic consists of k × (k − 1) elements, where each element, as de-fined in equation (3.1), consists of two terms which are stochastic integrals.The first term integrates the weight process under the estimated cumulativehazard function, and is evaluated as the sum of realizations of the weightprocess over the transition times. The second term on the right-hand side ofequation (3.1) integrates the weight process under the hypothesized cumu-lative transition functions. The choice of weight process is important sinceevery choice gives rise to a distinct family of test statistics. For example,taking the “at-risk process” as the weight process leads to the well knownlog-rank test statistic. In this paper, we consider the weight process as afunction of the hypothesized transition probabilities in a counting processframework which yields the family of Harrington-Fleming type test statisticsfor a multistate model and in Section 4, we develop the corresponding boot-strap methods. Numerical methods are needed to evaluate it, and we have

Extension of the Harrington-Fleming tests 3

developed a program in R Statistical Software for this purpose. Simulationstudy has been carried out to understand the usefulness of the test statis-tics. Bootstrap version of the test statistic is also discussed. Applicationof the testing problems in the Necropsy Data set, and Ludwig Trial V dataprovided to us by the International Breast Cancer Study Group.

2 Stochastic framework for the multistate modeland the testing problem

In clinical trials, one does not necessarily observe the end point for allthe observations in the study and hence the data are generally censoredto a certain degree. If we let Z1, Z2, · · · , Zn denote the times to “failure”for n individuals, and C1, C2, · · · , Cn the associated censoring variables, theobserved data consist of Xr = Zr ∧ Cr and δr = I {Xr ≤ Cr} , r = 1, .., n.In some clinical trials, however, an individual may transit among variousstates before reaching an absorbing end-point state, and it is possible tounambiguously identify a subject as being in one of (k + 1) “states” at agiven instant. If the state space is denoted by S = {1, 2, · · · , k, 0}, state ′0′

being an absorbing state, we can use a continuous function V for indicatingthe health state of an individual at time t, that is V (.) : R+ → S. Thenature of the state space S varies according to the type of the study underconsideration. For example, in the competing risks model, 1, 2, · · · , k are kdifferent causes of death, while 0 indicates the state of being alive.

Let N(t) = {Nij(t)} denote a matrix-valued counting process, with Nij(t)counting the number of transitions from state i to j at time t. The numberof individuals “at-risk” at time t and in state i is a continuous time processdenoted by Yi(t). To illustrate the terms here, let us consider the simple casewhere we have only two subjects and S = {1, 2, 3, 0}. Suppose that both thesubjects are observed in State 1 at t = 0, and the sample path for subject1 is 1 → 2 → 3 → 0 at times 12, 13, and 24, whereas the path for subject2 is 1 → 2 → 0 at times 5 and 30. Then at the transition instants, thecounting process Nij(t) is N12(5) = 1, N12(12) = 1, N23(13) = 1, N30(24) =1, N20(30) = 1, and the at-risk process Yi(t) is Y1(5) = 2, Y1(12) = 1,Y2(13) = 2, Y3(24) = 1, Y2(30) = 1.We assume that the transitions follow aMarkov process. Define αij(t) to be the transition rate from health state ito health state j at time t, that is,

αij(t) = limh↓0

P {V (t) = j|V (t − h) = i}h

4 P.N. Tattar and H.J. Vaman

Further, let the matrix of cumulative intensities Aij(t) =∫ t0 αij(s)ds be

denoted by A(t) = {Aij(t)} where

Aii(t) = −∑

j �=i

Aij(t). (2.1)

If P (s, t) = {Pij(s, t)} is the t.p.m. of the nonhomogeneous Markovprocess associated with the multistate model, then we have the followingrelation:

P(s, t) =∏

[s,t]

{I + dA(u)},

here∏

denotes the product integral.We consider the problem of testing H : P(s, t) = P0(s, t), 0 ≤ s ≤ t ≤

τ or equivalently HA : A(t) = A0(t), 0 ≤ t ≤ τ . The time point τ istaken as any time instant greater than the largest censored time point. Thet.p.m. P0 may correspond to a placebo or it may be as hypothesized by aclinical expert. We assume throughout the paper that P0, equivalently A0,is continuous. Construction of test statistics, under the assumption of themultiplicative intensity model λij(t) = Yi(t)αij(t), t > 0, for this problem isdeveloped in the next section where we also discuss various choices of theweight process. The counting process approach described in this sectionhelps us derive the asymptotic properties of the test statistics by applicationof standard results on martingales.

3 Multistate version of the Harrington-Flemingfamily of tests

The problem of testing transition probability matrix (t.p.m.) is consid-ered here. In the first subsection we describe the construction of the teststatistic. This is followed by a discussion of the weight process, and finallythe asymptotic distribution of the test statistic.

3.1. Construction of the test statistic. Let A(t) =(Aij(t)

)

i,jdenote

the matrix of Nelson-Aalen integrals, that is, Aij(t) =∫ t0

Ji(s)Yi(s)

dNij(s), whereJi(s) ≡ I (Yi(s) > 0). Then the t.p.m. P(s, t) can be estimated by theempirical transition probability matrix

P(s, t) =∏

(s,t]

(I + dA(u)

),

where A(t) =(Aij(t)

)

i,j∈S. The estimated t.p.m. P(s, t) is the well-known

Aalen-Johansen estimator.

Extension of the Harrington-Fleming tests 5

As in Tattar and Vaman (2008), we use the fact that A is uniformlyconsistent and asymptotically normal for A :

sups∈[0,t]

∥∥∥A(s) − A(s)

∥∥∥ P−→ 0,

and √n

(A(t) − A(t)

)D−→ U(t),

as n → ∞, where the ‖.‖ denotes the norm of a matrix, if B = ( (bij)), ‖B‖ =supi

∑j |bij |. Here, U(t) is a (k + 1) × (k + 1) matrix-valued process with

the elements Uhj , h = j, h, j ∈ S being independent Gaussian martingales,and Uhh = −

∑j �=h Uhj .

The problem of interest in the paper is the following:

H:P(s, t) = P0(s, t), 0 ≤ s ≤ t ≤ τ.

The hypothesis H is also equivalent to HA : A(t) = A0(t), 0 ≤ t ≤ τ , A0

assumed to be continuous. Now, define A∗ij(t) =

∫ t0 Ji(u)αij(u)ds and

A∗(t) ={A∗

ij(t)}

.

To construct test statistics with reasonably good properties, we needA − A∗

0 to be a locally square-integrable martingale. This property holdsgood in the light of the following lemma.

Lemma 3.1 (Tattar and Vaman, 2008). Under HA : A = A0, A−A∗0 is

a matrix-valued local square-integrable martingale.

A simple test statistic can be∫ t0 1k2×1dvec

(A − A∗

0

)(s), where 1 is a

row vector of 1’s and the operator vec arranges the columns of a matrix intoa single column vector. That is, the first column is followed below it by thesecond, and then the third column is placed below them, and so on. A sim-pler form of

∫ t0 1k2×1d vec

(A − A∗

0

)(s) is

∑ki=1

∑kj=1

∫ t0 d

(Aij − A0ij

)(s).

Note that the the integral term compares the difference in the incrementbetween the Nelson-Aalen estimator Aij of the cumulative hazard functionand its hypothesized value A∗

0ij , and then we sum over all transitions out ofthe transient state i to other states, and the sum is finally aggregated overi. However, we get a fairly general class of test statistics if we introducesome weight processes as the integrand. The weight process may consistof functions of the at-risk process, or transition probabilities given by thehypothesis of interest.

6 P.N. Tattar and H.J. Vaman

Let Kij(s), for i, j ∈ S, denote locally bounded nonnegative predict-able weight processes associated with the pair (i, j) of health states. Now,consider

Zij(t) =∫ t

0Kij(s)d

(Aij − A∗

0ij

)(s) ,

=∫ t

0Kij(s)dAij(s) −

∫ t

0Kij(s)dA∗

0ij(s), (3.1)

∀i, j ∈ S. Since Aij − A∗0ij is a local square-integrable martingale, and

Kij(s) is a predictable process, Zij(t) is simply a martingale transformand is thus a local square integrable martingale. It is also shown in theLemma 3.2 of next section, that Zij(t)/ 〈Zij〉1/2 (t) follows standard normaldistribution.

Define, for each i ∈ S,

Zi(t) =(Zi1(t), Zi2(t), · · · , Zi(i−1) (t) , Zi(i+1), · · · , Zik(t)

).

Note that Zi excludes the element Zii(t) =∫ t0 Kii(s)d

(Aii − A∗

0ii

)(s). This

is due to the relation given in (2.1) that Aii = −∑

j �=i Aij and also becausethe estimated state occupation probabilities in the Aalen-Johansen estimatorare constrained by the off-diagonal elements, Aii

(= −

∑j �=i Aij

). In other

words, the terms Zii’s do not give any additional information. We will usethe Zi’s to construct the proposed class of test statistics. Towards this,define

Z (t) = (Z1(t),Z2(t), · · · ,Zk(t)) ,

and considerZ (t) 〈Z〉−1 (t)ZT (t)

as a test statistic, where 〈Z〉 (t) is the predictable variation process of Z,and is obtained from the standard results on stochastic calculus for vectormartingales. An explicit expression for the test statistic is

Z(t) 〈Z〉−1 (t)ZT (t) =k∑

i=1

k∑

j �=i

Z2ij(t)

〈Zij〉 (t)(3.2)

=k∑

i=1

k∑

j �=i

(∫ t

0Kij(s)d

(Aij − A∗

0ij

)(s)

)2

∫ t

0K2

ij(s)d⟨Aij − A∗

0ij

⟩(s)

consisting of k × (k − 1) expressions.

Extension of the Harrington-Fleming tests 7

3.2. Choice of the weight process. Let P0 = {P0ij} be the t.p.m. un-der the null hypothesis, assumed to be continuous. The choice Kij =Yi [P0ij ]

aij [1 − P0ij ]bij , aij > 0, bij > 0, i, j ∈ S, leads to an extension of the

Harrington-Fleming family of test statistics for the multistatemodel:

ZHF (t) 〈ZHF 〉−1 (t)ZTHF (t)

=k∑

i=1

k∑

j �=i

(∫ t

0Yi (s) [P0ij (s)]aij [1 − P0ij (s)]bij d

(Aij − A∗

0ij

)(s)

)2

∫ t

0Y 2

i (s) [P0ij ]2aij [1 − P0ij (s)]2bij d

⟨Aij − A∗

0ij

⟩(s)

.

Also, if Kij = Yi [P0ij ]aij , that is, for the limiting case bij ↓ 0, for all i, j ∈ S,

we get the the multistate version of the Wilcoxon statistic:

ZW (t) 〈ZW 〉−1 (t)ZTW (t)

=k∑

i=1

k∑

j �=i

(∫ t

0Yi (s) [P0ij (s)]aij d

(Aij − A∗

0ij

)(s)

)2

∫ t

0Y 2

i (s) [P0ij ]2aij d

⟨Aij − A∗

0ij

⟩(s)

.

Further, the log-rank test statistic is a limiting case with aij ↓ 0 for alli, j ∈ S, leading to the choice Kij = Ki = Yi. If we denote by E0ij(t) theexpected number of transitions from state i to state j up to timet, then

ZLR (t) 〈ZLR〉−1 (t)ZTLR (t) =

i

j

(Nij(t) − E0ij(t))2

E0ij (t).

This is a chi-square goodness-of-fit statistic and such a statistichas also been suggested by Kalbfleisch and Lawless (1985) for panel dataanalysis in a parametric framework. Another interesting choice, namely,Kij = Ki = Y

1/2i leads us to a generalization of the Tarone-Ware test

statistic for the multistate models. Thus, we see that the choice of theweight process gives a wide class of test statistics. The main interest in thispaper is the Harrington-Fleming family of test statistics, and wefurther elaborate after establishing the asymptotic distribution ofZ (t) 〈Z〉−1 (t)ZT (t) .

8 P.N. Tattar and H.J. Vaman

3.3. Asymptotic null distribution of the test statistic. To derive the as-ymptotic distribution of the test statistic Z (t) 〈Z〉−1 (t)ZT (t) under H, wefirst prove that Zij(t)/〈Zij〉1/2 (t), for all i, j ∈ S, is asymptotically dis-tributed as a standard normal variate. Let us assume the multiplicativemodel λ

(n)ij = α

(n)ij Y

(n)i , for each i, j ∈ S,for the intensity function of the

counting process N(n)ij , n = 1, 2, · · · ;the superscript (n) indicates that the

expression is the empirical version based on a sample of size n. Under H,assume that there exists, for each state i, a sequence of constants {cin}and nonnegative functions kij and yi such that for all i = j, the func-tions k2

ijα0ij/yi are well defined and integrable on [0, τ ], and satisfy thefollowing:

c2in

∫ t

0

(K

(n)ij (s)

)2

Y(n)i (s)

α0ij(s)dsP−→

∫ t

0

(kij(s))2

yi(s)α0ij(s)ds

= σ20ij(t) (say), asn → ∞, (3.3)

and also assume that for all ε > 0 and for all states i, as n → ∞, thefollowing is satisfied:

c2in

∫ t

0

(K

(n)ij (s)

)2

Y(n)i (s)

α0ij(s)I

⎧⎪⎨

⎪⎩

∣∣∣∣∣∣∣

(K

(n)ij (s)

)2

Y(n)i (s)

∣∣∣∣∣∣∣> ε

⎫⎪⎬

⎪⎭ds P−→ 0. (3.4)

Lemma 3.2. If the conditions stated in (3.3) and (3.4) hold, then asn → ∞

Zij(t)

〈Zij〉1/2 (t)∼ N(0, 1),

with N(0, 1) denoting the standard normal distribution.Proof. As noted earlier Zij(t) is a local square-integrable martingale,

and its predictable variation process is

〈Zij〉(n) (t) =∫ t

0

(K

(n)ij (s)

)2d

⟨Aij − A∗

0ij

⟩(s)

=∫ t

0

(K

(n)ij (s)

)2 α0ij(s)Yi(s)

ds.

We see that under the assumptions of (3.3) and (3.4), the conditionsof Rebolledo’s martingale central limit theorem are satisfied, and hence we

Extension of the Harrington-Fleming tests 9

conclude that Zij(t)/ 〈Zij〉1/2 (t) is asymptotically distributed as a standardnormal variate.

The asymptotic distribution of Zij(t)/ 〈Zij〉1/2 (t) can be directly used toobtain the asymptotic distribution of Z (t) 〈Z〉−1 (t)ZT (t), and this is givenin the theorem below.

Theorem 3.1. As n → ∞,

Z (t) 〈Z〉−1 (t)ZT (t) ∼ χ2 (ν) ,

where ν = k × (k − 1) .

This result is proved by a straight forward application of Lemma 3.2.Note that the asymptotic distribution of Z (t) 〈Z〉−1 (t)ZT (t) is independentof the choice of the weight process Kij , and the only requirement is that Kij

should be a predictable process. Consequently, it follows that families ofstatistics due to Tarone-Ware, and Harrington-Fleming have this asymp-totic property. In Section 5, we illustrate, through a simulation study, theHarrington-Fleming test statistic and other variants, and in Section 6, theapplication to real data sets: the Necropsy Data of the causes of death inRadiation-Exposed Male Mice, and Ludwig Trial V data. The local asymp-totic power of the proposed tests follow as an extension of the discussion ofSection 5 of Tattar and Vaman (2008).

4 Bootstrap methods for the testing problemThe log-rank test is asymptotically optimal, in the power sense, that

is, if the sequence of alternatives is from the proportional hazards model,whereas the Harrington-Fleming type tests is asymptotically optimal if thesequence of alternatives is from the family of extreme-value distributions.In general, it is difficult to verify these assumptions. Further, it is also notthe case that one class of tests is better than the other under all possibledata generating mechanisms. Thus, we may need to resort to resamplingtechniques for studying the significance of the tests.

Bootstrap techniques are very popular among the practitioners and italso requires minimal assumptions about the underlying data generatingmechanisms. To illustrate the method, let us recall the simplistic examplestated in Section 2. Again, consider the sample paths: Patient 1 transitsthe states 1 → 2 → 3 → 0 at times 12, 13, and 24, whereas for Patient 2the path is 1 → 2 → 0 at times 5 and 30. We can resample the data onthe two patients with replacement and suppose that we obtain Patient 1twice. Then, for the bootstrap sample 1, we have N1

12(12) = 2, N123(13) =

2, N130(24) = 2 whereas Y 1

1 (12) = 2, Y 12 (13) = 2, Y 1

3 (24) = 2. Based on the

10 P.N. Tattar and H.J. Vaman

bootstrap sample 1 and the quantities as N112 (.) , Y 1

1 (.), etc, we computethe test statistic Z1 (t)

⟨Z1

⟩−1 (t)(Z1

)T (t). We repeat the process a largenumber of times, say B, and additionally obtain Z2 (t)

⟨Z2

⟩−1 (t)(Z2

)T (t),..., ZB (t)

⟨ZB

⟩−1 (t)(ZB

)T (t). Appropriate statistical inference may bedrawn from the sampling distribution of

Z1(t)〈Z1〉−1(t)(Z1)T (t), Z2(t)〈Z2〉−1(t)(Z2)T (t), . . . ,

ZB(t)〈ZB〉−1(t)(ZB)T (t)

Note that when we use the classical nonparametric bootstrap mechanism,we are bound to end up with ties. In such a case, the statistical solutionshave to be rigorously extended for incorporating the ties. However, in thispaper we have not developed elaborate methods for tackling the ties. Foreach bootstrapped sample path, a bit of noise may be added for breaking theties and thus an ad hoc solution may be used. In the R statistical softwareemployed in our illustrations in Sections 5 and 6, noise has been added to thebootstrapped sample paths using the “jitter” function for handling the ties.For example, if three observations are tied at time t = 16, then the ties arebroken by the jitter function to 15.96, 16.03, and 16.08, say. The researchproblem of scientifically handling the ties is a subject of future research.

5 A simulation study for the extended Harrington-Flemingand other tests

The behavior of the test statistics for different choices of the weightprocess is briefly investigated here. We consider the case of competingrisks and the Necropsy Data on the Radiation-Exposed Male Mice fromAndersen et al. (1993). Suppose that the hazard rates for the three causesare exponentially distributed with failure rates 1/θi, i = 1, 2, 3. For the cho-sen combination of sample size and hazard rate, a sample is simulated andthe values of the log-rank, Wilcoxon, Harrington-Fleming (two combina-tions) statistics are obtained taking the null hypothesis as θi = 45, i = 1, 2, 3(Table 1). The associated p-values are obtained taking the null distributionsas asymptotically chi-squared with three degrees of freedom. This processis replicated 100 times for each combination and the average p -value isreported in Table 2.

We note that (i) when the three competing risks have failure rates (1/30,1/45, 1/60), all the test statistics lead to rejection of the null hypothesis(ii) the p-values corresponding to the log-rank are lesser than those generatedby using the Wilcoxon test and the two chosen Harrington-Fleming tests,

Extension of the Harrington-Fleming tests 11

Table 1: A simulation study for the comparison of Log-Rank, Wilcoxon, andHarrington-Fleming tests through average p-valuesn θ1 θ2 θ3 θ0 Log-Rank Wilcoxon Harrington- Harrington-

Fleming Fleming(a=b=1) (a=0.5,b=2)

100 30 45 60 45 0.0027 0.0449 0.0221 0.0075100 40 45 50 45 0.3390 0.4052 0.3651 0.3625100 45 45 45 45 0.5370 0.4912 0.4114 0.5063200 30 45 60 45 0.0087 0.0379 0.0259 0.0160200 40 45 50 45 0.3694 0.3750 0.3581 0.3655200 45 45 45 45 0.5554 0.4393 0.5104 0.4952300 30 45 60 45 0.0049 0.0248 0.0213 0.0125300 40 45 50 45 0.3742 0.3726 0.3498 0.3535300 45 45 45 45 0.5377 0.5128 0.5179 0.5233500 30 45 60 45 0.0085 0.0202 0.0163 0.0065500 40 45 50 45 0.3608 0.3748 0.3666 0.3356500 45 45 45 45 0.4994 0.4724 0.5331 0.45811000 30 45 60 45 0.0082 0.0576 0.0205 0.01131000 40 45 50 45 0.3521 0.3987 0.3554 0.35661000 45 45 45 45 0.5363 0.4675 0.5273 0.4723

(iii) if the true values of parameters are all equal to 1/45, or nearly equal,(1/40, 1/45, 1/50), none of the tests leads to rejection of the null hypothesis.

Using different values of aij and bij , one may attempt to capture dif-ferent types of departure from the null hypothesis. Specifically, one mayuse aij > bij for identifying late departure from the null hypothesis, and

Table 2: A simulation study when Harrington-Fleming tests appearsbetter, in the sense of p-valuen θ1 θ2 θ3 θ0 Log-Rank Harrington-Fleming

(a=0.5,b=2)100 30 45 60 30 0.0003 0.0002200 30 45 60 30 0.0001 0.0001300 30 45 60 30 0.0001 0.0001500 30 45 60 30 0.0003 0.00081000 30 45 60 30 0.0001 0.0000

12 P.N. Tattar and H.J. Vaman

the reverse for capturing early departure from the null hypothesis. We rananother simulation where, as before, the failure times were generated fromexponential distributions with failure rates (1/30, 1/45,1/60), and now wetested the hypothesis that θ = 30. The table below summarizes the results:The p-values corresponding to the Harrington-Fleming tests are generallysmaller than those corresponding to the log-rank test. This indicates that,depending on the context, the practitioner may have to judiciously selecta test.

Next, the bootstrap method described in Section 4 has been carried outfor the following scenario. First, we simulate n =1000 sample paths assum-ing a competing risks model with three causes of death having an exponen-tial distribution with rates 1/30, 1/45, and 1/60 months respectively, andsubjected to censoring according to a uniform distribution U(40, 80). Thegenerated sample paths are then frozen and bootstrap samples are obtainedby sampling with replacement. The number of bootstrap replications is 100.For each bootstrap sample, the log-rank, Wilcoxon, and Harrington-Fleming(two cases: a = 5, b = 2, and a = b = 1) test statistics have been computed.The average p-value obtained on the hundred bootstrap values for the abovementioned test statistics are 2.446089e-12, 1.071884e-05, 3.457435e-12, and2.640585e-08, respectively. Thus, we see that the earlier inference is con-firmed by the use of the bootstrap method.

6 Applications of the Harrington-Fleming tests6.1. Application of the test to the necropsy data. We consider Exam-

ple I.3.8 of the Causes of Death in Radiation-Explodes Male Mice datafrom ABGK. Here, we focus on the mice group in the conventional lab-oratory environment which consists of 95 mice. In this setup, 22 mice(23%) died due to thymic limphoma (cause 1), 38 (40%) due to reticu-lum cell sarcoma (cause 2), while the remaining 35 (37%) died due to othercauses (cause 3). Suppose that we are interested to test the hypothesisthat the hazard rate of each of the causes is the same, specifically, to testH : Pi0(t) = exp(−0.0025 t)I{t>0}, that is, Ai0(t) = 0.0025t.I{t>0}, i = 1, 2, 3.Since there are only k possible transitions in the competing risks model, itfollows that the asymptotic distribution of Z (t) 〈Z〉−1 (t)ZT (t) is χ2 withk degrees of freedom only. In this example, we have k = 3 degrees offreedom.

The Wilcoxon and Harrington-Fleming test statistics resulted in p-valuesof 0+, and hence there is a very strong evidence in the data against the nullhypothesis H : Pi0(t) = exp(−0.0025 t)I{t>0}, i = 1, 2, 3.

Extension of the Harrington-Fleming tests 13

The bootstrap method discussed in Section 4 also leads us to the sameinference. The p-value was calculated for each of the bootstrap samples, thetotal number of bootstrap replicates was 100. The average p-value for thesebootstrap replicates is again practically zero.

6.2. Application of the test to the IBCSG Ludwig trial V data. In theLudwig Trial V data set, provided to the authors by IBCSG, patients, fol-lowing a surgery, were given a maintenance treatment, which may either beof short-duration with a higher dosage or of long-duration with low dosage.Throughout the study, the patients were followed to know their health statewhich could be one of Toxicity (TOX), Time Without Symptoms of Diseaseand Toxicity (TWiST), and Relapse of the disease (REL) or Death. The dataset pertains to 1229 patients with about 30% censoring and median followup time of 84 months. Let TOX, TWiST, and REL denote respectively thetimes at which toxicity, time without symptoms of disease, and relapse end.For each patient, the times spent in the states TOX, TWiST, REL are respec-tively obtained by TOX, DFS - TOX, OS - DFS, where OS denotes the overall survival time.

Suppose that we are interested in testing the hypothesis that the survivaltimes for the states TOX, DFS, and OS equal certain specified values:

H :PTOX(t) = exp(−0.2t)I{t>0}PDFS(t) = exp(−0.02t)I{t>0}POS(t) = exp(−0.01538t)I{t>0}

For the Trial V data set, the Harrington-Fleming test statistic

ZHF (t) 〈ZHF 〉−1 (t)ZTHF (t)

with aTOX = aDFS = aOS = 0.5, and bTOX = bDFS = bOS = 2 yields thep-value as 0+, which suggests rejection of the null hypothesis. The sameinference is arrived at when the test is bootstrapped.

Acknowledgement. Thanks are due to Prof. Bernard Cole, HarvardSchool of Public Health, and the International Breast Cancer Study Group,Boston, for providing the Trial V data.

References

andersen, p.k., borgan, o., gill, r.d. and keiding, n. (1993). Statistical Modelsbased on Counting Processes. Springer, New York.

beyersmann, j., allignol, a. and schumacher, m. (2012). Competing Risks andMultistate Models with R. Springer, New York.

14 P.N. Tattar and H.J. Vaman

harrington, d.p. and fleming, t.r. (1982). A class of rank test procedures forcensored survival data. Biometrika, 69, 553–566.

hougaard, p. (2000). Analysis of Multivariate Survival Data. Springer, New York.

huzurbazar, a. (2005). Flowgraph Models for Multistate Time-to-Event Data. J.Wiley,New York.

kalbfleisch, j.d. and lawless, j.f. (1985). The analysis of panel data under a Markovassumption. J. Amer. Statist. Assoc, 80, 863–868.

r development core team (2009). R: A language and environment for statisticalcomputing. R Foundation for Statistical Computing, Vienna, Austria. ISBN3-900051-07-0, URL http://www.R-project.org.

tattar, p.n. and vaman, h.j. (2008). Testing transition probability matrix of a multi-state model with censored data. Lifetime Data Anal., 14, 2, 216–230.

Prabhanjan N. Tattar

Dell International Services

121, 122A, 131A, Divyasree Greens

Koramangala Inner Ring Road

Challaghatta, Varthur Hobli

Bangalore KA 560071, India

E-mail: [email protected]

H.J. Vaman

Department of Statistics

Central University of Rajasthan

NH-8, Bandar Sindri, Ajmer-305801

Rajasthan, India

Paper received: 2 December 2010; revised: 12 March 2012.