7

Click here to load reader

Linear increment in efficiency with the inclusion of surrogate endpoint

  • Upload
    atanu

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Linear increment in efficiency with the inclusion of surrogate endpoint

Statistics and Probability Letters 96 (2015) 102–108

Contents lists available at ScienceDirect

Statistics and Probability Letters

journal homepage: www.elsevier.com/locate/stapro

Linear increment in efficiency with the inclusion of surrogateendpointBuddhananda Banerjee a, Atanu Biswas b,∗

a Indian Institute of Science Education and Research, Kolkata, Indiab Indian Statistical Institute, Kolkata, India

a r t i c l e i n f o

Article history:Received 10 May 2014Received in revised form 10 September2014Accepted 11 September 2014Available online 27 September 2014

Keywords:Surrogate responsesOdds ratioRisk ratioTreatment difference

a b s t r a c t

In a two-sample clinical trial, a fixed proportion of true-and-surrogate and the remainingonly-surrogate responses are observed. We quantify the increase in efficiency to comparethe treatments as a linear function of the proportion of available true responses.

© 2014 Elsevier B.V. All rights reserved.

1. Introduction

A surrogate endpoint is chosen as ameasure or an indicator of a biological process. Usually it is obtained sooner, at a lessercost than the true endpoint of health outcome, and is used to arrive at a conclusion about an effect of intervention on thetrue endpoint. Surrogate endpoints are used with growing interest in medical science. For example, in a trial of a treatmentof osteoporosis we might be interested in reduction of the fracture rate, but we measure the bone mineral density (BMD)instead. A change in CD4 cell count in a randomized trial is considered as a surrogate of survival time in the study of HIVaffected patients. Again, some damages to the heart muscle due to myocardial infarction can be accurately assessed by anarterioscintography reading. As it is an expensive procedure, the peak cardiac enzyme level in the blood stream, which ismore easily obtainable, is used as a surrogate measure of heart vascular damage (see Wittes et al., 1989). Sometimes theobserved value of the response variable in themiddle of an ongoing experiment is considered as the surrogate endpoint. Forexample patients with age related macular degeneration (ARMD) progressively loose vision. To compare between placeboand high-dose interferon-α for its treatment, observations are taken after six months and one year. The observation aftersix months is considered as a surrogate corresponding to the final outcome.

Two basic problems are studied in the literature of surrogate responses, namely (a) validation of a surrogate and (b)measurement of gain in inference using the surrogate responses. Prentice (1989) gave validation criteria for a surrogate,which is subsequently discussed by Freedman et al. (1992), Reilly and Pepe (1995), Day and Duffy (1996), Buyse andMolenberghs (1998), Buyse et al. (2000), Molenberghs et al. (2001), and Chen et al. (2007). The use of surrogate endpointsis likely to be beneficial, not only in terms of cost or time, but it gives more accuracy in the estimation of targetparametric functions such as treatment difference and odds ratio. For that purpose we first look at the data structure under

∗ Corresponding author. Tel.: +91 33 25752818; fax: +91 33 25773104.E-mail addresses: [email protected] (B. Banerjee), [email protected], [email protected] (A. Biswas).

http://dx.doi.org/10.1016/j.spl.2014.09.0150167-7152/© 2014 Elsevier B.V. All rights reserved.

Page 2: Linear increment in efficiency with the inclusion of surrogate endpoint

B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108 103

consideration. In this article we are interested tomeasure the increment of the efficiencywith the increase of the proportionof surrogate endpoints where the surrogate is presumed to be validated.

Suppose we consider the accumulated data when all of the surrogate responses are known, but only Q% of the trueresponses are available. Then, if we do not consider the surrogate data, we need to make inference based on Q% trueresponses only. In the present paper our objective is to use surrogate data efficiently (which consist of Q% bivariate dataand (100 − Q )% only univariate surrogate data) to improve the inference. To use (100 − Q )% surrogate data efficiently oneneed to identify the dependency structure of true and surrogate responses based on the Q% bivariate true-and-surrogatedata. A real data example which is appropriate to this situation is described in the next section.

Pepe (1992) obtained the distribution of the estimator of regression parameter when the validation sample fraction has afixed limiting value,ρ (=Q/100), say. Banerjee and Biswas (2011) established that the variance of the estimator of treatmentdifference is bounded for such a fixed ρ. Lin et al. (1997) measured the extent to which a biological marker is a surrogateendpoint for a clinical event and Wang and Taylor (2002) propose alternative measures of the proportion explained by thesurrogate endpoint. Chen (2000) and Begg and Leung (2000) discussed the inferential improvement by the use of surrogateendpoints. Chen et al. (2003) introduced the concept of information recovery from surrogate endpoints by considering linearmodels for true and surrogate on covariates.

Proportion of validation sample, ρ, naturally plays a key role in the gain in associated inference. The validation samplesare true and surrogate paired observations, but the rest of the samples are surrogate responses only. In Section 2we describethe set up in details and the data structure under a general probabilitymodel for binary true and binary surrogate responses.In Section 3 we establish that the (inverse of) relative efficiency to estimate the treatment success probability by usingsurrogate endpoints is a linear function of the validation sample proportion,ρ. As a simple consequence of thatwe also provethe (inverse of) relative efficiency to estimate treatments difference, log risk ratio and log odds ratio in a two-treatment setup is also linear in ρA and ρB, the validation sample proportions for the two treatments A and B, respectively. In Section 4we demonstrate our results with data example and conclude.

2. Experimental details and data structure

We consider a set up of two treatments having binary true endpoints with binary surrogates as well. Begg and Leung(2000) pointed out that for the binary endpoints the probability of concordance is an indicator of association betweentrue and surrogate endpoints. Suppose nA and nB patients are allotted to the treatments A and B, respectively; but we getonly mA and mB true endpoints along with all surrogate endpoints within the stipulated time frame or cost limit, wheremt ≪ nt , t = A, B. Denote the true and surrogate endpoints for the treatment t by Yt and Wt , where t = A, B. All theseendpoints are either 1 or 0 for success or failure, respectively. We denote pt = P(Yt = 1) as the success probability by thetrue endpoints for treatment t . Furthermore, let us denote

P(Wt = 1|Yt = 1) = πt1 and P(Wt = 0|Yt = 0) = πt0, (1)

which are the sensitivity and specificity of the 2 × 2 table for treatment t where the true and surrogate responses are inthe two margins. Clearly it is a saturated model with full parameter space. Consequently, the success probabilities by thesurrogate responses for the two treatments are

rt = P(Wt = 1)= P(Wt = 1|Yt = 1)P(Yt = 1) + P(Wt = 1|Yt = 0)P(Yt = 0)= πt1pt + (1 − πt0)(1 − pt)= pt(πt1 + πt0 − 1) + (1 − πt0).

The data corresponding to treatment t can be represented in a table as follows.

True SurrogateWt = 1 Wt = 0 Total

Yt = 1 mt11 mt10 YtT

Yt = 0 mt01 mt00 mt − YtT

Total WtT mt − WtT mt

Only surrogate WtS nt − mt − WtS nt − mt

where YtT =mt

i=1 Yti and WtT =mt

i=1 Wti; also we denote WtS =nt

i=mt+1 Wti for t = A and B. The notation (Yti,Wti) isspecifically used for denoting the response variables corresponding to the ith individual under tth treatment. If anymarginalis found to be zero, it is customary to add 0.5 to each of the marginals. As an example/illustration we consider the dataset analyzed by Buyse and Molenberghs (1998). This data set is obtained from a randomized clinical trial comparing anexperimental treatment interferon-α, with highest dose, 6-million units daily to a corresponding placebo in the treatmentof patients with age-related macular degeneration (ARMD). Patients with ARMD progressively lose vision. In the trial, apatient’s visual acuity is assessed at different time points through the ability to read lines of letters on standardized visioncharts. It is examinedwhether the loss of at least two lines of vision at 6months (denoted as 1, and 0 otherwise) can be used

Page 3: Linear increment in efficiency with the inclusion of surrogate endpoint

104 B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108

as a surrogate for the loss of at least three lines of vision at 1 year (denoted as 1, and 0 otherwise) which is a true endpointwith respect to the effect of interferon-α. A total of 87 patients received interferon-α and 103 received placebo.

Treatment procedures are evaluated after the first week, which is considered as surrogate endpoints (Winterferon-α orWPlacebo) and the second week to assess the helpfulness by true endpoints (Yinterferon-α or YPlacebo). These data are used inSection 6 for illustration. If we assume that 40% of the patients allocated to the interferon-α drug will turn upwith both trueand surrogate endpoints, that is ρinterferon-α = 0.4, then the data set for the drug interferon-α and placebo can be representedas

t mt11 mt10 mt01 mt00 WtS nt − mt − WtS

Interferon-α 15 4 4 12 28 24Placebo 12 4 4 24 22 47

3. Variance reduction using surrogate responses

Our basic objective is to quantify the increase in efficiency or reduction in variance of the estimator of any specifiedparametric function by properly using the information extracted from the surrogate endpoints. Consider the likelihood forthe treatment t ,

L(ξt) =

mt

ytT

pytTt qmt−ytT

t

ytTmt11

π

mt11t1 (1 − πt1)

ytT−mt11

×

mt − ytTmt01

(1 − πt0)

mt01πmt00t0

nt − mt

wtS

rwtSt (1 − rt)nt−mt−wtS , (2)

where ξt = (pt , πt1, πt0) and qt = 1−pt . The Fisher informationmatrix is I(ξt) and the (1, 1)th element of [I(ξt)]−1, denotedby [I(ξt)]−1

11 , gives the variance of maximum likelihood estimatorp(S)t . Here we define the inverse of efficiency, which is the

measure of improvement to estimate pt when surrogate augmented analysis is conducted. So the measure of improvement

is [I(ξt )]−111

ptqt/mt.

Furthermore, denote mt/nt = ρt ∈ (0, 1]. We note that ρt = 0 is possible only when mt = 0, indicating no trueresponse is available. This is not of any statistical interest. Using mt = ρtnt we get I(ξt) = nt Iξt (ρt). Hence the measure ofimprovement, given by the proportional variance, reduces to

Gξt (ρt) =n−1tIξt (ρt)

−111

m−1t ptqt

=ρtIξt (ρt)

−111

ptqt.

Now we have the following theorem, proof of which is given in Appendix A.1

Theorem 1. (a) Relative gain, denoted by the inverse of efficiency, Gξt (ρt) by using surrogate endpoints with ρt proportion ofavailable true responses is a linear function of ρt , that is

Gξt (ρt) = Ct + (1 − Ct)ρt = ρt + (1 − ρt)Ct , (3)

which is a straight line joining the points (0, Ct) and (1, 1) with intercept and slope add to unity, with

Ct =qtπt0(1 − πt0) + ptπt1(1 − πt1)

rt(1 − rt).

(b) Further

Ct =E(Var(Wt |Yt))

Var(Wt)∈ [0, 1].

Remark. The expression in (3) is amazingly simple in the context of the fairly general joint distribution of the true andsurrogate endpoints. So far our knowledge goes, this expression and the following expressions are new in the literaturewhere true and surrogate both are binary endpoints. The intercept of the line in Eq. (3) is Ct =

E(Var(Wt |Yt ))Var(Wt )

and slope is

(1 − Ct) =Var(E(Wt |Yt ))

Var(Wt ), and they add up to 1. It is also evident from Eq. (3) that the value of G is also 1 when ρt = 1,

irrespective of the value of Ct . The slope gives the rate of increment of efficiency with the inclusion of true endpoints. If welet the value of ρt to increase or in other words if we let mt to increase and keep nt as fixed then (1 − Ct) indicates theimprovement in efficiency. The intercept gives the lower bound of efficiency gain. Clearly Ct = 1 when true and surrogateendpoints are independent, and Ct = 0when the conditional distribution of surrogate endpoint given the true is degenerateone, for treatment t . Total variability ofWt is partitioned into two parts. The variability of the regression ofWt on Yt relativeto the variability ofWt (i.e. correlation ratio) serves as the slope, and the variability of the unexplained part in the regressionrelated to the total variability is the intercept. Theorem 1 can be effectively used to obtain the gain in efficiencies for thestandard measures of treatment difference, like the log odds ratio, difference in success probabilities and relative risk ratioin the two-treatment set up. We present these in Theorem 2 and Corollaries 1–3.

Page 4: Linear increment in efficiency with the inclusion of surrogate endpoint

B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108 105

Theorem 2. The inverse of efficiency in variance estimation by using surrogate endpoints to estimate the log odds ratio (OR),θ = log

pAqBqApB

, is given by a plane, that is

GOR(ρA, ρB) =(mApAqA)−1GξA(ρA) + (mBpBqB)−1GξB(ρB)

(mApAqA)−1 + (mBpBqB)−1. (4)

The proof is immediate with the use of delta method and Theorem 1.

Corollary 1. When ρA = ρB = ρ , the inverse of efficiency by using surrogate endpoints to estimate the log odds ratio is a linearfunction of ρ , that is

GOR(ρ, ρ) = G∗

OR(ρ) = ρ + (1 − ρ)

(nApAqA)−1CA + (nBpBqB)−1CB

(nApAqA)−1 + (nBpBqB)−1

.

Corollary 2. The inverse of efficiency by using surrogate endpoints to estimate θ = pA − pB, the treatment difference (TD), is aplane given by

GTD(ρA, ρB) =m−1

A pAqAGξA(ρA) + m−1B pBqBGξB(ρB)

m−1A pAqA + m−1

B pBqB,

and, for ρA = ρB = ρ we get the line given by

GTD(ρ, ρ) = G∗

TD(ρ) = ρ + (1 − ρ)

n−1A pAqACA + n−1

B pBqBCB

n−1A pAqA + n−1

B pBqB

.

Corollary 3. The inverse of efficiency by using surrogate endpoints to estimate θ = log

pApB

, the log risk ratio (RR), is a plane

given by

GRR(ρA, ρB) =(mApA)−1qAGξA(ρA) + (mBpB)−1qBGξB(ρB)

(mApA)−1qA + (mBpB)−1qB,

and when ρA = ρB = ρ it reduces to a linear function of ρ given by

GRR(ρ, ρ) = G∗

RR(ρ) = ρ + (1 − ρ)

(nApA)−1qACA + (nBpB)−1qBCB

(nApA)−1qA + (nBpB)−1qB

.

In Fig. 1 we plot GOR(ρA, ρB), given in (4), against ρA and ρB. We find that, for suitable ρA and ρB, there is considerablegain in estimation of OR. Fig. 2 illustrates the special case of ρA = ρB where GA, GB, G∗

OR, G∗

TD, G∗

RR are all straight lines. In fact,GOR(., .), GTD(., .) and GRR(., .) are convex combinations of GξA and GξB , and belong to in between these two straight lines.We have takenmA = mB = 34 and pA = 0.7, pB = 0.8 in these two figures, for illustration.

4. Data example and discussions

We obtain the treatment difference (TD) between interferon-α and placebo alloted to the ARMD patients (see Buyseand Molenberghs, 1998). Estimated treatment difference is 0.162. The observed value of CA = Cinterferon-α = 0.66 andCB = Cplacibo = 0.58. So the equation of G∗

TD(ρ) = ρ + 0.62(1 − ρ) where we assume the equal values of ρinterferon-α andρplacebo, sayρ. For difference values ofρ = {0.2, 0.3, . . . , 0.9} TD is calculated based only on true endpoints and on surrogateaugmented data as well. The ratio of their bootstrap variances are plotted along with the line G∗

TD(ρ) (see Corollary 2). Wefind that the observed measures of efficiencies are close to the straight line G∗

TD(ρ), see Fig. 3.In this article we discussed the impact of proportion of available true endpoints relative to large number of surrogate

endpoints in reduction of variance to estimate treatment success probabilities and related parametric functions liketreatment difference, log odds ratio, log risk ratio. We found that the proportion of variance reduction for the estimateof treatment success probability for any treatment is a linear function Gξt (ρt) of sample proportion of true endpoint (ρt )when compared to the total (surrogate) endpoints. Proportion of variance for the parametric functions corresponding totwo treatments are convex combinations of GξA(ρA) and GξB(ρB) and the weights are proportional to the variance of theestimator of the parametric function for individual treatments with true endpoints only.

When true and surrogate endpoints are independent that is CA = 1 = CB, there is no reduction in sample size. On thecontrary when the conditional distribution of surrogate endpoint given the true is degenerate one, i.e. CA = 0 = CB, theproportion of sample size reduction is (1 − ρ) for ρA = ρB = ρ.

Page 5: Linear increment in efficiency with the inclusion of surrogate endpoint

106 B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108

Fig. 1. Proportion of variance against ρA and ρB .

Fig. 2. GξA , GξB , G∗

OR , G∗

TD , G∗

RR against ρ.

For estimationwe suggested to useMLE in this paper, which is iterative for the problemunder consideration. For practicalimplementation of a surrogate-augmented procedure, Banerjee and Biswas (2011) used EM-based estimates of pA and pB.Alternative estimates based on conditional expectations may be (see Banerjee and Biswas, 2014)

p(S)t =Yt/nt = n−1

t

YtT +

mt11

WtTWtS +

mt10

mt − WtT(nt − mt − WtS)

. (5)

Here the three terms within brace in the right hand side correspond to the observed number of successes from thetrue responses, estimate of the true successes out ofWtS surrogate successes for which true responses are unobserved, andestimate of true successes out of (nt − mt − WtS) surrogate failures for which true responses are unobserved. Our detailedsimulation studies show that the behavior of the two estimators, EM based and given in (5) are almost similar to theMLE. Asan immediate application, these results can be used to obtain the closed form expressions for the allocation proportions oftwo treatments when the adaptive allocation is used to compare the two treatments. Other studies involving two treatmentbinary responses like the Cochran–Mantel–Haenszel test can also be modified for the surrogate augmented set up. These

Page 6: Linear increment in efficiency with the inclusion of surrogate endpoint

B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108 107

Fig. 3. G∗

TD(ρ) and observed efficiencies against ρ.

expressions are extremely useful to determine the sample to attain a certain level and power of a test when the surrogatedata are present. As a far reaching application these results may be extremely helpful when ρA and ρB are random variablesdepending on time in an ongoing treatment process. It will help to obtain the variance of the estimators of the functions ofsuccess probabilities more easily.

Acknowledgments

The authorswish to thank two anonymous referees for their careful reading and constructive suggestionswhich led someimprovements over the earlier version of the manuscript.

Appendix

A.1. Proof of Theorem 1

(a) From the likelihood Eq. (2), it is immediate that

nt Iξt (ρt) ≡ nt

ρtdt1 + (1 − ρt)dt2 (1 − ρt)ct1 (1 − ρt)ct2

(1 − ρt)ct1 ρtdt3 + (1 − ρt)dt4 (1 − ρt)ct3(1 − ρt)ct2 (1 − ρt)ct3 ρtdt5 + (1 − ρt)dt6

,

where dt1 =1

ptqt, dt2 =

(πt1+πt0−1)2

rt (1−rt ), dt3 =

ptπt1(1−πt1)

, dt4 =p2t

rt (1−rt ), dt5 =

qtπt0(1−πt0)

, dt6 =q2t

rt (1−rt )and ct1 =

πt0(1−rt )

−1−πt0

rt

=

rt (πt1+πt0−1)rt (1−rt )

, ct2 =

1−πt1(1−rt )

−πt1rt

= −

(1−rt )(πt1+πt0−1)rt (1−rt )

, ct3 = −ptqt

rt (1−rt ). Now observing that

dt6dt4 = c2t3dt2dt4 = c2t1dt6dt2 = c2t2dt2dt4dt6 + dt1dt4dt6 + dt2dt3dt6 = dt1c2t3 + dt3c2t2 + dt5c2t1dt2dt4dt6 + 2ct1ct2ct3 = dt2c2t3 + dt4c2t2 + dt6c2t1dt1dt3dt5 = dt1dt4dt5 + dt2dt3dt5 + dt1dt3dt6

=1

πt1(1 − πt1)πt0(1 − πt0).

So we get

Ct =dt6dt5

+dt4dt3

=qtπt0(1 − πt0) + ptπt1(1 − πt1)

rt(1 − rt),

and

Gξt (ρt) =n−1tIξt (ρt)

−111

m−1t ptqt

=ρtIξt (ρt)

−111

ptqt= ρt + (1 − ρt)Ct = Ct + (1 − Ct)ρt . (6)

Page 7: Linear increment in efficiency with the inclusion of surrogate endpoint

108 B. Banerjee, A. Biswas / Statistics and Probability Letters 96 (2015) 102–108

For (b) we observe that Var(Wt |Yt = j) = πtj(1 − πtj) for j = 0, 1 and t = A, B. Hence

E(Var(Wt |Yt)) = qtπt0(1 − πt0) + ptπt1(1 − πt1),

and consequently we get

Ct =E(Var(Wt |Yt))

Var(Wt)=

E(Var(Wt |Yt))

E(Var(Wt |Yt)) + Var(E(Wt |Yt))∈ [0, 1].

Clearly Ct = 1 when true and surrogate endpoints are independent, and Ct = 0 when the conditional distribution of surro-gate endpoint given the true is degenerate.

References

Banerjee, B., Biswas, A., 2011. Estimating treatment difference for binary responses in the presence of surrogate end points. Stat. Med. 30, 186–196.Banerjee, B., Biswas, A., 2014. Odds ratio for 2× 2 tables: Mantel–Haenszel estimator, profile likelihood and presence of surrogate responses. J. Biopharm.

Statist. 24 (3), 649–659.Begg, C.B., Leung, D.H.Y., 2000. On the use of surrogate endpoints in randomized trials. J. Roy. Statist. Soc. Ser. A 163, 15–28.Buyse, M., Molenberghs, G., 1998. Criteria for the validation of surrogate endpoints in randomized experiments. Biometrics 54, 1014–1029.Buyse, M., Molenberghs, G., Burzykowski, T., Renard, D., Geys, H., 2000. The validation of surrogate endpoints inmeta-analysis of randomized experiments.

Biometrics 56, 49–67.Chen, Y.H., 2000. A robust imputation method for surrogate outcome data. Biometrika 87, 711–716.Chen, H., Geng, Z., Jia, J., 2007. Criteria for surrogate endpoints. J. R. Stat. Soc. Ser. B 69, 919–932.Chen, S.X., Leung, D.H.Y., Qin, J., 2003. Information recovery in a study with surrogate endpoints. J. Amer. Statist. Assoc. 98, 1052–1062.Day, N.E., Duffy, S.W., 1996. Trial design based on surrogate endpoints—applications to comparison of different breast screening frequencies. J. Roy. Statist.

Soc. Ser. A 159, 49–60.Freedman, L.S., Graubard, B.I., Schatzkin, A., 1992. Statistical validation of intermediate endpoints for chronic diseases. Stat. Med. 11, 167–178.Lin, D.Y., Fleming, T.R., Gruttola, V.D., 1997. Estimating the proportion of treatment effect explained by a surrogate marker. Stat. Med. 16, 1515–1527.Molenberghs, G., Geys, H., Buyse, M., 2001. Evaluation of surrogate endpoints in randomized experiments with mixed discrete and continuous outcomes.

Stat. Med. 20, 3023–3038.Pepe, M., 1992. Inference using surrogate outcome data and validation sample. Biometrica 79, 355–365.Prentice, R.L., 1989. Surrogate endpoints in clinical trials: definition and operational criteria. Stat. Med. 8, 431–440.Reilly, M., Pepe, M.S., 1995. A mean score method for missing and auxiliary covariate data in regression models. Biometrika 82, 299–314.Wang, Y., Taylor, J.M.G., 2002. A measure of the proportion of treatment effect explained by a surrogate marker. Biometrics 58, 803–812.Wittes, J., Lakatos, E., Probstfield, J., 1989. Surrogate endpoints in clinical trials: cardiovascular disease. Stat. Med. 8, 415–425.