Fitzmaurice G.M., Lipsitz S.R., Ibrahim J.G. - Estimation in Regression Models for Longitudinal Binary Data With Outcome-Dependent Follow-up(2006)(17)

  • Upload
    oscura

  • View
    223

  • Download
    0

Embed Size (px)

DESCRIPTION

s

Citation preview

  • Biostatistics (2006), 7, 3, pp. 469485doi:10.1093/biostatistics/kxj019Advance Access publication on January 20, 2006

    Estimation in regression models for longitudinal binarydata with outcome-dependent follow-up

    GARRETT M. FITZMAURICE

    Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue andBrigham and Womens Hospital, Boston, MA, USA

    [email protected]

    STUART R. LIPSITZMedical University of South Carolina, Charleston, SC, USA

    JOSEPH G. IBRAHIMSchool of Public Health, University of North Carolina, Chapel Hill, NC, USA

    RICHARD GELBERDana Farber Cancer Institute, Boston, MA, USA

    STEVEN LIPSHULTZUniversity of Miami School of Medicine, Miami, FL, USA

    SUMMARYIn many observational studies, individuals are measured repeatedly over time, although not necessarily ata set of pre-specified occasions. Instead, individuals may be measured at irregular intervals, with thosehaving a history of poorer health outcomes being measured with somewhat greater frequency and regular-ity. In this paper, we consider likelihood-based estimation of the regression parameters in marginal modelsfor longitudinal binary data when the follow-up times are not fixed by design, but can depend on previousoutcomes. In particular, we consider assumptions regarding the follow-up time process that result in thelikelihood function separating into two components: one for the follow-up time process, the other forthe outcome measurement process. The practical implication of this separation is that the follow-up timeprocess can be ignored when making likelihood-based inferences about the marginal regression modelparameters. That is, maximum likelihood (ML) estimation of the regression parameters relating the proba-bility of success at a given time to covariates does not require that a model for the distribution of follow-uptimes be specified. However, to obtain consistent parameter estimates, the multinomial distribution for thevector of repeated binary outcomes must be correctly specified. In general, ML estimation requires speci-fication of all higher-order moments and the likelihood for a marginal model can be intractable except incases where the number of repeated measurements is relatively small. To circumvent these difficulties, wepropose a pseudolikelihood for estimation of the marginal model parameters. The pseudolikelihood uses a

    To whom correspondence should be addressed.

    c The Author 2006. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: [email protected].

  • 470 G. M. FITZMAURICE ET AL.

    linear approximation for the conditional distribution of the response at any occasion, given the history ofprevious responses. The appeal of this approximation is that the conditional distributions are functions ofthe first two moments of the binary responses only. When the follow-up times depend only on the previousoutcome, the pseudolikelihood requires correct specification of the conditional distribution of the currentoutcome given the outcome at the previous occasion only. Results from a simulation study and a studyof asymptotic bias are presented. Finally, we illustrate the main results using data from a longitudinalobservational study that explored the cardiotoxic effects of doxorubicin chemotherapy for the treatmentof acute lymphoblastic leukemia in children.

    Keywords: Follow-up time process; Generalized estimating equations; Maximum likelihood; Multinomial distribu-tion; Pseudolikelihood.

    1. INTRODUCTION

    In many observational studies, individuals are followed over time, and the outcome variable of interestis measured repeatedly at different follow-up times. Unlike designed longitudinal studies, individuals arenot necessarily measured at a set of common time points specified by the study protocol, but instead aremeasured at irregular intervals. Furthermore, the follow-up times may depend on the previous outcomemeasures. For example, consider a recent longitudinal observational study (Lipshultz et al., 1995) thatexplored the cardiotoxic effects of doxorubicin chemotherapy for the treatment of acute lymphoblasticleukemia in childhood. In this study, 111 children with acute lymphoblastic leukemia were treated ac-cording to one of the five protocols that included doxorubicin in total doses ranging from 45 to 550 mgper square meter of body-surface area. Although doxorubicin has proven successful in curing the leukemia(Schorin et al., 1994), long-term survivors of childhood cancer often develop progressive abnormalitiesof the heart (Lipshultz et al., 1991, 1995). In this study, the median time between the initial and finalfollow-up visit was 6 years. The primary outcome of interest is a binary response denoting normal or ab-normal left ventricular mass, as determined by echocardiogram. Table 1 provides illustrative data from10 of the 111 patients enrolled in the study. The majority of children (90%) had at least one follow-upechocardiogram; however, there was no explicit design for the patients to receive echocardiograms at reg-ular intervals. Instead, the decision to have a follow-up measurement taken was at the discretion of eachpatients physician. Consequently, the follow-up times are unequally spaced and the number and timingof follow-up measurements varied from individual to individual. In particular, including the initial visit,3 patients (2.7%) were assessed seven times, 1 patient was assessed six times, 9 patients (8.1%) wereassessed five times, 11 patients (9.9%) were assessed four times, 16 patients (14.4%) were assessed threetimes, 31 patients (27.9%) were assessed twice, and 40 patients (36%) were assessed only at the initialvisit. The maximum follow-up time for an echocardiogram was 13.9 years from the initial visit.

    In this particular study, each individual has a set of random follow-up times and it is reasonableto conjecture that the follow-up times might depend on previous outcomes. That is, the physician of apatient with an abnormal left ventricular mass measurement may demand more frequent echocardiogramsand at shorter follow-up intervals. For example, in this study, the median time to the first follow-up visitwas 2.6 years for patients who initially had normal left ventricular mass, and 1.7 years for patients whoinitially had abnormal left ventricular mass (log-rank test, p-value < 0.025).

    The focus of this paper is on estimation of regression parameters in marginal models for longitudinalbinary data where the follow-up times are not fixed by design, but can depend on previous outcomes.We discuss assumptions about the distribution of follow-up times that result in the likelihood function forthe observed data separating into two components: one for the follow-up time process, the other for theoutcome measurement process. The practical implication of this separation of the likelihood function isthat the former process can be ignored when making likelihood-based inferences about the latter. That is,

  • Estimation in regression models for longitudinal binary data 471

    Table 1. Selected data on 10 patients from the longitudinal study of cardiotoxicityPatient Time (years) Wall Dose (in mg) Age at end Gender

    since treatment thickness of treatment1 7.281 0 405 3.3 M1 8.849 0 405 3.3 M1 9.703 0 405 3.3 M1 10.112 0 405 3.3 M1 11.144 0 405 3.3 M1 11.770 0 405 3.3 M2 7.514 0 440 4.8 F2 8.400 0 440 4.8 F2 9.862 0 440 4.8 F2 10.997 1 440 4.8 F2 11.906 1 440 4.8 F2 13.003 1 440 4.8 F2 13.256 0 440 4.8 F3 2.064 1 280 15.7 M3 2.175 1 280 15.7 M4 5.270 0 294 2.7 F4 7.863 1 294 2.7 F4 11.886 1 294 2.7 F5 4.300 0 335 7.6 M5 11.447 1 335 7.6 M6 9.122 1 450 3.2 F6 10.501 0 450 3.2 F6 12.410 1 450 3.2 F7 9.892 1 446 3.8 M7 12.236 0 446 3.8 M8 6.755 0 330 5.2 F8 8.197 0 330 5.2 F8 9.966 0 330 5.2 F8 10.929 1 330 5.2 F8 12.199 1 330 5.2 F8 13.203 0 330 5.2 F9 3.012 0 323 15.5 F9 4.592 0 323 15.5 F9 6.799 1 323 15.5 F

    10 2.229 0 337 9.4 M10 8.715 1 337 9.4 M10 10.643 0 337 9.4 M

    maximum likelihood (ML) estimation of the regression parameters does not require that a model for thedistribution of follow-up times be specified. If the follow-up time process is considered to be a form ofselection, then there are very close parallels with ignorable missing data mechanisms (Little and Rubin,1987). However, to obtain consistent regression parameter estimates, the multinomial distribution for thevector of repeated binary outcomes must be correctly specified. ML estimation requires specification ofall higher-order moments and it is prohibitively difficult to compute the likelihood function for marginalmodels except in cases where the number of repeated measurements is small. When the likelihood is in-tractable, one alternative is the generalized estimating equations (GEE) approach (Liang and Zeger, 1986).

  • 472 G. M. FITZMAURICE ET AL.

    However, when the follow-up times depend on previous outcomes, the standard GEE approach yieldsbiased parameter estimates (Lipsitz et al., 2002). To circumvent problems with intractable likelihoodsfor marginal models and the potential bias of standard GEE methods, we propose a pseudolikelihood toestimate the regression parameters when the follow-up times depend on previous outcomes. The pseu-dolikelihood uses a linear approximation to the conditional distribution of the response at any occasions,given the history of previous responses. The appeal of this approximation is that the conditional distribu-tions are functions of the first two moments of the binary responses only. Recently, Lin et al. (2004) haveproposed a weighted GEE approach for handling outcome-dependent follow-up that also avoids specifica-tion of the joint distribution for the vector of repeated outcomes. In their weighted GEE, contributions tothe estimating equations are weighted by the inverse of the intensity-of-visit process. The weighted GEEapproach makes weaker assumptions about the follow-up time process but requires correct specification ofa model for the follow-up time process. In addition, it can incorporate auxiliary, internal time-dependentcovariates into the model for the follow-up time process. In contrast, the proposed pseudolikelihood esti-mator makes stronger assumptions about the follow-up time process but does not require specification ofa model for the follow-up time process.

    In the following, we refer to the model for the outcome of interest as the measurement model. Fur-thermore, given that the follow-up times are generated from some underlying distribution, we also refer tothe follow-up time process. In Section 2, we introduce some notation and consider models for the mea-surement and follow-up time processes. We discuss assumptions regarding the follow-up time process andconsider likelihood-based inferences about the measurement model. A pseudolikelihood approach is pro-posed to circumvent difficulties with intractable likelihoods for marginal models for longitudinal binarydata. In Section 3, we demonstrate how misspecification of the model for the within-subject associationamong the repeated binary responses will, in general, result in regression parameter estimates that areasymptotically biased. The results of a simulation study, assessing the potential magnitude of the bias infinite samples, and also comparing estimators in terms of variance and coverage probabilities, are pre-sented in Section 3. Finally, we illustrate the main results using data from the longitudinal study of thecardiotoxic effects of doxorubicin chemotherapy introduced earlier.

    2. MODELS FOR MEASUREMENT AND FOLLOW-UP PROCESSES

    2.1 Notation

    Suppose that n independent individuals are to be followed up intermittently over an interval [0, ], wherethe constant > 0 is determined with the knowledge that outcome measurements could potentially beobserved up to time . It is assumed that the outcome at time t, denoted by Yi (t), is binary, i.e. Yi (t)equals 1 if the i th individual has response 1 (say success) at time t , and 0 otherwise. Associated withYi (t), each individual has a (J 1) covariate vector Xi (t) which can include both time-stationary andtime-varying covariates.

    A logistic regression model is assumed for the marginal mean of Yi (t),

    pi (t) = E{Yi (t)|Xi (t)} = pr{Yi (t) = 1|Xi (t)} = exp{Xi (t)}1 + exp{Xi (t)} , (2.1)

    where is the vector of logistic regression parameters. Although a logit link function is adopted, in prin-ciple any suitable link function can be used. Note that if Xi (t) includes time-varying covariates, they areassumed to be external covariates in the sense described by Kalbfleisch and Prentice (1980). Specifically,it is assumed that any time-varying covariate process at time t is conditionally independent of all previousresponses, given the past history of the covariate process. When this assumption is violated, the validityand interpretability of the model given by (2.1) is questionable (see, for example, Fitzmaurice et al., 1993;

  • Estimation in regression models for longitudinal binary data 473

    Pepe and Anderson, 1994; Robins et al., 1999). Since Yi (t) is binary, the variance function over time isdetermined completely by the marginal mean, i.e.

    Var{Yi (t)} = pi (t){1 pi (t)}. (2.2)We denote the correlation function by

    i (u, t) = Corr{Yi (t), Yi (t u)}, (2.3)and it depends upon the higher-order moments; this is specified in terms of an additional set of parame-ters, .

    Next, let Ni (t) denote a counting process for the number of follow-up measurements on the i th in-dividual by continuous time point t 0. The indicator random variable d Ni (t) equals 1 if a follow-upmeasurement is obtained on the i th individual at follow-up time t , and equals 0 otherwise. Note that allthe information about the times of follow-up is contained in the counting process Ni (t), or equivalently,in d Ni (t). Thus, the i th individual (i = 1, . . . , n) has a set of binary responses at Mi = Ni ( ) randomfollow-up times.

    Finally, we consider the distribution of the follow-up times and their dependence on the outcomemeasurement function {Yi (t)}. Let Yi (t) = {Yi (s): 0 s t} and Ni (t) = {Ni (s): 0 s t}denote the history of the outcome measurement and the follow-up time process through time t ; further, letXi (t) = {Xi (s): 0 s t}. Let Y Oi (t) = {Yi (s): d Ni (s), 0 s t} denote the history of the observedoutcome measurement process through time t . Also, let Y i (t) = {Yi (s): t s }. Next, we assumethat the conditional distribution of d Ni (t), given Ni (t) and Yi (t), depends only on the history of theobserved outcome measurements at the previous times, and perhaps on Xi (t). For example, patients witha history of poor health outcomes on previous visits may be expected to have shorter times of follow-up.Specifically, we assume that

    d Ni (t) Y i (t)|Ni (t), Y Oi (t), Xi (t). (2.4)Thus, (2.4) implies that follow-up at time t depends only on the previously observed data (i.e. the previ-ously observed outcomes and the times at which they were observed). Equation (2.4) is identifiable onlyto the extent that it asserts other possible endogenous covariates, say W , do not affect the timing of themeasurements. In principle, the conditional distribution of the follow-up times can follow any time-to-event distribution. For reasons that will become apparent in Section 2.2, we do not consider particularfamilies of distributions for d Ni (t).

    2.2 ML estimation

    Recall that in most longitudinal studies the parameters of primary interest are , and sometimes ; theparameters of the follow-up time distribution, say , are usually regarded as nuisance parameters. Indeed,in many longitudinal studies, both and might be regarded as nuisance parameters. In this section, wediscuss estimation of (, ).

    The assumption about the conditional distribution of the follow-up times given by (2.4) implies thefollowing conditional independence relationship for the distribution of the outcome measurement processat time t ,

    Yi (t) Ni (t)|Y Oi (t), Xi (t), (2.5)which follows by expressing

    f (Y (t)|Y O(t), N (t), X(t)) = f (N (t)|Y (t), YO(t), X(t)) f (Y (t), Y O(t)|X(t))

    f (N (t)|Y O(t), X(t)) f (Y O(t)|X(t))

  • 474 G. M. FITZMAURICE ET AL.

    and substituting from (2.4). That is, the conditional distribution of the outcome measurement process attime t , given the history of the observed outcome measurement process and the visit history, is the samefor those individuals who were followed up as for those who were not. Note, however, that f (Y (t)|N (t),X(t)) = f (Y (t)|X(t)) due to the assumed dependence of visit times on the history of the observedoutcome measurement process.

    Since (2.5) holds under the assumption about the conditional distribution of the follow-up time processgiven by (2.4), the log-likelihood for the observed data is

    ( , , ) =n

    i=1

    0

    log{ f (d Ni (t)|Ni (t), Y Oi (t), Xi (t); )}d Ni (t)

    +n

    i=1

    0

    log{ f (Yi (t)|Y Oi (t), Xi (t); , )}d Ni (t). (2.6)

    Assuming that and (, ) are separable, the ML estimate of (, ) is obtained by maximizing thesecond term on the right-hand side of (2.6) and solving

    s(, ) ={( , , )

    (, )

    }(=,=)

    = (, )

    {n

    i=1

    0

    log{ f (Yi (t)|Y Oi (t), Xi (t); , )}d Ni (t)}

    (=,=)= 0, (2.7)

    for (, ).Thus, given (2.4) and the assumption that and (, ) are functionally distinct sets of parameters,

    (, ) can be estimated ignoring the distribution of the follow-up times by maximizing

    (, ) =n

    i=1

    0

    log{ f (Yi (t)|Y Oi (t), Xi (t); , )}d Ni (t). (2.8)

    Using standard results from likelihood theory, and under suitable regularity conditions, it can be shownthat the ML estimate (, ) is consistent and asymptotically multivariate normal, with asymptotic covari-ance matrix approximated by the inverse of the Fisher information matrix.

    To formulate the likelihood, we must specify the conditional distributions

    f (Yi (t)|Y Oi (t), Xi (t); , ). (2.9)In general, this requires specification of the success probabilities (2.1) and correlations (2.3), in addi-tion to higher-order moments. In principle, given (, ), specification of these higher-order moments isrelatively straightforward (e.g. using the Bahadur, 1961, representation for the multinomial distribution).However, because the number of higher-order moments increases rapidly with the number of repeatedmeasurements, in practice, it is prohibitively difficult to compute the likelihood function for marginalmodels except in cases where the number of repeated measurements is relatively small.

    2.3 Pseudo ML estimation

    Although ML estimation yields consistent parameter estimates when the follow-up times depend on previ-ous outcomes, it does require that the multinomial distribution for the vector of repeated binary outcomes

  • Estimation in regression models for longitudinal binary data 475

    be correctly specified. Because the likelihood is intractable except when Mi is small, we propose a pseudo-likelihood (PL) approach to circumvent these difficulties. The proposed pseudolikelihood is a modificationof the likelihood to reduce its complexity and is based on approximating the conditional distributions in(2.8). Specifically, the proposed pseudo log-likelihood is

    p(, ) =n

    i=1

    0

    [Yi (t) log{i (t)} + {1 Yi (t)} log{1 i (t)}]d Ni (t), (2.10)

    where i (t) = pr(Yi (t) = 1|Y Oi (t), Xi (t); , ) is approximated byi (t) pi (t) + Cov{Yi (t), Y Oi (t)}[Var{Y Oi (t)}]1[Y Oi (t) E{Y Oi (t)|Xi (t)}], (2.11)

    where pi (t) is the marginal probability at time t given by (2.1). This is a Gaussian approximation for i (t),with i (t) taking the same form as the conditional mean from the multivariate normal distribution. Thus,the pseudolikelihood uses a linear approximation for the conditional distribution of the response at anyoccasion, given the history of previous responses. The appeal of this approximation is that the conditionaldistributions are functions of the first two moments of the binary responses only.

    An interesting special case of this pseudolikelihood arises when an exponential correlation, i (u, t) =|u|, is assumed. Then, under (2.11), the conditional distribution of Yi (t), given the history of previousresponses, Yi (t), depends only on the most recently observed value of the response prior to t , Y Oi (t) =Y (max{s: 0 s < t, d Ni (s) = 1}). This is a first-order Markov-type dependence, and it can be shownthat the pseudo log-likelihood,

    ni=1

    0

    log{ f (Yi (t)|Y Oi (t), Xi (t); , )}d Ni (t), (2.12)

    yields consistent estimators of (, ) under the following conditional independence assumption,

    d Ni (t) Yi (t)|Y Oi (t), Xi (t). (2.13)Assumption (2.13) is stronger than assumption (2.4) and is not verifiable from the data at hand (see Ap-pendix) unless further parametric modeling assumptions are made [e.g. with additional modeling assump-tions of the kind discussed in Diggle and Kenward, 1994, assumption (2.13) becomes weakly testable]. Itimplies that the conditional distribution of the outcome measurement process at time t , given the previousobserved outcome, is the same for those individuals who were followed up at time t as for those who werenot. Thus, when assumption (2.13) holds, maximization of (2.12) yields pseudo maximum likelihood es-timates (PMLEs), (, ), which are consistent (see Appendix) provided the true conditional distributionf (Yi (t)|Y Oi (t), Xi (t); , ) has been correctly specified in (2.12), even though

    f (Yi (t)|Y Oi (t), Xi (t); , ) = f (Yi (t)|Y Oi (t), Xi (t); , ).When the true conditional distribution does satisfy the first-order Markov-type assumption,

    f (Yi (t)|Y Oi (t), Xi (t); , ) = f (Yi (t)|Y Oi (t), Xi (t); , ),then the PMLE is the ML estimate of (, ).

    Next, for the more general case given by (2.10) and (2.11), we derive the pseudoscore vector de-noted by

    sp(, ) =n

    i=1spi (, ) =

    p

    (, )=

    ni=1

    [ 0

    Di (t)i (t)1{Yi (t) i (t)}d Ni (t)],

  • 476 G. M. FITZMAURICE ET AL.

    where i (t) = i (t ; , ) = E[Yi (t)|Y Oi (t), Xi (t); , ] is specified by (2.11), Di (t) = Di (t ; , ) =i (t)(,) , and i (t) = i (t ; , ) = Var[Yi (t)|Y Oi (t), Xi (t), , ] = i (t)(1 i (t)). We note that PLestimators can be derived from (2.10) and (2.11) under more general assumptions about the correlationstructure, e.g. second-order antedependence. The PMLE (, ) is obtained as the solution to sp(, ) = 0,e.g. using the NewtonRaphson algorithm. However, in general, the inverse of the pseudo-Fisher infor-mation matrix does not yield a valid estimate of the asymptotic variance. Instead, a so-called empirical orrobust estimator of variance is required. The appropriate adjustment takes the usual form of the sandwichestimator,

    n12 [(, ) (, )] L N (0, ),

    where

    =[

    1n

    E{s(, )

    (, )

    }]1 1n

    ni=1

    E{[

    spi (, )] [

    spi (, )]} [1

    nE{s(, )

    (, )

    }]1. (2.14)

    The variance estimate is obtained by replacing (, ) with (, ) in (2.14).It must be recognized that the PL estimator of (and ) is consistent only if it can be shown that

    E[sp(, )] = 0 at the true and . Under the assumption about the conditional distribution of thefollow-up times given by (2.4), sp(, ) has mean equal to 0 at the true and provided i (t ; , )is a linear function of the previous responses. In the bivariate case, the linear approximation of i (t)can be shown to be exact provided that the covariance matrix has been correctly specified. However, inthe multivariate case this no longer holds and the adequacy of the approximation will depend on twofactors: (i) how well i (t) is approximated by a linear function of the previous responses and (ii) howclosely Cov(Yi (t), Y Oi (t)) approximates the true underlying covariance of the responses. We examinethe adequacy of the approximation in Section 3 with studies of asymptotic and finite-sample bias.

    Finally, we note that although the proposed PL estimator requires a Gaussian approximation of i (t), and are estimated jointly by maximizing (2.10). An alternative approach is to make a Gaussian ap-proximation for i (t) and, in addition, decouple estimation of from the estimation of (see, forexample, Hand and Crowder, 1996). This is precisely the approach taken in the modified GEE based onGaussian estimation, as suggested by Lipsitz et al. (1992), Lee et al. (1999, unpublished manuscript), andLipsitz et al. (2000). The modified GEE uses the multivariate normal estimating equations for estima-tion of the correlation parameters, , combined with the standard GEE for the marginal mean regressionparameters, , thereby decoupling estimation of from estimation of . The use of the multivariate nor-mal estimating equations for ensures that the estimated correlation matrix is non-negative definite, whileavoiding full specification of the multinomial joint distribution of the responses. Lipsitz et al. (2000) haveshown that this modified GEE yields little bias when there are missing data that are assumed to be missingat random. As mentioned earlier, if the follow-up time process is considered to be a form of selection,then there are very close parallels with missing at random missing data mechanisms (Little and Rubin,1987). In Sections 3 and 4, we explore the properties of the proposed PL estimator and compare it to thestandard and modified (or decoupled) GEE estimators of the regression parameters in marginal models.

    3. STUDIES OF BIAS

    In this section, we consider the asymptotic and finite-sample bias resulting from the different methodswhen follow-up times are random and depend on previous outcomes. Our main concern is with the poten-tial bias in estimating . We consider the PL estimator and compare it to both the standard and modified(or decoupled) GEE estimators (Lipsitz et al., 2000), with possible bias arising because of the nature ofthe follow-up mechanism.

  • Estimation in regression models for longitudinal binary data 477

    3.1 Study of asymptotic biasWe consider a two-group longitudinal design configuration with a binary response measured on fouroccasions. We assume that half of the individuals are assigned to each of the two groups, i.e. the groupindicator variable, xi , equals 0 or 1 with pr(xi = 1) = 0.5. The marginal model for the mean of Yi (t) is

    pi (t) = E[Yi (t)|xi ] = exp[0 + 1xi + 2t + 3xi t]1 + exp[0 + 1xi + 2t + 3xi t] , (3.15)

    and an autoregressive correlation is assumed,

    i (u, t) = Corr[Yi (t), Yi (t u)|xi ] = |u|, (3.16)for = 0.2 or 0.4.

    It is assumed that each individual is seen four times, with the first time point set equal to Ti1 = 0. A de-tailed description of the specification of the joint distribution of the responses and the follow-up times canbe found as supplementary material available at Biostatistics online (www.biostatistics.oupjournals.org).We specify the conditional distributions of the follow-up times as first-order, second-order, or third-order, depending on how many of the previous outcomes the follow-up time depends. Finally, the dis-tribution for the measurement model is specified through a series of conditional distributions, given thepast history of responses, via the Bahadur representation (Bahadur, 1961); the Bahadur representationof those distributions is specified such that (2.15) and (2.16) hold. We formulate two sets of conditionaldistributions. The first set is the first-order dependence model, in which the distribution of the outcomeat a given time depends only on the previous outcome; a Markov-type dependence. The second set is thefull Bahadur model, in which the distribution of the outcome at a given time depends on all previousoutcomes. A more detailed description can be found as supplementary material available at Biostatisticsonline.

    In the study of asymptotic bias, we fix (0, 1, 2, 3) = (1,1, 1,1) and set = 0.2 or 0.4. Thus,we specify two models for the outcome distribution, full Bahadur and first-order dependence, and threemodels for the follow-up time distribution, first-order, second-order, and third-order.

    Letting denote an estimate from one of the methods described in Section 2.3, P , where isnot necessarily equal to . Here we are primarily interested in assessing the asymptotic biases, ( ).Following Rotnitzky and Wypij (1994), the asymptotic bias can be ascertained by simply consideringan artificial sample comprised of one suitably weighted observation for each possible realization of theoutcomes and follow-up times. Then, we can solve for in the usual way, except that each pseudoindi-viduals contribution to the estimating equations is weighted by its respective probability. We examine theasymptotic bias of the various estimators of , and explore how different specifications of the workingcorrelation with GEE can affect the asymptotic bias. The three working correlation structures consid-ered are independence [i (u, t) = 0], exchangeable [i (u, t) = ], and the correct model, first-orderautoregressive.

    Table 2 displays the asymptotic bias for the grouptime interaction effect (3), the effect usuallyof most interest in a longitudinal study. It can be seen that the PL estimator is unbiased under a first-order dependence measurement model, regardless of the nature of the follow-up time process. The PLestimator has relatively small bias (

  • 478 G. M. FITZMAURICE ET AL.

    Table 2. Results from the study of asymptotic bias: percent relative bias (3 33 100%) for thegrouptime interaction effect (3 = 1)

    Measurement Follow-up Correlation GEE-IND GEE-EXCH GEE-AR1 PMLEmodel model parameter

    ()first-order first-order 0.2 8.77 7.82 0.41 0.00

    0.4 16.01 13.21 0.27 0.00first-order second-order 0.2 12.72 10.54 3.92 0.00

    0.4 23.24 17.23 2.44 0.00first-order third-order 0.2 12.09 9.97 3.68 0.00

    0.4 22.39 16.47 2.48 0.00Full Bahadur first-order 0.2 9.73 8.75 0.91 0.92

    0.4 15.98 13.08 0.20 0.82Full Bahadur second-order 0.2 15.54 13.20 5.13 2.81

    0.4 26.32 19.89 3.84 4.47Full Bahadur third-order 0.2 14.80 12.54 4.79 2.69

    0.4 23.47 17.14 3.24 2.62Pseudo maximum likelihood estimate.

    model for the correlation with the modified GEE appears to reduce greatly the bias (when compared tothe GEE with incorrect correlation structure). Overall, the results in Table 2 indicate that the proposedpseudolikelihood approach has little bias, even in the configurations when it might be expected to havebias (full Bahadur and second-order or third-order follow-up). In addition, the results indicate that themodified GEE requires careful modeling of the correlation structure in order to minimize bias.

    3.2 Simulation study

    To complement the study of asymptotic bias reported in Section 3.1, we conducted a simulation studyusing a similar design configuration. Specifically, we considered a two-group longitudinal design config-uration with a binary response measured on three occasions, as opposed to four occasions in the study ofasymptotic bias. We considered samples of size n = 400, with half of the individuals assigned to each ofthe two groups. The marginal model for the mean of Yi (t) is

    pi (t) = E[Yi (t)|xi ] = exp[0 + 1xi + 2t + 3xi t]1 + exp[0 + 1xi + 2t + 3xi t] ,

    where xi is the group indicator variable and (0, 1, 2, 3) = (1,1, 1,1). An autoregressive correla-tion was assumed,

    i (u) = Corr[Yi (t), Yi (t u)|xi ] = |u|,for = 0.2, 0.4, or 0.6. Following the design used in Section 3.1, we specified two models for the outcomedistribution, first-order and full Bahadur, and two models for the follow-up distribution, first-orderand second-order. For each combination of the outcome and follow-up time distributions, we performed1000 replicate simulations.

    Tables 3, 4, and 5 display the relative bias, variance, and coverage probabilities of nominal 95%confidence intervals for the grouptime interaction effect (3). In terms of bias (see Table 3), there isvery discernible bias for the GEE under the naive assumption of independence and under the incorrectly

  • Estimation in regression models for longitudinal binary data 479

    Table 3. Results of the simulation study: (a) Percent relative bias of the grouptime interaction effect(3 = 1)

    Measurement Follow-up Correlation GEE-IND GEE-EXCH GEE-AR1 PMLEmodel model parameter

    ()first-order first-order 0.2 29 19 8 9

    0.4 93 51 3 90.6 530 151 6 4

    first-order second-order 0.2 22 14 2 60.4 99 41 3 70.6 483 102 12 5

    Full Bahadur first-order 0.2 27 20 7 90.4 116 58 2 100.6 490 144 4 0

    Full Bahadur second-order 0.2 29 19 5 100.4 115 48 4 110.6 468 101 11 0

    Table 4. Results of the simulation study: (b) Variance of the grouptime interaction effectMeasurement Follow-up Correlation GEE-IND GEE-EXCH GEE-AR1 PMLEmodel model parameter

    ()first-order first-order 0.2 1.609 0.271 0.121 0.087

    0.4 8.014 1.480 0.086 0.0840.6 44.014 2.795 0.039 0.017

    first-order second-order 0.2 0.950 0.251 0.081 0.0660.4 9.143 0.883 0.060 0.0760.6 41.636 1.353 0.028 0.017

    Full Bahadur first-order 0.2 1.359 0.378 0.101 0.0780.4 11.057 1.969 0.089 0.0810.6 41.805 2.810 0.041 0.017

    Full Bahadur second-order 0.2 1.597 0.477 0.089 0.0830.4 10.462 1.013 0.049 0.0730.6 40.864 1.465 0.030 0.014

    specified exchangeable correlation structure. In particular, the bias increases with larger values of the cor-relation parameter . For = 0.4, the relative bias of the working independence GEE is approximately100%; when = 0.6, the relative bias exceeds 400%. For = 0.4, the relative bias of the workingexchangeable correlation GEE is approximately 50%; when = 0.6, the relative bias exceeds 100%.In contrast, the finite-sample bias of the working autoregressive correlation GEE and pseudolikelihoodestimates is modest, with relative bias less than 10% for almost all the design configurations. These resultscomplement the earlier results from the study of asymptotic bias, where it was found that the proposedpseudolikelihood approach, or use of the correct correlation model with the modified GEE, greatly reducethe bias when compared to the GEE with incorrect correlation structure. The results of the simulationstudy suggest that the bias of the GEE with incorrect correlation structure can be very substantial in finitesamples.

  • 480 G. M. FITZMAURICE ET AL.

    Table 5. Results of the simulation study: (c) Coverage probabilities for 95% confidence intervalsMeasurement Follow-up Correlation GEE-IND GEE-EXCH GEE-AR1 PMLEmodel model parameter

    ()first-order first-order 0.2 96.83 94.94 91.96 95.64

    0.4 93.10 89.10 87.90 94.700.6 64.40 51.70 82.90 95.00

    first-order second-order 0.2 95.58 94.11 92.14 96.760.4 91.18 86.77 85.78 94.510.6 61.72 52.84 75.77 95.21

    Full Bahadur first-order 0.2 97.03 96.54 92.87 97.620.4 92.76 88.40 88.63 95.950.6 79.50 72.10 90.08 90.25

    Full Bahadur second-order 0.2 96.02 94.29 92.49 96.770.4 91.76 86.02 88.99 96.940.6 65.44 56.11 77.95 95.73

    In terms of variance (see Table 4), the GEE under the naive assumption of independence and underthe incorrectly specified exchangeable correlation structure, performed poorly, especially for larger valuesof . In particular, the working independence GEE estimator was found to be very inefficient, withlarge increases in the variance as increased from 0.2 to 0.6. In contrast, the variances of the workingautoregressive correlation GEE and pseudolikelihood estimators were discernibly smaller. In terms ofboth variance and mean-squared error, the pseudolikelihood estimator, in general, performed somewhatbetter than the working autoregressive correlation GEE, although there were a few configurations wherethe working autoregressive correlation GEE performed better.

    Finally, in term of coverage probabilities for nominal 95% confidence intervals, the pseudolikeli-hood estimator had adequate coverage in 11 of the 12 configurations considered; in one configuration thecoverage probability was approximately 90%. Coverage probabilities for the working autoregressive cor-relation GEE were less impressive. In all configurations, the coverage probabilities were less than 95%,dropping below 90% in 7 of the 12 configurations. Coverage probabilities for the GEE with incorrectcorrelation structure were poor, with coverage probabilities decreasing with increasing . In particular,when = 0.6, coverage probabilities ranged from 51.7 to 79.5%.

    4. APPLICATION: LONGITUDINAL STUDY OF CARDIOTOXICITY

    In this section, we illustrate the key results using data from the longitudinal observational study that moti-vated this paper. Doxorubicin chemotherapy has been very effective in the treatment of acute lymphoblas-tic leukemia in children, with 5-year survival rates of at least 80% (Schorin et al., 1994). However, latecardiotoxic effects of doxorubicin are increasingly a problem for patients who survive childhood cancer.Cardiotoxicity is often progressive, and some patients have disabling symptoms. A recent observationalstudy was conducted (Lipshultz et al., 1995) to identify risk factors for cardiotoxicity. The outcome mea-sured over time was the left ventricular mass of the heart. As seen in Table 1, many of the patients inthis study have abnormal left ventricular mass. When this study began in the 1980s, many of the patientshad completed doxorubicin chemotherapy as much as 14 years earlier. We let t denote the years since thecompletion of doxorubicin chemotherapy. Cardiologists consider time since the end of chemotherapy asa key measure in predicting heart function.

  • Estimation in regression models for longitudinal binary data 481

    Abnormalities of the left ventricular mass of the heart were determined by examining echocardiogramsfrom n = 111 children and adults who had received cumulative doses from between 45 to 550 mg ofdoxorubicin per square meter of body-surface area for the treatment of acute lymphoblastic leukemia inchildhood. The main covariates of interest are the cumulative dose (dichotomized at 300 mg; 1 = 300 mgor more, 0 = less than 300 mg), age at the end of treatment (ranging from 1.5 to 20 years), time since theend of chemotherapy to a given heart left ventricular mass measurement (ranging from 0 to 13.9 years),and gender (1 = female, 0 = male). We note that clinical investigators suspect that the previous outcomeis the main determinant of the timing of future visits. Thus, the timing of the patient visit is assumed todepend on the previous left ventricular mass status. This is an important assumption that is required forappropriate inference from subsequent analyses.

    We fit the following logistic regression model for abnormal left ventricular mass of the heart at time t ,

    logit(pi (t)) = 0 + 1femalei + 2agei + 3log(dosei ) + 4t + 5[agei t].To minimize the possible bias in estimating , we fit a highly parameterized model for the within-subjectassociation. However, attempts to obtain the ML estimates for the full Bahadur model with over 100parameters for the within-subject association resulted (not surprisingly) in convergence problems for thealgorithms used (NewtonRaphson and NelderMead). Instead, we fit the conditional probability approx-imation given in (2.11), with autoregressive (or, more precisely, exponential) correlation pattern

    i (u, t) = Corr[Yi (t), Yi (t u)|xi (t), xi (t u)] = |u|.For illustrative purposes, we also used the GEE approach, implemented using the SAS procedurePROC GLIMMIX, with the following correlation structures: independence, exchangeable, and autore-gressive (2.1).

    Table 6 displays the estimates of obtained using the pseudolikelihood and GEE approaches. For theGEE with a working independence correlation structure, standard errors of the estimates of were basedon the empirical or so-called robust variance estimator (Huber, 1967; White, 1982) in order to correctfor the potential misspecification of the covariance. Overall, all approaches yielded similar estimates forthe gender and dose effects. However, the approaches produced discernibly different estimates for theinteraction between age and time. In particular, there is a 42% difference between the PL-AR1 andGEE-IND; a 61% difference between the PL-AR1 and GEE-CS estimates; and a 25% differencebetween the PL-AR1 and GEE-AR1 estimates.

    Also, for these data, = 0.938 using the pseudolikelihood and = 0.926 using the GEE approachwith AR1 correlation. Thus, as might be expected, observations close in time are very similar, and thereis strong evidence that the repeated measures are not independent. Because there was empirical evidencethat follow-up time appeared to depend on the previous outcomes (e.g. the median time to the first follow-up visit was 1.7 years for patients with abnormal initial left ventricular mass measurements, and 2.6 yearsfor patients with normal initial left ventricular mass measurements), it should not be surprising that theestimates of the interaction effect differ when using the pseudolikelihood and GEE approaches.

    5. CONCLUSION

    In this paper, we consider ML estimation of regression parameters in marginal models for longitudinalbinary data when the follow-up times are random and depend on previous outcomes. Under this follow-uptime process, we have shown that ML estimation of the regression parameters does not require that a modelfor the distribution of follow-up times be specified. However, the model for the entire joint distribution ofthe repeated measures must be correctly specified. Thus, any misspecification of the higher-order momentscan result in biased estimates of the regression parameters.

  • 482 G. M. FITZMAURICE ET AL.

    When follow-up depends on previous outcomes, ML estimation requires specification of all higher-order moments. However, it is prohibitively difficult to compute the likelihood function for marginalmodels for longitudinal binary data except in cases where the number of repeated measurements is rel-atively small. To circumvent these difficulties, we propose a pseudolikelihood to estimate the regressionparameters. The pseudolikelihood uses a linear approximation to the conditional distribution of the re-sponse at any occasion, given the history of previous responses. The appeal of this approximation is thatthe conditional distributions are functions of the first two moments of the binary responses only. When thefollow-up times depend only on the previous outcome, the pseudolikelihood requires correct specificationof the conditional distribution of the current outcome given the outcome at the previous occasion only.The results from a study of asymptotic bias and a simulation study demonstrate that the proposed pseu-dolikelihood approach, or use of the correct correlation model with modified GEEs, greatly reduce thebias when compared to the GEE with incorrect correlation structure. This underscores the need to modelcarefully the correlation structure when follow-up depends on previous outcomes.

    A caveat of the proposed pseudolikelihood approach is that it requires strong assumptions on eitherthe follow-up time process [assumption (2.13)] or the measurement model. When the follow-up timesdepend only on the previous outcome, the PMLE is consistent provided the true conditional distributionf (Yi (t)|Y Oi (t), Xi (t); , ) has been correctly modeled, even though f (Yi (t)|Y Oi (t), Xi (t); , ) =f (Yi (t)|Y Oi (t), Xi (t); , ). On the other hand, when the true conditional distribution does satisfy thefirst-order Markov-type dependence, f (Yi (t)|Y Oi (t), Xi (t); , ) = f (Yi (t)|Y Oi (t), Xi (t); , ),

    Table 6. Regression parameter () estimates and standard errors (SE) for the logistic regression modelfor the cardiotoxicity data

    Effect Method Estimate SE Z p-valueIntercept GEE-IND 9.0680 1.683 5.39

  • Estimation in regression models for longitudinal binary data 483

    the PMLE is the ML estimate and is consistent even when assumption (2.13) does not hold, but thefollow-up times depend on any of the previous outcomes [assumption (2.4)]. We note that the Bahadurdependence discussed in the study of asymptotic bias may not always be adequate for correctly specify-ing the conditional distribution of the current response, given the past history of responses. For example,when the true binary measurement process depends on random effects, this would induce a non-Markovdependence marginally (when averaged over the distribution of the random effects). However, we conjec-ture that the bias of the PL estimator may be modest unless any departures from these assumptions arerelatively extreme. The simulation study results lend partial support to this conjecture.

    Finally, we note that the results for both ML and PL estimation presented in this paper can be readilyextended to the case where the longitudinal outcome is categorical with more than two levels (e.g. ordinaland nominal responses).

    ACKNOWLEDGMENTS

    We thank the Associate Editor and the referees for their helpful comments and suggestions. We are gratefulfor the support provided by National Institutes of Health grants GM 29745, GM 70335, HL 69800, HL53392, HL 78522, HL 72705, AI 60373, CA 74015, CA 70101, CA 68484, CA 79060, and HR 96041.We also thank the Dana-Farber Cancer Institute Childhood Acute Lymphoblastic Leukemia Consortiumfor permission to use the data from the cardiotoxicity study. Conflict of Interest: None declared.

    APPENDIX

    Under assumption (2.13), the PMLE of (, ) is consistent.Sketch of Proof. The pseudoscore function s(, ) can be expressed as

    s(, ) =n

    i=1

    [ 0

    Di (t)i (t)1{Yi (t) i (t)}d Ni (t)],

    where

    i (t) = i (t ; , ) = E[Yi (t)|Y Oi (t), Xi (t); , ), Di (t) = Di (t ; , ) =i (t)

    (, ),

    i (t) = i (t ; , ) = Var[Yi (t)|Y Oi (t); , ] = i (t)(1 i (t)).The PMLE of (and ) is consistent, under suitable regularity conditions, if it can be shown thatE[s(, )] = 0 at the true (, ). Recall that assumption (2.13) implies the following conditional in-dependence relationship,

    f (Yi (t)|Y Oi (t), Xi (t), d Ni (t) = 1) = f (Yi (t)|Y Oi (t), Xi (t)),

    and thus

    E(Yi (t)|Y Oi (t), Xi (t), d Ni (t) = 1) = E(Yi (t)|Y Oi (t), Xi (t)) = i (t ; , ).

    Hence, under assumption (2.13) and provided the model for i (t ; , ) has been correctly specified,s(, ) has mean equal to 0 at the true and .

  • 484 G. M. FITZMAURICE ET AL.

    Finally, we note that because f (Yi (t)|Y Oi (t), Xi (t), d Ni (t) = 0) is clearly not estimable from theobserved data, assumption (2.13) is not verifiable from the data at hand.

    REFERENCESBAHADUR, R. R. (1961). A representation of the joint distribution of responses to n dichotomous items. In

    Solomon, H. (ed), Studies in Item Analysis and Prediction. Stanford Mathematical Studies in the Social SciencesVI. Palo Alto, CA: Stanford University Press, pp. 158168.

    DIGGLE, P. J. AND KENWARD, M. G. (1994). Informative dropout in longitudinal data analysis (with discussion).Applied Statistics 43, 4993.

    FITZMAURICE, G. M., LAIRD, N. M. AND ROTNITZKY, A. (1993). Regression models for discrete longitudinalresponses (with discussion). Statistical Science 8, 284309.

    HAND, D. J. AND CROWDER, M. J. (1996). Practical Longitudinal Data Analysis. London: Chapman and Hall.HUBER, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. Proceedings

    of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1. Berkeley, CA: Universityof California Press, pp. 221233.

    KALBFLEISCH, J. D. AND PRENTICE, R. L. (1980). The Statistical Analysis of Failure Time Data. New York: JohnWiley.

    LEE, H., LAIRD, N. M. AND JOHNSTON, G. (1999). Combining GEE and REML for estimation of generalizedlinear models with incomplete multivariate data (Unpublished).

    LIANG, K. Y. AND ZEGER, S. L. (1986). Longitudinal data analysis using generalized linear models. Biometrika 73,1322.

    LIN, H., SCHARFSTEIN, D. O. AND ROSENHECK, R. A. (2004). Analysis of longitudinal data with irregular,outcome-dependent follow-up. Journal of the Royal Statistical Society Series B 66, 791813.

    LIPSHULTZ, S. E., COLAN, S. D., GELBER, R. D., PEREZ-ATAYDE, A. R., SALLAN, S. E. AND SANDERS, S. P.(1991). Late cardiac effects of doxorubicin therapy for acute lymphoblastic leukemia in childhood. New EnglandJournal of Medicine 324, 808815.

    LIPSHULTZ, S. E., LIPSITZ, S. R., MONE, S. M., GOORIN, A. M., SALLAN, S. E. AND SANDERS, S. P. (1995).Female sex and higher drug dose as risk factors for late cardiotoxic effects of doxorubicin therapy for childhoodcancer. New England Journal of Medicine 332, 17381743.

    LIPSITZ, S. R., FITZMAURICE, G. M., IBRAHIM, J. G., GELBER, R. AND LIPSHULTZ, S. (2002). Parameterestimation in longitudinal studies with outcome-dependent follow-up. Biometrics 58, 5059.

    LIPSITZ, S. R., LAIRD, N. M. AND HARRINGTON, D. P. (1992). A three-stage estimator for studies with repeatedand possibly missing binary outcomes. Applied Statistics 41, 203213.

    LIPSITZ, S. R., MOLENBERGHS, G., FITZMAURICE, G. M. AND IBRAHIM, J. (2000). GEE with Gaussian estima-tion of the correlations when data are incomplete. Biometrics 56, 528536.

    LITTLE, R. J. AND RUBIN, D. B. (1987). Statistical Analysis with Missing Data. New York: Wiley & Sons.PEPE, M. S. AND ANDERSON, G. L. (1994). A cautionary note on inference for marginal regression models with

    longitudinal data and general correlated response data. Communications in Statistics 23, 939951.

    ROBINS, J. M., GREENLAND, S. AND HU. F.-C. (1999). Estimation of the causal effect of a time-varying exposureon the marginal mean of a repeated binary outcome. Journal of the American Statistical Association 94, 687712.

    ROTNITZKY, A. AND WYPIJ, D. (1994). A note on the bias of estimators with missing data. Biometrics 50,11631170.

  • Estimation in regression models for longitudinal binary data 485

    SCHORIN, M. A., BLATTNER, S., GELBER, R. D., TARBELL, N. J., DONNELLY, M., DALTON, V., COHEN, H. J.AND SALLAN, S. E. (1994). Treatment of childhood acute lymphoblastic leukemia: results of Dana-Farber Can-cer Institute/Childrens Hospital Acute Lymphoblastic Leukemia Consortium Protocol 85-01. Journal of ClinicalOncology 12, 740747.

    WHITE, H. (1982). Maximum likelihood estimation of misspecified models. Econometrica 50, 126.

    [Received May 29, 2003; first revision May 25, 2004; second revision February 11, 2005;third revision June 16, 2005; fourth revision December 8, 2005; fifth revision January 4, 2006;

    accepted for publication January 18, 2006]