16
MAIN PAPER Bayesian joint modelling of benefit and risk in drug development Maria J. Costa 1 | Thomas Drury 2 1 GlaxoSmithKline Research and Development, Stevenage, UK 2 Integral Statistics Limited, London, UK Correspondence Maria J. Costa, GlaxoSmithKline Research and Development, Stevenage, UK. Email: [email protected] To gain regulatory approval, a new medicine must demonstrate that its benefits outweigh any potential risks, ie, that the benefitrisk balance is favourable towards the new medicine. For transparency and clarity of the decision, a structured and consistent approach to benefitrisk assessment that quantifies uncertainties and accounts for underlying dependencies is desirable. This paper proposes two approaches to benefitrisk evaluation, both based on the idea of joint modelling of mixed outcomes that are potentially dependent at the subject level. Using Bayesian inference, the two approaches offer interpretability and efficiency to enhance qualitative frameworks. Simulation studies show that accounting for correlation leads to a more accurate assessment of the strength of evidence to support benefitrisk profiles of interest. Several graphical approaches are proposed that can be used to communicate the benefitrisk balance to project teams. Finally, the two approaches are illustrated in a case study using real clinical trial data. KEYWORDS Bayesian inference, benefitrisk, copula, decisionmaking, generalised linear mixed models, joint modelling, uncertainty 1 | INTRODUCTION To progress through the development pathway, a medicine must demonstrate that its benefits outweigh its risks. This assessment is a complex task that depends on factors such as disease prevalence, prognosis for patients, severity of safety signals, and the benefitrisk (BR) profile of existing therapies, among others. A range of statistical techniques and visual tools for the separate analysis of the efficacy and safety profiles of new drugs is available. However, the same cannot be said regarding joint quantitative BR assessments. In general, sponsors and regulators agree that improving transparency, reproducibility, and communication of BR assessments requires a structured approach, focusing on key efficacy and safety outcomes. 1-3 Recent efforts have attempted to move beyond the structuring step to a more quantitative framework that allows sponsors and regulators to gain further insight into specific aspects of a drug's BR profile. Techniques such as multicriteria decision analysis (MCDA), decision contours, or weighted BR scores have proved useful tools for quantitative BR assessments. 4-7 A comprehensive review of quantita- tive approaches for BR evaluations can be found in MtIsa et al. 8 An additional challenge of any BR assessment is the fact that exposure to an investigational drug creates the potential for efficacy and safety outcomes to be connected at the subject level. In this scenario, analysing efficacy and safety responses separately could lead to misleading results. Often, these outcomes will be of a different nature, for example, a continuous efficacy measure and a binary safety event. Approaches for joint modelling have been developed in the literature, notably when linking continuous and longitudinal or continuous and timetoevent data. 9,10 Received: 20 March 2017 Revised: 31 October 2017 Accepted: 8 January 2018 DOI: 10.1002/pst.1852 Pharmaceutical Statistics. 2018;116. Copyright © 2018 John Wiley & Sons, Ltd. wileyonlinelibrary.com/journal/pst 1

Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

Received: 20 March 2017 Revised: 31 October 2017 Accepted: 8 January 2018

MA IN PAP ER

DOI: 10.1002/pst.1852

Bayesian joint modelling of benefit and risk in drugdevelopment

Maria J. Costa1 | Thomas Drury2

1GlaxoSmithKline Research andDevelopment, Stevenage, UK2Integral Statistics Limited, London, UK

CorrespondenceMaria J. Costa, GlaxoSmithKline Researchand Development, Stevenage, UK.Email: [email protected]

Pharmaceutical Statistics. 2018;1–16.

To gain regulatory approval, a new medicine must demonstrate that its benefits

outweigh any potential risks, ie, that the benefit‐risk balance is favourable towards

the new medicine. For transparency and clarity of the decision, a structured and

consistent approach to benefit‐risk assessment that quantifies uncertainties and

accounts for underlying dependencies is desirable. This paper proposes two

approaches to benefit‐risk evaluation, both based on the idea of joint modelling

of mixed outcomes that are potentially dependent at the subject level. Using

Bayesian inference, the two approaches offer interpretability and efficiency to

enhance qualitative frameworks. Simulation studies show that accounting for

correlation leads to a more accurate assessment of the strength of evidence to

support benefit‐risk profiles of interest. Several graphical approaches are proposed

that can be used to communicate the benefit‐risk balance to project teams. Finally,

the two approaches are illustrated in a case study using real clinical trial data.

KEYWORDS

Bayesian inference, benefit‐risk, copula, decision‐making, generalised linear mixed models, joint

modelling, uncertainty

1 | INTRODUCTION

To progress through the development pathway, a medicine must demonstrate that its benefits outweigh its risks. Thisassessment is a complex task that depends on factors such as disease prevalence, prognosis for patients, severity of safetysignals, and the benefit‐risk (BR) profile of existing therapies, among others.

A range of statistical techniques and visual tools for the separate analysis of the efficacy and safety profiles of newdrugs is available. However, the same cannot be said regarding joint quantitative BR assessments. In general, sponsorsand regulators agree that improving transparency, reproducibility, and communication of BR assessments requires astructured approach, focusing on key efficacy and safety outcomes.1-3 Recent efforts have attempted to move beyondthe structuring step to a more quantitative framework that allows sponsors and regulators to gain further insight intospecific aspects of a drug's BR profile. Techniques such as multicriteria decision analysis (MCDA), decision contours,or weighted BR scores have proved useful tools for quantitative BR assessments.4-7 A comprehensive review of quantita-tive approaches for BR evaluations can be found in Mt‐Isa et al.8

An additional challenge of any BR assessment is the fact that exposure to an investigational drug creates the potentialfor efficacy and safety outcomes to be connected at the subject level. In this scenario, analysing efficacy and safetyresponses separately could lead to misleading results. Often, these outcomes will be of a different nature, for example,a continuous efficacy measure and a binary safety event. Approaches for joint modelling have been developed in theliterature, notably when linking continuous and longitudinal or continuous and time‐to‐event data.9,10

Copyright © 2018 John Wiley & Sons, Ltd.wileyonlinelibrary.com/journal/pst 1

Page 2: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

2 COSTA AND DRURY

This paper explores two different approaches to joint modelling within a Bayesian framework applied to a range ofscenarios where key efficacy and safety endpoints have been identified as part of an initial qualitative BR frameworkexercise. The examples illustrate situations where specific safety concerns have been identified (for example, risksassociated with the drug's mechanism of action) and a desire exists to understand the extent to which these risks couldbe offset by observed efficacy. These situations will usually be specific to the asset and have low dimensionality. TheBayesian framework is particularly useful for BR assessments as uncertainty can be expressed in terms of probabilities,which are simple to communicate to nonstatisticians.11 Decision‐making under uncertainty is described by the EMAand FDA as one of the key challenges associated with BR evaluations.2,3 In addition, the Bayesian approach allowsfor the formal utilisation of prior information and repeated updates based on accumulating evidence as the drug pro-gresses through its lifecycle. The use of prior knowledge can be particularly important when assessing safety, as clinicaltrials may not be adequately powered to detect safety signals.12 The methods and visualisations investigated in thispaper can be added to the “toolbox” of analyses available to support decision‐making (internal or external) as part ofBR assessments.

This paper is organised as follows. Section 2 outlines the two proposed methods to evaluate the BR profile quantita-tively. Section 3 presents the results from simulation studies to assess the performance of the two approaches at differentstages of the development lifecycle of a drug. Section 4 illustrates the use of these techniques on a real data example.Potential extensions are discussed in Section 5.

2 | METHODOLOGY

In a clinical trial, it is often the case that measures of benefit and risk are different in the nature of their samplingdistribution. For example, the primary measure of efficacy may vary on a continuous scale, whereas a key safety riskmay be binary, representing the occurrence or not of an adverse event of special interest. These multiple measurementson an individual subject can be seen as a multivariate response variable. This paper outlines two approaches for jointmodelling of multivariate data where the components of the response vector can differ in the nature of their samplingdistribution. The first is through shared random effects (or latent variables), a special type of generalised linear mixedmodel (GLMM), and the second is through copulas, which build multivariate distributions from different, potentiallydependent types of data using copula marginal regression models.13

Although a key property of the proposed approach is its quantitative and objective nature, it also allows for subjectivejudgement to be incorporated using clinical thresholds. Using information from physicians, patients, or competitor data,regions of interest for the joint posterior distribution of parameter values can be defined to assess the probability that anew drug has the desirable BR profile, similar to the probability of technical success (POTS).9

2.1 | Method 1: GLMMs

Generalised linear mixed models are a powerful tool for modelling multiple responses of different types on the sameindividual or unit in a coherent way. They can be seen as an extension of linear mixed models to situations where eitherthe response variable is not modelled using a Gaussian distribution, or the multiple responses follow different probabi-listic distributions.

Let yi = (yi1,…, yiJ)t, i = 1, …, n, be the vector of J responses for the i‐th subject. A GLMM has the form:

gjðμijÞ ¼ Xi βj þ Zi ui; uieN 0;G Xið Þð Þ; (1)

where

• μi = (μi1,…,μiJ)t is a J‐dimensional vector of mean responses such that μij = E[yij], yijeExponential Family ðμij; σijÞ,

with dispersion parameter σij,• βj = (βj1,…,βjK)

t is a K‐dimensional vector representing the fixed covariate effects for the j‐th response and ui therandom effects with design matrices Xi and Zi, respectively,

• gj(·) is the link function connecting the linear predictor for the j‐th response, ηij = Xiβj + Ziui, to the mean μij of yij(the link functions may be different across the J responses),

• G(Xi) is the covariance matrix for the random effects ui.

Page 3: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

COSTA AND DRURY 3

When shared among the different components of yi, the random effects ui in the GLMM induce correlation amongthe multiple responses, the exact form of which depends on the distribution of the yij’s and the link function(s) used.In addition, the fixed effects βjk should be interpreted conditional on the ui’s whenever the link functions in g are not

the identity, as in this case g(E[yij]) ≠ E[g(yij)]. In what follows, βj represents marginal (population) effects, whereas β*jare subject‐specific effects. Generally, the former are of interest to allow direct comparison across the different responses.

For illustration purposes, consider two responses on the same subject, yi = (yi1, yi2), with yi1 representing acontinuous, normally distributed efficacy response, for example, change from baseline in lung function, and yi2representing a binary safety outcome such as the occurrence or not of an adverse event (AE) of special interest, assumedto follow a Bernoulli distribution. The subject‐level correlation between these two measurements is modelled by asubject‐specific random effect, ui, similar to standard linear mixed models. For the continuous response, yi1, the usualidentity link function is chosen, whereas for the binary response, yi2, a probit link function is used. The latter allowsclosed form expressions for the vector of marginal covariate effects β2. If a logistic link function is to beused, approximations exist to derive population effects (see supplementary information (SI)). For each subject i, theGLMM in (1) takes the form

yi1

Φ−1 P yi2 ¼ 1juið Þ½ �

" #¼

Xi β1 þ ui þ ϵi

Xi β2* þ ui

!; ϵieN 0; υ2i

� �; uieN 0; τ2i

� �; (2)

where Φ−1 in (2) is the inverse of the standard normal cumulative distribution function (CDF). For simplicity, the meanof both responses yi1 and yi2 are assumed to depend on the same set of covariates Xi. However, the parameters modulat-ing this relationship, β1 and β2

*, can differ. The non‐linear link function associated with the binary response in (2)implies that to obtain marginal covariate effects, β2, the random effects need to be integrated out, for example, usingMonte Carlo approaches. However, chosing the probit link Φ−1 for this model allows these to be obtained in closed formas

p2 ¼ Φ β2ð Þ ¼ P yi2 ¼ 1ð Þ ¼ ∫P yi2 ¼ 1juið Þ f uið Þ dui ¼ ΦXi β2

*ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ τ2i

p !:

By inspection, therefore, β2 ¼β2

*ffiffiffiffiffiffiffiffiffiffiffiffiffi1þ τ2i

p :14

2.2 | Method 2: copula marginal regression

Copulas are functions that enable univariate distribution functions to be combined into multivariate distributions whilepreserving the characteristics of each univariate marginal component. Copula marginal regression uses copula functionsto incorporate separate (marginal) regression models into a single joint model. Modelling continuous variables withcopulas is well established15-17; however, recent developments have extended copula modelling to discrete and mixturesof continuous and discrete variables. Some of this work has focussed on the multivariate Gaussian copula,13,18,19 whileothers have developed the methodology of paired copula construction (also known as Vines) to build multivariatemodels using conditioning and bivariate copulas.20-22 To the best of the authors' knowledge, joint modelling via copulashas not been applied to clinical BR assessments, although the technique has been applied to model data from preclinicaltoxicology,13 public health,22 and health economics.19

Recall from Section 2.1 that yij represents the j‐th response on the i‐th subject with mean response μij and, if required,dispersion σij, ie, yij~ fj(μij,σij). A copula marginal regression model is defined as

gjðμijÞ ¼ Xi βj

F1… J yi1;…; yiJ jθð Þ ¼ C F1 yi1 jμi1; σi1ð Þ;…;FJ yiJ jμiJ ; σiJð Þjθð Þ;where

• gj(·), βj and Xi are the link functions, fixed covariate effects and design matrix, respectively, as described in Section2.1,

• Fj(yij|μij,σij) is the CDF of the j‐th response,

Page 4: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

4 COSTA AND DRURY

• C(·,…, · |θ) is a J‐dimensional copula distribution function with dependency parameter vector θ,• F1…J (yi1,…, yiJ| θ) is the resulting multivariate CDF.

In practice, the joint distribution F1…J (yi1,…, yiJ| θ) is modelled as a mass or density function generated using deriva-tives for continuous variables and finite differencing for discrete variables. The resulting function is then used to buildthe likelihood function (see SI).

Using copulas to build multivariate models has several advantages. In addition to allowing different types of responsevariables to be jointly modelled while accounting for any existing dependency, the fixed effect parameters are directlyinterpretable as population based estimates. The models are also flexible, with different copulas allowing diverse patternsof dependency to be modelled, including asymmetric or strong tail dependencies.

In this paper, the Gaussian copula is used to illustrate copula marginal regression. This choice is due to its abilityto incorporate positive and negative dependency and the fact that it is easily tractable due to its elliptical nature.13

The J‐dimensional Gaussian copula is defined as

CGaussJ yi1;…; yiJ jθð Þ ¼ ΦJ Φ−1 F1f g;…;Φ−1 FJf g j θ� �

;

where each CDF Fj = Fj(yij|μij,σij) is transformed into a “normal‐score” through the inverse of the standard normal CDFand then evaluated as a J‐dimensional standard multivariate normal CDF.

To evaluate the impact of modelling dependency, the J‐dimensional Independence copula is also implemented. Asthe name suggests, it assumes the data are completely independent and is defined as

C IndepJ yi1;…; yiJð Þ ¼ ∏J

j¼1 Fjðyijjμij; σijÞ:

The deviance information criterion (DIC)23 is used to assess which copula provides the best representation of the data.

2.3 | Joint modelling, BR, and decision‐making

Typically, in BR assessments, the parameters of interest are treatment comparisons, for example, the difference in theaverage lung function response between an active drug and placebo. Once the joint posterior distribution of the parametersin either the GLMM or copula model is obtained, the next step is to assess the level of evidence (ie, posterior probability)associated with BR profiles of interest. For illustration purposes, consider the scenario where a single efficacy and safetyresponses are of interest for the BR assessment of a new drug compared to placebo. The different BR profiles of interestcan be set‐up using a range of clinically meaningful efficacy and safety thresholds, ΔE and ΔS, respectively, representingminimum improvements in efficacy and maximum increases in risk with the new drug relative to placebo. These areindependent and set by the project team and can be viewed as clinical Go/No‐go boundaries. Although these thresholdsdo not explicitly account for the relative importance of the corresponding endpoints for the BR assessment, the lowdimensionality of the problems tackled here means that it can be reasonably assumed that the endpoints included inthe joint model are comparable in their relevance for the BR assessment. If the team wishes to include explicit weighting,then approaches such as clinical utility index could be considered.24 Expert knowledge can be used to define appropriateranges for ΔE and ΔS to understand the BR trade‐off in terms of the posterior probability:

Prob HE μEð Þ≥ ΔE and HS μSð Þ≤ ΔSj Data;Priorð Þ: (3)

The functions HE (·) and HS (·) in (3) represent the treatment comparison of interest based on the mean response vec-tor μE and μS for the efficacy and safety responses, respectively. The probability in (3) represents the POTS for the newdrug versus placebo.9 For the purposes of the BR assessment, the probability in (3) can be plotted against pairs (ΔE, ΔS)using a contour plot. This is a key step in assessing the BR profile of an intervention as it provides different stakeholders(sponsors, regulators, HTA bodies, and patients) with the opportunity to assess whether the existing data supports theirown BR perspective.

3 | SIMULATED DATA EXAMPLES

The two approaches outlined above will be illustrated using simulated data to explore their potential to aid decision‐making in BR assessments.

Page 5: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

COSTA AND DRURY 5

3.1 | Example 1: single efficacy and safety responses

Recall the two responses scenario described in Section 2.1 for assessing the BR profile of a new active drug compared toplacebo. For illustration purposes, assume treatment is the only covariate of interest. In this case, the design matrix Xi in(2) is a 2‐dimensional vector indicating whether subject i receives active drug or placebo. For each subject, the efficacyresponse yi1 is assumed to follow a normal distribution with mean μt and variance σ2t (t = 1 for the placebo group, t = 2for the active group), whereas the safety response yi2 is a binary outcome that can be modelled using a Bernoulli distri-bution with parameter pt representing the probability that a subject will experience the AE of interest. The true values forthe model parameters were chosen so that they represent a real respiratory clinical trial setting. In particular,μ1 = − 150 (σ21 ¼ 1002), p1 = 0.1 and ρ1 = 0.1 in the placebo group, and μ2 = − 50 (σ22 ¼ 1002), p2 = 0.4 andρ2 = 0.6 in the active group, with ρt representing the correlation between the two responses for treatment t. For this sce-nario, HE (·) and HS (·) in (3) take the form HE (μ1,μ2) = μ2 − μ1 and HS (p1,p2) = p2 − p1, representing the difference inthe mean efficacy response and the difference in the probability of an adverse event between the active and placebogroups. Later, other forms for HE and HS will be considered. A total of 1000 data sets with 100 subjects per treatmentgroup were generated using NORTA random vectors to ensure that the resulting responses have the correct correlationstructure.25 Bayesian inference was performed using Markov chain Monte Carlo (MCMC) to obtain samples from theposterior distributions of interest assuming noninformative priors for all the model parameters (See SI).

3.1.1 | The GLMM approach

In the simple scenario described above, the GLMM in (2) reduces to

yi1

Φ−1 ptð Þ

" #¼

xi1 β11 þ xi2 β12 þ υt ui þ ϵi

xi1 β*21 þ xi2 β*22 þ ui

!; (4)

where the covariates xit, t = 1, 2, are indicator variables for treatment t and βjt are the parameters representing treatmenteffects on the mean efficacy and safety responses, ie, μ1 = β11 , μ2 = β12, p1 = Φ(β21), and p2 = Φ(β22) are the marginalestimates for the parameter pt after integrating over the shared random effects ui. The variance of ui, τ2i , is such thatτ2i ¼ τ2t , therefore allowing any correlation between the efficacy and safety responses to differ between treatment groups.Similarly, the variance of ϵi is also modelled separately as υ2i ¼ υ2t . Since the continuous and binary responses aredifferent in scale, the variance of the shared random effect ui in the continuous linear predictor is scaled by the standarddeviation of the residual term, υt:14 This ensures that the model in (4) is identifiable (two variance parameters estimatedusing the two observations per subject). Note that the reparameterisation in (4) implies that Var yi1ð Þ ¼ σ2t ¼ υ2t ·τ

2t þ υ2t .

3.1.2 | The copula approach

For the same scenario above, the copula marginal regression model is

μt

Φ−1 ptð Þ

" #¼

xi1 β11 þ xi2 β12

xi1 β21 þ xi2 β22

!; σt ¼ xi1 s1 þ xi2 s2 θt ¼ xi1 ω1 þ xi2 ω2 ; (5)

where xit and βjt are treatment indicators and effect parameters, respectively, st are the dispersion parameters of theefficacy response for subjects on treatment t, and ωt are the copula dependency parameters for treatment t. The parametersfor the means, dispersions and dependency in (5) are combined using the marginal distribution functions and selectedcopula to create a joint likelihood function (full details of the joint likelihood are given in SI).

When using Gaussian copulas to jointly model normal and binary data, the copula dependency parameter θt corre-sponds to the poly‐serial correlation, ie, the Pearson correlation between the normal response and a latent normal var-iable underlying the binary response.26 Although useful in its own right, the dependency parameter can also betransformed into an estimate of the Pearson correlation ρt between the normal and binary responses using the following

relationship,27 ρt ¼ Corr yi1; yi2ð Þ ¼ θtϕ Φ−1 ptð Þ½ �= ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffipt 1−ptð Þp

, where ϕ is the density of the standard normal distribution.

Page 6: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

6 COSTA AND DRURY

3.1.3 | Simulated data results

The results from the simulation study can be found in Table 1 and Figure 1. Table 1 shows some of the model parameterestimates obtained using the GLMM and Copula approaches, together with estimated bias and mean squared error(MSE). These are similar across the two methods. Moreover, the high correlation between the two responses in the activearm does not lead to substantial bias in the estimates for the fixed effects in either the efficacy or safety regressionmodels. This is in contrast with previous reports of marked increases in bias for moderate to high correlation settings.9

The results from a single simulated sample are presented in Figure 1 using the GLMM approach. Figure 1A shows thejoint posterior distribution HE (μ1, μ2) = μ2 − μ1 and HS (p1, p2) = p2 − p1 using the GLMM model. The fact that,compared to the placebo group, subjects in the active treatment group who experience an AE are more likely to benefitfrom the treatment is visible in the elliptical shape of the distribution. Through joint modelling of the efficacy and safetyendpoints, it is possible to incorporate this information into the BR assessment.

The complete BR profile for the sample simulated data set is represented by the BR contour plot in Figure 1B. Itshows that most evidence supports a BR profile where the difference between active and placebo in the risk of an AEis at most 0.35 and in efficacy at least 80. This scenario has 84% posterior probability. If a difference of at least 90 inefficacy is deemed to be clinically meaningful, then to maintain the same level of posterior probability (or POTS) themaximum difference in the risk of an AE must be increased to close to 0.4. Conversely, if 70% is the minimum posteriorprobability necessary to progress the drug through its development pathway, then Figure 1B shows that this constrainsthe maximum difference in risk to be at least 0.33 or the minimum difference in efficacy to be at most 95. Differentstakeholders may therefore use the information in the BR contour plot of Figure 1B to assess the extent to which the datacollected and any prior information used in the analysis support their desired BR profile.

The same simulation study was performed for different values of ρ2 while keeping all other parameters fixed. Table 2shows the impact that ρ2 has on the average posterior probability of the BR profile HE (μ1, μ2) = μ2 − μ1 > 100and HS (p1, p2) = p2 − p1 < 0.3 across the simulations for both modelling approaches. This profile corresponds to thebottom right quadrant in Figure 1A as defined by the dashed lines. In this case, as the value of ρ2 increases, the jointdistribution of μ2 − μ1 and p2 − p1 becomes increasingly stretched along the top right and bottom left quadrants andtherefore the posterior probability associated with the BR profile above decreases. The specific relationship betweencorrelation in the observed responses and the posterior probability of a BR profile of interest will depend on the natureof the correlation, ie, whether this is positive or negative.

3.2 | Example 2: BR and optimal dose selection

Identifying the dose with optimal BR profile is one of the most challenging hurdles in drug development. A dose that istoo high may result in an unacceptable risk profile, while a dose that is too low may decrease the chances of achievingthe desired level of efficacy in a phase 3 trial. These considerations suggest a need to address the problem of selecting the

TABLE 1 Summary of key generalised linear mixed model (GLMM) and copula model parameters averaged across 1000 simulations,

together with bias and mean squared error (MSE) estimates

Model Parameter Mean SD 2.5% 50% 97.5% Bias MSE

GLMM μ1 −150.27 10.10 −170.04 −150.27 −130.49 0.27 105.78μ2 −50.14 10.11 −69.93 −50.14 −30.35 0.14 109.52p1 0.10 0.03 0.05 0.10 0.16 <0.01 <0.01p2 0.40 0.05 0.31 0.40 0.49 <0.01 <0.01ρ1 0.08 0.07 <0.01 0.07 0.22 0.02 <0.01ρ2 0.59 0.05 0.49 0.60 0.68 0.01 <0.01

Gaussiancopulamodel

μ1 −150.27 10.13 −170.10 −150.28 −130.44 0.27 105.71μ2 −50.13 10.04 −69.77 −50.13 −30.50 0.13 109.37p1 0.10 0.03 0.05 0.10 0.16 <0.01 <0.01p2 0.40 0.05 0.31 0.40 0.49 <0.01 <0.01ρ1 0.09 0.09 −0.09 0.09 0.27 0.01 0.01ρ2 0.58 0.05 0.47 0.58 0.68 0.02 <0.01θ1 0.15 0.16 −0.16 0.16 0.46 0.02 0.02θ2 0.74 0.07 0.60 0.74 0.86 0.03 0.01

Page 7: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

FIGURE 1 Results from a sample simulated dataset. (A) Joint and marginal posterior distribution for HE (μ1, μ2) = μ2 − μ1 and

HS (p1, p2) = p2 − p1. (B) Benefit‐risk contour plot representing the probability of technical success defined by Prob HE μ1; μ2ð Þ≥ΔEð and

HS p2; p1ð Þ≤ΔS∣ DataÞ, for varying values of ΔE and ΔS. Each contour represents a region with constant posterior probability. The colour

scale on the right represents the level of joint posterior probability

TABLE 2 Average posterior probability as a function of ρ2 across the 1000 simulations for the BR profile HE (μ1, μ2) = μ2 − μ1 > 100

and HS (p1, p2) = p2 − p1 < 0.3

ρ2 GLMM Model, % Gaussian Copula Model, %

0 25.19 25.48

0.2 23.28 23.26

0.4 21.06 21.29

0.6 19.02 19.48

0.75 17.00 17.60

COSTA AND DRURY 7

optimal dose as a BR assessment one.28 In particular, allowing for observed efficacy and toxicity levels to be linked at thesubject level can be particularly informative. Generally, dose‐response modelling attempts to characterise the relation-ship between drug concentration and efficacy and/or safety response. For efficacy endpoints, standard models includethe family of Emax models.29 Emax models can accommodate a variety of different dose‐response relationships such asexponential or sigmoid. Dose‐response models for safety have been less well characterised in humans, partly due tochallenges in linking the underlying biological mechanisms to toxicology levels. Nevertheless, it is plausible to considera model where small increments in dose or concentration lead to small increases in toxicity up to a dose level beyondwhich toxicity increases substantially with dose.

Let μd = (μ1d, μ2d) represent the mean response vector for the efficacy (μ1d) and safety (μ2d) responses at dose d.Also, define the minimum effective dose (MED) as the smallest dose d that produces an improvement of size ΔE or largercompared to placebo (d = 0) with posterior probability greater than p, ie,

MED ¼ argmind Prob HE μ1d; μ10ð Þ≥ΔEð j Dataf Þ≥pg:In addition, define the critical effective dose (CED) as the largest dose d that produces an increase in toxicity

no greater than ΔS compared to placebo with posterior probability greater than p, ie,

CED ¼ argmaxd Prob HS μ2d; μ20ð Þ≤Δsð j Dataf Þ≥pg:30

The functions HE and HS were introduced in Section 2.3 and represent the comparison of interest of the efficacy andsafety mean responses, respectively, between a particular dose and placebo. For simplicity, the probability thresholdp in the definitions of MED and CED has been kept the same but this can differ between the two.

Page 8: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

8 COSTA AND DRURY

To illustrate the proposed approach for dose selection the scenario of Section 3.1 is revisited and extended to adose‐response setting by assuming that 5 different active doses and placebo are included in a trial with the aim of findingthe dose with optimal BR profile. This section will focus on the copula approach although the ideas described here couldbe adapted to the GLMM method too. Assume the selected active doses take values d2= 0.3, d3 = 0.7, d4= 1, d5= 4, andd6= 6 units with d1= 0 for placebo. In this example, a 3‐parameter Emax model is used to model the relationship betweendose d and μ1d, ie,

μ1d ¼ E0 þ EmaxdED50 þ d

; (6)

where the parameters E0, Emax, and ED50 represent the placebo response, the maximum achievable increase in responseover placebo, and the dose which produces 50% of the Emax effect, respectively. For the safety response, a linearregression model is used on the probit scale, as follows:

Φ−1 pdð Þ ¼ aþ b d; (7)

with μ2d = pd representing the probability of an adverse event of special interest at dose d. Similar approaches for model-ling safety dose‐response curves have been used elsewhere.5 To generate simulated data, the values of E0, Emax, ED50, a,and b are set to −150, 150, 0.5, Φ−1(0.1), and (Φ−1(0.6) − Φ−1(0.1))/6, respectively. The choice for a and b implies thata = − 1.28 (and pd1 = 0.1) and b = 0.26 (with pd6 = 0.6). The variance for the continuous response is such thatσ2 = 1002 across all doses. In a dose‐response setting, it is plausible to assume that any correlation between the efficacyand safety outcomes at the subject level may vary with dose. To illustrate this, the correlation ρd between the efficacy andsafety endpoints at dose d was defined as ρd = (d/(d + 1))3. Hence, for placebo (d = d1 = 0), ρd1 = 0, and for the highestdose (d = d6 = 6), ρd6 ¼ 0:63. Remaining values are ρd2 = 0.01, ρd3 = 0.07, ρd4 = 0.13, and ρd5 = 0.51. A total of 50 subjectsare allocated to each dose group. The Gaussian copula model used is similar to that in Section 3.1.2, with the exception ofthe marginal model for the mean in (5) which is now given in (6) and (7) (details of the likelihood function are given inSI). MCMC was used to obtain samples from the posterior distributions of interest using noninformative prior distribu-tions (see SI for full results). The copula approach proved particularly useful in the dose–response setting as it enabledthe flexibility of modelling the safety profile as a continuous function of dose as defined in (7). In contrast, because the var-iance τ2i of the random effect ui in the GLMM model (4) would need to change with dose to reflect the increased correla-tion, to obtain the marginal parameters pdk defined in (4) dose would need to be treated as a categorical covariate.

LetHE (μ1d, μ10) = μ1d− μ10 represent the difference in mean response between dose d and placebo, whereasHS (μ2d,μ20) = pd − p1 is the difference in risk of an AE in dose d versus placebo. Once the data is collected, to select the dose dwith optimal BR profile given ΔE, ΔS and p in the definitions of MED and CED one can proceed as follows: ifMED < CED, then any dose in the range [MED, CED] will satisfy the desired BR profile. If MED = CED, then this cor-responds to the single optimal dose. If MED > CED then a decision will need to be made to either modify ΔE, ΔS, p orwhether the clinical programme is still viable. As in Section 3.1.3 it is therefore important to understand the impact thatthese parameters have on the selection of dose. Graphical approaches can aid in this process.

The plot in Figure 2A displays pairs (MED, CED) for particular values of ΔE, ΔS and p for a sample simulated data set.The values for ΔE and ΔS may be defined based on desired clinically meaningful thresholds. Although ultimately theobjective is always not to incur any increase in toxicity (ie, to have large evidence/probability when ΔS = 0), this maynot always be possible due to the complexity of the disease and/or the drug's mechanism of action. In this case, the plotin Figure 2A can be used to explore the limits of the available evidence for various values of ΔE, ΔS and p. For example,for this sample data set, it shows that, for the BR profile ΔE = 80, ΔS = 0.3, and p = 70%, the MED is 2.5 units and theCED is 4 units. Therefore, any dose in the range [2.5, 4] can be considered “optimal.” If a risk difference of 0.3 is deemedtoo high and the clinical team instead considers ΔS = 0.05 as the maximum allowed difference between active andplacebo in the probability of an adverse event then, to achieve MED ≤ CED the value of posterior probability p needsto be lower than 30%, ie, there is considerably more uncertainty associated with the second, more stringent, BR profile.In general, as the value of ΔE increases (from left to right panel), the MED increases for a given value of ΔS and p.Conversely, the CED decreases as the value of ΔS decreases for fixed ΔE and p. Therefore, as ΔE increases and ΔS

decreases, the condition MED ≤ CED is only achieved by lowering the posterior probability p, ie, by accepting increaseduncertainty on the BR profile. In addition, it is interesting to note that when ΔE = 100 the value of p will not increasebeyond 70% for the range of ΔS considered. The plot in Figure 2A can thus be used to make explicit the impact thatthe trade‐off between the quantities ΔE, ΔS, and p has on the relationship between MED and CED in the presence of

Page 9: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

FIGURE 2 (A) MED vs CED for different values of ΔE (efficacy threshold, E1), ΔS (safety threshold, S1), and p (probability, %) for a sample

simulated dataset. Diagonal line represents the MED = CED line. (B) Power vs dose for different benefit‐risk scenarios. Gaussian copula model

approach

COSTA AND DRURY 9

correlation and uncertainty. If the MED > CED, ie, if the points in Figure 2A are below the diagonal, then there is nooptimal dose for the chosen values of ΔE, ΔS and p.

Typically, in dose‐response clinical trials, the clinical development team will want to evaluate the ability of thechosen study design to successfully deliver a dose with the desired BR profile prior to study start. In addition, from aplanning perspective, it may also be of interest to understand which doses are more likely to be carried forward to phase3. If the BR profile is defined as Prob HE μ1d; μ10ð Þ≥ΔE and HS μ2d; μ20ð Þ≤ΔS ∣ Datað Þ≥p, then one can evaluate theoperating characteristics of this joint success criterion given a range of values for ΔE, ΔS, and p, chosen functions HE

and HS, and assumptions for the joint model in (6) and (7). The plot in Figure 2B illustrates the results for the simulationstudy described in this section. The term “Power” represents the probability of achieving the chosen BR profile across the1000 simulated data sets. Overall, the power for a BR scenario follows a concave pattern, increasing with dose up to thepoint of maximum power, which can be defined as the optimal dose, and then decreasing in a steep manner towards zerofor fixed ΔE, ΔS, and p, reflecting the fact that the desired BR profile is less likely as the dose increases due to theconstraints imposed on the safety profile. If the team wishes to achieve the standard 80% power, then Figure 2B showsthat, for the scenario considered here, this is only achievable if ΔE ≤ 80 and ΔS ≥ 0.2, unless one is prepared to carry

Page 10: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

10 COSTA AND DRURY

substantial uncertainty to phase 3 by setting p < 50%. When ΔE = 80, ΔS = 0.6 and p = 80% (bottom left panel), any dosein the range [2.3, 5.8] achieves power ≥80%, with 5 being the optimal dose for the chosen BR profile (given the modelassumptions) at which power is approximately 100%.

3.3 | Example 3: single efficacy and two safety outcomes

As the drug under investigation proceeds through its development pathway, new safety concerns may arise that arelikely to have substantial impact on the drug's BR profile. Here, a scenario where a single efficacy response and twosafety responses per subject are modelled jointly is considered. To illustrate this situation, the example in Section 3.1is again revisited. For the efficacy (yi1) and the initial safety responses (yi2), the parameter specifications are the sameas described in Section 3.1 (a normal and Bernoulli distribution, respectively). The additional safety response, yi3, repre-sents an event rate and is assumed to follow a negative binomial distribution with dispersion φt = 0.5 and mean rate rt,with r1 = 3 and r2 = 2 for the placebo and active groups, respectively. This means that for the new safety risk the eventrate is lower in those subjects receiving active drug. The two different types of AEs will be represented by S1 (binary) andS2 (event rate), whereas E1 represents the efficacy response. As before, 1000 simulated data sets were generated using theNORTA method25 with different correlation matrices for subjects in the active and placebo groups as follows:

Rplacebo ¼1 0:1 0:1

0:1 1 0:3

0:1 0:3 1

26643775Ractive ¼

1 0:6 0:3

0:6 1 0:6

0:3 0:6 1

26643775:

Bayesian inference was performed on the simulated data via MCMC assuming noninformative priors (see SI). Toassess the impact of accounting for dependency, the Gaussian and Independence copula models were constructed andcompared using DIC. The 3‐dimensional copula model takes the form:

μtΦ−1 ptð Þlog rtð Þ

264375 ¼

xi1 β11 þ xi2 β12xi1 β21 þ xi2 β22xi1 β31 þ xi2 β32

0B@1CA; (8)

σtφt

� �¼ xi1s11 þ xi2 s12

xi1s31 þ xi2 s32

;

θ12tθ13tθ23t

264375 ¼

xi1ω121 þ xi2ω12

2

xi1ω131 þ xi2ω13

2

xi1ω231 þ xi2ω23

2

0B@1CA;

with xit and βjt in (8) representing the treatment indicators and covariate effect parameters, respectively. Note thatj = 1 represents the efficacy response E1, whereas j = 2, 3 represents safety response S1 and S2, respectively. The sjt's form

the dispersion parameters of the j‐th response for subjects on treatment t, and, for the Gaussian copula model, the ωjkt

represent the dependency parameters between responses yij and yik (j ≠ k) for subjects on treatment t.The simulation study results show that both theGaussian and Independence copulamodels estimate themarginal param-

eters with good precision (see SI). The average DIC is 3332.11 for the Gaussian copula, compared with 3435.77 for the Inde-pendence model, suggesting that allowing for dependency via the Gaussian copula approach provides a better fit to the data.

Following from Section 2.3, let ΔE1 represent minimum differences in mean efficacy response on active compared toplacebo, while ΔS1 and ΔS2 represent increases in risk for active compared to placebo for each safety response. Also,let HE1(μ1, μ2) = μ2 − μ1, HS1(p1, p2) = p2 − p1 and HS2(r1, r2) = r2/r1. The posterior probability of achievingthresholds ΔE1, ΔS1 and ΔS2 given the data can be obtained from the joint posterior distribution.

The plot in Figure 3 shows the posterior probability for 3 different thresholds for efficacy together with differentthresholds for both safety responses for the Gaussian copula model for a typical simulated dataset. The left panel inFigure 3 shows that there is 96% posterior probability associated with the following BR profile when comparing activewith placebo: The true difference in mean efficacy is at least 60, and the true difference in the probability of event S1is at most 0.4 and the true risk ratio of event S2 is at most 0.8 (dashed/dotted line). The decrease in posterior probabilityassociated with an increase in the efficacy threshold from 60 to 80 (middle panel) is not substantial. However, when theefficacy threshold is set to 100 (right panel), there is a large decrease in posterior probability, thus suggesting that any BR

Page 11: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

FIGURE 3 Posterior probability plot representing Prob μ2−μ1≥ΔE1Þ and p2−p1≤ΔS1Þ and r2=r1≤ΔS2Þj Datað �ðð½ for a sample simulated

dataset for different thresholds ΔE1, ΔS1 and ΔS2 using the Gaussian copula model

COSTA AND DRURY 11

profile that requires a difference in the efficacy response of 100 or more is less likely to be attainable. In general, largerthresholds for efficacy combined with smaller thresholds for safety lead to lower posterior probabilities for the Gaussiancopula compared with the Independent copula model (see SI). Hence, ignoring dependency can lead to over estimatingthe asset's chance of achieving the desired BR profile.

4 | A REAL DATA EXAMPLE

Compound X was a potential treatment for new‐onset type 1 diabetes that aimed to halt the decline of C‐peptide levelsover time. A clinical trial was conducted to assess the efficacy and safety of treatment X in patients with new‐onset type 1diabetes over an 18‐month period. The key primary efficacy endpoint was change from baseline in C‐peptide levels at6 months. Key adverse events of interest included infection and cytokine release syndrome (CRS). A total of 73 subjectshad C‐peptide levels recorded at 6 months (39 received treatment X, 34 placebo). Details of the study have beenanonymised. To illustrate the use of both the GLMM and Copula approaches in a real data example, the GLMM methodwill be used to assess BR when a single efficacy and safety endpoints have been identified, whereas the copula methodwill be applied when more than 2 endpoints are included in the BR assessment.

4.1 | Single efficacy and safety endpoints

The change from baseline in C‐peptide levels at 6 months is modelled as a continuous response y1 following a normaldistribution with mean μt and variance σ2t (t = 1 for the placebo group, t = 2 for the treatment X group). For the infectiontype adverse events, a binary variable y2 indicating the occurrence of at least one infection during the clinical trial iscreated and assumed to follow a Bernoulli distribution with parameter pt. The Pearson correlation between the efficacyand safety responses among those subjects receiving treatment X is −0.23 (0.05 on placebo), suggesting that smallerchanges from baseline in C‐peptide levels (indicative of treatment effect) are associated with the occurrence of at leastone infection event in this group of patients. The GLMM model in (4) was used to obtain the joint posterior distributionof interest through MCMC simulations. To accommodate the observed negative correlation, the response variablechange from baseline in C‐peptide level was transformed from y1 to −y1

14 (see SI). The results can be found inTable 3 and Figure 4A. They show that, in patients receiving treatment X, both the risk of infection and the averageincrease in C‐Peptide levels at 6 months compared to baseline is higher compared to those receiving placebo.

The BR contour plot in Figure 4A provides an overall assessment of the trade‐off between improvements in C‐Peptidelevels and the probability of experiencing an infection under treatment X compared to placebo. It suggests that BR

Page 12: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

TABLE 3 Benefit‐risk analysis for treatment X. Posterior parameter estimates using the generalised linear mixed model method

Parameter Mean SD 2.5% 50% 97.5%

μ1 −0.32 0.14 −0.60 −0.32 −0.04

μ2 0.32 0.12 0.09 0.32 0.55

p1 0.67 0.08 0.06 0.68 0.24

p2 0.92 0.04 0.15 0.93 0.38

μ2 − μ1 0.63 0.18 0.27 0.63 0.99

p2 − p1 0.25 0.09 0.07 0.24 0.42

σ21 0.67 0.18 0.41 0.64 1.08

σ22 0.53 0.13 0.33 0.51 0.84

FIGURE 4 Benefit‐risk analysis for treatment X. (A) Benefit‐risk contour plot representing Prob μ2−μ1≥ΔE and p2−p1≤ΔSj Datað Þ forvarying values of ΔE and ΔS. (B) Covariate profile plot for baseline C‐peptide level (dots represent the minimum, 20th percentile, 40th

percentile, 60th percentile, 80th percentile and maximum of the observed distribution of baseline C‐peptide level)

12 COSTA AND DRURY

profiles with high posterior probability correspond to scenarios with a substantial increase ΔS in the probability ofinfection over placebo for only low to modest differences ΔE in C‐Peptide levels. The data do not support BR profilesfor which ΔE > 0.8 or ΔS < 0.1.

To assess the impact that baseline C‐Peptide level has on the BR profile the model in (4) is extended to include aninteraction term between treatment and baseline C‐Peptide level. The question of interest is, “Given a patient's baselineC‐peptide level, what is his/her likely BR profile with treatment X compared to placebo?” The plot in Figure 4B repre-sents the change in the mean C‐Peptide level and probability of an infection as a function of baseline C‐Peptide levelsfor each treatment group. Patients in the placebo group with lower baseline C‐Peptide levels tend to improve or remainconstant compared to those with higher levels. In addition, Figure 4B also suggests that for these patients, the probabilityof experiencing an infection increases with increasing baseline C‐Peptide levels. For those patients receiving treatmentX, both the probability of an infection and the improvements in C‐Peptide at 6 months remain roughly constant acrossbaseline C‐Peptide levels. Thus, the BR profile of treatment X is robust to a patient's baseline C‐Peptide level, whereaswithin the placebo group, subjects with lower baseline C‐Peptide levels have a more favourable BR profile.

4.2 | Single efficacy and two safety endpoints

The analysis from the previous section was extended to include information on the risk of CRS in addition to infection.Cytokine release syndrome is represented as an indicator variable, y3, for each subject and is assumed to follow aBernoulli distribution (with y3 = 1 if the event occurs). The 3 responses, y1, y2, and y3, were joint modelled using a

Page 13: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

COSTA AND DRURY 13

3‐dimensional Gaussian copula marginal regression model with covariates of treatment group and baseline C‐Peptidelevel (y1 mean only). Probit link functions were used for y2 and y3. The dependency matrices were allowed to differbetween treatment groups. The joint posterior distribution was obtained through MCMC simulation usingnoninformative priors (see SI for details of the likelihood function and prior distributions).

As discussed in Section 3.1.2, using the Gaussian copula to jointly model normal and Bernoulli responses implies thatthe copula dependency parameters correspond to poly‐serial (normal‐binary) and poly‐choric (binary‐binary) correlationestimates.26 This provides a convenient method for comparing dependency parameters across treatments and responses.

As expected, the posterior estimates for the mean C‐peptide changes from baseline and probability of experiencing aninfection agree with those from Section 4.1 (see SI for full details). The posterior mean estimate for the probability ofexperiencing CRS is 0.1 for placebo and 0.94 for treatment X. The posterior means for the copula dependency parameterssuggest there is no substantial association between change from baseline in C‐peptide levels and probability of CRS for

either treatment X or placebo treated subjects (θ13Placebo= −0.18 and θ13Active= −0.02), and a reasonable level of positive

association between probability of Infection and probability of CRS for subjects receiving treatment X (θ23Active= 0.45),

but no association in the placebo treated subjects (θ23Placebo= −0.02).The plot in Figure 5 illustrates the probability of success for various efficacy and safety thresholds and can be used to

communicate the strength of evidence of different BR profiles to different stakeholders. For this trial, the posteriorprobability is generally very low unless the potential for up to a 0.9 increase in the risk of CRS versus placebo is accepted(S2 threshold of p22 − p21 ≤ 0.9). Given that most of the CRS events were deemed by the clinical team involved in the trialto be tolerable and transient, such a large increase in risk may not be unreasonable. The plot also suggests that a 0.5increase in the risk of infection compared to placebo (S1 threshold of p22 − p21 ≤ 0.5) would need to be accepted toachieve BR profiles with greater than 25% posterior probability. For these safety thresholds, the mean treatment differ-ence in change from baseline in C‐Peptide levels would need to be at least 0.4 compared to placebo (E1 threshold ofμ2 − μ1 ≥ 0.4) to yield a BR profile with approximately 70% posterior probability. However, if a larger magnitude ofdifference in the efficacy mean response is required for a clinically meaningful outcome, for example 0.7, then theposterior probability is at most approximately 25%. The above example illustrates how the discussions around a newtreatment's BR balance can enable clinical teams and other stakeholders to assess whether the available evidencejustifies further development.

5 | DISCUSSION

This paper proposes two approaches to Bayesian joint modelling for BR assessment across different stages of the drugdevelopment pathway. The techniques described here, GLMMs and Copulas, can help clinical teams gain insight into

FIGURE 5 Benefit‐risk analysis for treatment X: posterior probability plot representing Prob μ2−μ1≥ΔE1Þ andð½p12−p11≤ΔS1Þ and p22−p21≤ΔS2Þj Datað �ð for varying values of ΔE1, ΔS1 and ΔS2

Page 14: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

14 COSTA AND DRURY

the problem of BR assessment in the presence of multidimensional measures that are different in nature. By conductinginference within the Bayesian framework, the uncertainty around the BR profile is translated into intuitive probabilisticstatements that can be used to communicate the extent to which the existing evidence supports BR statements of interest.In addition, prior information can be incorporated in a straightforward manner so that the BR profile can be evaluated bycontrasting newly acquired information with previous evidence. For example, the study teammay be interested in explor-ing how the BR balance changes from phase 2 to phase 3 or, if relevant, from phase 3 to phase 4 and beyond. This featurecan be particularly useful to assess changes in the safety profile. Although not explored here, another important feature ofthe Bayesian framework is the ability to predict the efficacy and safety responses of future patients conditional on the dataobserved so far. This is straightforward under the Bayesian paradigm through the posterior predictive distribution.

The use of clinical thresholds allows for the assessment of BR profiles that are specific to the different stakeholdersinvolved in decision‐making, from sponsors all the way to patients. It also allows one to explore the limits of the evidencegenerated by, for example, assessing the probability of BR profiles corresponding to best and worst case scenarios. In the2‐dimensional setting, this assessment can be easily represented through BR contour plots as described in Section 3.1.For 3‐dimensional examples, plots like the one in Figures 3 and 5 can be used. Beyond 3 dimensions, visual representa-tions of BR become more complex and difficult to interpret.

Both the GLMM and Copula approach account for correlation among the multiple responses. From a clinical per-spective, the existence of correlation at the subject level is a plausible assumption in many diseases/therapies, wherethrough exposure to an active drug the degree to which a patient benefits from a treatment is related to how likelyhe/she is to experience an adverse event. The simulation studies presented here have shown that accounting for existingcorrelation results in more accurate probabilistic statements about BR.

One important distinction between the two approaches presented here is the fact that, while Copula models automat-ically retrieve marginal effects of covariates, which are important for BR assessments at the population level, with theGLMM approach these need to be derived, typically through numerical integration (although closed form solutions orapproximations exist for special cases as described here). However, if interest lies in making subject‐specific BR state-ments,7 the output from the GLMM approach can be used.

Other multivariate approaches for BR assessment have been proposed. MCDA has been identified by the EMA's BRmethodology project has having potential for BR assessment.3 However, MCDA relies on the identification of appropriateweights and utility functions for each of the endpoints. Although this allows one to incorporate multiple endpoints into asingle BR metric, there is a risk that the weights rather than the data drive the BR assessment and conclusions.11 In addi-tion, it may be challenging to understand the impact that dependency among the multiple endpoints has on theconclusions. Recent developments have attempted to overcome some of these limitations,31 however these methods donot account for potential correlation among the multiple attributes, possibly due to complexities associated with definingsuch high‐dimensional dependency structures. The approaches proposed here could be used in conjunction with MCDA.If a small number of attributes is deemed influential in the conclusions from the MCDA analysis, then the Bayesian jointmodelling approach could be applied to investigate their relationship inmore detail and how this impacts the BR assessment.

The BR methods presented here can be expanded to higher dimensions. If more than three measured responses aredeemed relevant for BR assessment, then both the GLMM and Copula approaches can be extended to accommodate suchscenarios, although care is needed when specifying the covariance structure in GLMMs or the dependency structure inCopulas to ensure identifiability of the relevant model parameters. With GLMMs, questions of identifiability can beovercome by assuming the same shared random effect across all responses.10 However, it is important to note that thecomplexity of the model increases substantially with each additional dimension and thus the number of responsesconsidered should be kept low, typically no higher than four. Clinical trial design can also be formulated in terms of aBR assessment problem. Given a range of clinical thresholds of interest as defined in Section 2.3, simulations can be usedto assess the operating characteristics of a given design relative to a BR profile of interest. Section 3.2 explored this ideafrom a dose‐response perspective, but the ideas developed there can be extended to the other scenarios as well.

The methods proposed here are computationally simple and can be readily implemented using available software. Itis important to note that they are not intended to be exhaustive and other analyses, qualitative and quantitative, wouldbe expected for a thorough assessment of the BR profile.

ACKNOWLEDGEMENTS

The authors would like to thank Nigel Dallow, Graeme Archer, Nicky Best, and James Roger for constructive discus-sions, and two anonymous referees and the associate editor for helpful comments that improved the content of the paper.

Page 15: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

COSTA AND DRURY 15

ORCID

Maria J. Costa http://orcid.org/0000-0002-0563-8877Thomas Drury http://orcid.org/0000-0003-1443-5999

REFERENCES

1. Coplan PM, Noel RA, Levitan BS, Ferguson J, Mussen F. Development of a framework for enhancing the transparency, reproducibility andcommunication of the benefit‐risk balance of medicines. Clin Pharmacol Therapeut. 2011;89(2):312‐315.

2. Structured approach to benefit‐risk assessment in drug regulatory decision‐making. Draft PDUFA V implementation plan ‐ February 2013.http://www.fda.gov/downloads/ForIndustry/UserFees/PrescriptionDrugUserFee/UCM329758.pdf

3. European Medicines Agency. Benefit‐Risk Methodology Project Work Package 3 Report: Field Tests. London: European Medicines Agency;2011.

4. Waddingham E, Mt‐Isa S, Nixon R, Ashby D. A Bayesian approach to probabilistic sensitivity analysis in structured benefit risk assess-ment. Biom J. 2015;58:28‐42.

5. Thall P, Cook J. Dose‐finding based on efficacy–toxicity trade‐offs. Biometrics. 2014;60:684‐693.

6. Zhao Y, Zalkikar J, Tiwari RC, LaVange LM. Bayesian approach for benefit‐risk assessment. Statist Biopharmaceut Res. 2014;6:326‐337.

7. Cui S, Zhao Y, Tiwari RC. Bayesian approach to personalized benefit‐risk assessment. Statist Biopharmaceut Res. 2016;8(3):316‐324.

8. Mt‐Isa S, Owens M, Robert V, Gebel M, Schacht A, Hirsch I. Structured benefit‐risk assessment: a review of key publications and initiativeson frameworks and methodologies. Pharm Stat. 2015;15:324‐332.

9. He W, Cao X, Xu L. A framework for joint modeling and joint assessment of efficacy and safety endpoints for probability of successevaluation and optimal dose selection. Stat Med. 2012;31(5):401‐419.

10. He W, Fu B. Benefit‐risk evaluation using a framework of joint modeling and joint evaluations of multiple efficacy and safety endpoints.In: Jiang Q, He W, eds. Benefit‐Risk Assessment Methods in Drug Development: Bridging Qualitative and Quantitative Assessments. CRCPress; 2016:175‐196.

11. Costa MJ, He W, Jemiai Y, Zhao Y, Di Casoli C. The case for a Bayesian approach to benefit‐risk assessment: overview and future direc-tions. Therapeut Innovat Regulat Sci. 2017;51:568‐574.

12. Ma H, Jiang Q, Chuang‐Stein C, et al. Considerations on endpoint selection, weighting determination, and uncertainty evaluation in thebenefit‐risk assessment of medical product. Statist Biopharmaceut Res. 2016;8(4):417‐425.

13. de Leon AR, Wu B, Withanage N. Joint analysis of mixed discrete and continuous outcomes via copula models. In: de Leon AR, ChoughKC, eds. Analysis of Mixed Data: Methods & Applications. Chapman and Hall/CRC; 2013.

14. Teixeira‐Pinto A, Harezlak J. Factorization and latent variable models for joint analysis of binary and continuous outcomes. In: de LeonAR, Chough KC, eds. Analysis of Mixed Data: Methods & Applications. Chapman and Hall/CRC; 2013.

15. Joe H. Multivariate Models and Dependence Concepts. Chapman and Hall/CRC; 1997.

16. Nelsen RB. An Introduction to Copulas. New York: Springer‐Verlag; 2006.

17. Trivedi PK, Zimmer DM. Copula Modeling: an Introduction for Practitioners. now publishers; 2007.

18. Song PX‐K. Correlated Data Analysis: Modeling, Analytics, and Applications. New York: Springer‐Verlag; 2007.

19. Zimmer DM. Analysis of mixed outcomes in econometrics: applications in health economics. In: de Leon AR, Chough KC, eds. Analysis ofMixed Data: Methods & Applications. Chapman and Hall/CRC; 2013.

20. Kurowicka D, Joe H. Dependence Modeling—Vine Copula Handbook. World Scientific Publishing Company; 2010.

21. Stober JM, Hong HG, Czado C, Ghosh P. Comorbidity of chronic diseases in the elderly: patterns identified by a copula design for mixedresponses. Computat Statist Data Anal. 2015;88:28‐39.

22. Stober JM. 2013. Regular vine copulas with the simplifying assumption, time‐variation, and mixed discrete and continuous margins [PhDthesis]. Technical University of Munich. http://d‐nb.info/1037198476

23. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A. Bayesianmeasures of model complexity and fit. J R Stat Soc Ser B. 2002;64(4):583‐639.

24. Ouellet D. Benefit‐risk assessment: the use of clinical utility index. Expert Opin Drug Saf. 2010;9(2):289‐300.

25. Cairo MC, Nelson BL. Modeling and generating random vectors with arbitrary marginal distributions and correlation matrix. TechniqueReport, Department of Industrial Engineering and Management Science, Northwestern University, Evanston, IL, 1997.

26. Joe H. Dependence Modeling with Copulas. Chapman and Hall/CRC; 2014.

27. Demirtas H. Joint generation of binary and nonnormal continuous data. J Biometr Biostatist. 2014;S12:001.

28. Tao A, Lin Y, Pinheiro J, Shih WJ. Dose finding methods in joint modeling of efficacy and safety endpoints in phase II studies. Intern J StatProbabil. 2015;4:33‐45.

Page 16: Bayesian joint modelling of benefit and risk in drug developmenthbiostat.org/papers/RCTs/multEndpoints/cos18bay.pdf · 2018. 2. 23. · approaches to benefit‐risk evaluation, both

16 COSTA AND DRURY

29. Greco WR, Bravo G, Parsons JC. The search of synergy: a critical review from a response surface perspective. Pharmacol Rev.1995;47:331‐385.

30. Slob W. Dose response modelling of continuous endpoints. Toxicol Sci. 2002;66(2):298‐312.

31. Saint‐Hilary G, Cadour S, Robert V, Gasparini M. A simple way to unify multicriteria decision analysis (MCDA) and stochasticmulticriteria acceptability analysis (SMAA) using a Dirichlet distribution in benefit‐risk assessment. Biom J. 2017;59(3):567‐578.

SUPPORTING INFORMATION

Additional Supporting Information may be found online in the supporting information tab for this article.

How to cite this article: Costa MJ, Drury T. Bayesian joint modelling of benefit and risk in drug development.Pharmaceutical Statistics. 2018;1–16. https://doi.org/10.1002/pst.1852