Increasing efficiency for estimating treatment-biomarker ... · 9/6/2014 · I Overexpression of ETS transcription factors ERG and ETV1 is characterizes a fusion of the androgen-sensitive

Increasing efficiency for estimatingtreatment-biomarker interactions with

historical data

Philip S. Boonstra

University of Michigan20 May 2014

1

Predictive biomarkers in randomized phase II trialsTargeted therapies are, by design, expected not to work equally wellfor all patients. Biomarker(s) may be predictive for treatmentresponse, e.g. higher levels mean higher treatment sensitivity.

2

A statistical challenge for evaluating predictivebiomarkers

Can characterize “predictiveness” through treatment-biomarkerinteraction in statistical model for response, but modest size of typicalphase II trials makes estimating interaction difficult.

3

Randomized Phase II trial at Michigan

I Overexpression of ETS transcription factors ERG and ETV1 ischaracterizes a fusion of the androgen-sensitive promoter from aprostate-specific gene and an ETS gene [Tomlins et al., 2005].

I Advanced prostate cancer patients evaluated for ETStranscription factors in the metastatic lesion.

4


5


I ETS-mediated oncogenic features such as metastasis and tumorgrowth depend on PARP1 [Brenner et al., 2011].

I Thus a PARP1-inhibitor can specifically target ETS-positiveprostate cancers

6


The trial will evaluate the addition of PARP1-inhibitor to a standardtreatment regime (Abiraterone).

7


I Advanced prostate cancerI ETS fusion +: Randomize Abiraterone +/- PARP1-inhibitorI ETS fusion -: Randomize Abiraterone +/- PARP1-inhibitorI n = 120I Need large sample sizes to estimate interactions

8

Historical data to increase efficiency

Historical data may increase statistical efficiency for estimatingtreatment-biomarker interactions in randomized phase II trials. Twosuch types consist of

I patients who all received the standard of care but for whichbiomarker measurements are available

I patients who received either the standard of care or experimentaltreatment but for which biomarker measurements are missing

9

Historical data to increase efficiency

I Strategies exist to estimate main effect of treatment [e.g., Thalland Simon, 1990, Neuenschwander et al., 2010]

I Bias in treatment effect estimates is possible if historical andprospective data are incompatible [DuMouchel and Harris, 1983,Cuffe, 2011]. Will discuss this more at the end.

10

Outline

1. Define prospective data and 2 types of historical data2. Continuous response/biomarker setting

2.1 Formulate response model2.2 Propose a relative efficiency (RE) metric based on Fisher

Information to quantify gain from historical data2.3 Evaluate RE both numerically and analytically

3. Binary response/biomarker setting – Preliminary results

4. Interpretation and caveats

11

Data collected

I Response Y (continuous, binary)I Prognostic Covariates W ∈ Rq

I Treatment T ∈ 0, 1I Biomarker V (continuous, binary)

12

Sources of data

Group 1 Prospective randomized trial: n1 observationsconsisting of Y,W,T,V.

Group 2 Previous biomarker study, lacking the experimentaltherapy: n2 observations consisting ofY,W,T ≡ 0,V.

Group 3 Previous observational or clinical trial of experimentaltherapy: n3 observations consisting of Y,W,T.

13

Response model – continuous setting

I Response Y ∈ RI Prognostic Covariates W ∈ Rq

I Treatment T ∈ 0, 1I Biomarker V ∈ R

A Gaussian-linear model for the response Y is

Y|X ∼ N(X>β, σ2),

with X = 1,W>,T,V,T × V>, β = β0,β>W, βT , βV , θ> and

βW = βW1, βW2, . . . , βWq>.

E(Y|X) = β0 + β>WW + βTT + βVV + θTV

θ is parameter of interest

14

Response model – continuous setting

I Response Y ∈ RI Prognostic Covariates W ∈ Rq

I Treatment T ∈ 0, 1I Biomarker V ∈ R

A Gaussian-linear model for the response Y is

Y|X ∼ N(X>β, σ2),


βW = βW1, βW2, . . . , βWq>.

E(Y|X) = β0 + β>WW + βTT + βVV + θTV

θ is parameter of interest

14

Biomarker model – continuous setting

To account for missing data in group 3, we assume a biomarker model

V|W ∼ N(γ0 + γ>W, τ 2)

and marginalize the response model over V . Conditionalindependence: T ⊥ V|W.

Note: Groups 2 and 3 by themselves provide no information about θitself.

15

Biomarker model – continuous setting


V|W ∼ N(γ0 + γ>W, τ 2)

and marginalize the response model over V . Conditionalindependence: T ⊥ V|W.

Note: Groups 2 and 3 by themselves provide no information about θitself.

15

Differences between groups

Two assumptions allow for population-level differences in groups:

1. Prognostic covariates (W), which are available in all groups,explain some differences between populations.

2. For differences not accounted for by W, we allow intercepts, β0and γ0, to differ between populations.

16

Variance estimate of θ

1. Log-likelihood contributions from each group: `1, `2, `3

2. Ik = E

[∂`k∂β

∂`k∂β>

∂`k∂β

∂`k∂γ>

∂`k∂γ

∂`k∂β>

∂`k∂γ

∂`k∂γ>

]3. I obtained from linear combination of I1, I2 and I3

4. Var(θ) obtained from diagonal element of (I−1)

17

Fisher information: group 1

Expected information contribution from one observation in group 1:

I1 =1σ2

1 · · · · · · · ·W WW> · · · · · · ·T TW> T · · · · · ·V VW> TV V2 · · · · ·

TV TVW> TV TV2 TV2 · · · ·0 0>q 0 0 0 1/(2σ2) · · ·0 0>q 0 0 0 0 σ2/τ 2 · ·0q 0q0>q 0q 0q 0q 0q (σ2/τ 2)W (σ2/τ 2)WW> ·0 0>q 0 0 0 0 0 0>q σ2/(2τ 4)

[ β0 βW βT βV θ σ2 γ0 γ τ 2 ]

18



I2 =1σ2

1 · · · · · · · ·W WW> · · · · · · ·0 0>q 0 · · · · · ·V VW> 0 V2 · · · · ·0 0>q 0 0 0 · · · ·0 0>q 0 0 0 1/(2σ2) · · ·0 0>q 0 0 0 0 σ2/τ 2 · ·0q 0q0>q 0q 0q 0q 0q (σ2/τ 2)W (σ2/τ 2)WW> ·0 0>q 0 0 0 0 0 0>q σ2/(2τ 4)


19



I3 =1δ2

1 · · · · · · · ·W WW> · · · · · · ·T TW> T · · · · · ·V VW> TV V2+2τ 4D2/δ2 · · · · ·

TV TVW> TV TV2+2Tτ 4D2/δ2 TV2+2Tτ 4D2/δ2 · · · ·0 0>q 0 τ 2D/δ2 Tτ 2D/δ2 1/(2δ2) · · ·D DW> TD VD TVD 0 D2 · ·

DW DWW> TDW VDW TVDW 0q D2W D2WW> ·0 0>q 0 τ 2D3/δ2 Tτ 2D3/δ2 D2/(2δ2) 0 0 D4/(2δ2)


with D = βV + Tθ, V = E[V|W] = γ0 + W>γ, andδ = (τ 2D2 + σ2)1/2. The additional off-diagonals result frommarginalizing over the missing V .

20

Numerical study

q = 5 prognostic covariates. For each group, 100,000 draws of

1. W ∼ N(0q, Iq),

2. T|W ∼ Bin(1/2) (group 1)or T|W ∼ Bin(expitα0 + α>W) (group 3),

3. V|W ∼ N(γ0 + γ>W, τ 2) (for groups 1 and 2),

and calculate the empirical average values of I1, I2, and I3.

Choose α0 and γ0 so that E[T] = 0 and E[V] = 0. Modify the signalwith coefficients of determination, R2

σ and R2τ .

21

Numerical study

Define the following measures of relative efficiency for estimating θ,given some non-negative number k:

RE2(k) = (I1 + kI2)−1θ/I−11 θ,

RE3(k) = (I1 + kI3)−1θ/I−11 θ,

RE23(k) = (I1 + 0.5k[I2 + I3])−1θ/I−11 θ,

REmax(k) = (I1 + kI1)−1θ/I−11 θ = 1/(1 + k).

k is the sample size ratio, e.g. n2/n1 or n3/n1, for the historical datacompared to the prospective data.

22

Relative Efficiencies

Sample size ratio (k)

RE 0.3

0.5

0.7

0.9(0.046)

= Rτ2 0.1

=

Rσ2

0.1

0 1 2 3

(0.015)

= Rτ2 0.5

=

Rσ2

0.1

0 1 2 3

(0.303) = Rτ

2 0.1

=

Rσ2

0.5

0.3

0.5

0.7

0.9(0.116)

= Rτ2 0.5

=

Rσ2

0.5

RE2 RE3 RE23 REmax

Efficiency gains from historical data under varying R2’s for response(row, R2

σ) and biomarker (column, R2τ ) models; partial-R2 from θ is in

upper corner of each panel. 23

Interaction type



RE 0.3

0.5

0.7

0.9(0.026)

= Rτ2 0.1

=

Rσ2

0.1

0 1 2 3

(0.009)

= Rτ2 0.5

=

Rσ2

0.1

0 1 2 3

(0.201) = Rτ

2 0.1

=

Rσ2

0.5

0.3

0.5

0.7

0.9(0.078)

= Rτ2 0.5

=

Rσ2

0.5

RE2 RE3 RE23 REmax




Interaction type



RE 0.3

0.5

0.7

0.9(0.004)

= Rτ2 0.1

=

Rσ2

0.1

0 1 2 3

(0.001)

= Rτ2 0.5

=

Rσ2

0.1

0 1 2 3

(0.035) = Rτ

2 0.1

=

Rσ2

0.5

0.3

0.5

0.7

0.9(0.014)

= Rτ2 0.5

=

Rσ2

0.5

RE2 RE3 RE23 REmax




Interaction type

Analytical evaluation of group 2 contribution

Group 2: Previous biomarker study, lacking experimental therapy: n2observations consisting of Y,T ≡ 0,V,W.

Write variance of θ from group 1 + group 2 as

vθ(µT , n1, k,Ω)

≡ (1/n1)(I1 + kI2)−1θ= . . .

=

(1n1

)(σ2

γ>ΣWγ + τ 2

)︸︷︷︸

model parameters

(1 + k

µT(1 + k)− µ2T

)︸︷︷︸

design parameters

,

µT ≡ E[T] ∈ (0, 1) (group 1), k = n2/n1, and Ω denotes all otherparameters.

26

Analytical evaluation of group 2 contribution

Group 2: Previous biomarker study, lacking experimental therapy: n2observations consisting of Y,T ≡ 0,V,W.

Write variance of θ from group 1 + group 2 as

vθ(µT , n1, k,Ω)

≡ (1/n1)(I1 + kI2)−1θ= . . .

=

(1n1

)(σ2

γ>ΣWγ + τ 2

)︸︷︷︸

model parameters

(1 + k

µT(1 + k)− µ2T

)︸︷︷︸

design parameters

,

µT ≡ E[T] ∈ (0, 1) (group 1), k = n2/n1, and Ω denotes all otherparameters.

26

Maximum possible efficiency gain from group 2

limk→∞

vθ(µT , n1, k,Ω)

vθ(µT , n1, 0,Ω)= lim

k→∞

(1 + k)(µT − µ2T)

µT(1 + k)− µ2T

= 1− µT .

When µT = 1/2 (an equal-arm prospective trial), group 2 can reducethe variance of θ by no more than 1/2.

27

Approaching a single-arm trial

vθ(µT , n1, k,Ω) is minimized with respect to µT by usingµ

optT = 1/2 + k/2 = 1/2 + n2/(2n1). If n2 > n1, then prospective

trial should have single arm: the experimental therapy. Group 2provides all the controls.

Theoretical result – caveats at end of talk.

28

Approaching a single-arm trial

vθ(µT , n1, k,Ω) is minimized with respect to µT by usingµ

optT = 1/2 + k/2 = 1/2 + n2/(2n1). If n2 > n1, then prospective

trial should have single arm: the experimental therapy. Group 2provides all the controls.

Theoretical result – caveats at end of talk.

28

Response model – binary setting

I Response Y ∈ 0, 1I Prognostic Covariates W ∈ Rq

I Treatment T ∈ 0, 1I Biomarker V ∈ 0, 1

A logistic model for the response Y is

Pr(Y = 1|X) = expitX>β,


βW = βW1, βW2, . . . , βWq>.

29

Biomarker model – binary setting


Pr(V = 1|W) = expit[1,W>]γ

30

Fisher Information

Similar strategy as before: numerically determine Ik, invert to obtainlarge-sample variance of θ. When V is measured, we have

I(β,γ) =

(r2

YXX> rYrVX[1,W>]

· r2V [1,W>]>[1,W>]

),

where rY = (Y − expitX>β) and rV = (V − expit[1,W>]γ);Fisher information is now a function of both the response and theparameters.

When V is unmeasured, required to first marginalize over missing V .

31

Fisher Information

Similar strategy as before: numerically determine Ik, invert to obtainlarge-sample variance of θ. When V is measured, we have

I(β,γ) =

(r2

YXX> rYrVX[1,W>]

· r2V [1,W>]>[1,W>]

),

where rY = (Y − expitX>β) and rV = (V − expit[1,W>]γ);Fisher information is now a function of both the response and theparameters.

When V is unmeasured, required to first marginalize over missing V .

31

Numerical study

q = 5 prognostic covariates. For each group, 100,000 draws of

1. W ∼ N(0q, Iq),

2. T|W ∼ Bin(1/2) (group 1)or T|W ∼ Bin(expitα0 + α>W) (group 3),

3. V|W ∼ Bin(expit[1,W>]γ) (for groups 1 and 2),

4. Y|X ∼ Bin(expitX>β)and calculate the empirical average values of I1, I2, and I3.

Choose α0 and γ0 so that E[T] = 0 and E[V] = 0.2.

32



RE 0.3

0.5

0.7

0.9

= θ 0.5 =

β V

−0.

5

0 1 2 3

= θ 1.5

=

β V−

0.5

0 1 2 3

= θ 0.5

=

β V0

0.3

0.5

0.7

0.9

= θ 1.5

=

β V0

RE2 RE3 RE23 REmax

33

Interpretation, caveats

1. Initial efficiency gain observed from both types ofhistorical data.

2. Efficiency gains depend on models being “similarenough”∗, although not identical.

*“similar enough” means that βV and θ are equal betweengroups. However, may be able to relax this through aBayesian analysis, i.e. commensurate or power priors.

34





34





34


3. Not always realistic or desirable to prescribe a single-armtrial; interactions are not be-all/end-all

4. A multiplicative interaction parameter is neither sufficient(may not be clinically meaningful) nor necessary (rather amore complicated subgroup effect) for a predictivebiomarker

35


3. Not always realistic or desirable to prescribe a single-armtrial; interactions are not be-all/end-all

4. A multiplicative interaction parameter is neither sufficient(may not be clinically meaningful) nor necessary (rather amore complicated subgroup effect) for a predictivebiomarker

35


5. Large-sample investigation demonstrates a proof ofprinciple: more data can increase efficiency.

6. Next step is to investigate small-sample properties ofestimators.

7. Also worthwhile to extend to censored outcomes.

36





36





36

Acknowledgments

Collaborators: Jeremy M G Taylor, Bhramar Mukherjee

Funding: Eli Lilly and Company

37

J Chad Brenner, Bushra Ateeq, Yong Li, et al. Mechanistic rationale for inhibition ofpoly(ADP-ribose) polymerase in ETS gene fusion-positive prostate cancer. Cancer Cell, 19:664–678, 2011.

Robert L Cuffe. The inclusion of historical control data may reduce the power of a confirmatorystudy. Stat Med, 30:1329–1338, 2011.

William H DuMouchel and Jeffrey E Harris. Bayes methods for combining the results of cancerstudies in humans and other species. J Am Statist Assoc, 78:293–308, 1983.

Beat Neuenschwander, Gorana Capkun-Niggli, Michael Branson, et al. Summarizing historicalinformation on controls in clinical trials. Clin Trials, 7:5–18, 2010.

Peter F Thall and Richard Simon. Incorporating historical control data in planning phase IIclinical trials. Stat Med, 9:215–228, 1990.

Scott A Tomlins, Daniel R Rhodes, Sven Perner, et al. Recurrent fusion of TMPRSS2 and ETStranscription factor genes in prostate cancer. Science, 310:644–648, 2005.

38

Documents

Increasing efficiency for estimating treatment-biomarker ... · 9/6/2014 · I Overexpression of ETS transcription factors ERG and ETV1 is characterizes a fusion of the androgen-sensitive