37
Optimizing Group Sequential Designs that Allow Changes to the Population Sampled Based on Interim Data Michael Rosenblum Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Joint Work with Mark J. van der Laan Division of Biostatistics University of California, Berkeley

Michael Rosenblum Department of Biostatistics Johns Hopkins Bloomberg School of Public Health

  • Upload
    noelle

  • View
    44

  • Download
    0

Embed Size (px)

DESCRIPTION

Optimizing Group Sequential Designs that Allow Changes to the Population Sampled Based on Interim Data. Michael Rosenblum Department of Biostatistics Johns Hopkins Bloomberg School of Public Health Joint Work with Mark J. van der Laan Division of Biostatistics - PowerPoint PPT Presentation

Citation preview

Page 1: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Optimizing Group Sequential Designs that Allow Changes to the

Population Sampled Based on Interim Data

Michael RosenblumDepartment of Biostatistics

Johns Hopkins Bloomberg School of Public Health

Joint Work with Mark J. van der LaanDivision of Biostatistics

University of California, Berkeley

Page 2: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

``Death is caused by swallowing small amounts of saliva over a long period of time.” --George Carlin

<<La mort est causé par ingestion de petites quantités de salive sur une longue période de temps.>> --George Carlin

Page 3: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Outline of Talk

• Motivation: Why Change the Population Sampled in Group Sequential Designs?

• Simple Example• Our Method• Our Method Applied to Example• Use Results to Compare Power of

Adaptive vs. Non-Adaptive Designs

Page 4: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Adaptive Clinical Trial DesignsFDA is Interested:

“A large effort has been under way at FDA during the past several years to encourage the development and use of new trial designs, including enrichment designs.”

Page 5: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Adaptive Clinical Trial Designs• Pharmaceutical Companies are Interested:

“An adaptive clinical trial conducted by Merck saved the company $70.8 million compared with what a hypothetical traditionally designed study would have cost…”

Page 6: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Why Use Adaptive Designs?

Benefits:– Can Give More Power to Confirm Effective

Drugs and Determine Subpopulations who Benefit Most

– Can Reduce Cost, Duration, and Number of Subjects of Trials

Designs Must:– Control Probability of False Positive Results

Page 7: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Group Sequential Randomized Trial Designs

• Participants Enrolled over Time• At Interim Points, Can Change Sampling

in Response to Accrued Data.

Page 8: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

FDA Draft Guidance on Adaptive Designs

Focus is AW&C (adequate and well-controlled) trials.

Distinguishes well understood vs. less well understood adaptations.

Explains chief concerns: Type I error, bias, interpretability.

Page 9: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

FDA Draft Guidance on Adaptive Designs

• Adapt Study Eligibility Criteria Using

Only Pre-randomization data.• Adapt to Maintain Study Power Based

on Blinded Interim Analyses of Aggregate Data (or Based on Data Unrelated to Outcome).

• Adaptations Not Dependent on Within Study, Between-Group Outcome Differences

Well Understood Adaptations:

Page 10: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

FDA Draft Guidance on Adaptive Designs

• Group Sequential Methods (i.e. Early

Stopping)

Well Understood Adaptations:

Page 11: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

FDA Draft Guidance on Adaptive Designs

• Adaptive Dose Selection• Response-Adaptive Randomization• Sample Size Adaptation Based on

Interim-Effect Size Estimates• Adaptation of Patient Population Based

on Treatment-Effect Estimates• Adaptive Endpoint Selection

Less-Well Understood Adaptations:

Page 12: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

FDA Draft Guidance on Adaptive Designs

Adaptation of Patient Population Based on Treatment-Effect Estimates

“These designs are less well understood, pose challenges in avoiding introduction of bias, and generally call for statistical adjustment to avoid increasing the Type I error rate.“

Page 13: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Example

Population: Subjects with depression.

Research Questions: How does a new antidepressant compare to placebo in effect on change in Hamilton Rating Score of Depression (HRSD) after 6 weeks?

Does it differ depending on initial severity of depression?

Prior Data Indicates: Maybe only a treatment benefit for those with severe initial depression.

Page 14: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Mean Drug-Placebo Difference as a Function of Initial Severity

(from meta-analysis of Kirsch et al. 2008)

Page 15: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Some Possible Fixed Designs• Enroll from total population (both those

with moderate initial depression and severe initial depression)

Subpopulation 1

Subpopulation 2

Subpopulation 2

• Enroll only from those with severe initial depression

Page 16: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

?21 TT Subpopulation 1

Stage 1 Decision Stage 2

Subpopulation 2 Subpopulation 1

Subpopulation 2

Enrichment Design Recruitment Procedure

Recruit Both Populations

Recruit Only Subpop.1

Recruit Only Subpop.2

Recruit Both Pop.Subpopulation 1

Subpopulation 2

If TreatmentEffect Strong inTotal Pop.

Else, if TreatmentEffect Strongerin Subpop. 1

Else, if TreatmentEffect Strongerin Subpop. 2

Page 17: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Problem and Goals• Problem:

Analyze Group Sequential Designs thatAllow Changes to Population Sampled at Interim Points, Based on Earlier Data (Including Outcomes Data)

• Goals: 1. Make Inferences about Selected Populations2. Control Asymptotic, Worst-case, Familywise Type I Error Rate, Making No Model Assumptions

• Our Contribution: General Method for Reducing Problem to Optimization Problem that Can Be Easy to Solve with Standard Statistical Software

Page 18: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Some Related Work• Adapt Treatments and/or Population Sampled

[Thall, Simon, Ellenberg 1988, Schaid, Wieand, Therneau 1990, Wittes and Brittain 1990, Follman 1997, Bauer and Köhne 1994, Bauer and Kieser 1999, Liu, Proschan, Pledger 2002, Stallard and Todd 2003, Sampson and Sill 2005, Bischoff and Miller 2005, Freidlin and Simon 2005, Jennison and Turnbull, 2003, 2006, 2007, Wang, Hung, O’Neill 2009]

Page 19: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Our Contribution Theorem that Gives Reduction of Problem

of Computing Asymptotic, Worst-Case, Familywise Error Rate to Optimization Problem that Often Can be Solved Numerically with Standard Statistical Software.

Reduces Problem to Computations of

Probability that a Multivariate Normal is in a Set of Regions (that Depend on Design).

Page 20: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Scope of Method

Can be applied to studies with: • Any number of pre-planned treatment

arms• Any number of pre-planned stages• Sequential testing• Adaptation of Randomization Probabilities

Page 21: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Asymptotic, Worst-case familywise Type I Error

• We get sharp bounds on the asymptotic, worst-case, Type I error.

• For sample size n, P’ the set of possible data generating distributions, this is defined as:

limsupn→ ∞

[supP∈P'P(Reject at least one true null)]

Page 22: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Example from Earlier• 2-Stage Design, 2 Subpopulations of Interest• Three Null Hypotheses:

H00: Mean Effect in Total (Mixture) Population is ≤ 0.

H01: Mean Effect in Subpopulation 1 (moderate) is ≤ 0.

H02: Mean Effect in Subpopulation 2 (severe) is ≤ 0.

• Stage 1: Recruit Equal Number of Subjects from Each Subpopulation. Get t-statistics:

• Stage 2: if recruit as in Stage 1.

Else, Recruit from severe depressed only.• Test Procedure: if

then reject null for the population enrolled in stage 2.• Our Method Shows Asympt. Worst Case, FWER ≤ 0.05.

T10,T11,T12.

T11 > T12 or T11 > 0.3

1.65. 22

12

1211

T

TT

Page 23: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Computation of Worst-Case Asymptotic Familywise Error RateConsider Case of

μ1 > 0 (H01 false), and μ2 = 0 (H02 true). Type I Error only when Subpopulation 2 Selected

for Recruitment in Stage 2, ANDFinal Statistic:

By our Theorem, we have uniformly* in μ1 , σ1, σ2:

1.65. 22

12

1211

T

TT

.065.14

2/

4HReject

1

1

1

102

nn

P

Page 24: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Computation of Worst-Case Asymptotic Familywise Error RateWorst-Case Familywise Error Rate for case:

μ1 > 0 (H01 false), and μ2 = 0 (H02 true)

.65.14

2/

4suplim

1

1

1

1

, 11

nnn

.0302.065.12/sup0

.0302.0

Thus, in the Case:

μ1 > 0 (H01 false), and μ2 = 0 (H02 true), we haveAsymptotic, Worst-Case, FWER at most

Page 25: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Computation of Worst-Case Asymptotic Familywise Error RateMore generally, for more complex designs,

need to solve optimization problems of the form:

where X is a multivariate normal random variable with mean vector and covariance matrix that are functions of . , for R and S fixed regions.

Solve by grid search combined with bound on approximation error of grid search.

sup(λ1 ,...,λ J )∈R

Pλ1 ,...,λ J(X ∈S),

1,...,λ J

Page 26: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Power ComparisonConsider 3 scenarios:a) New antidepressant works only for those

with severe initial depression, and mean benefit modest (1.8 HRSD points)

b) Same as (a) but benefit strong (3 pts.) c) New antidepressant works equally well for

those with moderate and with severe initial depression, and benefit modest.

For each, consider:1. first stage enrolls equally from each subpop2. 75% of first stage enrolled moderate depr.

Page 27: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Power Comparison

Page 28: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Power Comparison

Page 29: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

When Our Method Can Be AppliedOur Main Theorem:• Gives Reduction of Problem of Computing Worst-Case,

Asymptotic Familywise Error Rate to

Optimization Problem that Often Can be Solved with Standard Statistical Software

• Relies on:

1. Prespecification of Adaptation Algorithm

2. Centered Statistics from Each Stage Asymptotically Normal, and Independent of Past Stage Data Given Previous Design Decision.

3. Uniform Convergence in (2).

4. Selection and Rejection Regions Simple to Compute.

Page 30: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Assumptions

To Get Uniform Asymptotic Normality of Test Statistics, Conditioned on Prior Design Selection, Need to Assume:

• Variances Bounded Away from Zero• Sample Size at Each Stage Goes to InfinityTo Efficiently Compute Asymptotic FWER:• Decision Regions for Between-Stage

Adaptations and Rejection Regions Easy to Specify and Compute.

• Asymptotic Distribution of Test Statistics Depends on Finite Dimensional Parameter of Data Generating Distribution (e.g. Moments)

Page 31: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Limitations of Adaptation

• In General No Guarantee of Power Gain; May Lead to Loss of Efficiency

• Enrichment Designs Result in Reduced Generalizability of Results

• May Encourage Poorly Planned Designs (with Hope Adaptation will Fix Everything)

Page 32: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Open Problems

Generalizing Our Method to Designs with:• Adaptation of Monitoring Frequency• Adaptation of List of Covariates Collected• Survival Outcomes

Confidence Intervals

Page 33: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Conclusions

General Method for Reducing Problem of Computing Worst-case, Asymptotic, Familywise Error Rate to Optimization Problem that Often Can Be Easy to Solve with Standard Statistical Software.

Page 34: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

References

Rosenblum, M. and van der Laan, M.J. Optimizing Randomized Trial Designs to Distinguish which Subpopulations Benefit from Treatment (under review)

http://people.csail.mit.edu/mrosenblum/adaptive_subpop.pdf

Kirsch, I., Deacon, B., Huedo-Medina, T., Scoboria, A., Moore, T. & et al. (2008). Initial severity and antidepressant benefits: A meta-analysis of data submitted to the food and drug administration. PLoS Med 5, e45.doi:10.1371/journal.pmed.0050045.

FDA (2010). Draft Guidance for Industry. Adaptive Design Clinical Trials for Drugs and Biologics. http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM201790.pdf

Page 35: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Comparison to Alternative Analysis Method

Method based on [Bauer and Köhne 1994]:

Use Closed Test Principle [Marcus 1976] to

Deal with Multiple Hypotheses, and Prespecified

Function to Combine Stagewise p-values.

• Extremely Useful, Flexible Method• Facilitates Proving Bounds on FWER• Restricted to Cases where p-values in Each Stage

Dominate U[0,1] under Intersection Null Hypotheses.

Page 36: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

Group Sequential Designs We Consider

• K stages (though may stop early)• Population Sampled at Stage m is

Prespecified Function of Data Collected in Past Stages.

• Allow Early Stopping for Efficacy, Safety, or Futility

• Prespecify: Primary Outcome, Null Hypotheses, Test statistics, Rejection Regions

Page 37: Michael Rosenblum Department of Biostatistics  Johns Hopkins Bloomberg School of Public Health

When Our Method Can Be AppliedOur Main Theorem:• Gives Reduction of Problem of Computing Worst-Case,

Asymptotic Familywise Error Rate to

Optimization Problem that Often Can be Solved with Standard Statistical Software

• Relies on:

1. Prespecification of Adaptation Algorithm

2. For Each Possible Design Selection Before a Stage, Resulting Centered Test-Statistics from that Stage Asymptotically Normal (e.g. t-test or log-rank test)

3. Uniform Convergence in (2).

4. Selection and Rejection Regions Simple to Compute.