41
Towards a Complete Solution for Cost/Effectiveness in Oncology: Handling Heterogeneity, Variability and Censoring Gerhardt Pohl Eli Lilly and Company

Gerhardt Pohl Eli Lilly and Company

  • Upload
    george

  • View
    44

  • Download
    2

Embed Size (px)

DESCRIPTION

Towards a Complete Solution for Cost/Effectiveness in Oncology: Handling Heterogeneity, Variability and Censoring. Gerhardt Pohl Eli Lilly and Company. Objective. - PowerPoint PPT Presentation

Citation preview

Page 1: Gerhardt Pohl Eli Lilly and Company

Towards a Complete Solution for Cost/Effectiveness in Oncology: Handling Heterogeneity, Variability and Censoring

Gerhardt PohlEli Lilly and Company

Page 2: Gerhardt Pohl Eli Lilly and Company

Objective

We view a “complete” solution to the problem of calculating the Incremental Cost Effectiveness Ratio (ICER) in oncology as one simultaneously addressing the key issues of (1) Heterogeneity (2) Variability and (3) Censoring.The talk will discuss a stratified, bootstrap approach and draw links to the concept of “local propensity.”

Page 3: Gerhardt Pohl Eli Lilly and Company

What is an ICER??

• Incremental Cost Effectiveness Ratio

• Ratio of difference in mean cost divided by difference in mean effectiveness.

where T denotes the new treatment and C the control

Page 4: Gerhardt Pohl Eli Lilly and Company

Impact

• ICER is the primary tool used in cost-effectiveness comparisons by HTA (Health Technology Assessment) bodies around the world.

• ICER’s allow comparisons of treatments between therapies and across disease states to allow appropriate choices in national health expenditures.

Page 5: Gerhardt Pohl Eli Lilly and Company

ICER Graphically

QALY

$

ICE Plane

Cheaper and BetterCheaper but Less Effective

More Expense for More Effectiveness

More Expensive and Less Effective

(Quality Adjusted Life Years)

Page 6: Gerhardt Pohl Eli Lilly and Company

NICE Thresholds

£50,000/QALY (E

nd of Life Status)

£30,000/QALYNot Approvable

Approvable

“End of Life Status” is granted only for treatments which are life-extending (>3months) for patients (<7,000) with short life expectancy (<24 months)

National Institute for Health and Clinical Excellence

Page 7: Gerhardt Pohl Eli Lilly and Company

Analysis GoalCreate a bootstrapped display of variability in the ICE plane where each iteration is based on risk-adjusted and censored estimates of the cost and survival.

Frick KD, et al. “Modeled cost-effectiveness of the experience corps Baltimore based on a pilot randomized trial.” Journal of Urban Health 2004; 81:106-117.

Page 8: Gerhardt Pohl Eli Lilly and Company

Addressing Heterogeneity

Page 9: Gerhardt Pohl Eli Lilly and Company

Propensity Score Pictorially

Age

Com

orbi

ditie

s

Propensit

y Sco

re Strata

More likely to receive red treatment

More likely to receive blue treatment

Page 10: Gerhardt Pohl Eli Lilly and Company

A Downside of Propensity Scoring• Patients with the identical propensity score may have very

different covariate levels.

Age

Com

orbi

ditie

s

Young and Sick

Old But Healthy

Page 11: Gerhardt Pohl Eli Lilly and Company

Blocking• Grid the factor space into blocks (unordered strata) of similar

patients.• This may be thought of as a many-to-many matching directly

in the covariate space.

Age

Com

orbi

ditie

s

Stratum 1 Stratum 2 Stratum 3

Stratum 4 Stratum 5 Stratum 6

Stratum 7 Stratum 8 Stratum 9

Page 12: Gerhardt Pohl Eli Lilly and Company

General Approach• Whatever the original dimension of the covariate space, this

reduces the problem to cross-classification of treatments versus strata.

Stratum 1 Stratum 2 Stratum 3

Stratum 4 Stratum 5 Stratum 6

Stratum 7 Stratum 8 Stratum 9

Stratum 1 Stratum 2 … TotalTreatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Page 13: Gerhardt Pohl Eli Lilly and Company

Calculate Within-Stratum Treatment Differences

Stratum 1 Stratum 2 Etc. TotalTreatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Cost:

Effectiveness:

How to pool across strata?

Page 14: Gerhardt Pohl Eli Lilly and Company

Stratum Weighting

For overall mean difference, pool relative to size of strata:

Definition of Stratified ICER:

Page 15: Gerhardt Pohl Eli Lilly and Company

Pros and Cons

Blocking• Non-parametric• Provides better matches of

underlying covariates• Able to capture complex

interactions of covariates and likelihood of treatment

Propensity Score Matching• Reduces complexity of

covariate space down to one dimension

• Can maintain structure/ordering of covariate levels

• Can borrow information across blocks

• Potentially uses full richness of continuous data

Page 16: Gerhardt Pohl Eli Lilly and Company

Addressing Variability

Page 17: Gerhardt Pohl Eli Lilly and Company

Variability in the ICER Estimate• Many methods have been proposed to incorporate variability into ICER based

inference:

– Univariate Sensitivity Analyses (Tornado Diagrams)– Confidence Intervals (Fieller’s Theorem, 2-dimensional boxes, ellipses, wedges,

bootstrapping…)– Simulation– Cost-Effectiveness Acceptability Curves– Net Monetary Benefit– And various combinations thereof…

• Fundamental technical issue is that the ratio of 2 normal variables is not normal. (Nor very tractable! For example, what if confidence interval for the denominator includes zero.)

• Evidence in literature suggests that Bootstrapping provides robust and consistent inference. Bootstrapping is also relatively assumption free and easily, explained heuristically .

Page 18: Gerhardt Pohl Eli Lilly and Company

Bootstrapping the Stratified ICER

• Pre-specify strata (factors and cutoff levels).• Sample individual patient cost/effectiveness pairs.• Draw samples with replacement proportional to the

size of the treatment groups -- Ignore stratification in drawing the samples.

• Use the fixed, pre-specified boundaries to divide each sample into strata.

• Calculate stratified difference in cost, stratified difference in effectiveness and stratified ICER for each sample.

Page 19: Gerhardt Pohl Eli Lilly and Company

Interpreting the Bootstrap Samples

+Upper

95% Co

nf. In

t.

Lower 95% Conf. Int.

ICER

QALY

$

50.5%48.0%

0.5%1.0%

+

QALY

$

Windshield Wiper Regions Proportion in Each Quadrant of ICE Plane

Page 20: Gerhardt Pohl Eli Lilly and Company

Effects of Bootstrapping against Fixed Strata Boundaries

• The fraction of patients treated with one or the other treatment within a stratum changes from sample to sample.

• This captures (some of) the component of variability due to estimating the propensity.

• Fixed boundaries prevent technical issues that violate assumptions necessary to assure convergence of the bootstrap. (Abade and Imbens. “On the failure of the bootstrap fro Matching Estimators”. Econometrica, Vol. 76, Issue 6, pp. 1537-1557, Nov. 2008.)

Page 21: Gerhardt Pohl Eli Lilly and Company

Stratification Redux• Whatever the original dimension of the covariate space,

reduce the problem to cross-classification of treatments versus strata.

Stratum 1 Stratum 2 Stratum 3

Stratum 4 Stratum 5 Stratum 6

Stratum 7 Stratum 8 Stratum 9

Stratum 1 Stratum 2 … TotalTreatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Page 22: Gerhardt Pohl Eli Lilly and Company

Local Propensity Score• Define the “Local Propensity Score” as the fraction of patients

treated with treatment A within each stratum.• Blocking can be thought of as fitting a step function for the

estimated propensity.Stratum 1 Stratum 2 … Total

Treatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Propensity pj = nT1/N1 nT2/N2

Page 23: Gerhardt Pohl Eli Lilly and Company

Inverse Propensity Weighting

Page 24: Gerhardt Pohl Eli Lilly and Company

A Curious Equivalence

Page 25: Gerhardt Pohl Eli Lilly and Company

Some Algebra

Sum by blocks

Within-block Average

Sum of Weights isTotal Sample Size

Page 26: Gerhardt Pohl Eli Lilly and Company

Consequences of Equivalence

• (For a class of statistics…)• Consider two individuals from different blocks but

with same propensity score. They enter IPW statistic identically with same weight.

• Therefore, even if matched via propensity score, the summary statistic remains the same.

• Stated another way, matching within strata then calculating local propensity is equivalent to deriving local propensity and then matching via that propensity.

Page 27: Gerhardt Pohl Eli Lilly and Company

The Downside of PS Does Not Apply to Stratified Local Propensity Scoring

• Underlying differences in covariates are irrelevant to the summary statistic.

Age

Com

orbi

ditie

s

Young and Sick

Old But Healthy

Page 28: Gerhardt Pohl Eli Lilly and Company

Three Possible WeightingsStratum 1 Stratum 2 Etc. Total

Treatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Marginal wrt Treatment:

Marginal wrt Control:

Marginal wrt Population:

Weighting Within-Stratum Differences

Page 29: Gerhardt Pohl Eli Lilly and Company

Three Possible WeightingsStratum 1 Stratum 2 Etc. Total

Treatment nT1 nT2 NT

Control nC1 nC2 NC

Total N1 N2 N

Marginal wrt Treatment:

Marginal wrt Control:

Marginal wrtPopulation: Average Treatment Effect (ATE)

Average Treatment amongControls (ATT for Control Group)

Average Treatment amongTreated (ATT)

Weighting Within-Stratum Differences IPW/Causal Inference

Page 30: Gerhardt Pohl Eli Lilly and Company

Relationship among Weightings

Average Treatment Effect (ATE)

Average Treatment amongControls (ATT for Control Group)

Average Treatment amongTreated (ATT)

I.e., ATE is a convex combination of the ATT weightings for Treatment and Control proportional to the size of the groups.

Page 31: Gerhardt Pohl Eli Lilly and Company

ImplicationsATT and ATE weightings are the same,

if and only if, there is uniform propensity to treat with regard to strata.

Therefore, differences under ATT and ATE weightings inform on the impact of directed prescribing (a.k.a. propensity).

Page 32: Gerhardt Pohl Eli Lilly and Company

Population-Based Weightings Are Not the Same as ANOVA Weightings

• SAS Type I:

• SAS Type II:

• SAS Type III:

Differ, so not collapsible to function of

Inversely proportional to variance of

Uniform across strata

Page 33: Gerhardt Pohl Eli Lilly and Company

Addressing Censoring

Page 34: Gerhardt Pohl Eli Lilly and Company

Censored Survival• A recent paper reviewed the survival component of 45 Health Technology Assessments

(HTA) submitted to National Institute for Health and Clinical Excellence (NICE) in the cancer disease area.

• Nicholas R. Latimer. “Survival Analysis for Economic Evaluations alongside Clinical Trials—Extrapolation with Patient-Level Data: Inconsistencies, Limitations, and a Practical Guide”. Medical Decision Making, published online 22 January 2013.

• A variety of methods were noted as having been used to estimate mean survival in the assessments– restricted means analyses, i.e., area under the K-M curve up until a certain point– parametric modeling (exponential, Weibull, Gompertz, etc.)– Partial Likelihood Regression/Proportional Hazards (which accounts for heterogeneity)– Reliance on estimates external to the study

• STRONG PREFERENCE of author for parametric modeling : “a lifetime horizon is usually advocated, particularly for interventions that affect survival. Therefore, in the presence of censoring, extrapolation is required to predict the complete survival impact of the new intervention, which may be summarized as the mean survival benefit.

Page 35: Gerhardt Pohl Eli Lilly and Company

Is Censoring Really So Bad?• “In 17 (38%) [H]TAs, extrapolation was not performed,

with the survival analysis based purely on the observed trial data (restricted means analysis). Appropriately, this was generally only the case when there was relatively little censoring in the survival data from the trial.”

• Consider a study with 1,000 patients followed until death. Now, add data from 1,000 more patients that have incomplete censored data. The supplemented data has 50% censoring but more information content. Is it really worse than a study one half the size with no censoring?

Page 36: Gerhardt Pohl Eli Lilly and Company

Our Approach

• The restricted K-M mean provides the advantage of using all data and being non-parametric. So long as time horizon is adequately long to fully characterize survival function, censoring should not be a problem.

• We calculate the K-M mean within each stratum as area under the curve up until the last death. This is conservative as regards the ICER as it tends to underestimate the denominator.

Page 37: Gerhardt Pohl Eli Lilly and Company

Censored Costs

• D. Y. Lin; E. J. Feuer; R. Etzioni; Y. Wax. “Estimating Medical Costs from Incomplete Follow-Up Data”. Biometrics, Vol. 53, No. 2. (Jun., 1997), pp. 419-434.

• Basic concept is to calculate expected cost via conditioning, i.e., as the sum all intervals of the probability of survival to start of an interval times the average cost incurred during the interval by patients alive at the start of the interval.

Page 38: Gerhardt Pohl Eli Lilly and Company

Calculation of Censored Cost

PatientEvent Time

Death or

Censor Cost1 Cost2 Cost3 …1 22 1 $1,533 $3,742 $4,899 2 18 1 $5,426 $2,538 $3,745 3 39 0 $6,792 $4,407 $3,890 …

Input Dataset

One Record per Patient

Time to Death or Censoring

Total Cost in Interval (E.g., Weekly Cost)

Page 39: Gerhardt Pohl Eli Lilly and Company

Calculation of Censored Cost

PatientEvent Time

Death or

Censor Cost1 Cost2 Cost3 …1 22 1 $1,533 $3,742 $4,899 2 18 1 $5,426 $2,538 $3,745 3 39 0 $6,792 $4,407 $3,890 …

Week S1 0.982 0.873 0.66…

Week Avg Cost1 $4,583 2 $3,562 3 $4,178 …

Kaplan-Meier Calculate Averages

Input Dataset

Probability ofSurvival untilTime Period

Average Period Cost among Patients Alive at

Start of Time Period

One Record per Time Interval

One Record per Patient

Page 40: Gerhardt Pohl Eli Lilly and Company

Calculation of Censored Cost

PatientEvent Time

Death or

Censor Cost1 Cost2 Cost3 …1 22 1 $1,533 $3,742 $4,899 2 18 1 $5,426 $2,538 $3,745 3 39 0 $6,792 $4,407 $3,890 …

Week S1 0.982 0.873 0.66…

Week Avg Cost1 $4,583 2 $3,562 3 $4,178 …

$10,348

Kaplan-Meier Calculate Averages

Input Dataset

Probability ofSurvival untilTime Period

Average Period Cost among Patients Alive at

Start of Time Period

One Record per Time Interval

One Record per Patient

Expected (Censoring-Adjusted) Cost

Page 41: Gerhardt Pohl Eli Lilly and Company

Pulling It All Together

• To adjust for heterogeneous risk stratify.• Address censoring by calculating censored mean

survival and expected cost within strata.• Pool strata relative to stratum size.• Bootstrap at individual patient level against the pre-

specified strata boundaries to incorporate variability arising from outcomes and propensity estimates.

• All together, this yields a complete method to calculate ICER in oncology.