7
Available online at www.sciencedirect.com journal homepage: www.elsevier.com/locate/jval The Role of Noncomparative Evidence in Health Technology Assessment Decisions Elizabeth A. Grifths, BSc 1, , Richard Macaulay, BSc, PhD 1 , Nirma K. Vadlamudi, BA, BS, MPH 2 , Jasim Uddin, BSc, PhD 1 , Ebony R. Samuels, BSc, PhD 1 1 PAREXEL Access Consulting, PAREXEL International, London, UK; 2 Faculty of Pharmaceutical Sciences, University of British Columbia, Vancouver, British Columbia, Canada ABSTRACT Background: Many health technology assessment (HTA) agencies express a preference for randomized controlled trial evidence when appraising health technologies; nevertheless, it is not always feasible or ethical to conduct such comparative trials. Objectives: To assess the role of noncomparative evidence in HTA decision making. Meth- ods: The Web sites of the National Institute for Health and Care Excellence (NICE) in the United Kingdom, the Canadian Agency for Drugs and Technologies in Health (CADTH) in Canada, and the Institute for Quality and Efciency in Health Care (Institut für Qualität und Wirtschaftlichkeit im Gesundheitswesen [IQWiG]) in Germany were searched for single HTA reports (published between January 2010 and December 2015). The product, indication, outcome, and clinical evidence presented (comparative/noncomparative) were double- extracted, with any discrepancies reconciled. A noncomparative study was dened as any study not presenting results against another treatment (including placebo or best supportive care), regardless of phase or setting, including dose-ranging studies. Results: A total of 549 appraisals were extracted. Noncomparative evidence was consid- ered in 38% (45 of 118) of NICE submissions, 13% (34 of 262) of CADTH submissions, and 12% (20 of 169) of IQWiG submissions. Evidence submissions based exclusively on noncomparative evidence were presented in only 4% (5 of 118) of NICE appraisals, 6% (16 of 262) of CADTH appraisals, and 4% (6 of 169) of IQWiG appraisals. Most drugs appraised solely on the basis of noncomparative evidence were indicated for cancer or hepatitis C. Positive outcome rates (encompass- ing recommended/restricted/added-benet decisions) for submissions presenting only noncomparative evidence were similar to overall recommendation rates for CADTH (69% vs. 68%, respectively), but were numerically lower for NICE (60% vs. 84%, respectively) and IQWiG (17% vs. 38%, respectively) (P 4 0.05 for all). Conclusions: Noncomparative studies can be viewed as acceptable clinical evidence by HTA agencies when these study designs are justiable and when treatment effect can be convincingly demonstrated, but their use is currently limited. Keywords: clinical effectiveness, decision making, evidence-based medicine, health technology assessment. Copyright & 2017, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. Introduction Health technology assessment (HTA) can be broadly dened as the evaluation of health care interventions in the context of their implications to the wider health system. HTA aims to systemati- cally assess the clinical value (i.e., comparative health benets) and the economic value (i.e., value for money) of interventions to inform decisions regarding their reimbursement and uptake [1]. To assess the clinical value of a new intervention, it is necessary to compare it against currently available interventions in terms of patient-relevant outcomes such as efcacy, safety, and health-related quality of life. To ensure accurate and objec- tive comparisons between interventions, the best available clin- ical evidence must be used. Randomized controlled trials (RCTs) have historically been considered the gold standard in the hierarchy of clinical evidence, surpassed only by meta-analyses of RCTs, whereas nonrandomized studies and uncontrolled studies are considered weaker evidence (Fig. 1) [2]. Compared with non-RCTs, RCTs minimize the likelihood of confounding factors inuencing the results and therefore produce a more robust and less biased estimation of treatment effect. For this reason, HTA agencies usually express a preference for RCT evidence to assess comparative effectiveness [35]. Nevertheless, there are situations in which it is not ethical, feasible, or practical to conduct an RCT [6,7]. For example, it may be unethical to offer a placebo or an intervention that is hypothesized to be less effective than the intervention under evaluation (e.g., in immediately life-threatening disorders). Alter- natively, the disease may be so rare that it would be difcult to recruit enough participants to detect statistically signicant differences between treatment arms (e.g., rare genetic disorders), or there simply may be no established treatment options to compare against (e.g., some advanced cancers). In such cases, noncomparative studies may provide the best available1098-3015$36.00 see front matter Copyright & 2017, International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. http://dx.doi.org/10.1016/j.jval.2017.06.015 E-mail: egrif[email protected]. Address correspondence to: Elizabeth A. Grifths, PAREXEL Access Consulting, PAREXEL International, 160 Euston Road, London NW12DX, UK. VALUE IN HEALTH 20 (2017) 1245 1251

TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

  • Upload
    ngophuc

  • View
    230

  • Download
    5

Embed Size (px)

Citation preview

Page 1: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

Avai lable onl ine at www.sc iencedirect .com

journal homepage: www.elsevier .com/ locate / jva l

The Role of Noncomparative Evidence in Health TechnologyAssessment DecisionsElizabeth A. Griffiths, BSc1,�, Richard Macaulay, BSc, PhD1, Nirma K. Vadlamudi, BA, BS, MPH2,Jasim Uddin, BSc, PhD1, Ebony R. Samuels, BSc, PhD1

1PAREXEL Access Consulting, PAREXEL International, London, UK; 2Faculty of Pharmaceutical Sciences, University of BritishColumbia, Vancouver, British Columbia, Canada

A B S T R A C T

Background: Many health technology assessment (HTA) agenciesexpress a preference for randomized controlled trial evidence whenappraising health technologies; nevertheless, it is not always feasibleor ethical to conduct such comparative trials. Objectives: To assessthe role of noncomparative evidence in HTA decision making. Meth-ods: The Web sites of the National Institute for Health and CareExcellence (NICE) in the United Kingdom, the Canadian Agency forDrugs and Technologies in Health (CADTH) in Canada, and theInstitute for Quality and Efficiency in Health Care (Institut für Qualitätund Wirtschaftlichkeit im Gesundheitswesen [IQWiG]) in Germany weresearched for single HTA reports (published between January 2010 andDecember 2015). The product, indication, outcome, and clinicalevidence presented (comparative/noncomparative) were double-extracted, with any discrepancies reconciled. A noncomparative studywas defined as any study not presenting results against anothertreatment (including placebo or best supportive care), regardless ofphase or setting, including dose-ranging studies. Results: A total of549 appraisals were extracted. Noncomparative evidence was consid-ered in 38% (45 of 118) of NICE submissions, 13% (34 of 262) of CADTH

submissions, and 12% (20 of 169) of IQWiG submissions. Evidencesubmissions based exclusively on noncomparative evidence werepresented in only 4% (5 of 118) of NICE appraisals, 6% (16 of 262) ofCADTH appraisals, and 4% (6 of 169) of IQWiG appraisals. Most drugsappraised solely on the basis of noncomparative evidence wereindicated for cancer or hepatitis C. Positive outcome rates (encompass-ing recommended/restricted/added-benefit decisions) for submissionspresenting only noncomparative evidence were similar to overallrecommendation rates for CADTH (69% vs. 68%, respectively), but werenumerically lower for NICE (60% vs. 84%, respectively) and IQWiG (17%vs. 38%, respectively) (P 4 0.05 for all). Conclusions: Noncomparativestudies can be viewed as acceptable clinical evidence by HTA agencieswhen these study designs are justifiable and when treatment effect canbe convincingly demonstrated, but their use is currently limited.Keywords: clinical effectiveness, decision making, evidence-basedmedicine, health technology assessment.

Copyright & 2017, International Society for Pharmacoeconomics andOutcomes Research (ISPOR). Published by Elsevier Inc.

Introduction

Health technology assessment (HTA) can be broadly defined asthe evaluation of health care interventions in the context of theirimplications to the wider health system. HTA aims to systemati-cally assess the clinical value (i.e., comparative health benefits)and the economic value (i.e., value for money) of interventions toinform decisions regarding their reimbursement and uptake [1].

To assess the clinical value of a new intervention, it isnecessary to compare it against currently available interventionsin terms of patient-relevant outcomes such as efficacy, safety,and health-related quality of life. To ensure accurate and objec-tive comparisons between interventions, the best available clin-ical evidence must be used. Randomized controlled trials (RCTs)have historically been considered the gold standard in thehierarchy of clinical evidence, surpassed only by meta-analysesof RCTs, whereas nonrandomized studies and uncontrolled

studies are considered weaker evidence (Fig. 1) [2]. Comparedwith non-RCTs, RCTs minimize the likelihood of confoundingfactors influencing the results and therefore produce a morerobust and less biased estimation of treatment effect. For thisreason, HTA agencies usually express a preference for RCTevidence to assess comparative effectiveness [3–5].

Nevertheless, there are situations in which it is not ethical,feasible, or practical to conduct an RCT [6,7]. For example, it maybe unethical to offer a placebo or an intervention that ishypothesized to be less effective than the intervention underevaluation (e.g., in immediately life-threatening disorders). Alter-natively, the disease may be so rare that it would be difficult torecruit enough participants to detect statistically significantdifferences between treatment arms (e.g., rare genetic disorders),or there simply may be no established treatment options tocompare against (e.g., some advanced cancers). In such cases,noncomparative studies may provide the “best available”

1098-3015$36.00 – see front matter Copyright & 2017, International Society for Pharmacoeconomics and Outcomes Research (ISPOR).

Published by Elsevier Inc.

http://dx.doi.org/10.1016/j.jval.2017.06.015

E-mail: [email protected].

� Address correspondence to: Elizabeth A. Griffiths, PAREXEL Access Consulting, PAREXEL International, 160 Euston Road, LondonNW12DX, UK.

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 1

Page 2: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

evidence. In the clinical trial setting, noncomparative studiesencompass a range of designs including dose-ranging studies,single-arm trials, case series, and case reports [8–11]. In the “real-world” setting (i.e., outside of the typical clinical trial setting), thismay include registry studies, claims data, and some observatio-nal designs [12]. Although considered less robust than RCTs,noncomparative studies can—and do—inform health care deci-sion making.

In the era of biomarker-based, “personalized” medicine andconditional regulatory approvals based on immature clinical data[13], there may be cases in which reimbursement decisions willneed to be made on the basis of noncomparative studies such asphase 1 or phase 2 trials (e.g., dose-ranging or single-arm trials),with this trend anticipated to continue [14,15]. Both the US Foodand Drug Administration (FDA) and the European MedicinesAgency (EMA) have developed mechanisms to facilitate earlierpatient access to promising medicines, such as breakthroughstatus (FDA), the accelerated approval pathway (FDA), and theadaptive pathways pilot (EMA). For example, the FDA approvedceritinib for non–small cell lung cancer and pembrolizumab formelanoma in 2014 under both the FDA breakthrough status andaccelerated approval processes, supported by only phase 1 data[16,17]. Although these expedited regulatory pathways canensure earlier market authorization, to achieve patient access,public reimbursement must also be achieved, which typicallyrequires previous recommendation by an HTA body.

It will therefore be important to understand how HTA bodiesreact to noncomparative evidence, and whether a positive out-come is achievable with such data. This study aimed to addressthe following research questions to characterize the role ofnoncomparative evidence in HTA decision making, which mayhelp inform future HTA submissions:

1. What are HTA agencies’ perceptions of noncomparative evi-dence, on the basis of recommendation and nonrecommen-dation rates, and what are the key differences in these ratesbetween agencies?

2. Do recommendation rates for submissions presenting non-comparative evidence differ from rates for submissions pre-senting RCT evidence?

3. Are there any differences in recommendation rates/percep-tions by parameters such as disease area and size of thepatient population?

For this research, three HTA bodies were selected: theNational Institute for Health and Care Excellence (NICE) in theUnited Kingdom, the Canadian Agency for Drugs and Technolo-gies in Health (CADTH) in Canada, and the Institute for Qualityand Efficiency in Health Care (Institut für Qualität und

Wirtschaftlichkeit im Gesundheitswesen [IQWiG]) in Germany. Theseagencies were chosen because they represent key jurisdictionsthat use varying criteria to inform decision making in differentsettings, and release publicly available and transparent appraisaldocuments for all interventions that are evaluated.

Methods

Data Sources

The Web sites of the three jurisdictions—NICE (https://www.nice.org.uk/), CADTH (https://www.cadth.ca/), and IQWiG (https://www.iqwig.de/)—were systematically searched for publicly avail-able HTA appraisals published between January 2010 and Decem-ber 2015. This date range was chosen because it allowed for awide range of appraisals across the three agencies to be analyzedand it reflects recent decision-making trends; the evolution ofHTA processes is such that decisions published before 2010 maybe less relevant to today’s reimbursement landscape. The IQWiGand the CADTH pan-Canadian Oncology Drug Review (pCODR)did not publish their first decisions until 2011 and 2012, respec-tively; in addition, all three agencies revisit their methods everyfew years.

Appraisal Selection

The inclusion criteria encompassed all single technology apprais-als for pharmaceutical interventions, irrespective of indication oroutcome. Multiple technology appraisals, appraisals for vaccinesand devices, requests for advice, and health economic dossierswere excluded.

Data Extraction

For each appraisal meeting, the inclusion criteria, indication, dateissued, outcome, and clinical evidence presented were extracted.Appraisals were classified as recommended, restricted, or notrecommended. Detailed definitions of these outcomes aredescribed in Appendix Table 1 in Supplemental Materials foundat http://dx.doi.org/10.1016/j.jval.2017.06.015.

The clinical evidence was defined as the data presented tosupport the clinical case, for efficacy, safety, or quality-of-lifeoutcomes, as categorized by the HTA agency. Data used to informeconomic modeling were not included. The clinical evidence wasfirst categorized into two categories, comparative study or non-comparative study, using the following definitions:

1. Comparative study: Any study against an active or placebocomparator:

• Active-controlled RCT: randomized controlled study against arelevant active comparator (note that best supportive carewas considered to fall under this category);

• Placebo-controlled RCT: randomized controlled study againsta placebo comparator (including vehicle-controlled studies,sham injections, and studies in which the active interventionþ X was compared against placebo þ X);

• Other comparative study: study designs encompassing non-randomized, controlled trials.

2. Noncomparative study: Any study that did not compare againstan active comparator or placebo. This included, but was notlimited to, single-arm trials, dose-ranging studies, registrystudies, compassionate use programs, and uncontrolledextension studies.

Noncomparative studies were further classified into single-arm trials, single-arm extensions of comparative trials, dose-ranging studies, and other (see Appendix Table 2 in Supplemental

Fig. 1 – Evidence hierarchy of clinical studies. Adapted fromAkobeng [2].

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 11246

Page 3: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

Materials found at http://dx.doi.org/10.1016/j.jval.2017.06.015).Categories were not mutually exclusive and more than oneevidence type could be recorded per appraisal if multiple studieswere presented. In addition, submissions were classified by theInternational Statistical Classification of Diseases and Related HealthProblems, Tenth Revision (ICD-10) category to determine whetherany trends could be identified by disease area and by orphanversus nonorphan designation (using the EMA definition of anorphan disease, i.e., affecting ≤5 in 10,000 people in the EuropeanUnion).

All appraisals were extracted independently by two separateauthors, with any discrepancies resolved by a third author.

Data Analyses

The κ statistic [18] was used to evaluate inter-reviewer reliability.The Cohen κ was 0.75, reflecting good concordance betweenreviewers.

Descriptive statistics were presented as proportions for thesubmission and recommendation rates by study design. Oddsratios [18] were calculated to determine the odds of a positivesubmission outcome by evidence category. A two-tailed Fisherexact test was used to evaluate the effect of presenting non-comparative evidence on recommendation rates for each agency(recommended and restricted vs. not recommended), for sub-missions presenting noncomparative evidence versus those notpresenting noncomparative evidence.

Results

A total of 549 appraisals were extracted on the basis of theinclusion criteria: 118 for NICE, 262 for CADTH, and 169 for IQWiG.Overall, NICE had the highest proportion of submissions withfavorable outcomes (57% recommended, with a further 27%recommended with restrictions), followed by CADTH (10% rec-ommended and 58% recommended with restrictions) and IQWiG(14% recommended and 24% restricted).

Noncomparative evidence was considered in 38% (45 of 118) ofNICE submissions, 13% (34 of 262) of CADTH appraisals, and 12%(20 of 169) of IQWiG appraisals. Evidence submissions basedexclusively on noncomparative evidence were presented in only4% (5 of 118) of NICE appraisals, 6% (16 of 262) of CADTHappraisals, and 4% (6 of 169) of IQWiG appraisals (Table 1).

Numerically, the proportion of recommended, restricted, andnot recommended submissions varied depending on the type ofevidence presented for all agencies. Positive outcome rates(encompassing recommended, restricted, and added-benefit deci-sions) for submissions presenting only noncomparative evidencewere similar to overall recommendation rates for CADTH (69% [11of 16] vs. 68% [178 of 262], respectively), but were notably lowerfor NICE (60% [3 of 5] vs. 84% [99 of 118], respectively) and IQWiG(17% [1 of 6] vs. 38% [65 of 169], respectively).

Nevertheless, sample sizes were very small for submissionspresenting noncomparative evidence alone, and the difference inproportions did not meet statistical significance for any agency (P4 0.05). Outcome rates for submissions including noncompar-ative evidence alongside an RCT were both numerically andstatistically similar to the overall outcome rates for all agencies(Table 1).

Overall, the majority of the noncomparative evidence pre-sented in the reviewed HTA submissions consisted of single-armstudies, followed by randomized dosing studies and single-armextension studies (Fig. 2). For NICE and CADTH submissions,most of these studies were used to support efficacy and/or safety,with a smaller proportion supporting quality of life outcomes. ForIQWiG, most of the noncomparative studies were not consideredrelevant for the assessment (Fig. 2).

In terms of disease area, most submissions presenting non-comparative evidence were treatments indicated for cancer orinfection (specifically hepatitis C), with a substantial proportionalso indicated for an orphan disease within these therapy areas(Fig. 3).

No meaningful time trends in the proportion of submissionspresenting noncomparative evidence were observed (Fig. 4). Theabsolute number of submissions presenting noncomparative

Table 1 – Overall recommendation rates and recommendations by study design.

Submission outcome HTA agency

CADTH IQWiG NICE

Total submissions, N 262 169 118Recommended, n (%) 27 (10) 24 (14) 67 (57)Restricted, n (%) 151 (58) 41 (24) 32 (27)Not recommended, n (%) 84 (32) 104 (62) 19 (16)

All submissions including noncomparative evidence, N (%) 34 (13) 21 (12) 45 (38)Recommended, n (%) 6 (18) 2 (10) 24 (53)Restricted, n (%) 16 (47) 5 (25) 14 (31)Not recommended, n (%) 12 (35) 13 (65) 7 (16)Odds of a positive* outcome, OR (95% CI) 0.85 (0.40–1.80) 0.84 (0.32–2.24) 1.07 (0.39–2.95)P value† 0.70 0.81 1.00

Submissions with noncomparative evidence alone, N (%) 16 (6) 6 (4) 5 (4)Recommended, n (%) 2 (13) 0 (0) 2 (40)Restricted, n (%) 9 (56) 1 (17) 1 (20)Not recommended, n (%) 5 (31) 5 (83) 2 (40)Odds of a positive* outcome, OR (95% CI) 1.04 (0.35–3.10) 0.31 (0.04–2.71) 0.27 (0.04–1.71)P value† 1.00 0.41 0.18

CADTH, Canadian Agency for Drugs and Technologies in Health; CI, confidence interval; HTA, health technology assessment; IQWiG, Institutefor Quality and Efficiency in Health Care; NICE, National Institute for Health and Care Excellence; OR, odds ratio.⁎ Positive defined as recommended or recommended with restrictions vs. submissions not presenting noncomparative evidence.† Two-tailed Fisher test, recommended þ restricted vs. not recommended for submissions presenting noncomparative evidence vs. those notpresenting noncomparative evidence.

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 1 1247

Page 4: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

evidence for all jurisdictions did tend to increase over time, butthis reflected an increase in the number of appraisals overall.Similarly, recommendation rates for submissions including non-comparative evidence or presenting noncomparative evidencealone did not vary notably over time, compared with overallrecommendation rates.

In terms of the critique given by different agencies, NICE andCADTH were critical of the lower quality of noncomparativeevidence, highlighting the higher risk of bias, but were preparedto consider this evidence in the absence of more robust studies.IQWiG was generally not prepared to consider noncomparative

studies, with ledipasvir/sofosbuvir for hepatitis C virus infectionbeing the only drug given a favorable recommendation on thebasis of noncomparative evidence alone in 2015 (and indeed theonly submission in which noncomparative evidence was consid-ered acceptable) [19].

Appendix Table 3 in Supplemental Materials found at http://dx.doi.org/10.1016/j.jval.2017.06.015 describes the drugs that wereappraised solely on the basis of noncomparative evidence,including reasons for considering a clinical argument basedon noncomparative evidence alone. These reasons could bebroadly assigned to one of four main categories. In most of the

22

24

1314

6

3

13

7

5

8

43

0

5

10

15

20

25

30

NICE (n=45) CADTH (n=34) IQWIG (n=21)

num

ber o

f app

rais

als

Single-arm study

Single-arm extension study

Randomized, dosing study

Other

3028

3

33

28

3

810

30 1 01 2

17

6

2 00

5

10

15

20

25

30

35

NICE (n=45) CADTH (n=34) IQWIG (n=21)

Efficacy

Safety

QoL

Other

Not considered

Unclear

(A) (B)

Fig. 2 – (A) Type of noncomparative evidence and (B) role of that evidence. Note. Categories are not mutually exclusive (e.g., astudy may provide both efficacy and safety evidence, and more than one type of study may be presented in each submission).“Other” for type of noncomparative evidence includes individual patient data, case series, and retrospective audits; nononcomparative “real-world” studies (e.g., registry studies) were identified. “Other” for role of evidence includesbioequivalence data. “Not considered” refers to submissions in which the agency explicitly stated that the evidence was notconsidered in their assessment. “Unclear” refers to submissions in which the agency did not explicitly state how the evidencewas used for the assessment. CADTH, Canadian Agency for Drugs and Technologies in Health; IQWiG, Institute for Qualityand Efficiency in Health Care; NICE, National Institute for Health and Care Excellence; QoL, quality of life.

Fig. 3 – Submissions presenting noncomparative evidence, by disease area. Note. Orphan drugs and disease categories are notmutually exclusive (e.g., a drug may be categorized under both neoplasms and orphan drugs). CADTH, Canadian Agency forDrugs and Technologies in Health; IQWiG, Institute for Quality and Efficiency in Health Care; NICE, National Institute forHealth and Care Excellence.

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 11248

Page 5: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

submissions in which noncomparative evidence alone was con-sidered acceptable (NICE, 5 of 5; CADTH, 16 of 16; and IQWiG, 1 of6), a lack of effective treatment alternatives to compare against orunmet clinical need for new options was listed as one of thecontributing factors (NICE, 5 of 5; CADTH, 15 of 16; and IQWiG, 0of 1). The second most frequently listed reason was that thetreatment under evaluation was licensed in a small patient

population (and a comparative trial could therefore not bepowered for statistical significance) (NICE, 2 of 5; CADTH, 6 of16; and IQWiG, 0 of 1); thirdly, that the anticipated magnitude oftreatment effect was sufficiently large that it would be unethicalto conduct a comparative trial (NICE, 2 of 5; CADTH, 4 of 16; andIQWiG, 1 of 1). The fourth reason was that the disease was life-threatening and it would be unethical to compare against a

0%

10%

20%

30%

40%

50%

2010 2011 2012 2013 2014 2015

Year

Prop

ortio

n of

sub

mis

sion

s NICE submissions including non-comparative evidence

NICE submissions with non-comparative evidence only

CADTH submissions including non-comparative evidence

CADTH submissions with non-comparative evidence only

IQWIG submissions including non-comparative evidence

IQWiG submissions with non-comparative evidence only

Fig. 4 – Trends over time in the proportion of submissions presenting noncomparative evidence. CADTH, Canadian Agency forDrugs and Technologies in Health; IQWiG, Institute for Quality and Efficiency in Health Care; NICE, National Institute forHealth and Care Excellence.

Table 2 – NICE, CADTH, and IQWiG guidance for the inclusion of noncomparative evidence within submissions.

HTAagency

Guidance on the inclusion of noncomparative evidence

NICE [3] The problems of confounding, lack of blinding, incomplete follow-up, and lack of a clear denominator and end point occurmore commonly in nonrandomized studies and uncontrolled trials than in RCTs.

Inferences will necessarily be more circumspect about relative treatment effects drawn from studies without randomizationor control than those from RCTs.

The potential biases of observational studies should be identified and ideally quantified and adjusted for. When possible,more than one independent source of such evidence should be examined to gain some insight into the validity of anyconclusions.

Evidence from sources other than RCTs is also often used for parameters such as the valuation of health effects over timeinto QALYs and for costs. Study quality can vary, and so systematic review methods, critical appraisal, and sensitivityanalyses are as important for review of these data as they are for reviews of data on relative treatment effects from RCTs.

CADTH [4] If no head-to-head trial(s) have been conducted vs. another available drug and the manufacturer is assuming differentclinical benefit (efficacy or effectiveness or safety) of the submitted drug compared with other drugs in the class, an indirectcomparison is required. The evidence to support these claims should be provided in detail.

Reference to other studies that may provide information on clinical benefits other than efficacy and safety, or postmarketingdata that provide information on longer term safety, should be supplied as required.

For resubmissions, new efficacy data must be from an RCT. (If the new information is in support of improved safety, case-control or cohort studies will be accepted if RCTs are unavailable.)

IQWiG [5] Conclusions for benefit assessments are usually inferred only from the results of direct comparative studies. An RCTrepresents the criterion standard in the assessment of drug and noninterventions.

As a rule, the institute therefore considers RCTs in the benefit assessment of drugs and uses only nonrandomizedintervention studies or observational studies in justified exceptional cases.

Reasons for exception include nonfeasibility of an RCT (e.g., if the therapist and/or patient has a strong preference for aspecific therapy alternative) or if other study types may also provide sufficient certainty of results for the research questionposed.

For diseases that would be fatal within a short period of time without intervention, several consistent case reports mayprovide sufficient certainty of results that a particular intervention prevents this otherwise inevitable course (dramaticeffect).

The institute disapproves of the use of nonadjusted indirect comparisons (i.e., the naive use of single-study arms).

Note. Guidance is copied or paraphrased from the respective method’s guide.CADTH, Canadian Agency for Drugs and Technologies in Health; IQWiG, Institute for Quality and Efficiency in Health Care; NICE, NationalInstitute for Health and Care Excellence; QALY, quality-adjusted life-year; RCT, randomized controlled trial.

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 1 1249

Page 6: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

placebo or a less effective comparator (NICE, 1 of 5; CADTH, 2 of16; and IQWiG, 0 of 1). In many submissions, more than one ofthese reasons was listed.

Discussion

Overall, a minority of HTA submissions gained positive recom-mendations on the basis of noncomparative evidence alone:11 for CADTH (4%), 3 for NICE (3%), and 1 for IQWiG (0.6%).Noncomparative evidence was nevertheless used as supportingevidence in 45 (38%) NICE submissions, 34 (13%) CADTH sub-missions, and 20 (12%) IQWiG submissions. NICE and CADTHwere generally critical of the lower quality of noncomparativeevidence, but recognized that there are situations in whichnoncomparative trial designs could be acceptable, in line withtheir method guidance (Table 2). IQWiG places strong emphasison demonstrating additional therapeutic benefit from directcomparative trials versus the appropriate comparator and washighly critical of submissions presenting noncomparative evi-dence. This was reflected in a correspondingly lower recommen-dation rate for submissions based purely on noncomparativedata than that seen for all submissions assessed. Nevertheless,although recommendation rates did vary between appraisalspresenting comparative evidence and those presenting noncom-parative evidence, sample sizes were small and no significantdifferences were observed.

Most of the submissions presenting noncomparative evidencewere single-arm studies used to support safety and efficacy, withdose-ranging studies and extension studies also represented. Thesingle-arm studies tended to reflect interventions in diseaseareas in which placebo-controlled trials were inappropriate giventhe life-threatening nature of the disease (e.g., cancers at latelines of therapy), interventions for diseases with a small patientpopulation in which trials could not be powered to detectstatistically significant differences between treatment arms(e.g., rare genetic disorders), and interventions for diseases wherepreliminary data suggested a magnitude of benefit thatwas sufficiently great and that comparative trials would not beethical (e.g., hepatitis C). Other examples included terminalor rare conditions in which a high unmet need or lack ofalternative treatment options to compare against would alsomake noncomparative evidence acceptable. Accordingly,most therapies that were recommended on the basis ofnoncomparative evidence were in oncology, orphan diseaseindications, or offered curative potential (specifically in hepatitisC infection).

No notable time trends were observed with respect to the roleof noncomparative evidence in HTA decision making. Never-theless, in an era of increasingly targeted medicine and in whichconditional approvals are becoming more common, it is likelythat noncomparative evidence will begin to play a greater role inHTA. Payers will therefore need to adapt if they are to permitpatient access to innovative new drugs, and they are alreadyconsidering how to do this. For example, although IQWiG con-sidered very few noncomparative studies to be relevant forthe benefit assessment, the 2015 IQWiG-Herbst Symposium (anannual conference that explores current and controversial topicsin health care) focused on the role of real-world data (includingnoncomparative data) in benefit assessments [20]. The firstpositive recommendation by IQWiG in 2015 on the basis ofnoncomparative evidence alone (sofosbuvir/ledipasvir) could alsobe considered an indicative milestone.

Nevertheless, challenges remain, and HTA agencies remaincritical of noncomparative evidence. The use of such evidenceintroduces an element of uncertainty in estimating treatmenteffect; granting access to innovative medicines without a robust

supporting evidence base could be exposing patients to ineffica-cious or even harmful medicines. For example, fewer than half ofoncology medicines are successful at phase 3 [21], yet a consid-erable number have been approved on the basis of phase 2 oreven phase 1 single-arm data. Furthermore, many drugs havereceived accelerated market authorization conditional on subse-quent follow-up data being made available, yet these continue tobe available to patients despite the manufacturer not fulfillingthis requirement [22].

Despite limitations associated with RCTs such as uncertaingeneralizability to clinical practice and limited duration [23], theyremain the most objective tool for quantifying treatment effect. Itis therefore unlikely that the RCT will be replaced by a standardnoncomparative study design as the gold standard in the foresee-able future. Instead, opportunities exist to better define the role ofnoncomparative evidence: acceptable when RCT designs are notpractical, feasible, or ethical, and useful to strengthen andvalidate RCT data in the context of the real world and long-termevaluation.

A key limitation of this analysis was the extraction of informa-tion from publicly available summary reports, rather than frommanufacturers’ submissions (which are generally not publiclyavailable). The summary reports may have left out additionalsupporting evidence included in the manufacturer’s submission orinformation on the role or type of evidence. Nevertheless, it islikely that any noncomparative study that played a role in thedecision-making process would have been described in theappraisal summary document. It should also be noted that thisstudy focused on the role of noncomparative evidence to supportthe clinical value proposition, but that such evidence can also playa role in supporting an economic value proposition (e.g., the use ofsingle-arm extension data to validate long-term cost-effectivenessmodel survival predictions).

To our knowledge, this is the first published study to explicitlyand systematically assess the role of noncomparative evidence inHTA, and it may be useful to inform future evidence generationstrategies for new medicines. The study reflects the priorities ofthree internationally recognized and often-referenced jurisdic-tions; nevertheless, different HTA bodies have varying criteria,and individualized strategies are needed when determining asubmission strategy. Further research could expand the scope ofthis study to include additional HTA agencies.

Conclusions

The role of noncomparative evidence in decision making by HTAbodies is currently small, but may be accepted in cases in which:

1. there is a high unmet need for treatment or there are noestablished alternative treatment options to compare against(e.g., rare genetic disorders and rare cancers);

2. the disease is rare and it is not possible to design a trial withsufficient power to detect statistically significant differencesbetween treatment arms (e.g., ultra-orphan indications);

3. preliminary clinical data suggest a clinical effect that issufficiently large in magnitude that a direct comparative trialto a treatment alternative would not be considered ethical(e.g., hepatitis C);

4. the illness is severe and life-threatening with no efficacioustreatment alternatives and randomizing patients to a placeboor investigator’s choice arm would not be considered ethical(e.g., very advanced cancers).

In the aforementioned categories, noncomparative evidencemay be viewed as acceptable clinical evidence by some HTAbodies as long as the study design is justifiable and meets the

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 11250

Page 7: TheRoleofNoncomparativeEvidenceinHealthTechnology ... Assessment Decisions Elizabeth A. Griffiths, BSc1,, ... Jasim Uddin, BSc, PhD 1, Ebony R. Samuels,

jurisdiction’s reference case, and when treatment benefit can beconvincingly demonstrated. In future, the increasing trendtoward personalized medicine and conditional regulatory appro-vals may necessitate a greater role of noncomparative evidencein HTA decision making.

Source of financial support: This study was financially sup-ported by PAREXEL International in the form of salaries for E. A.Griffiths, R. Macaulay, J. Uddin, and E. R. Samuels. E. A. Griffiths,R. Macaulay, J. Uddin, and E. R. Samuels were employed byPAREXEL International at the time of writing this article. N. K.Vadlamudi was previously employed by PAREXEL Internationalbut did not receive any financial incentive toward this articlefrom PAREXEL International after leaving in August 2016. Thefunders played a role in the study design, data collection andanalysis, decision to publish, and preparation of the manuscript.

Supplemental Materials

Supplemental material accompanying this article can be found inthe online version as a hyperlink at http://dx.doi.org/10.1016/j.jval.2017.06.015 or, if a hard copy of article, at www.valueinhealthjournal.com/issues (select volume, issue, and article).

R E F E R E N C E S

[1] Drummond MF, Schwartz JS, Jönsson B, et al. Key principles for theimproved conduct of health technology assessments for resourceallocation decisions. Int J Technol Assess Health Care 2008;24:244–58.

[2] Akobeng AK. Understanding randomised controlled trials. Arch DisChild 2005;90:840–4.

[3] National Institute for Health and Care Excellence. Guide to the methodsof technology appraisal 2013. Available from: https://www.nice.org.uk/process/pmg9/chapter/foreword. [Accessed March 1, 2016].

[4] Canadian Agency for Drugs and Technologies in Health. Procedure forthe CADTH Common Drug Review 2014. Available from: https://www.cadth.ca/media/cdr/process/Procedure_for_CADTH_CDR.pdf. [AccessedMarch 1, 2016].

[5] Institute for Quality and Efficiency in Health Care. IQWiG generalmethods version 4.2 2016. Available from: https://www.iqwig.de/download/IQWiG_General_Methods_Version_%204-2.pdf. [AccessedMarch 1, 2016].

[6] Williams BA. Perils of evidence-based medicine. Perspect Biol Med2010;53:106–20.

[7] Grossman J, Mackenzie FJ. The randomized controlled trial: goldstandard, or merely standard? Perspect Biol Med 2005;48:516–34.

[8] BMJ Publishing Group Ltd. Glossary. Evid Based Nurs, 2009;12:128.[9] Elsevier. Glossary of methodologic terms 2017. Available from: https://

www.elsevier.com/__data/promis_misc/apmrglossary.pdf. [AccessedMarch 1, 2017].

[10] Wolters Kluwer. Glossary 2017. Available from: http://journals.lww.com/jlgtd/Pages/Study-Design-Glossary.aspx. [Accessed March 1, 2017].

[11] ClinicalTrials.gov. Glossary of common site terms. 2017. Availablefrom: https://clinicaltrials.gov/ct2/about-studies/glossary. [AccessedMarch 7, 2017].

[12] Dekkers OM, Egger M, Altman DG, Vandenbroucke JP. Distinguishingcase series from cohort studies. Ann Intern Med 2012;156(Pt 1):37–40.

[13] US Food and Drug Administration. Paving the way for personalizedmedicine: FDA’s role in a new era of medical product development2013. Available from: https://www.fda.gov/downloads/scienceresearch/specialtopics/personalizedmedicine/ucm372421.pdf. [Accessed March1, 2016].

[14] Moloney R, Mohr P, Hawe E, et al. Payer perspectives on futureacceptability of comparative effectiveness and relative effectivenessresearch. Int J Technol Assess Health Care 2015;31:90–8.

[15] Milne C-P, Cohen JP, Felix A, Chakravarthy R. Impact of postapprovalevidence generation on the biopharmaceutical industry. Clin Ther2015;37:1852–8.

[16] Chabner BA. Approval after phase I: ceritinib runs the three-minutemile. Oncologist 2014;19:577–8.

[17] US Food and Drug Administration. FDA approves Keytruda foradvanced non-small cell lung cancer 2015. Available from: https://www.fda.gov/newsevents/newsroom/pressannouncements/ucm465444.htm. [Accessed March 1, 2016].

[18] Fleiss JL, Levin B, Paik MC. The comparison of proportions from severalindependent samples. Statistical Methods for Rates and Proportions[online]. Hoboken, NJ: John Wiley & Sons, Inc. 2003;187–233. Availablefrom: http://onlinelibrary.wiley.com/doi/10.1002/0471445428.ch9/summary. [1 May 2017].

[19] Institute for Quality and Efficiency in Health Care. Ledipasvir/sofosbuvir—benefit assessment according to §35a Social Code Book V(dossier assessment) 2015. Available from: https://www.ncbi.nlm.nih.gov/books/NBK385695/. [Accessed March 7, 2017].

[20] Institute for Quality and Efficiency in Health Care. Autumn Symposium2015: real-world data for benefit assessments: How can registries androutine data contribute? 2015. Available from: https://www.iqwig.de/en/events/autumn-symposium/symposium-2015.6883.html. [AccessedOctober 1, 2016].

[21] Hay M, Thomas DW, Craighead JL, et al. Clinical development successrates for investigational drugs. Nat Biotechnol 2014;32:40–51.

[22] Macaulay R. Downfalls of the FDA accelerated approval pathway—stringent controls must be invoked to ensure prompt submission offollow-up confirmatory trial data. Value Health 2014;17:A98.

[23] Silverman SL. From randomized controlled trials to observationalstudies. Am J Med 2009;122:114–20.

V A L U E I N H E A L T H 2 0 ( 2 0 1 7 ) 1 2 4 5 – 1 2 5 1 1251