The Critical Path Opportunities for Efficiency in Development Robert J. Temple, M.D. Associate Director for Medical Policy Center for Drug Evaluation and

The Critical PathOpportunities for Efficiency in

Development

Robert J. Temple, M.D.Associate Director for Medical PolicyCenter for Drug Evaluation and ResearchU.S. Food and Drug Administration

FDA Science Board Advisory Committee MeetingApril 22, 2004

2

What Would Represent Efficiency?Two major possibilities

1. Decreasing the cost of studies or the number of studies• Simpler, lower cost studies – collect less data• Trials offshore• Develop pre-existing trial networks, more use of

central IRB’s• More efficient study designs• Use existing flexibility; no talk of lowering standards,

but– fewer patients in life-threatening diseases– better assessment and utilization of valid or

reasonable surrogate endpoints

Real possibilities here and I’ll discuss some, but there is a second greater efficiency

3

What Would Represent Efficiency?2. Improve the quality of development to obtain valid

answers earlier; i.e., get the right answer in phase 2

• Terminate development of ineffective or unsafe drugs sooner

• Proceed into full development only with drugs likely to succeed

• Not lose effective and safe drugs because of the wrong dose or other design defects

This is the area of greatest potential gain to industry and the public and is primarily what I will consider

4

Startling FactPhRMA says that of drugs completing phase 2, about 50% fail in phase 3, often because of lack of effectiveness

But phase 2 is supposed to demonstrate effectiveness in some defined population. What is going wrong?

Well, we don’t know yet and PhRMA’s number 1 task should be to find out

But we at FDA have some ideas and a commitment to try to help: the Critical Path Initiative, studies of how to improve translational research

5

Specific FDA Efforts1. Under PDUFA 3, pilot 2

Much more intense early interaction with sponsors for selected drugs. Goal is greater efficiency, fewer omissions of critical data, etc.

2. “Track IV GRP” and related efforts – develop checklists and guidance for critical meetings to be sure needed data of all kinds are discussed. Premise (my view): you can’t only respond to sponsors’ questions; sponsors may not ask everything that should be asked because of a) anxiety about answer, b) unawareness of an issue. FDA staff need to ask the omitted questions.

6

Specific FDA Efforts (cont.)

3. Guidance on specific clinical areas and more general guidance, such as:

• Exposure-response• QT evaluation• Evaluation of hepatotoxicity

7

Specific FDA Efforts (cont.)Again, FDA’s efforts will help only somewhat until phase 3 failures are better understood as:

• Unavoidable− Surprise infrequent adverse effect− No valid biomarker, so no possible

early insight until phase 3 (oral iib/iiia inhibitors)

− Adverse effects showing up with longer exposure

8

Specific FDA Efforts (cont.)• Avoidable (partial list)

– Failure to use biomarkers in dose evaluation– Study too narrow a dose range– Overoptimism regarding less than adequate

phase 2 (leading to “confirm” before you’ve “learned” enough)

– Failure to continue dose finding in phase 3, choosing wrong single dose or regimen based on too little data

– Inadequate metabolic, QT or interaction work-up– Subset chasing

9

Focus on Phase 2I will examine ways that phase 2 controlled trials can be redesigned to give more unequivocal answers than they apparently now do

There are design possibilities, infrequently used, that may be more efficient, i.e., giving a surer answer with less effort, and may provide added important data. These include:

• Enrichment approaches - larger effect sizes give surer answers

• Reversing the sequence - the randomized withdrawal study

• Better dose finding - a useful titration design (Sheiner) and attention throughout phase 3 to dose

10

Enrichment

Enrichment of a population is any selection maneuver that makes the population more likely to be able to participate properly in the study, have the endpoint of interest, or respond to treatment, all of which increase study power for showing an effect of treatment by increasing effect size, increasing the number of events or decreasing heterogeneity. All studies (almost) are enriched to some degree, but some enrichments are more controversial than others, and provoke concerns about “generalizability,” i.e., relevance to the population that will get the drug

11

Enrichment (cont.)

A. Practical – routine, almost always acceptable• Find (prospectively) likely compliers• Choose people who will not drop out• Eliminate placebo-responders in a lead-in

period• Eliminate people who give inconsistent

treadmill results• Eliminate people with diseases likely to lead

to early death• Eliminate people on drugs with same effect as

test drug

12

Enrichment (cont.)B. Pathophysiological - based on understanding of

disease, also almost always acceptable

• Edema can result from hepatic, renal or cardiac causes. Choose the last for study of an inotrope

• CHF can result from systolic or diastolic dysfunction. Choose the former for study of an inotrope

• We distinguish (some) causes of pain: angina, vasospastic angina, migraine, menstrual pain, etc., where we believe etiologies are distinct and particular pharmacologic effects are pertinent

13

Enrichment (cont.)B. Pathophysiologic - based on understanding of disease

• Hypertension can be high-renin or low-renin. Could study BB’s, ACE’s, or AIIB’s in the former. There is no doubt a high renin population would show a much larger effect than a mixed population

• A well-established genetically determined difference could be the basis for a pathophysiologically selected enriched population. In some cases, a marker associated with a particular tumor characteristic or even found retrospectively to predict response could be a basis for selection. Most convincing so far are tumor genetics: Herceptin for Her2+ breast tumors; selection of ER+ breast tumors for anti-estrogen treatment

14

Enrichment (cont.)C. Clinical Response - potentially more controversial

- examining patient response before entry to identify likely responders, who would then be studied in a rigorous trial

1. Response other than trial response

CAST carried out in people with > 70% VPB suppression (study endpoint was survival)

Trial of topical nitrate in people with BP response to sublingual NTG

15

Enrichment (cont.)2. Screen for true drug response, then

randomize only the responders. Particularly powerful where a low response rate is expected

a. Past examples• Oates, Woosley, Roden

antiarrhythmic studies• Trial of nitrate patch after screen

for treadmill response to NTG• History of response to a class

16

17

Enrichment (cont.)b. Other possibilities

In a clinical screen, you would give drug to a group, find apparent responders, take the drug away, then randomize to drug vs. control when sign or symptom reappears (What Oates, et. al. did). Promising in any setting where effectiveness is hard to show because only some of the population responds

• GI disease - notorious difficulty in showing effect of motility-modifying drugs (cisapride, domperidone). Consider open screen, then randomize the responders to placebo-controlled trial for their next episode

• Pulmonary drugs - anti-asthmatics other than beta agonists and steroids difficult to show effect. Suppose open screen to find responders, then study (Cromolyn anecdote)

18

Enrichment (cont.)3. Picking people likely to have the event

• Automatic in treating symptoms or lab abnormalities – 100% of patients thought to have the condition

• Regularly used in outcome trials to find high risk patients (cholesterol trials, BP trials, CHF trials started with highest risk patients). In fact, any secondary prevention trial, whether in AMI, stroke, or breast cancer, is a high risk population

• Genetic risk factors (for cancers, Alzheimers’s Disease) can be used in this way, as could “proteomic” risk factors (high CRP, PlGF)

• Known response to provocation - Successful meal-associated heartburn trials used people who responded to a provocative meal (pizza with everything and unlimited Chianti wine). Had a well-defined, 100% with condition to be treated, population.

19

Enrichment (cont.)

An enriched population must be recognized for what it is, but studies using these approaches can demonstrate an effect in the group studied far more efficiently, providing “proof of principle,” clear evidence of clinical effectiveness. Phase 3 then refines that evidence for different doses, populations, etc., but with firm assurance that the drug is effective (at least in some people)

20

Randomized WithdrawalAmery in 1975 proposed a “more ethical” design for angina trials, which then often ran 8 weeks to 6 months in patients with frequent attacks (before regular CABG and angioplasty)

Patients initially receive open treatment with the test drug, then are randomized to test drug (at one or more doses) or placebo. Endpoint can be time to failure (early escape) or conventional measure (attacks per week)

21

Randomized WithdrawalNow standard for anti-depressant maintenance studies, where it is the obvious choice, but useful whenever you want to assess long-term effectiveness but would not want to use a long-term placebo (BP, cholesterol, possibly blood sugar, CHF). Attractive in pediatric studies because of short period of symptoms without treatment [Note that in depression, where standard studies fail 50% of the time, randomized withdrawal studies almost never fail]

22

Randomized Withdrawal

Illustrations

1. Vasospastic Angina

2. Cataplexy

23

24

25

26

Patients on treatment with sodium oxybate for cataplexy with narcolepsy for 7-44 months randomized to continued treatment of placebo

median attacks/2 weeks Baseline Change in Rate

Placebo (29) 4.0 +21.0

sodium oxybate (26) 1.9 0

p<0.001

Clearly demonstrated persisting long-term effect

27

Randomized Withdrawal (cont.)Other Possibilities

1. Confirm a subset observation

One potential source of erroneous progression to phase 3 is “over-interpretation” of a subset finding in phase 2. These are known to be “treacherous.” Such a finding could, however, be confirmed by a randomized withdrawal trial in the responder subset. A favorable response would be strong evidence of active drug

28

Randomized Withdrawal (cont.)Other Possibilities

2. Confirm a dramatic response

Studies are designed to show average responses and, usually, the range of responses seen on drug and placebo is similar. That means you cannot really use the size of individual responses easily to conclude that the drug has very large effects in some people.

But a WD study might do this

When Lotronex was found to cause ischemic colitis and “surgical constipation,” one possibility, I thought, was a randomized WD trial in the “hyperresponders,” people previously disabled by IBS, still on drug, who might have been willing to help save it by their participation. I believe the study could have been completed in a few months at most. Never done.

29

Randomized Withdrawal (cont.)Design has major advantages

• Efficient: “enriched,” with larger drug-placebo difference

• Efficient: patients already exist and known, e.g., a part of an open, or access protocol

• Ethical: can stop as soon as failure criterion met, very attractive in pediatrics

• Can verify a subgroup finding when overall study result is negative. Reassuring for phase 3 effort

• Potential for rigorous evaluation of good responders; possible Lotronex study of people with dramatic improvements (either from trials or open experience)

• Can easily incorporate D/R

30

Dose-Finding

We won’t know until failures are well-examined, but I believe an important cause of phase 3 failure is not getting the dose or dose-interval right, leading to unacceptable toxicity or inadequate effectiveness

31

Better Dose-FindingBefore considering designs:

The impression that dose-finding is largely completed in phase 2 is a terrible error

Phase 2 studies almost never can detect small differences in effect, and cannot give useful information on safety except for the most common events

Having all or most phase 3 studies be D/R is usual for antihypertensives and antidepressants, anti-migraines, and anti-psychotics. This should be more common

32

Efficiency in D/R

A. Use PD information and efficient designs to narrow range of doses to study clinically:

Where PD mechanism is well understood (ACEI’s, AIIB’s, beta blockers, inhibitors of platelet function) use the PD information, with particular attention to duration, to identify dose range (but don’t just believe it; test the expectation; sometimes clinical effect has different duration)

33

Efficiency• It is well-recognized that field studies of anti-

histamines require large patient samples (> 200 per group often) and fail regularly nonetheless. Chamber studies (antigen introduced, don’t depend on pollen, winds) are a kind of PD study (but with a standard clinical endpoint), require much smaller numbers

Will also need field studies, but initial dose finding surely should be in chamber studies (also better for time of onset, duration of effect). Then try to confirm in field studies

34

EfficiencyConsider conducting dose response studies for effectiveness in known responders to the drug or drug class to increase sensitivity, or identify responders pharmacologically, if possible. The only effect of including non-responders is to obscure (flatten) the dose-response relationship. Note, though, that non-responders may have adverse effects and cannot be ignored.

It would usually be important to test non-responders separately to see if they merely have a shift in D/R, an important discovery, if true, and, if responders are not identifiable, studies in a non-selected population would be needed to assess overall B/R.

35

36

Efficiency

Other areas (in addition to hypertension) where looking at responders in dose response studies could be useful:

• Asthma drugs (Cromolyn-anecdote)• All symptomatic GI conditions • Anti-arrhythmics• Anti-depressants• BPH drugs

37

Efficiency

B. Study a full range of doses in phase 3 to establish dose response for both favorable and unfavorable effects and to locate less than fully effective dose that may still be useful. Also, if possible, try to see whether sub-effective dose represents some people responding fully or all people responding a little. Doing this may need individual dose response curves and crossover designs

38

Efficiency

C. Examine maintenance dose

When dose-finding occurs, it is almost always during initial treatment. For long half-life drugs particularly, but others too, examining the maintenance dose response could be very useful

• If 20 mg of fluoxetine works acutely and drug and metabolites have half lives well > 1 week, the maintenance dose is surely well under 20 mg, based solely on PK arguments. Lower maintenance doses could lead to a wide range of safety advantages. This has never been studied

39

Efficiency• It would not be surprising if the dose needed to

treat acute exacerbations of mania, depression, and other diseases was larger than the dose needed to maintain patients. Perhaps alternate day dosing would work. (Could this be true for Lotronex? Would lower doses have given less constipation or even less ischemic colitis?)

• Astemizole has a long half-life but was used acutely in seasonal allergies. It could have been used as a loading dose of 10 mg with subsequent lower doses < 3 mg. That would have placed dose at about 1/5 of QT prolonging dose, instead of at 1/2. The drug might still be available

40

Efficiency

Maintenance dose-response studies are easy. Use randomized withdrawal design. People on treatment, doing well, are randomized to placebo and several doses of the drug

41

Lots of Possibilities

The alternative study designs I’ve shown have all been used

They may make people nervous but they have one feature in common. They are all rigorous RCTs so they definitely tell you something. There could be debates about generalizability, effects in the “rest” of the population, etc., but are worth considering

As early “proof of concept” trials, there will be little regulatory controversy but greater assurance

42

Efficiency

D. Consider Sheiner optional titration design with mixed effects modeling (NONMEM), a “learn-confirm” approach (although he didn’t mention it in his learn-confirm paper)

Titration studies are recognized problem (ICH E-4) because they confound increased dose and increased duration, are hard to use for safety evaluation, and do not represent randomized comparisons, leading, if examined naively, to odd results (umbrella-shaped dose response because only poor responders are up-titrated)

Sheiner, Beal, and Sambol solved some of that problem by looking at individual dose response information within the group

43

44

EfficiencySheiner didn’t use a placebo, so that some of effect seen is not drug effect (tends to underestimate, i.e., L-shift, the ED50)

But a placebo controlled titration design should be useful in any stable condition (BP, cholesterol, chronic pain, e.g., osteoarthritis) in which a symptomatic, not disease-modifying, treatment is used

Much more efficient than randomized, parallel dose-response, because you can study many doses in only 2 groups; number of doses is limited only by rate of dose escalation and acceptable duration of placebo period

Could be very valuable initial study (phase 2)• Shows effect vs. placebo; i.e, one A&WC study• Locates definitive dose-response study

NEVER USED

45

Conclusions

This is a sample of design that can be used to give a more certain answer about effectiveness in phase 2. It is just one of many possible approaches to improving translational efforts and making drug development more efficient

Documents

The Critical Path Opportunities for Efficiency in Development Robert J. Temple, M.D. Associate Director for Medical Policy Center for Drug Evaluation and