Elegant Alternatives to Randomized Trials for Determining Treatment Efficacy (or Harm) Thomas B. Newman, MD, MPH Professor of Epidemiology and Biostatistics

Elegant Alternatives to Randomized Trials for Determining Treatment Efficacy (or Harm)Thomas B. Newman, MD, MPHProfessor of Epidemiology and Biostatistics and Pediatrics, UCSF\PAS\AltToRCTs forPAS 11May05.ppt

Lecture OutlineBackgroundInstrumental variables and natural experimentsMeasuring additional unrelated variables to estimate biasPropensity scoresIllustrations using phototherapy for jaundice

BackgroundWhy do RCTs?Assemble comparable groups (avoid confounding)Allow blinding (to avoid placebo effect, cointerventions, and bias in measuring outcome variable)Observational studiesMay be able to assemble comparable groups or use statistical adjustmentWont be blinded

Why is it hard to assemble comparable groups without randomizing?People who get treated differ from those who dontImportant differences are with respect risk of the outcome Treated people often at higher risk (confounding by indication for treatment).Treated people may be at lower risk (selection bias)

Pre-testObservational studies can never establish causation. Proof of causality requires randomized trials.True or false?FALSE

When causal inference from observational studies is easyOutcomes not related to indications for treatment, are highly localized in time or space, or well-understood biologicallyLiver failure following acetaminophen overdoseFluids for dehydrationSkin sloughing after infiltrate of calcium infusionBirth defects from isotretinoinLand mines and limb injuries

Post-testObservational studies can never establish causation. Proof of causality requires randomized trials.True or false?FALSE

When its hard:Outcomes are related to indications or selection for treatment Learning disabilities in children treated with anticonvulsantsSuicide in users of antidepressantsMortality after surgery for gastroesophageal reflux in children

Natural Experiments and Instrumental VariablesFind a time or place where receipt of treatment was unlikely to be related to prognosisE.g., time-series analyses where something changed (e.g. new intervention became available)Instrumental variables (IV): measurable factors that influence probability of treatment that are not otherwise associated with outcome

Use of large databasesAllows use of (weak) surrogate measures for actual predictorBiased towards nullAchieve statistical significance with large sample sizeAlgebraically reverse bias towards null (with various assumptions)

Delayed Effects of the Military Draft on Mortality Origin of study: Agent Orange concernDesign: Randomized natural experimentusing the draft lotteryData source: computerized death certificate registries, CA and PAPredictor variable of interest: military serviceHearst N, Newman TB, Hulley SB. NEJM 1986; 314:620-24

Why not compare outcomes according to the predictor variable of interest?Biased comparison those who serve in the military start out healthierHealthy warrior effect

Delayed Effects of the Military Draft on MortalityThe instrumental variable measured: draft lottery number below cutoff (based on date of birth) IV associated with predictor variable of interest, not independently associated with outcome

Year of Birth

Highest Number Drafted

1950

195

1951

125

1952

95

BUT: Having an eligible number was a poor measure of military service:

Lottery category

Proportion serving

Eligible

25.6%

Not eligible

9.3%

Algebraic Correction 1:Assume death rates in eligible (RI) and ineligible (RC) men are weighted averages of rates among those serving (A) and not serving (B)Then if p1 and p2 are proportions serving in the eligible and not eligible groups, Ri and Rc are: RI = p1A + (1-p1)B RC= p2A + (1-p2)B

Algebraic correction 2:What we want to know is the relative risk for military service (A/B)What we have is the relative risk for draft eligibility (RI/RC) Then with algebra it can be shown that : A/B =1-RI/RC + 1 p2RI/RC -p1

Results

Cause of Death

Observed RR

Projected RR

Suicide

1.13**

1.86

Motor-vehicle accidents

1.08*

1.53

All causes

1.04*

1.25

** P < 0.01 ; * P < 0.05

Health effects of breast feedingCant do RCT of breast-feedingCan do RCT of breast-feeding PROMOTIONNeed VERY large sample sizeAlgebraic correction

Promotion of Breastfeeding Intervention Trial (PROBIT)Cluster-randomized trial at 31 sites in BelarusSubjects 17,046 term singleton infants >2500g initially breastfedIntervention: WHO/UNICEF Baby Friendly Hospital InitiativeOutcomes: BF @ 3,6,9,12 months and allergic, gastrointestinal and respiratory diseaseF/U to 12 months on 16,491 (96.7%)Kramer MS, et al. JAMA 2001;285:413-20.

PROBIT, contdRQ#1: Does a Baby Friendly Hospital increase exclusive breastfeeding?Predictor = Group assignmentOutcome = Exclusive breast feedingIntention-to-treat (ITT) analysis is fineRQ#2: How much does exclusive breastfeeding reduce the risk of atopic eczema in the infant? (What is NNEBF*? )Predictor = Exclusive breast feedingOutcome = Atopic eczemaITT wont work -- too much misclassification*Number Needed to Exclusively Breast Feed

ResultsExclusive BF at 3 months (rounded) 40% vs 5%, P < 0.001)Eczema 3.3% vs 6.3%; adjusted OR = 0.54 (95% CI 0.31-.95 based on GLIMMIX)Question: if the risk difference and risk ratio in this study are 3% and 0.54, what can we say about the values for exclusive breast feeding (as opposed to treatment allocation)?

Question:What is the true effect of breast feeding, undiluted by misclassification bias?Might be relevant for helping a working mother decide whether to breastfeed exclusively. (NNEBF)

Algebraic correction (simplified) 1Assume:There is a rate of eczema for breast-fed infants (A) and a different rate for formula-fed infants (B) These rates are not dependent on group assignmentThen if p1 and p2 are proportions breastfed in the intervention and control groups, the observed rates of eczema in the two groups, Ri and Rc are: RI = p1A + (1-p1)B RC= p2A + (1-p2)B

Algebraic correction (simplified) 2 To obtain the risk difference, we first subtract the two equations: RI = p1A + (1-p1)B RC= p2A + (1-p2)B RC -RI =(p2-p1)A - (p2-p1)B RC -RI =(p2-p1)(A-B) B-A = (RC- RI)/(p1-p2)So difference in risk of eczema for exclusive BF is: (6.3%-3.3%)/(40%-5%)= 8.6%

NNEBF and caveatSince estimated risk difference is 8.6%, NNEBF to prevent 1 case of eczema is about 12Caveats:Results are for the effect of breastfeeding in response to the interventionAssumes the only effect of the Baby Friendly Hospital is via difference in exclusive breastfeeding Similarly, effects of draft lottery only apply to those who served as a result of the lottery.

Summary/other examplesIf variables known NOT to be associated with outcome are associated with treatment of interest, consider this approach.Generalizes to manynatural experiments.E.g., an intervention is intermittently available, or only available to certain groups. -- different outcome by day of the week, etc.

More natural experiments:Costs of discontinuity of care: increased laboratory test ordering in patients transferred to a different team the next morning*Effect of ER Copay: rate of appendicitis perforation unchanged after increase in co-pay.**Aircraft cabin air recirculation and symptoms of the common cold: no difference by type of air recirculation in aircraft **** Help me find this article! It was from a VA**Hsu J, et al. Presented at Bay Area Clinical Research Symposium 10/17/03** Zitter JN et al. JAMA 2002;288:483-6

Unrelated variables to estimate bias or confoundingMeasure an outcome that WOULD be affected by bias, but not by intervention (and see if it is)Measure a predictor that WOULD cause the same bias as the predictor of interest (and see if it does)

Observational study of screening sigmoidoscopyPossible bias: patients who agree to sigmoidoscopy are likely to be differentSolution: measure an outcome that would be similarly affected by biasResults:Decreased deaths from cancers within the reach of the sigmoidoscope (OR= 0.41)No effect on deaths from more proximal cancers (OR= 0.96). Selby et al, NEJM 1992;326:653-7

Effect of British breathalyser crackdownAbrupt drop in accidents occurring during weekend nights (when pubs are open)Measure an outcome that would be affected by bias: accidents during other timesResult: No change in accidents occurring during other hoursSee Cook and Campbell: Quasi-Experimentation.Boston:Houghton Mifflin, p. 219

Calcium Channel Blockers (CCB) and AMIPopulation based case-control study at Group HealthProgressive increase in risk of AMI with higher doses of CCB (P

Suicide Risk in Bipolar Disorder During Treatment With Lithium and DivalproexRetrospective cohort study of Kaiser Permanente and Group Health patients with bipolar disorderCompared with no treatment, patients treated with Valproex 2.1 time suicide riskConcern: confounding by indicationResults: Suicides per 1000 person/years31.3 for treatment with divalproex15 for no treatment (P

Initial Mood Stabilizer Prescription by Year of Initial Diagnosis Goodwin et al. JAMA.2003;290:1467-1473

Estimating biases: Cautionary TaleNurses Health Study*Vitamin E assoc. with decreased risk of CHD (RR ~.6)No significant effect of multiple vitaminsHealth Professionals Study**Vitamin E assoc. with decreased risk of CHD (RR ~.6)No significant effect of Vitamin C* N Engl J Med. 1993 May 20;328:1444-9** N Engl J Med. 1993 May 20;328:1450-6.

Propensity Scores -1Big picture: want to know if association between treatment and outcome is CAUSALRecall competing explanation = confounding by indication for treatment:Factor must be associated with outcomeFactor must be associated with treatmentTraditional approach: adjust for factors associated with outcome

Propensity Scores -2Alternative approach: Create a new variable, propensity to be treated with the interventionThen match, stratify, or include it in multivariable analysesAdvantages:Better power to control for covariables (because receipt of the intervention may be much more common than occurrence of the outcome)You can more easily tell when treated and untreated groups are not comparable

How Much Overlap In The Propensity Scores Do We Want?

Example: Aspirin use and all-cause mortality among patients being evaluated for known or suspected Coronary Artery DiseaseRQ: Does aspirin reduce all-cause mortality in patients with coronary diseaseDesign: Cohort studySubjects: 6174 consecutive patients getting stress echocardiogramsPredictor: ASA useOutcome: All-cause mortality JAMA 2001; 286: 187

Analysis using Propensity ScoresTwo multivariable analyses:Predictors of aspirin usePredictors of deathPredictors of ASA use turned into a propensity score Users and non-users of ASA matched on ASA propensity score

Survival in Propensity-Matched PatientsRecall total N=6174

LimitationsCan only compare subjects whose propensity scores overlapCan only generalize to subjects who could have received either treatmentImportant variables may be missing from your model

Illustration: Phototherapy for Neonatal JaundiceRQ: How effective is phototherapy in babies with TSB 20-22.9 mg/dL?Subjects: Newborns at NC-KPMCP 2000 g, 34 wks with TSB 20-22.9 mg/dL at 48 hr (N=1777)Intervention: Phototherapy within 8 hours of TSB 20-22.9 mg/dL (N=635, 36%)Outcome TSB 25 mg/dL (N=21, 1.2%)

Logistic regressionPhototherapy only: OR=0.30 (P = .05)Phototherapy + gest age: OR = 0.28 (P=0.04)Phototherapy + gest age + rate of rise: OR = 0.12 (P=.002)

Propensity analysisStep 1: predictors of PT within 8 hr oif TSB 20-22.9Rate of rise of TSB, gestational age, race, sex, maternal age, hospital of birth, etc.Generate new variable, propensityPT= predicted probability of PT

Propensity by whether PT received

Logistic Regression With Propensity ScorePhototherapy only: OR=0.30 (P = .05)Phototherapy + gest age: OR = 0.28 (P=0.04)Phototherapy + gest age + rate of rise: OR = 0.12 (P=.002)Phototherapy + propensity score OR= 0.13 (P=.002)

Efficacy of Phototherapy (PT) for Neonatal JaundiceLarge interfacility practice variation in use of phototherapy in the NC KPMCPHospital of birth thus an IV for phototherapy useWe can use individual-level data to adjust for other risk factors for TSB 20 mg/dL

Instrumental variableN too small for IV for TSB 25 mg/dL, so predict TSB 20 mg/dL For each hospital, calculate the proportion of newborns in group C (AAP consider PT group) who received phototherapyUse this proportion as a predictor of TSB 20 mg/dL in individual level analyses

Group R: AAP RECOMMENDS phototherapyGroup C: AAP says CONSIDER phototherapyAtkinson L, Escobar G, Takayama J, Newman TB. Pediatrics 2003;111:e555-61

Chart1

4015.22

6317

6319

5213

66.6714

628

6117

5410

275

7537

5521

Group R

Group C

Facility

Percent

Phototherapy Use When Eligible for Consider or Recommended

Table 1

Table 1: AAP Guideline for Phototherapy Use

AgeNo PhototherapyConsider PhototherapyRecommend Phototherapy

Rate of hyperbilirubinemia by PT use in 11 hospitals, 1995-6

IV analysis to Predict TSB 20 mg/dL

Sheet1

P>z

Odds RatioP

Gestational age (wks)0.651

SummaryRCTs are most definitive, butNot always feasible or necessaryLook for opportunities to answer questions with observational studiesInstrumental variablesNatural experimentsPropensity scoresEstimating biases

Potential instrumental variables analysis look at suicide attempts by YEAR, both overall and in treatment groups separately.Situation A: nothing to compare youre doneSituation B: no need for propensity score direct matching will do little no selection biasSituation C: near-optimal situation for PS overlap in covariates allows matching to substantially improve selection bias problemSlide Courtesy of Thomas LoveCodes indicating which facility is which may be obtained from Chiefs of Pediatrics.

Documents

Elegant Alternatives to Randomized Trials for Determining Treatment Efficacy (or Harm) Thomas B. Newman, MD, MPH Professor of Epidemiology and Biostatistics