Confounding and effect modification Epidemiology 511 W. A. Kukull November 23 2004

Preview:

DESCRIPTION

Example (after Rothman, 1998) Is frequent beer consumption is associated with rectal cancer ? Beer consumption is associated with consumption of pizza Is pizza consumption a confounder? –Is pizza, by itself, causally associated with Ca? if yes, then its a confounder; otherwise not

Citation preview

Confounding and effect modification

Epidemiology 511W. A. Kukull

November 23 2004

Confounding

• “A function of the complex interrelationships between various exposures and disease”.

• Occurs when the disease - exposure association under study is “mixed” with the effect of another factor

Example(after Rothman, 1998)

• Is frequent beer consumption is associated with rectal cancer ?

• Beer consumption is associated with consumption of pizza

• Is pizza consumption a confounder?– Is pizza, by itself, causally associated with Ca?

• if yes, then its a confounder; otherwise not

Beer and Rectal Ca

Rectal Ca control

Beer 630 770

No Beer 770 630

OR= 0.67 (0.58 - 0.78)

Pizza consumption ?Yes No

Beer

No Beer

Rectal Ca Control Rectal Ca Control

350 700

70 280

280 70

700 350

Confounding(after Rothman, 1998)

• Confounding factor must be risk factor for disease (causally associated)

• Confounding factor must be associated with exposure in the source (study) population

• Confounding factor must not be affected by exposure or disease– it cannot be the result of exposure– it cannot be an intermediate step in causal path

Confounding(after Koepsell & Weiss, 2003)

• A factor that occurs only as a consequence of the exposure cannot distort (confound) the disease-exposure association.

• To be a confounder, the factor would have to give rise to the exposure or be associated with something that did.

• “No matter how strongly a variable is related to exposure status, if it is not also related to the occurrence of the disease in question, it cannot be a confounder.”

Confounder – Exposure – Disease: some finer distinctions (Koepsell & Weiss, 2003)

• A confounder can be an actual cause of disease. • A confounder can be associated with a cause of

disease that, in the context of the study, cannot be measured. (e.g., genotype)

• A variable can be a confounder if it is related to the recognition of the disease even if it has no relationship to the actual occurrence of disease. (e.g., frequency of screening tests for disease)

Exposure

Disease

Confounder

= non causal= causal

Confounding

?

AgeDistribution(conf)

Country (exp)

Mortality

?

GeneralHealth(conf)

Sexual Activity (exp)

Mortality

?

Other Meds(conf)

Ca Channel Blockers (exp)

GI Bleeding

?

Diet, SESLifestyle(conf)

Vitamin CIntake (exp)

Colon Cancer

?

Low Fat Diet (exp)

Cholesterol(conf ?)

Heart disease

?

Consequence of exposure

Weight Loss(conf ?)

Smoking

Lung Ca

?

Consequence of disease

Quetelet Index

Abdominal skinfold(conf ?)

Type IIDiabetes

?

Skinfold is a surrogate measureof body mass

Red Meat diet

Colon Ca

Tax IdNumber(conf ?)

No plausible association with disease

?

Confounder or consequence?

• Studying decreased risk of MI and due to moderate alcohol consumption

• Higher HDL cholesterol is independently associated with lower risk of MI

• HDL increases as a result of moderate alcohol use

• Is HDL a Confounder?

Controlling confounding in the design of a study

• Randomization: ensures known and unknown confounders are evenly distributed in study groups

• Restriction: Limit subjects to one category of a confounder – e.g. if sex confounds, use only men;

• Matching: equalize groups on confounder (must follow matched analysis)

Evaluating Confounder disease and exposure

• Construct tables for – confounder and disease– confounder and exposure

• Examine odds ratios (or effect estimate)– are the associations “strong”– are they likely to be “causal”

Stratification in analysis: adjusting for confounding

• Computing the crude OR from a 2x2 table• Stratification breaks the crude table into

separate 2x2 tables for each level of the confounding factor– analogous to “standardization”– many factors and many levels can cause tables

with empty cells

Is there Confounding?

• Do stratum specific RR estimates differ from Crude estimate?

• Does “adjusted” RR estimate differ from Crude estimate– Mantel-Haenszel – Multivariate modeling

• differences of >10% in RR when factor is included in the model, indicate confounding present

Confounding in stratified analyses

• stratify by the potential confounder• compute stratum-specific OR estimates• If uniform but different from crude OR

then confounding is probably present: – calculate adj. OR (e.g., use Mantel-Haenszel)

• If NOT uniform across strata then “effect modification” (interaction) may be present– Report stratum specific estimates; do not adjust

Is toluene exposure associated with Diabetes?

Diabetes CTRL

Exposedto Toluene

Not Exposed

30 18

70 82

Crude OR = 1.95 (1.0 - 3.8)

Does the Age confound the diabetes – toluene association?<40 > 40

diabetes ctrl diabetes ctrl

Tolu.

Not

Tolu.

Not

5

45

8

72

25

25

10

10

OR(1) = 1.0 (0.3 - 3.1)

OR(2) = 1.0 (0.4 - 2.8)

Why? Age confounds because it is associated with diabetes, regardless of toluene exposure

Toluene exposed

No Toluene

Diab Ctrl Diab Ctrl

>40

<40

>40

<40

25 10

5 8

25 10

45 72

OR = 4.0 (1.1 - 14.7)

OR = 4.0 (1.8 - 9.0)

Stratification example 1

• Crude OR = 1.95• OR in each age group is 1.0

– when the strata OR’s are the roughly equal --but different from the Crude OR-- it indicates confounding

• Age is a confounder • We should adjust for Age in the analysis

– Mantel-Haenszel adjusted OR (you will not need to memorize the formula)

ETOH and MIMI No MI

AlcoholYes

No

71 52

29 48

OR= 2.26 {1.26 - 4.04}

non smokers smokersMI Ctrl MI Ctrl

ETOHYes

No

Yes

No

8 16

22 44

63 36

7 4

OR=1.0 (0.38 - 2.65)

OR = 1.0 (0.29 - 3.45)

Stratify by smoking

Physical Activity and StrokeStroke No Stroke

P. A. High

Low

190 266

176 157

OR= 0.64 {0.48 - 0.85}

Men WomenStroke Ctrl Stroke Ctrl

P.A.

Hi

Lo

Hi

Lo

141 208

144 112

49 58

32 45

OR= 0.53 (0.38 - 0.73)

OR = 1.19 (0.65 - 2.16)

Stratify by Gender

Controlling Confounding in the Analysis: Adjusted odds ratio

• Stratified analysis (examine strata OR)– Mantel-Haenszel adjusted OR : a weighted

average of stratum specific OR’s

(ad / N) divided by (bc / N) = ORmh

– Where N= total subjects in each sub table

a bc d N1

a bc d N2

Mantel-HaenszelAdjusted OR

(a1d1)/N1 + (a2d2)/N2 + . . .

(b1c1)/N1 + (b2c2)/N2 +. . .OR mh =^

Trisomy 21 and spermicide use:Case-Control Study

4 109

12 1145

1270

Down’s Ctrl

Sp +

Sp -

OR=

Stratify by Maternal Age<35 35+

Down Ctrl Down Ctrl

Sp +

Sp -

Sp +

Sp -

3

9

104

1059

1

3

5

86

OR= OR=

1175 95

Mantel-HaenszelAdjusted OR

(a1d1)/N1 + (a2d2)/N2 + . . .

(b1c1)/N1 + (b2c2)/N2 +. . .OR mh =^

[(3)(1059) / (1175)] + [(1)(86) / (95)]

[(9)(104) / (1175)] + [(3)(5) / (95)]=

= 3.8

Multivariate Statistics

Linear: y = b0 + b1x1 + b2x2 +. . . bkxk

Logistic: exp (b) gives you adjusted ORlog(odds) = b0 + b1x1 + b2x2 +. . . bkxk

for b1 coded as a [0,1] variable, the ORx1= eb1 (adjusted for all other xi )

Cox : exp (b) gives you adjusted RRlog(haz) = b0 + b1x1 + b2x2 +. . . bkxk

Logistic RegressionCoding Variables

• Continuous x causes b to be interpreted as: increase in log odds per unit change in x

• Interaction of two variables is represented by a single product term: x1x2 (with only one b)– interpretation of models which include

interaction and continuous terms can be tricky– Consult a friendly Biostatistician

Recognizing Confounding in logistic regression models

• Logistic Regression:– ln[Y/(1-Y)] = a + b1X1 + b2X2 + … bnXn

– e(bi) = OR(xi) (per unit change in Xi)

– does bxi change when Xk factor(s) are added?– Does crude OR differ from adjusted OR?– does model “log-likelihood” change (score

test)

Logistic coefficients and OR’s

Variable (x) Coefficient (b) Odds Ratio

intercept -4.56 ----

gender(1=m,0=F)

1.31 3.71

smoking(1=yes,0=no)

0.70 2.01

HTN(1=yes,0=no)

0.51 1.67

eb = OR

Interaction (Effect Modification)

• Statistical, Biological and Social semantic meanings differ.

• Does the RR estimate “differ” at each level of a third variable? Homogeneity of RR

• Biological reasoning: is there something about the third factor that changes the way the Exposure-Disease association works?

Stratification Example:Crude tableCrude table

Hepatocellular carcinoma

Case ControlHepatitis CVirus infection

Yes

No

63

102

24

357

Crude OR = 9.2 (5.5 - 15.4)

Stratify by HBV infectionAre the stratum specific odds ratios statistically

different?HBV-

HepC+

-

HepC+

-

Case Ctrl Case Ctrl

HBV+

37

40

1

28

26

62

23

329

OR(1)= 25.9 (4.2 - * ) OR(2)= 6.0 (3.2 - 11.1)

M-H“adjusted odds ratio” OR= 8.1

ORs are not statistically different: should we adjust or report strata ORs???

Stratification Example 2:HBV, HepC and Liver Ca

• The OR’s in the HBV strata look quite different– Does this indicate “effect modification”?– Effect modification is a finding in the data

that needs to be elaborated; it is a natural phenomena that exists independently

– Confounding is a nuisance that needs to be eliminated (by adjusting, matching, restriction, etc.)

Effect Modification(also known as “interaction”)

• When the measure of effect differs between strata– Can apply to RR or risk difference (AR) measures

• Presumed additive or multiplicative effect model depends on biology of disease and factor

• Synergy: when effect exceeds that expected under the chosen model– RR (A+B) >> RR (A) + RR (B)– RR (A x B) >> RR (A) x RR (B)

Schematic of additive modelfor case control data (Szklo & Nieto, 2000)

BL BL BL BL BL

“A” “Z” “Z” “Z”

“A” “A”

Excessjoint

increase

Additive model effects:Expected = OR(A) + OR(Z) - 1.0

OR=1.0

2.03.0

4.0

7.0

Expected Observed

RR estimates in strata: “guidelines” for heterogeneity[Szklo & Nieto 2000]

Suspected E-M factor absent

Suspected E-M factor present

Adjust or report strata RR’s

2.3 2.6 Adjust

2.0 20.0 Report

0.5 3.0 Report (qualitative diff)

3.0 4.5 Maybe both

Is there an associationbetween risk factor (X)and disease (Y)?YES

Is it affected by Bias?

No

Estimate magnitudeand direction of

effect on RR

Are STRATUMRR’s different from“crude”RR?

YESStratum RRs are similarto each other: Confounding: Adjust for stratum factor

Stratum RRs are statistically different from each other:Interaction/effect modificationreport strata RRs, don’t adjust

Yes

No

No confounding by Strata factor

Stratified analysis flow chart

Considerations

• Collect data on potential confounders– if you don’t get it you can’t control for it

• Try to reason what the potential effect of confounding might be – Magnitude and direction (as with bias)– Coffee drinking and MI: smoking may be a

positive confounder because smokers are at increased risk of MI

Generally speaking...

• A “strong” association is less likely to be explained by confounding than a weak one

• For an observed association to be the sole result of confounding by another factor:– the factor must have a stronger association with

disease than the one observed– if RR= 10.0 for smoking and Lung ca, then a

confounder would need RR> 10.0

Logistic Regression

• Allows simultaneous adjustment for several confounders (also allows “interactions”)– multiple variables to predict disease status

(dichotomous outcome)• Odds ratios can be obtained directly from

the regression coefficients• “Standard” method seen in most case-

control study analyses (matched and unmatched analyses)

Conclusion

• What is confounding? – How do we recognize, evaluate and control it?

• What is effect modification?– How do we recognize and evaluate it?– Why is it important?– [also know as “interaction”, “effect measure

modification”, etc.]