27
Biostat Didactic Seminar Series Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic An Introduction to Logistic Regression Regression Robert Boudreau, PhD Robert Boudreau, PhD Co-Director of Methodology Core Co-Director of Methodology Core PITT-Multidisciplinary Clinical Research Center PITT-Multidisciplinary Clinical Research Center for Rheumatic and Musculoskeletal Diseases for Rheumatic and Musculoskeletal Diseases Core Director for Biostatistics Core Director for Biostatistics Center for Aging and Population Health Center for Aging and Population Health Dept. of Dept. of

Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Embed Size (px)

Citation preview

Page 1: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Biostat Didactic Seminar SeriesBiostat Didactic Seminar Series

Analyzing Binary Outcomes:  Analyzing Binary Outcomes: 

An Introduction to Logistic An Introduction to Logistic RegressionRegression

Robert Boudreau, PhDRobert Boudreau, PhD

Co-Director of Methodology CoreCo-Director of Methodology Core

PITT-Multidisciplinary Clinical Research Center PITT-Multidisciplinary Clinical Research Center

for Rheumatic and Musculoskeletal Diseasesfor Rheumatic and Musculoskeletal Diseases

Core Director for BiostatisticsCore Director for Biostatistics

Center for Aging and Population Health Center for Aging and Population Health

Dept. of Epidemiology, GSPH Dept. of Epidemiology, GSPH 10/8/201010/8/2010

Page 2: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Flow chart for group Flow chart for group comparisonscomparisons

Measurements to be compared

continuous

Distribution approx normal or N ≥ 20?

No Yes

Non-parametrics T-tests

discrete

( binary, nominal, ordinal with few values)

Chi-squareFisher’s Exact

Page 3: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Flow chart for regression Flow chart for regression modelsmodels

(includes adjusted group comparisons)(includes adjusted group comparisons)Outcome variable continuous or dichotomous?

Dichotomous (binary)continuous

Time-to-event available (or relevant)?

No Yes

Multiple logistic regression

Cox proportionalhazards regression

Predictor variable categorical?

No Yes(e.g. groups)

Multiple linear regression

ANCOVA(Multiple linear regression -using dummyvariable(s) forcategorical var(s)

Page 4: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Analysis From Last Analysis From Last Didactic …Didactic …

In Health, Aging and Body Composition Knee-OA Substudy:In Health, Aging and Body Composition Knee-OA Substudy:

Examine Association between SxRxKOA (knee OA) and CRP Examine Association between SxRxKOA (knee OA) and CRP adjusted for BMI.adjusted for BMI.

Motivation:Motivation: Sowers M, Hochberg M et. al. C-reactive protein as a biomarker

of emergent osteoarthritis. Osteoarthritis and CartilageVolume 10, Issue 8, August 2002, Pages 595-601

Conclusion: “CRP is highly associated with Knee OA; however, its high correlation with obesity limits its utility as an exclusive marker for knee OA”

Page 5: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Logistic RegressionOutline for today Definition and interpretation of odds-ratio for binary

outcome Essential equivalence of odds-ratio ↔ testing for

group differences in rates (or percentages) when evaluated using 2 x 2 table, chi-square and p-values

Logistic regression as “binary outcome” version of multiple linear regression: group (and covariate adjustment) effects are interpreted as odds-ratios affecting the binary outcome

Detailed example: relating obesity to odds of knee OA

- adjusted for race and gender

Page 6: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

HABC: Obese x KneeOA

Obese:BMI > 30

Chi-squareP < 0.0001

Obese=1: Odds of kneeOA = p/(1-p)=0.2444/0.7556 = 0.32345Obese=0: Odds of kneeOA = p/(1-p)=0.0911/0.9089 = 0.10023Obesity odds-ratio for kneeOA OR = 0.32345/0.10023=3.225

Page 7: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

HABC: Obese x KneeOA

proc logistic data=worst_knee_vs_noOA;

model kneeOA(event="1")=obese;

run;

Note OR and C.I.

ConfidenceInterval (C.I.)(2.56,4.04)

doesn’t cover 1.0 => stat signif.

Page 8: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

HABC: Obese x KneeOA

Prob[kneeOA│obese=0]= exp(-2.3)/(1+exp(-2.3) = 0.0911

Prob[kneeOA│obese=0]= exp(-2.3+1.17)/(1+exp(-2.3+1.17) = 0.2444

Page 9: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

HABC: Obese x KneeOA

Obese:BMI > 30

Chi-squareP < 0.0001

Prob[kneeOA│obese=0]= exp(-2.3)/(1+exp(-2.3) = 0.0911

Prob[kneeOA│obese=0]= exp(-2.3+1.17)/(1+exp(-2.3+1.17) = 0.2444General logistic regression form:Prob[kneeOA│obese] = exp(int+obese)/(1+exp(int+obese)

Page 10: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Gender x PAD

Page 11: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Gender x PAD(referent=female)

proc logistic data=pad;

model y1ppad(event=“1”)=male;

run;

Page 12: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Gender x PAD(ref=male)

proc logistic data=pad;

model y1ppad(event=“1”)=female;

run;

Page 13: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Gender x PAD(compare models: ref=female vs

ref=male)

(vs females)Male OR= 1.891

(vs males)Female OR= 0.529 = 1/1.891

Page 14: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

CHD x KneeOACHD x KneeOACHD Knee OAassociation notstatisticallysignificant

C.I.=(0.79,1.34)

Page 15: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Self-reported rheumatoid Self-reported rheumatoid arthritis as binary outcome arthritis as binary outcome (or covariate) for analyses ?(or covariate) for analyses ?

(NOT ?#!)(NOT ?#!)

Page 16: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

White Females: Obesity x KneeOA

Page 17: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

White vs Black FemalesObesity x KneeOA: Similar

OR’s

WhiteFemales

BlackFemales

Page 18: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Black females have about two times higher rates of

kneeOA than white women

proc logistic data=worst_knee_vs_noOA; model kneeOA(event="1")= black ; where female;run;

Page 19: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Obesity odds-ratio is same for white and black women

(interaction term is NS)proc logistic data=worst_knee_vs_noOA;

model kneeOA(event="1")=obese black

obese_x_black;

where female;

run;

Page 20: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Non-obese black women have OR=1.53 higher rates of knee OA, but obesity is associated with increased OR=3.61 for knee OA that applies within each race

Page 21: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Obesity explains some, but not all of the difference in rates of knee OA between black

and white females

(Note the “black race” OR attenuation from 2.08 to 1.53

after “adjusting” for obesity)

model kneeOA= black

model kneeOA= black obese

Page 22: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

White Females: White Females: Continuous CRP

Difference in average logCRP: 0.76 – 0.43 = 0.33

Knee OA

P-value

No (n=752) Yes (n=92)

Mean (SD) Mean (SD)

Equal vars Unequal

logCRP 0.43 (0.83) 0.76 (0.58) 0.0002 < 0.0001

logCRP SD’s were signif diff (p<0.0001) => Use Satterthwaite unequal variance test

Page 23: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

All White Females in HABC (N=844)[includes SxRxKOA (n=93); also rest of parent study cohort]

N=5N=5 had CRP > 30 (max=63.2)

Page 24: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

log CRP

Page 25: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

White Females Continuous CRP as predictor

of kneeOA

Standardized var: mean-centered, divided by SD

logCRP_perSD= (logCRP-0.4728)/0.8625

Units of standardized logCRP is SD’s

Page 26: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

White Females: Per SD higher logCRP,

rates of knee OA increase by OR=1.5

proc logistic data=worst_knee_vs_noOA3;

model kneeOA(event="1")=logCRP_perSD ;

where female and white;

run;

Page 27: Biostat Didactic Seminar Series Analyzing Binary Outcomes: Analyzing Binary Outcomes: An Introduction to Logistic Regression Robert Boudreau, PhD Co-Director

Thank youThank you

Questions, comments, suggestions or insights?Questions, comments, suggestions or insights?

Remaining time: Open consultation …Remaining time: Open consultation …