22
Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Statistics for Health Research Research

Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Embed Size (px)

Citation preview

Page 1: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Assessing Binary Outcomes: Logistic

Regression Peter T. Donnan

Professor of Epidemiology and Biostatistics

Statistics for Health ResearchStatistics for Health Research

Page 2: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Objectives of Session Objectives of Session

•Understand what is meant by a Understand what is meant by a binary outcomebinary outcome

•How analyses of binary outcomes How analyses of binary outcomes implemented in logistic implemented in logistic regression model regression model

•Understand when a logistic Understand when a logistic model is appropriatemodel is appropriate

•Be able to implement in SPSS and Be able to implement in SPSS and •Interpret logistic model outputInterpret logistic model output

•Understand what is meant by a Understand what is meant by a binary outcomebinary outcome

•How analyses of binary outcomes How analyses of binary outcomes implemented in logistic implemented in logistic regression model regression model

•Understand when a logistic Understand when a logistic model is appropriatemodel is appropriate

•Be able to implement in SPSS and Be able to implement in SPSS and •Interpret logistic model outputInterpret logistic model output

Page 3: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Binary OutcomeBinary Outcome

Extremely common in health Extremely common in health research:research:

•Dead / AliveDead / Alive

•Hospitalisation (Yes / No)Hospitalisation (Yes / No)

•Diagnosis of diabetes (Yes / No)Diagnosis of diabetes (Yes / No)

•Met target e.g. total cholesterol < 5.0 Met target e.g. total cholesterol < 5.0 mmol/l (Yes / No)mmol/l (Yes / No)n.b. Can use any code such as 1 / 2 but mathematically n.b. Can use any code such as 1 / 2 but mathematically easier to use 0 / 1easier to use 0 / 1

Page 4: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

How is relationship How is relationship formulated?formulated?

For linear simplest equation For linear simplest equation is :is :

iebxay

y is the outcome; a is the y is the outcome; a is the intercept;intercept;

b is the slope related to x the b is the slope related to x the explanatory variable and;explanatory variable and;

e is the error term or random e is the error term or random ‘noise’‘noise’

Page 5: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Can we fit y as a Can we fit y as a probability range 0 to probability range 0 to

1?1?iebxay

Not quite! Not quite!

Y as continuous - any value from -Y as continuous - any value from -∞ to + ∞ to + ∞∞

Outcome is a probability of event, Outcome is a probability of event, ΠΠ (or (or p) on scale 0 – 1 p) on scale 0 – 1

Certain transformations of p can give the Certain transformations of p can give the required scalerequired scale

Probit is a normal transformation of p but Probit is a normal transformation of p but not easy to interpret results not easy to interpret results

Page 6: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

We can now fit p as a probability range 0 We can now fit p as a probability range 0 to 1 to 1

And y in range -∞ to + ∞And y in range -∞ to + ∞

iebxa)p(itlogy

The logit transformation The logit transformation works! works!

iebxa

p

p

1

log

Page 7: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Logistic Regression ModelLogistic Regression Model

This has very useful propertiesThis has very useful properties

The term p/(1-p) is called the ‘Odds’ of an The term p/(1-p) is called the ‘Odds’ of an eventevent

Note: not the same as the probability of an Note: not the same as the probability of an event pevent p

If x is binary coded 0/1 then - If x is binary coded 0/1 then -

exp (b) = ODDS RATIOexp (b) = ODDS RATIOfor the outcome in those coded 1 relative to for the outcome in those coded 1 relative to code 0 code 0

e.g. Odds of death in men (1) vs. women (0)e.g. Odds of death in men (1) vs. women (0)

iebxa

p

p

1

log

Page 8: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Logistic Regression ModelLogistic Regression Model

Consider the LDL data. Consider the LDL data.

It has two binary outcomes –It has two binary outcomes –

1)1)LDL target achievedLDL target achieved

2)2)Chol target achieved Chol target achieved

For example consider gender as For example consider gender as a predictor – Male = 1 & Female a predictor – Male = 1 & Female = 2= 2

Page 9: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

For a binary x we can express For a binary x we can express results as odds ratios (available in results as odds ratios (available in

crosstabs)crosstabs)

140140 563563

149149 531531

No Yes

Male

Female

LDL target achieved

Odds yes Odds yes = = 563/140563/140

Odds yes Odds yes = = 531/149531/149

Page 10: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Odds ratio = 4.02 / 3.56Odds ratio = 4.02 / 3.56

OR = 0.886 Female cf MaleOR = 0.886 Female cf Male

140140 563563

149149 531531

No Yes

Male

Female

LDL target achieved

Odds yes Odds yes = = 563/140563/140= = 4.024.02Odds yes Odds yes = = 531/149531/149= = 3.563.56

N.b. Odds is different to prob – Men p = 563/(140+563) = 0.80 or 80%

Page 11: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Odds ratio from Odds ratio from CrosstabsCrosstabs

Obtain odds ratios for 2 x 2 Obtain odds ratios for 2 x 2 tables from crosstabs and select tables from crosstabs and select option ‘risk’option ‘risk’

Page 12: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Results from CrosstabsResults from Crosstabs

Odds ratios for achieving LDL Odds ratios for achieving LDL target in females vs. malestarget in females vs. males

n.b. OR given for Female vs male = 0.886

Page 13: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Fit Logistic Regression Fit Logistic Regression ModelModel

DependentDependent is binary outcome – is binary outcome –

LDL target met (Yes = 1, No = 0)LDL target met (Yes = 1, No = 0)

IndependentIndependent – Gender 1 = M, 2 = F – Gender 1 = M, 2 = F

Should get same as the crosstabs Should get same as the crosstabs result result

Select Analyze / Regression / Binary Select Analyze / Regression / Binary LogisticLogistic

Select option of 95% CI for exp (b)Select option of 95% CI for exp (b)

Page 14: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Regression / Regression / Binary Binary

logistic…..logistic…..

Page 15: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Odds ratio from logistic Odds ratio from logistic model results for a binary model results for a binary

predictorpredictor

EXP (B) = Odds ratio F vs. MEXP (B) = Odds ratio F vs. MNote that OR for Men vs Note that OR for Men vs Women = 1/0.886 = 1.13Women = 1/0.886 = 1.13

Page 16: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Fit Logistic Regression Fit Logistic Regression Model – continuous Model – continuous

predictorpredictorDependentDependent is binary outcome – is binary outcome –

LDL target metLDL target met

IndependentIndependent – Continuous predictor – Continuous predictor – Adherence– Adherence

B represents the change in the ODDS B represents the change in the ODDS RATIO for a 1 unit increase in adherenceRATIO for a 1 unit increase in adherence

B x 10 represents the change in the B x 10 represents the change in the ODDS RATIO for a 10 unit increase in ODDS RATIO for a 10 unit increase in adherenceadherence

Page 17: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Odds ratio from logistic Odds ratio from logistic model results for a model results for a

continuous continuous

EXP (B) = Odds ratio for 1% increase in EXP (B) = Odds ratio for 1% increase in AdherenceAdherenceOR for 10% increase is exp(10 x 0.010) = 1.105 OR for 10% increase is exp(10 x 0.010) = 1.105

i.e. a 10.5% increase in odds of i.e. a 10.5% increase in odds of meeting LDL target for each 10% meeting LDL target for each 10% increase in adherenceincrease in adherence

Page 18: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Fit Logistic Regression Fit Logistic Regression Model – categorical Model – categorical

predictorpredictorDependentDependent is binary outcome – is binary outcome –

LDL target metLDL target met

IndependentIndependent – APOE genotype (1 – – APOE genotype (1 – 6)6)

Choose a reference category, in this case Choose a reference category, in this case worst outcome is genotype 6 so choose 6 worst outcome is genotype 6 so choose 6 to give ORs > 1to give ORs > 1

B represents the OR for each category B represents the OR for each category relative to the reference categoryrelative to the reference category

Page 19: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Regression / Regression / Binary Binary

logistic…..logistic…..Choose Categorical

Page 20: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Odds ratios from logistic Odds ratios from logistic model results for a model results for a

categorical predictorcategorical predictor

EXP (B) = Odds ratio EXP (B) = Odds ratio for APOE (2) vs APOE for APOE (2) vs APOE (6) OR = 4.381 (6) OR = 4.381 (95% CI 1.742, 11.021)(95% CI 1.742, 11.021)

Page 21: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

Other binary modelsOther binary models

The logistic model is only applicable The logistic model is only applicable whenever the length of follow-up is whenever the length of follow-up is same for each individual e.g. 5-yr same for each individual e.g. 5-yr follow-up of a cohortfollow-up of a cohort

For binary outcomes where For binary outcomes where censoring occurs i.e. people leave censoring occurs i.e. people leave the cohort from death or migration the cohort from death or migration then length of follow-up varies and then length of follow-up varies and need to use need to use survival models survival models such as such as Cox Proportional Hazards modelCox Proportional Hazards model

Page 22: Assessing Binary Outcomes: Logistic Regression Peter T. Donnan Professor of Epidemiology and Biostatistics Statistics for Health Research

SummarySummary

• Logistic model easily fitted in Logistic model easily fitted in SPSSSPSS

• Clear link with ODDS RATIOSClear link with ODDS RATIOS

• Common model for case-control, Common model for case-control, cohort studies as well as cohort studies as well as development of clinical prediction development of clinical prediction modelsmodels