Psy524 Lecture 18 Logistic

8/3/2019 Psy524 Lecture 18 Logistic

1/37

Logistic Regression

Psy 524

Ainsworth


2/37

What is Logistic Regression? Form of regression that allows the

prediction of discrete variables by a mix

of continuous and discrete predictors. Addresses the same questions that

discriminant function analysis andmultiple regression do but with no

distributional assumptions on thepredictors (the predictors do not have tobe normally distributed, linearly related orhave equal variance in each group)


3/37

What is Logistic Regression? Logistic regression is often used because

the relationship between the DV (adiscrete variable) and a predictor is non-linear

Example from the text: the probability ofheart disease changes very little with a ten-point difference among people with low-bloodpressure, but a ten point change can mean adrastic change in the probability of heartdisease in people with high blood-pressure.


4/37

Questions Can the categories be correctly predicted

given a set of predictors?

Usually once this is established the predictorsare manipulated to see if the equation can besimplified.

Can the solution generalize to predicting new

cases? Comparison of equation with predictors plus

intercept to a model with just the intercept


5/37

Questions What is the relative importance of each

predictor?

How does each variable affect the outcome?

Does a predictor make the solution better orworse or have no effect?


6/37

Questions Are there interactions among predictors?

Does adding interactions among predictors

(continuous or categorical) improve themodel?

Continuous predictors should be centeredbefore interactions made in order to avoidmulticollinearity.

Can parameters be accurately predicted?

How good is the model at classifyingcases for which the outcome is known ?


7/37

Questions What is the prediction equation in the presence

of covariates?

Can prediction models be tested for relative fitto the data?

So called goodness of fit statistics

What is the strength of association between the

outcome variable and a set of predictors? Often in model comparison you want non-significant

differences so strength of association is reported foreven non-significant effects.


8/37

Assumptions

The only real limitation on logistic

regression is that the outcome mustbe discrete.


9/37

Assumptions If the distributional assumptions are met

than discriminant function analysis maybe more powerful, although it has beenshown to overestimate the associationusing discrete predictors.

If the outcome is continuous thenmultiple regression is more powerfulgiven that the assumptions are met


10/37

Assumptions Ratio of cases to variables using

discrete variables requires that there are

enough responses in every givencategory

If there are too many cells with no responsesparameter estimates and standard errors willlikely blow up

Also can make groups perfectly separable(e.g. multicollinear) which will makemaximum likelihood estimation impossible.


11/37

Assumptions

Linearity in the logit the regression

equation should have a linearrelationship with the logit form of theDV. There is no assumption about thepredictors being linearly related to

each other.


12/37


13/37

Background Odds like probability. Odds are usually

written as 5 to 1 odds which isequivalent to 1 out of five or .20probability or 20% chance, etc.

The problem with probabilities is that theyare non-linear

Going from .10 to .20 doubles the probability,but going from .80 to .90 barely increasesthe probability.


14/37

Background

Odds ratio the ratio of the odds over

1 the odds. The probability ofwinning over the probability of losing.5 to 1 odds equates to an odds ratioof .20/.80 = .25.


15/37

Background

Logit this is the natural log of an

odds ratio; often called a log oddseven though it really is a log oddsratio. The logit scale is linear andfunctions much like a z-score scale.


16/37

Background

LOGITS ARE CONTINOUS, LIKE Z

SCORESp = 0.50, then logit = 0

p = 0.70, then logit = 0.84

p = 0.30, then logit = -0.84


17/37

Plain old regression Y = A BINARY RESPONSE (DV)

1 POSITIVE RESPONSE (Success) P

0 NEGATIVE RESPONSE (failure) Q = (1-P)

MEAN(Y) = P, observed proportion ofsuccesses

VAR(Y) = PQ, maximized when P = .50,variance depends on mean (P)

XJ = ANY TYPE OF PREDICTOR Continuous, Dichotomous, Polytomous


18/37

Plain old regression

and it is assumed that errors arenormally distributed, with mean=0and constant variance (i.e.,

homogeneity of variance)

0 1 1|Y X B B X I!


19/37


an expected value is a mean, so

0 1 1( | )E Y X B B X !

The predicted value equals the proportion ofobservations for which Y|X = 1; P is theprobability of Y = 1(A SUCCESS) given X, andQ = 1- P (A FAILURE) given X.

1 ( ) |

YY P XT !! !


20/37


For any value of X, only two errors ( )

are possible, AND .W

hichoccur at rates P|X AND Q|X and withvariance (P|X)(Q|X)

Y Y

1 T 0 T


21/37


Every respondent is given a probability

of success and failure which leads toevery person having drasticallydifferent variances (because theydepend on the mean in discrete cases)

causing a violation of thehomoskedasticity assumption.


22/37


Long story short you cant use

regular old regression when you havediscrete outcomes because you dontmeet homoskedasticity.


23/37

An alternative the ogive

function

An ogive function is a curved s-shaped

function and the most common is thelogistic function which looks like:


24/37

The logistic function


25/37


Where Y-hat is the estimatedprobability that the ith case is in a

category and u is the regular linearregression equation:

1

u

i u

eY

e!

1 1 2 2 K Ku A B X B X B X !


26/37


0 1 1

0 1 1

1

b b X

i b b X

e

e

T

!


27/37

The logistic function Change in probability is not constant

(linear) with constant changes in X This means that the probability of asuccess (Y = 1) given the predictorvariable (X) is a non-linear function,specifically a logistic function


28/37

The logistic function It is not obvious how the regression

coefficients for X are related tochanges in the dependent variable (Y)when the model is written this way

Change in Y(in probability units)|Xdepends on value of X. Look at S-shaped function


29/37

The logistic function The values in the regression equation

b0 and b1 take on slightly differentmeanings.

b0 The regression constant (movescurve left and right)

b1


30/37

Logistic Function Constant regression

constant different

slopes v2: b0 = -4.00

b1 = 0.05 (middle)

v3: b0 = -4.00

b1 = 0.15 (top)

v4: b0 = -4.00

b1 = 0.025 (bottom)10090807060504030

1.0

.8

.6

.4

.2

0.0

V4

V1

V3

V1

V2

V1


31/37

Logistic Function Constant slopes with

different regression

constants v2: b0 = -3.00

b1 = 0.05 (top)

v3: b0 = -4.00

b1 = 0.05 (middle)

v4: b0 = -5.00

b1 = 0.05 (bottom) 10090807060504030

1.0

.9

.8

.7

.6

.5

.4

.3

.2

.1

0.0

V4

V1

V3

V1

V2

V1


32/37

The Logit By algebraic manipulation, the logistic

regression equation can be written interms of an odds ratio for success:

0 1 1( 1| ) exp( )

(1 ( 1| )) (1 )

i

i

i

P Y X b b X P Y X

TT

! ! ! ! - -


33/37

The Logit Odds ratios range from 0 to positive

infinity Odds ratio: P/Q is an odds ratio; lessthan 1 = less than .50 probability,greater than 1 means greater than .50

probability


34/37

The Logit Finally, taking the natural log of both

sides, we can write the equation interms of logits (log-odds):

0 1 1

( 1| )ln ln (1 ( 1| )) (1 )

P Y X

b b X P Y X

T

T

!

! ! ! - -

For a single predictor


35/37

The Logit

For multiple predictors

0 1 1 2 2

ln (1 )k kb b X b X b X

T

T

! -


36/37

The Logit Log-odds are a linear function of the

predictors

The regression coefficients go back totheir old interpretation (kind of)

The expected value of the logit (log-

odds) when X = 0 Called a logit difference; The amountthe logit (log-odds) changes, with a oneunit change in X; the amount the logitchanges in going from X to X + 1


37/37

Conversion EXP(logit) or = odds ratio

Probability = odd ratio / (1 + oddratio)

Documents

Psy524 Lecture 18 Logistic