28
1 STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear Introduction to Generalized Linear Models Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS count data - assume a Poisson distribution counts in contingency tables with categorical response variables. modeling count or rate data for a single discrete response variable.

1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS count data - assume a Poisson distribution counts

Embed Size (px)

Citation preview

Page 1: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

1STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3 GENERALIZED LINEAR MODELS FOR COUNTS

count data - assume a Poisson distribution

counts in contingency tables with categorical response variables.

modeling count or rate data for a single discrete response variable.

Page 2: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

2STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.1 Poisson Loglinear Models

The Poisson distribution has a positive mean µ. Although a GLM can model a positive mean using the

identity link, it is more common to model the log of the mean.

Like the linear predictor , the log mean can take any real value.

The log mean is the natural parameter for the Poisson distribution, and the log link is the canonical link for a Poisson GLM.

A Poisson loglinear GLM assumes a Poisson distribution for Y and uses the log link.

Page 3: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

3STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Log linear model

The Poisson loglinear model with explanatory variable X is

For this model, the mean satisfies the exponential relationship x

A 1-unit increase in x has a multiplicative impact of on µ

The mean at x+1 equals the mean at x multiplied by .

Page 4: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

4STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.2 Horseshoe Crab Mating Example

Page 5: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

5STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 6: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

6STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.2 Horseshoe Crab Mating Example a study of nesting horseshoe crabs. Each female horseshoe crab had a male

crab resident in her nest. AIM: factors affecting whether the

female crab had any other males, called satellites, residing nearby.

Explanatory variables are : C - the female crab’s color, S - spine condition, Wt - weight, W - carapace width.

Outcome: number of satellites (Sa) of a female crab.

For now, we only study W (carapace width)

Page 7: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

7STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

number of satellites (Sa) = f (W)

Scatter plot – weakly linear ? (N=173)

Grouped plot: To get a clearer picture, we grouped the female crabs into width categories

and calculated the sample mean number of satellites for female crabs in each category.

Figure 4.4 plots these sample means against the sample mean width for crabs in each category.

The sample means show a strong increasing trend.

WHY?

Page 8: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

8STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 9: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

9STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 10: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

10STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 11: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

11STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 12: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

12STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 13: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

13STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 14: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

14STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

SAS code

data table4_3;

input C S W Wt Sa@@; cards;

2 3 28.3 3.05 8 3 3 22.5 …

;

proc genmod data=table4_3;

model Sa=W/dist=poisson link=identity;

ods output ParameterEstimates=PE1;

run;

proc genmod data=table4_3;

model Sa=w/dist=poisson link=log;

ods output ParameterEstimates=PE2;

run;

Page 15: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

15STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Modelsdata _NULL_; set PE1;

if Parameter="Intercept" then

call symput("intercp1", Estimate);

if Parameter="W" then call symput("b1", Estimate);

data _NULL_; set PE2;

if Parameter="Intercept" then

call symput("intercp2", Estimate);

if Parameter="W" then call symput("b2", Estimate);

run;

data tmp;

do W=22 to 32 by 0.01;

mu1=&intercp1 + &b1*W;

mu2=exp(&intercp2 + &b2*W);

output;

end;

run;

Page 16: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

16STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Graphs

proc sort data=table4_3; by W;

data tmp1; merge table4_3 tmp; by W; run;

symbol1 i=join line=1 color=green value=none;

symbol2 i=join line=2 color=red value=none;

symbol3 i=none line=3 value=circle;

proc gplot data=tmp1;

plot mu1*W mu2*W Sa*W / overlay;

run;

Page 17: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

17STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 18: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

18STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Group data/*group data*/

data table4_3a; set table4_3;

W_g=round(W-0.75)+0.75;

*if W<23.25 then W_g=22.5;

*if W>29.25 then W_g=30.5;

run;

proc sql;

create table table4_3g as

select W_g, count(W_g) as Num_of_Cases,

sum(Sa) as Num_of_Satellites,

mean(Sa) as Sa_g, var(sa) as Var_SA

from table4_3a group by W_g;

quit;

proc print; run;

Page 19: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

19STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

SAS output

Num_of_ Num_of_

Obs W_g Cases Satellites Sa_g Var_SA

1 20.75 1 0 0.00000 .

2 21.75 1 0 0.00000 .

3 22.75 12 14 1.16667 3.0606

4 23.75 14 20 1.42857 8.8791

5 24.75 28 67 2.39286 6.5437

6 25.75 39 105 2.69231 11.3765

7 26.75 22 63 2.86364 6.8853

8 27.75 24 93 3.87500 8.8098

9 28.75 18 71 3.94444 16.8791

10 29.75 9 53 5.88889 9.8611

11 30.75 2 6 3.00000 0.0000

12 31.75 2 6 3.00000 2.0000

13 33.75 1 7 7.00000 .

Page 20: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

20STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Graphs

data tmp2; merge table4_3g(rename=(W_g=W)) tmp; by W; run;

symbol1 i=join line=1 color=green value=none;

symbol2 i=join line=2 color=red value=none;

symbol3 i=none line=3 value=circle;

proc gplot data=tmp2;

plot mu1*W mu2*W Sa_g*W / overlay;

run;

Page 21: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

21STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 22: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

22STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.3 Overdispersion for Poisson GLMs

Page 23: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

23STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Solution?

Page 24: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

24STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.4 Negative binomial GLMs

Page 25: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

25STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 26: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

26STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

/*fit negative binomial with identical link to count for overdispersion*/

proc genmod data=table4_3;

model Sa=W/dist=NEGBIN link=identity;

ods output ParameterEstimates=PE3;

run;

Page 27: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

27STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.6 Poisson GLM of independence in I × J contingence tables

Page 28: 1 STA 517 – Chp4 Introduction to Generalized Linear Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS  count data - assume a Poisson distribution  counts

28STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models