Upload
melvin-price
View
226
Download
0
Embed Size (px)
Citation preview
1STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3 GENERALIZED LINEAR MODELS FOR COUNTS
count data - assume a Poisson distribution
counts in contingency tables with categorical response variables.
modeling count or rate data for a single discrete response variable.
2STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.1 Poisson Loglinear Models
The Poisson distribution has a positive mean µ. Although a GLM can model a positive mean using the
identity link, it is more common to model the log of the mean.
Like the linear predictor , the log mean can take any real value.
The log mean is the natural parameter for the Poisson distribution, and the log link is the canonical link for a Poisson GLM.
A Poisson loglinear GLM assumes a Poisson distribution for Y and uses the log link.
3STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Log linear model
The Poisson loglinear model with explanatory variable X is
For this model, the mean satisfies the exponential relationship x
A 1-unit increase in x has a multiplicative impact of on µ
The mean at x+1 equals the mean at x multiplied by .
4STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.2 Horseshoe Crab Mating Example
5STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
6STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.2 Horseshoe Crab Mating Example a study of nesting horseshoe crabs. Each female horseshoe crab had a male
crab resident in her nest. AIM: factors affecting whether the
female crab had any other males, called satellites, residing nearby.
Explanatory variables are : C - the female crab’s color, S - spine condition, Wt - weight, W - carapace width.
Outcome: number of satellites (Sa) of a female crab.
For now, we only study W (carapace width)
7STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
number of satellites (Sa) = f (W)
Scatter plot – weakly linear ? (N=173)
Grouped plot: To get a clearer picture, we grouped the female crabs into width categories
and calculated the sample mean number of satellites for female crabs in each category.
Figure 4.4 plots these sample means against the sample mean width for crabs in each category.
The sample means show a strong increasing trend.
WHY?
8STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
9STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
10STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
11STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
12STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
13STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
14STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
SAS code
data table4_3;
input C S W Wt Sa@@; cards;
2 3 28.3 3.05 8 3 3 22.5 …
;
proc genmod data=table4_3;
model Sa=W/dist=poisson link=identity;
ods output ParameterEstimates=PE1;
run;
proc genmod data=table4_3;
model Sa=w/dist=poisson link=log;
ods output ParameterEstimates=PE2;
run;
15STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Modelsdata _NULL_; set PE1;
if Parameter="Intercept" then
call symput("intercp1", Estimate);
if Parameter="W" then call symput("b1", Estimate);
data _NULL_; set PE2;
if Parameter="Intercept" then
call symput("intercp2", Estimate);
if Parameter="W" then call symput("b2", Estimate);
run;
data tmp;
do W=22 to 32 by 0.01;
mu1=&intercp1 + &b1*W;
mu2=exp(&intercp2 + &b2*W);
output;
end;
run;
16STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Graphs
proc sort data=table4_3; by W;
data tmp1; merge table4_3 tmp; by W; run;
symbol1 i=join line=1 color=green value=none;
symbol2 i=join line=2 color=red value=none;
symbol3 i=none line=3 value=circle;
proc gplot data=tmp1;
plot mu1*W mu2*W Sa*W / overlay;
run;
17STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
18STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Group data/*group data*/
data table4_3a; set table4_3;
W_g=round(W-0.75)+0.75;
*if W<23.25 then W_g=22.5;
*if W>29.25 then W_g=30.5;
run;
proc sql;
create table table4_3g as
select W_g, count(W_g) as Num_of_Cases,
sum(Sa) as Num_of_Satellites,
mean(Sa) as Sa_g, var(sa) as Var_SA
from table4_3a group by W_g;
quit;
proc print; run;
19STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
SAS output
Num_of_ Num_of_
Obs W_g Cases Satellites Sa_g Var_SA
1 20.75 1 0 0.00000 .
2 21.75 1 0 0.00000 .
3 22.75 12 14 1.16667 3.0606
4 23.75 14 20 1.42857 8.8791
5 24.75 28 67 2.39286 6.5437
6 25.75 39 105 2.69231 11.3765
7 26.75 22 63 2.86364 6.8853
8 27.75 24 93 3.87500 8.8098
9 28.75 18 71 3.94444 16.8791
10 29.75 9 53 5.88889 9.8611
11 30.75 2 6 3.00000 0.0000
12 31.75 2 6 3.00000 2.0000
13 33.75 1 7 7.00000 .
20STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Graphs
data tmp2; merge table4_3g(rename=(W_g=W)) tmp; by W; run;
symbol1 i=join line=1 color=green value=none;
symbol2 i=join line=2 color=red value=none;
symbol3 i=none line=3 value=circle;
proc gplot data=tmp2;
plot mu1*W mu2*W Sa_g*W / overlay;
run;
21STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
22STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.3 Overdispersion for Poisson GLMs
23STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Solution?
24STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.4 Negative binomial GLMs
25STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
26STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
/*fit negative binomial with identical link to count for overdispersion*/
proc genmod data=table4_3;
model Sa=W/dist=NEGBIN link=identity;
ods output ParameterEstimates=PE3;
run;
27STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
4.3.6 Poisson GLM of independence in I × J contingence tables
28STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models