View
221
Download
3
Category
Preview:
Citation preview
Analysis of VarianceAnalysis of variance and regression coursehttp://staff.pubhealth.ku.dk/~lts/regression10_2/index.html
Marc Andersen, mja@statgroup.dk
Analysis of variance and regression for health researchers,November 25, 2010
1 / 68
Outline
Comparison of serveral groups
Model checking
Two-way ANOVA
Interaction
Advanced designs
2 / 68
Acknowledgements
written by Lene Theil Skovgaard1
2006, 2007updated by Julie Lyng Forman1
2008, November 2009updated by Marc Andersen2
April 2009, April 2010, November 2010
1Dept. of Biostatistics2StatGroup
3 / 68
Comparison of 2 or more groups
number different sameof groups individuals individual
2 unpaired pairedt-test t-test
≥2 oneway two wayanalysis of variance analysis of variance
4 / 68
One-way analysis of variance
◮ Do the distributions differ between the groups?◮ Do the levels differ between the groups?
5 / 68
Example: ventilation during anaesthesia
Data: 22 bypass-patients randomised to 3 different kinds ofventilation during anaesthesiaOutcome: measurement of red cell folate
Group I 50% N2O, 50% O2 for 24 hoursGroup II 50% N2O, 50% O2 during operationGroup III 30–50% O2 (no N2O) for 24 hours
Gr.I Gr.II Gr.IIIn 8 9 5Mean 316.6 256.4 278.0SD 58.7 37.1 33.8
6 / 68
Example: ventilation during anaesthesia
Red
cel
l fol
ate
200
250
300
350
400
GroupI II III
7 / 68
One-way ANOVA
One-waybecause we only have one critera for classification of theobservations, here ventilation method
ANalysis Of VAriancebecause we comparethe variance between groupswith the variance within groups
8 / 68
The one-way ANOVA model
NotationThe j’th observation from group i is described by:
Yij = µi + εij
j’th observation mean individualin group no. i group i deviation
i.e. as consisting of mean of the group plus an individualdeviation , with εij ∼ N(0, σ2) or equivalently Yij ∼ N(µi , σ
2).
AssumptionsObservations are assumed be independent and to follow anormal distribution with mean µi withing group i with the samevariance.
Model assumptions should be investigated!
9 / 68
Hypothesis testing
Investigate difference between groups
◮ Null hypothesis: group means are equal, H0 : µi = µ
◮ Alternative hypothesis: group means are not equal◮ We conclude that the means are not equal when we reject
the null hypothesis of equality (ref DGA, 8.5 HypothesisTesting)
10 / 68
ANOVA math: Sums of squares
Decomposition of ’deviation from grand mean’
yij − y· = (yij − yi) + (yi − y·)
Decomposition of variation (sums of squares)∑i ,j
(yij − y·)2
︸ ︷︷ ︸total variation
=∑i ,j
(yij − yi)2
︸ ︷︷ ︸within groups
+∑i ,j
(yi − y·)2
︸ ︷︷ ︸between groups
yij j ’th observation in i ’th groupyi average in i ’th groupy. overall average, or ’grand mean’
11 / 68
Decomposition of variation
total = between + within
SStotal = SSbetween + SSwithin
(n − 1) = (k − 1) + (n − k)
F-test statistic
F =MSbetween
MSwithin=
SSbetween/(k − 1)
SSwithin/(N − k)
Hypothesis testReject the null hypothesis if F is large, i.e. if the variationbetween groups is too large compared to the variation withingroups.
12 / 68
Analysis of variance table
ANOVA tableVariation df SS MS F PBetween k − 1 SSb SSb/dfb MSb/MSw P(F (dfb, dfw) > Fobs)Within n − k SSw SSw/dfw
Total n − 1 SStot
F test statisticsThe F test statistics follows and F-distribution with dfb and dfwdegrees of freedom: Fobs ∼ F (dfb, dfw).
13 / 68
Analysis of variance table - Anaestesia example
ANOVA table
df SS MS F PBetween 2 15515.77 7757.9 3.71 0.04Within 19 39716.09 2090.3Total 21 55231.86
F test statistics
F = 3.71 ∼ F (2, 19) ⇒ P = 0.04
InterpretationWeak evidence of non-equality of the three means
14 / 68
Analysis of variance in SAS
To define the anaestesia data in SAS, we write
data ex_redcell;input grp redcell;cards;1 2431 2511 275. .. .. .3 2933 328;run;
The variable redcell contains all the measurements of theoutcome and grp contains the method of ventilation for eachindividual.
15 / 68
Analysis of variance program
proc glm data=ex_redcell;class grp;model redcell=grp / solution;run;
General Linear Models ProcedureDependent Variable: REDCELL
Sum of MeanSource DF Squares Square F Value Pr > F
Model 2 15515.7664 7757.8832 3.71 0.0436Error 19 39716.0972 2090.3209Corrected Total 21 55231.8636
R-Square C.V. Root MSE REDCELL Mean0.280921 16.14252 45.7200 283.227
Source DF Type I SS Mean Square F Value Pr > FGRP 2 15515.7664 7757.8832 3.71 0.0436
Source DF Type III SS Mean Square F Value Pr > FGRP 2 15515.7664 7757.8832 3.71 0.0436
16 / 68
Parameter estimates
The option solution outputs parameter estimates
T for H0: Pr > |T| Std Error ofParameter Estimate Parameter=0 Estimate
INTERCEPT 278.0000000 B 13.60 0.0001 20.44661784GRP 1 38.6250000 B 1.48 0.1548 26.06442584
2 -21.5555556 B -0.85 0.4085 25.501412903 0.0000000 B . . .
NOTE: The X’X matrix has been found to be singular and a generalizedinverse was used to solve the normal equations. Estimates followedby the letter ’B’ are biased, and are not unique estimators of theparameters.
◮ Group 3 (the last group) is the reference group◮ The estimates for the other groups refer to differences to
this reference group
17 / 68
PROC glm box plot
18 / 68
Interpreting the estimates
◮ What is the scientific question◮ Clinical significance◮ Statistical significance◮ Provide confidence interval◮ Does it make sense?
19 / 68
Multiple comparisons
The F -test show, that there is a difference — but where?
Pairwise t-tests are not suitable due to risk of masssignificance
A significance level of α = 0.05 means 5% chance of wrongfullyrejecting a true hypothesis (type I error)
The chance of at least one type I error goes up with the numberof tests.
(for k groups, we have m = k(k − 1)/2 possible tests, the actual significance level can
be as bad as: 1 − (1 − α)m , e.g. for k=5: 0.40)
20 / 68
Adressing multiplicity
There is no completely satisfactory solution.
Approximative solutions
1. Select a (small) number of relevant comparisons in theplanning stage.
2. Make a graph of the average ±2× SEM and judge visually(!), perhaps supplemented with F -tests on subsets ofgroups.
3. Modify the t-tests by multiplying the P-values with thenumber of tests, the socalled Bonferroni correction(conservative)
4. Use a correction for multiple testing (Dunnett, Tukey) or a(prespecified) multiple testing procedure
21 / 68
Tukey: multiple comparisons in SAS
proc glm data=ex_redcell;class grp;model redcell=grp /
solution;lsmeans grp /
adjust=tukey pdiff cl;run;
The GLM ProcedureLeast Squares MeansAdjustment for Multiple Comparisons: Tukey-Kramer
Least Squares Means for effect grpPr > |t| for H0: LSMean(i)=LSMean(j)
Dependent Variable: redcell
i/j 1 2 3
1 0.0355 0.32152 0.0355 0.68023 0.3215 0.6802
Least Squares Means for Effect grp
Difference Simultaneous 95%Between Confidence Limits for
i j Means LSMean(i)-LSMean(j)
1 2 60.180556 3.742064 116.6190471 3 38.625000 -27.590379 104.8403792 3 -21.555556 -86.340628 43.229517
22 / 68
Visual assessment (1/3)
The bars represent 95 % confidence intervals for the meansusing the standard deviation for each group (std2mjt insymbol1 statement).
proc gplot data=ex_redcell;plot redcell*grp
/ haxis=axis1 vaxis=axis2 frame;axis1 order=(1 to 3 by 1)
offset=(8,8)label=(H=3)value=(H=2) minor=NONE;
axis2offset=(1,1) value=(H=2) minor=NONElabel=(A=90 R=0 H=3);
symbol1 v=circle i=std2mjt l=1 h=2 w=2;run;
Red
cel
l fol
ate
200220240260280300320340360380400
GroupI II III
23 / 68
Visual assessment (2/3)
The bars represent 95 % confidence intervals for the meansusing the pooled standard deviation for each group (std2mpjtin symbol1 statement).
proc gplot data=ex_redcell;plot redcell*grp
/ haxis=axis1 vaxis=axis2 frame;axis1 order=(1 to 3 by 1)
offset=(8,8)label=(H=3)value=(H=2) minor=NONE;
axis2offset=(1,1) value=(H=2) minor=NONElabel=(A=90 R=0 H=3);
symbol1 v=circle i=std2mpjt l=1 h=2 w=2;run;
Red
cel
l fol
ate
200220240260280300320340360380400
GroupI II III
24 / 68
Visual assessment (3/3)The bars represent 95 % confidence intervals for the meansusing the pooled standard deviation for each group obtainedfrom PROC glm.
25 / 68
Model checking
Check if the assumptions are reasonable: (If not theanalysis is unreliable!)
◮ Variance homogeneity may be checked by performingLevenes test (or Bartletts test).
◮ In case of variance inhomogeneity, we may also perform aweighted analysis (Welch’s test ), just as in the T-test
◮ Normality may be checked through probability plots (orhistograms) of residuals, or by a numerical test on theresiduals.
◮ In case of non-normality, we may use the nonparametricKruskal-Wallis test
Transformation (often logarithms) may help to achievevariance homogeneity as well as normality
26 / 68
Check of variance homogeneity and normality in SAS
proc glm data=ex_redcell;class grp;model redcell=grp;means grp / hovtest=levene welch;output out=model p=predicted r=residual;
run;
Store residuals in a dataset for further model checking
proc univariate data=model normal ;var residual;histogram residual/ normal(mu=0);ppplot residual / normal(mu=0) square;
run;
27 / 68
Output from proc glm: Test for variance homogeneity
Levene’s Test for Homogeneity of redcell VarianceANOVA of Squared Deviations from Group Means
Sum of MeanSource DF Squares Square F Value Pr > F
grp 2 18765720 9382860 4.14 0.0321Error 19 43019786 2264199
Weighted anova in case of variance heterogeneity:
Welch’s ANOVA for redcell
Source DF F Value Pr > F
grp 2.0000 2.97 0.0928Error 11.0646
So we are not too sure concerning the group differences.....
28 / 68
Test for normality
Output from proc univariate
Tests for NormalityTest --Statistic--- -----p Value----Shapiro-Wilk W 0.965996 Pr < W 0.6188Kolmogorov-Smirnov D 0.107925 Pr > D >0.1500Cramer-von Mises W-Sq 0.043461 Pr > W-Sq >0.2500Anderson-Darling A-Sq 0.263301 Pr > A-Sq >0.2500
The 4 tests focus on different aspects of non-normality.
◮ For small data sets, we rarely get significance◮ For large data sets, we almost always get significance◮ Could look at a probability plot instead
29 / 68
Output from proc univariate: Histogram and probabilityplot
30 / 68
Non-parametric ANOVA, the Kruskal-Wallis test
SAS code
proc npar1way wilcoxon;exact;class grp;var redcell;run;
Wilcoxon Scores (Rank Sums) for Variable redcellClassified by Variable grp
Sum of Expected Std Dev Meangrp N Scores Under H0 Under H0 Score-------------------------------------------------------------------1 8 120.0 92.00 14.651507 15.0000002 9 77.0 103.50 14.974979 8.5555563 5 56.0 57.50 12.763881 11.200000
Kruskal-Wallis TestChi-Square 4.1852DF 2Asymptotic Pr > Chi-Square 0.1234Exact Pr >= Chi-Square 0.1233
Again, we have ’lost’ the significance....
31 / 68
Two-way analysis of variance
Two criterias for subdividing observations, A og B
Data in two-way layout:
BA 1 2 · · · c1 · · ·2 · · ·...
......
...r · · ·
◮ Effect of both factors◮ Perhaps even
interaction (effectmodification)
One factor may be ’individuals’or “experimental units” (e.g. dif-ferent treatments tried on sameperson)
32 / 68
Repeated measurements
Example: Short term effect of enalaprilate on heart rate
TimeSubject 0 30 60 120 average1 96 92 86 92 91.502 110 106 108 114 109.503 89 86 85 83 85.754 95 78 78 83 83.505 128 124 118 118 122.006 100 98 100 94 98.007 72 68 67 71 69.508 79 75 74 74 75.509 100 106 104 102 103.00average 96.56 92.56 91.11 92.33 93.14
33 / 68
Line plot (“Spaghettiogram”)
Ideally the time courses are parallel.
34 / 68
The additive model
The two effects (s and t) work in an additive way.
Yst = µ + αs + βt + εst
The εst ’s are assumed to be independent, normally distributedwith mean 0, and identical variances, εst ∼ N(0, σ2).(This assumption should be investigated!)
Variational decomposition:
SStotal = SSsubject + SStime + SSresidual
35 / 68
Analysis of variance table - enalaprilate example
df SS MS F PSubjects 8 8966.6 1120.8 90.64 <0.0001Times 3 151.0 50.3 4.07 0.0180Residual 24 296.8 12.4Total 35 9414.3
◮ Highly significant difference between subjects (not veryinteresting)
◮ Significant time differences.
36 / 68
Two-way ANOVA in SAS
proc glm data=ex_pulse;class subject times;
model hrate=subject times / solution;run;
General Linear Models ProcedureClass Level Information
Class Levels Values
SUBJECT 9 1 2 3 4 5 6 7 8 9TIMES 4 0 30 60 120
Number of observations in data set = 36
37 / 68
Two-way ANOVA output
General Linear Models Procedure
Dependent Variable: HRATESum of Mean
Source DF Squares Square F Value Pr > F
Model 11 9117.52778 828.86616 67.03 0.0001Error 24 296.77778 12.36574Corrected Total 35 9414.30556
R-Square C.V. Root MSE HRATE Mean
0.968476 3.775539 3.51650 93.1389
Source DF Type I SS Mean Square F Value Pr > F
SUBJECT 8 8966.55556 1120.81944 90.64 0.0001TIMES 3 150.97222 50.32407 4.07 0.0180
Source DF Type III SS Mean Square F Value Pr > F
SUBJECT 8 8966.55556 1120.81944 90.64 0.0001TIMES 3 150.97222 50.32407 4.07 0.0180
38 / 68
Parameter estimates
T for H0: Pr > |T| Std Error ofParameter Estimate Parameter=0 Estimate
INTERCEPT 102.1944444 B 50.34 0.0001 2.03024963SUBJECT 1 -11.5000000 B -4.62 0.0001 2.48653783
2 6.5000000 B 2.61 0.0152 2.486537833 -17.2500000 B -6.94 0.0001 2.486537834 -19.5000000 B -7.84 0.0001 2.486537835 19.0000000 B 7.64 0.0001 2.486537836 -5.0000000 B -2.01 0.0557 2.486537837 -33.5000000 B -13.47 0.0001 2.486537838 -27.5000000 B -11.06 0.0001 2.486537839 0.0000000 B . . .
TIMES 0 4.2222222 B 2.55 0.0177 1.6576918930 0.2222222 B 0.13 0.8945 1.6576918960 -1.2222222 B -0.74 0.4681 1.65769189120 0.0000000 B . . .
NOTE: The X’X matrix has been found to be singular and a generalizedinverse was used to solve the normal equations. Estimates followedby the letter ’B’ are biased, and are not unique estimators of theparameters.
◮ subject 9 at time 120 minutes is the reference
39 / 68
Expected values and residuals
Expected values for subject=3, times=30
yst = µ + αs + βt
= 102.19− 17.25 + 0.22
= 85.16
Residuals
rst = observed − expected
= yst − yst ≈ εst
Residual for subject 3, time 30: r32 = 86− 85.16 = 0.84
40 / 68
Model checking
Look for:
◮ differences in variances (systematic?)◮ Non-normality◮ Lack of additivity (interaction).
Can only be tested if there is more than one observationfor each combination
◮ Serial correlation?(Neighboring observations look more alike)
41 / 68
Residual based diagnostics
Use the residuals for model checking
◮ Probability plot of residuals.◮ Plot residuals vs expected values.◮ Plot residuals vs group.◮ Look for outliers (a large residual means observed and
expected values deviate a lot).
42 / 68
Enalaprilate example
No systematic patterns should be present.
43 / 68
Interaction
Example of two criterias for subdividing individuals:sex and smoking habits
Outcome: FEV1
Here, we see an interaction between sex and smoking.
44 / 68
Possible explanations for interaction
◮ Biologically different effects of smoking on males andfemales
◮ Perhaps the women do not smoke as much as the men◮ Perhaps the effect is relative
(to be expressed in %)
45 / 68
Example: The effect of smoking on birth weight
46 / 68
Example: The effect of smoking on birth weight
47 / 68
Interpreting interaction
◮ There is an effect of smoking, but only for those who havebeen smoking for a long time.
◮ There is an effect of duration, and this effects increaseswith amount of smoking
The effect of duration depends upon .... amount of smoking
and the effect of amount depends upon .... duration of smoking
48 / 68
Example: Fibrinogen after spleen operation
34 rats are randomized, in 2 ways
◮ 17 have their spleen removed (splenectomy=yes/no)◮ 8/17 in each group are kept in altitude chambers (15.000
ft) (place=altitude/control)
OutcomeFibrinogen level in mg% at day 21
49 / 68
Example: Fibrinogen after spleen operation
fibrin
ogen
100
200
300
400
500
600
group
no_altitude no_control yes_altitude yes_control
50 / 68
ANOVA model with interaction
The usual additive model:
Yspr = µ + αs + βp + εspr , εspr ∼ N(0, σ2)
splenectomy (s=yes/no) and place (p=altitude/control)have an additive effect.
Model with interaction
Yspr = µ + αs + βp + γsp + εspr , εspr ∼ N(0, σ2)
Here, we specify an interaction between splenectomy andplace, i.e. the effect of living in a high altitude may be thoughtto depend upon whether or not you have an intact spleen.
and vice versa..
51 / 68
Two-way ANOVA with interaction in SAS
proc glm data=ex_fibrinogen;class splenectomy place;
model fibrinogen=place splenectomyplace*splenectomy / solution;
output out=model p=predicted r=residual;run;
The GLM Procedure
Class Level Information
Class Levels Values
splenectomy 2 no yesplace 2 altitude control
Number of observations 34
52 / 68
Output: two-way ANOVA table
Dependent Variable: fibrinogen
Sum ofSource DF Squares Mean Square F Value Pr > F
Model 3 139439.2067 46479.7356 8.32 0.0004Error 30 167573.7639 5585.7921Corrected Total 33 307012.9706
R-Square Coeff Var Root MSE fibrinogen Mean0.454180 20.99213 74.73816 356.0294
Source DF Type I SS Mean Square F Value Pr > F
place 1 67925.25531 67925.25531 12.16 0.0015splenectomy 1 69662.38235 69662.38235 12.47 0.0014splenectomy*place 1 1851.56904 1851.56904 0.33 0.5691
Source DF Type III SS Mean Square F Value Pr > F
place 1 67925.25531 67925.25531 12.16 0.0015splenectomy 1 68093.92198 68093.92198 12.19 0.0015splenectomy*place 1 1851.56904 1851.56904 0.33 0.5691
53 / 68
Output: Parameter estimates
StandardParameter Estimate Error t Value Pr > |t|
Intercept 261.6666667 B 24.91271904 10.50 <.0001place altitude 104.3333333 B 36.31621657 2.87 0.0074place control 0.0000000 B . . .splenectomy no 104.4444444 B 35.23190514 2.96 0.0059splenectomy yes 0.0000000 B . . .splenectomy*place no altitude -29.5694444 B 51.35888601 -0.58 0.5691splenectomy*place no control 0.0000000 B . . .splenectomy*place yes altitude 0.0000000 B . . .splenectomy*place yes control 0.0000000 B . . .
NOTE: The X’X matrix has been found to be singular, and a generalized inverse was used tosolve the normal equations. Terms whose estimates are followed by the letter ’B’ are not
uniquely estimable.
54 / 68
Computing expected values
The reference levels are place=control,splenectomy=yes(as SAS chooses the reference levels as last level based onalphabetic ordering)
so the expected fibrinogen level for these animals isintercept=261.67
For all other groups, we have to add one or more extraestimates, as shown in the table below:
55 / 68
Expected fibrinogen levels
placesplenectomy control altitude
261.67 261.67yes + 104.33
= 366.00261.67 261.67
+ 104.44 + 104.44no + 104.33
- 29.57= 366.11 = 440.87
Note: expected value for splenectomy=no, place=altitude - rounding issue
56 / 68
Model checking
Variance homogeneity may be judged from a one-wayanova
The GLM ProcedureClass Level Information
Class Levels Valuesgroup 4 no_altitude no_control yes_altitude yes_control
Number of observations 34
Levene’s Test for Homogeneity of fibrinogen VarianceANOVA of Squared Deviations from Group Means
Sum of MeanSource DF Squares Square F Value Pr > F
group 3 1.9078E8 63594756 1.55 0.2222Error 30 1.2314E9 41045352
No reason to suspect inhomogeneity
57 / 68
Normality assumption for residuals
Result from proc univariate normal)
Tests for Normality
Test --Statistic--- -----p Value------Shapiro-Wilk W 0.964518 Pr < W 0.3276Kolmogorov-Smirnov D 0.126665 Pr > D >0.1500Cramer-von Mises W-Sq 0.091627 Pr > W-Sq 0.1424Anderson-Darling A-Sq 0.490958 Pr > A-Sq 0.2140
ConclusionNo reason to suspect non-normality
58 / 68
Mode simplification
In the two-way anova, the interaction was not significant(P=0.77), so we omit it from the model:
proc glm data=ex_fibrinogen;class splenectomy place;model fibrinogen=place splenectomy / solution clparm;
run;
Dependent Variable: fibrinogen
Sum ofSource DF Squares Mean Square F Value Pr > F
Model 2 137587.6377 68793.8188 12.59 <.0001Error 31 169425.3329 5465.3333Corrected Total 33 307012.9706
R-Square Coeff Var Root MSE fibrinogen Mean0.448149 20.76455 73.92789 356.0294
Source DF Type III SS Mean Square F Value Pr > Fplace 1 67925.25531 67925.25531 12.43 0.0013splenectomy 1 69662.38235 69662.38235 12.75 0.0012
59 / 68
Assessing the main effects
StandardParameter Estimate Error t Value Pr > |t|
Intercept 268.6241830 B 21.54935559 12.47 <.0001place altitude 89.5486111 B 25.40104253 3.53 0.0013place control 0.0000000 B . . .splenectomy no 90.5294118 B 25.35705800 3.57 0.0012splenectomy yes 0.0000000 B . . .
Parameter 95% Confidence Limits
Intercept 224.6739825 312.5743835place altitude 37.7428433 141.3543789place control . .splenectomy no 38.8133510 142.2454725splenectomy yes . .
◮ Removal of spleen leads to a decrease in fibronogen ofapprox 90.53 mg% at day 21
◮ Placing in altitude leads to an increase in fibronogen ofapprox 89.55 mg% at day 21
60 / 68
Residual plots
Normality Variance homogeneity
Res
idua
l
-200
-100
0
100
200
Expected
260 280 300 320 340 360 380 400 420 440 460
61 / 68
More complicated analyses of variances
◮ Three- or more-sided analysis of variance.◮ Latin squares
1 2 3I A B CII B C AIII C A B
(Cochran & Cox (1957): Experimental Designs, 2.ed., Wiley)
◮ Cross-over designs◮ Variance component models
62 / 68
Example of a latin square: A rabbit experiment
63 / 68
Example of a latin square: A rabbit experiment
◮ 6 rabbits◮ Vaccination at 6 different
spots on the back◮ 6 different orders of
vaccination◮ Swelling is area of
blister (cm2)
spot rabbit order swelling
1 1 3 7.91 2 5 8.71 3 4 7.41 4 1 7.4
.
.6 4 4 5.86 5 1 6.46 6 3 7.7
64 / 68
Illustrations
sw
ellin
g
5
6
7
8
9
10
spot
a b c d e f
1
1
11 1
1
22 2
2
2
2
3 3
33
3 34 44
4
44
5
5
5
5
5 5
6
6
6
6
66
sw
ellin
g
5
6
7
8
9
10
order
1 2 3 4 5 6
11
1
1
11
2 2 2
2
22
3 33
3
3
34 4
44
445 5
55
55
6
66 6
6
6
65 / 68
3-way analysis of variance, with additive effects
proc glm;class rabbit spot order;model swelling=rabbit spot order;
run;
The GLM Procedure
Class Level Information
Class Levels Values
rabbit 6 1 2 3 4 5 6spot 6 a b c d e forder 6 1 2 3 4 5 6
Number of observations 36
66 / 68
3-way analysis of variance
Dependent Variable: swelling
Sum ofSource DF Squares Mean Square F Value Pr > F
Model 15 17.23000000 1.14866667 1.75 0.1205Error 20 13.13000000 0.65650000Corrected Total 35 30.36000000
R-Square Coeff Var Root MSE swelling Mean
0.567523 10.99883 0.810247 7.366667
Source DF Type III SS Mean Square F Value Pr > F
rabbit 5 12.83333333 2.56666667 3.91 0.0124spot 5 3.83333333 0.76666667 1.17 0.3592order 5 0.56333333 0.11266667 0.17 0.9701
The design is balanced , so the test of the effect of one variable(covariate) does not depend on which of the others are still inthe model.
67 / 68
How about possible interactions?
proc glm;class rabbit spot order;model swelling=rabbit spot order spot*order;
run;
Dependent Variable: swellingSum of
Source DF Squares Mean Square F Value Pr > F
Model 35 30.36000000 0.86742857 . .Error 0 0.00000000 .Corrected Total 35 30.36000000
Source DF Type I SS Mean Square F Value Pr > F
rabbit 5 12.83333333 2.56666667 . .spot 5 3.83333333 0.76666667 . .order 5 0.56333333 0.11266667 . .spot*order 20 13.13000000 0.65650000 . .
There is no room for interaction, since there is only oneobservation for each combination of spot and order!
68 / 68
Recommended