Epidemiology 9509 oneway
Epidemiology 9509Principles of Biostatistics
Chapter 15 - One way analysis of variance
John Koval
Department of Epidemiology and BiostatisticsUniversity of Western Ontario
1
Epidemiology 9509 oneway
What is being covered
1. oneway analysis of variance
2. posthoc tests
2
Epidemiology 9509 oneway
What is being covered
1. oneway analysis of variance
2. posthoc testsmultiple comparisons
3
Epidemiology 9509 oneway
ANOVA
ANalysis Of VAriance
2 or more samples
1. may be naturally occurringas in population studynon-smokers, occasional smokers, current smokers
2. may be experimentalas in Randomized Comparative Trial (RCT)Placebo, Acetaminophen (AC), Acetylsalicylic Acid (ASA)orPlacebo, ASA at 200 mgm, ASA at 400 mgm
4
Epidemiology 9509 oneway
ANOVA - history
◮ agriculturelevels of fertilizerR.A. Fisher (1925, 1934)
◮ psychology
◮ clinicaldosage of drugs
5
Epidemiology 9509 oneway
Definitions
◮ factor - measured variableoften discrete
◮ levelsvalues of the factor
◮ treatmentparticular combination of levels of one or more factorshere equivalent to a level of a factor
6
Epidemiology 9509 oneway
Analysing variability
variability measured by
k∑
i=1
ni∑
j=1
(yij − y)2
k groupsyij is j’th observation in group i
7
Epidemiology 9509 oneway
Analysing variability II
SST - total sum of squarespartition into two
1. within treatment groups (SSW or SSE)
k∑
i=1
ni∑
j=1
(yij − yi)2
2. between groups (SSB or SSTr)
k∑
i=1
ni∑
j=1
(yi − y)2
k∑
i=1
ni (yi − y)2
8
Epidemiology 9509 oneway
ANOVA Table
anova table for the oneway design
Source Sum of degrees of Mean Square VarianceSquares freedom ratio
Between SSB k-1 SSBk−1
MSBMSW
Within SSW N-k SSWN−k
Total SST N-1
9
Epidemiology 9509 oneway
Mean Square
thanks to Manny
10
Epidemiology 9509 oneway
ANOVA - example
temperature reduction (◦C) for three drugs
placebo AC ASA
0.6 1.0 1.00.4 0.5 1.50.0 0.0 0.5-0.4 0.3 1.2-0.6 0.7 0.8
Ti 0.0 2.5 5.0yi 0.0 0.5 1.0s2i 0.26 0.145 0.145si 0.51 0.38 0.38
11
Epidemiology 9509 oneway
getting entries for ANOVA table
y =
∑
ni yi∑
ni
=5(0.0) + 5(0.5) + 5(1.0)
5 + 5 + 5
=7.5
15= 0.5
SSB =∑
ni(yi − y)2
= 5(0.0 − 0.5)2 + 5(0.5 − 0.5)2 + 5(1.0 − 0.5)2
= 5(−0.5)2 + 0 + 5(0.5)2
= 1.25 + 0 + 1.25 = 2.5
12
Epidemiology 9509 oneway
getting entries for ANOVA table (continued)
SSW =∑
(ni − 1)s2i
= 4(0.26) + 4(0.145) + 4(0.145)
= 1.04 + 0.58 + 0.58 = 2.20
13
Epidemiology 9509 oneway
ANOVA for example
temperature reduction (◦C) for three drugs
Source Sum of degrees of Mean Square VarianceSquares freedom ratio
Between 2.50 2 1.25 6.82
Within 2.20 12 0.183
Total 4.70 14
14
Epidemiology 9509 oneway
Hypothesis testHo : none of the treatments (groups)is different from the others
Under Ho
the variance ratio (vr) ∼ F2,12
in generalvr ∼ Fk−1,N−k
in this case0.025 > Pr(F2,12 > 6.82) > 0.010(Table A.4, page A.7)
At α = 0.05, reject Ho
claimHA: at least one of the treatments (groups)is different from the others
which treatments??15
Epidemiology 9509 oneway
Post hoc test
three comparisons of interest
1. |y1 − y2|2. |y1 − y3|3. |y2 − y3|
16
Epidemiology 9509 oneway
Post hoc test II
which test
17
Epidemiology 9509 oneway
Post hoc test II
which testindependent samples t-test
|y1−y2|√
s2p
(
1n1
+ 1n2
)
18
Epidemiology 9509 oneway
Multiple comparisons
y1 = 0.0, y2 = 0.5, y3 = 1.0
q12 =|0.0−0.5|
√
0.183( 15+ 1
5)= 1.848
q13 = 3.696, q23 = 1.848
if compare to critical value t12,0.025(= 2.179)LSD (Least Significant Difference)q13 is significant, q12 and q23 are not
19
Epidemiology 9509 oneway
Probability of making mistake
General - and for LSD
1. 5% for each of three testsERPC - error rate per comparison (α)=5%
2. expected number of errorsError rate per experimentERPE = np = n(0.05) = 0.15
3. probability of at least one error(αE - experimentwise error rate)experimentwise error rateEWER = 1− (1− α)c = 1− (1− 0.05)3
= 1− (0.95)3 = 1− .857375 = 0.142625≈ ERPE
ERPE and EWER are less if comparisons are correlatedand they are
20
Epidemiology 9509 oneway
10 groups
1. ERPC = 0.05
2. ERPE = np = 10(0.05) = 0.5assumes independence
3. EWER = 1− (1− α)c = 1− (1− 0.05)10
= 1− (0.95)10 = 1− 0.5987 = 0.4013ERPE not such a good approximationand this is upper bound (assumes independence)
21
Epidemiology 9509 oneway
more conservative procedure - Bonferroni
ERPE and EWER too largedivide α by number of possible comparisons(32
)
= 3
α = 0.05/3 = 0.01667
1. ERPC = α = 0.01667
2. ERPE = n(0.01667) = 0.05
3. EWER = 1− (1− 0.01667)3
= 1− (0.98333)3 = 1− 0.95082 = 0.04918this is upper bound
22
Epidemiology 9509 oneway
more conservative procedure - Bonferroni - moderate n
(102
)
= 45
α = 0.05/45 = 0.00111
1. ERPC = α = 0.00111
2. ERPE = 45(0.00111) = 0.05
3. EWER = 1− (1− 0.00111)45
= 1− (0.99889)45 = 1− 0.95125 = 0.04875this is upper bound
23
Epidemiology 9509 oneway
Bonferroni - example
n = 3α = 0.01667t12,0.00833 = 2.779same conclusions as for LSD
24
Epidemiology 9509 oneway
less conservative? - Scheffe
compare q with√
(k − 1)Fk−1,N−k
critical value is√
2F2,12,0.05=
√
2(3.885) =√7.566 = 2.787
same conclusions as for otherssimilar to Bonferronibut more conservative than Bonferroni
WHY??handles all possible linear contrastsegHo : 1
2(µ1 + µ2) = µ3
use y1+y22 − y3
25
Epidemiology 9509 oneway
better - Tukey
Tukey(1953) defined Studentized Range Statisticrange of means
√
s2
n
and got its distribution,Qk,N−k
not quite what we wantonly for equal-sample size
compare qij with qk,N−k,0.05/√2
in our case, q3,12,0.05 = 3.783/1.414 = 2.675
same conclusions as for previous tests
26
Epidemiology 9509 oneway
summary
Critical values multiple comparison tests
Name formula α = 0.05
t-test tN−k,1−(α/2) t12,0.025 = 2.179
(LSD)
Bonferroni tN−k,1−(α/[k(k−1)]) t12,0.00833 = 2.779
Scheffe√
(k − 1)Fk−1,N−k,1−α
√
2F2,12,0.05=
√
2(3.885) = 2.787
Tukey qk,N−k,1−α/√2 q3,12,0.05/
√2
(HSD) = 3.783/1.414 = 2.675
27
Epidemiology 9509 oneway
other tests
Multiple Range tests:SNK (Student, Newman, Keuls)
◮ put means in ascending ordery[1], ..., y[k]
◮ for |y[1] − y[k]|use Qk,N−k,0.05
◮ for |y[1] − y[k−1]||y[2] − y[k]|use Qk−1,N−k,0.05
◮ etc
◮ or |y[1] − y[2]|, ...,|y[k−1] − y[k]|use Q2,N−k,0.05
like independent samples t-test
28
Epidemiology 9509 oneway
SNK example
◮ put means in ascending ordery[1], y[2], y[3]
◮ for |y[1] − y[3]|use Q3,N−3,0.05
◮ for |y[1] − y[2]||y[2] − y[3]|use Q2,N−3,0.05
like independent samples t-test
29
Epidemiology 9509 oneway
conclusion
1. for this data setall tests have the same result
2. t-test (LSD) is anti-conservative
3. Bonferroni and Scheffe are conservative
4. Tukey (HSD) is bestfor equal sample size
30
Epidemiology 9509 oneway
unequal sample size
Tukey studentised range statistic - equal sample size
q =|y1 − y2|√
s2pn
Kramer(1956) modified for unequal sample size
q =|y1 − y2|
√
s2p2
(
1n1
+ 1n2
)
31
Epidemiology 9509 oneway
unequal sample size II
SAS uses a different version (sometimes)based on harmonic mean
nh =k
∑
n−1i
so that
q =|y1 − y2|√
s2pnh
32