Upload
sumit-prajapati
View
655
Download
0
Embed Size (px)
DESCRIPTION
Citation preview
Analysis of Variance
Yuantao Hao
26th,Oct. 2009
Chapter5
Review:
1. The basic steps and logic of hypothesis testing;
2. One-sample t test;
3. Two-sample t test;
4. Paired –sample t test;
5. F test for homogeneity of variances;
6. Z test for the parameters of binomial distribution and
Poisson distribution when sample size is large
enough.
main steps:
(1) Set up the statistical hypotheses
(2) Select statistics and calculate its current value
(3) Determine the P-value
00 : H 01 : H
nS
Xt
/0
) statistic of aluecourrent v( ttPP
(4) Decision and conclusion
Comparing the P-value with the pre-assigned
small probability , if P ≤ , then reject ;
otherwise, not reject . Finally, issue the
conclusion incorporating with the background.
0H
0H
P value
P-value is defined as a probability of the event
that the current situation and even more
extreme situation towards appear in the
population. 0H
)8345.2( tPP
The P-value can also be thought of as the
probability of obtaining a test statistic as
extreme as or more extreme than the actual
test statistic obtained, given that the null
hypothesis is true.
)8345.2( tPP
Question:
1. One-sample;
2. Two-sample;
3. Paired –sample;
4. Three or more samples?
Analysis of variance (ANOVA) :
One-way ANOVA is used to test for
differences among two or more independent
groups.
Typically, however, the one-way ANOVA is
used to test for differences among at least
three groups, since the two-group case can
be covered by a t-test .
5.1 One-Way ANOVA for the Completely Random Design
The completely random design
For this design, there is only one treatment
factor with G (≥2) levels. The term level refers
to the possible status planned for the treatment
factor.
Example 5.1 Randomly assign 12
laboratory blood specimens (experiment
units) into three groups with 4 blood
specimens in each group.
How to assign the 12 units into three
groups randomly?
Table 11.1 Randomized grouping result Unit number 1 2 3 4 5 6 7 8 9 10 11 12 Random number 39 90 22 00 66 82 89 08 92 72 36 60 Rank (R) 5 11 3 1 7 9 10 2 12 8 4 6 Grouping result 2 3 1 1 2 3 3 1 3 2 1 2
Unit number 1 2 3 4 5 6 7 8 9 10 11 12
Random number 39 90 22 00 66 82 89 08 92 72 36 60
Rank (R) 5 11 3 1 7 9 10 2 12 8 4 6
Grouping result 2 3 1 1 2 3 3 1 3 2 1 2
Example 5.2 12 blood specimens are randomly assigned into three groups according to Table 2. Group 1 receives the treatment of anticoagulant (抗凝血剂) A; group 2 receives anticoagulant B; and group 3 receives anticoagulant C.
For each blood specimen, the erythrocyte sedimentation rate (ESR ,红细胞沉降率 ) after receiving the treatment is measured. The aim is to test whether the three mean ESRs are significantly different. The results are showed in Table 11.2.
Table 11.2 Erythrocyte sedimentation rate (ESR mm/h)
Anti-
coagulant ESR( ijX ) j
ijX j
ijX 2 in iX 2iS
A 17 16 16 15 64 1026 4 16.0 0.67
B 10 11 12 12 45 509 4 11.3 0.92
C 11 9 8 9 37 347 4 9.3 1.58
Total ij
ijX =146 ij
ijX 2 =1882 N =12 X =12.2 2cS =3.17
0H : 1 = 2 = 3 ,
1H : 1 , 2 and 3 are not all the same
BET
EEE SSMS /
BBB SSMS /
E
B
MS
MSF
G
i
n
jijT
i
XXSS1 1
2)(
G
iiiB XXnSS
1
2)(
G
i
n
jiijE
i
XXSS1 1
2)(
BET SSSSSS
1GB
1NT
GNE
TTT SSMS
If then reject H0
If then not reject H0
1,2,1FF
1,2,1FF
0H : 1 = 2 = 3 ,
1H : 1 , 2 and 3 are not all the same
Table 11.3 The table for one-way analysis of variance
Source DF SS MS F P
Between groups B = 1G BSS BMS = BSS / )1( G BMS / EMS
Within groups E = GN ESS = TSS - BSS EMS = ESS / )( GN
(Errors)
Total 1N TSS
Table 11.4 Table of one-way ANOVA for the effects of the anticoagulants
Source DF SS MS F P
Between the anticoagulants 2 96.17 48.09 45.37 <0.05
Errors 9 9.50 1.06
Total 11 105.67
Two assumptions on analysis of variance:
1. follows normal distribution , ;
2. homogeneity of variances, .
ijX ),( 2iiN Gi ,,2,1
222
21 G
Bartlett’s test :
222
21 G 0H
1H 222
21 ,,, G are not all equal
:
:
)ln()1(ln)1(
1 22
2ii
ci Sn
G
Sn
m
1G
)1(
1
1
1
)1(3
11
ii nnGm
Example 5.2 (cont.)
Test the homogeneity of variances for the
three populations in Table 2.
)14(
1
14
1
)13(3
11m
50.0
)58.1ln92.0ln67.0)(ln14(3
17.3ln)14(
148.1
12
213
20.10,2 4.61 P>0.10
Test for normality and transformations:
)log( aXY
XY
pY 1sin
5.2 Multiple comparisons
1. To examine whether a specified two means
are equal or not . LSD-t test.
2. To examine whether all the means of
comparison groups are equal or not . SNK-q
test.
LSD-t test
(least significant difference t test)
jiij XXd
)11( jiEdnnMSS
ij
ijdji StXX ,
H0 is rejected if:
SNK-q test
(Student-Newman-Keuls q test)
nMSS EX ij
nnnn G 21
21
1in
n NG N
All means should be sorted from the smallest to the
biggest to form contrasts.
Each contrast may contain a means, a=2,3,…,G.
In Example 11.2, the means of three groups is sorted as
9.3, 11.3 and 16.0; if 9.3 and 11.3 are selected to form a
contrast, a=2;
With the parameters a and, the critical value of
SNK-q test can be find out from Table 11 of Appendix 2.
, ,a vq
ijXaji SqXX ,,
H0 is rejected if:
For a =2, 0.01,2,9q =4.60, 0.01,2,9 Xq S =4.60×0.51=2.35.
For a =3, 0.01,3,9q =5.43, 0.01,3,9 Xq S =5.43×0.51=2.77.
Anticoagulants C B A
Means of ESR 9.3 11.3 16.0
Grouping (C, B) (A)
Figure11.1 The grouping of the effects of the three anticoagulants
5.3 Two-Way ANOVA for the Randomized Complete-Block
Design There are n blocks and each block contains G
experimental units to receive G treatments ran
domly. The total number of observations is N =
nG.
Example5.4 12 mice have been grouped into 4 blocks according to their birth litters and each block has 3 mice. Randomly assign 3 kinds of food to the 3 mice in each block.
Table 11.8 Random allocation of the randomized complete-block design
Block Unit No. Random No. (Rank) Unit No. (Treatment )
1 1 2 3 28 (1) 65 (3) 62 (2) 1 (A) 2 (C) 3 (B)
2 1 2 3 79 (3) 21 (2) 05 (1) 1 (C) 2 (B) 3 (A)
3 1 2 3 81 (2) 51 (1) 94 (3) 1 (B) 2 (A) 3 (C)
4 1 2 3 19 (1) 90 (3) 76 (2) 1 (A) 2 (C) 3 (B)
The advantage of this design comparing with
the completely random design is to reduce the
effect of the variation among the experimental
units if the difference among blocks was a
main source of variation.
The disadvantage is that all the sizes of
different blocks (or say, the numbers of
experimental units in different blocks) should
equal to the number of treatments, otherwise,
the statistical analysis will be difficult.
The observations of the randomized complete-block design
Treatment Block
1 2 … j … G
Total
1
2
i
n
X11 X12 … X1j … X1G
X21 X22 … X2j … X2G
… …
Xi1 Xi2 … Xij … XiG
… …
Xn1 Xn2 … Xnj … XnG
B1
B2
Bi
Bn
Total
Sum of squares
T1 T2 … Tj … TG
Q1 Q2 … Qj … QG
Example 5.5 The investigator used randomized block design to carry out the experiment to compare the anti-tumor effects of three anti-tumor drugs A, B, C on mice sarcoma (肉瘤) . 15 mice of the same race were selected and three anti-tumor drugs A, B, C randomly allocated into 3 mice within the same block.
With the observations of sarcoma’s weight, the experiment result sees Table 5. Please test if the effects of three anti-tumor drugs are different.
Table 11.11 The weight of mice sarcoma with different drugs (g)
Drugs Block
A B C Total (Bi)
1 0.82 0.65 0.51 1.98 2 0.73 0.54 0.23 1.50 3 0.43 0.34 0.28 1.05 4 0.41 0.21 0.31 0.93 5 0.68 0.43 0.24 1.35
Total (Ti) 3.07 2.17 1.57 6.81 Sum of Squares (Qi) 2.02 1.06 0.55 3.63
5
The table of analysis of variance for randomized complete-block design
Source DF SS MS F P
Treatment G-1 CTn
SSG
jjB
1
21
1 )1/(1! GSSMS BB EB MSMS /1
Block n-1 CBG
SSn
iiB
1
22
1 )1/(22 nSSMS BB EB MSMS /2
Error (G-1)(n-1) 21 BBTE SSSSSSSS )1)(1/( nGSSMS EE
Total Gn-1 CXSSn
i
G
jijT
1 1
2
2
1 1
1
n
i
G
jijX
nGC
Analysis of variance for the anti-tumor effects
Source DF SS MS F P Treatment 2 0.228 0.114 11.937 <0.004 Block 4 0.228 0.057 5.978 <0.016 Error 8 0.076 0.010 Total 14 3.624
0.532
Summary:
1. Basic logic of ANOVA;
2. ANOVA for completely random design data;
3. Multiple comparison;
4. ANOVA for randomized complete-block design
data.
THE END
THANKS !