Upload
pamela-jones
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
Economics 173Business Statistics
Lectures 9 & 10
Summer, 2001
Professor J. Petry
Analysis of VarianceAnalysis of Variance
Chapter 14
14.1 Introduction
• Analysis of variance helps compare two or more populations of quantitative data.
• Specifically, we are interested in the relationships among the population means (are they equal or not).
• The procedure works by analyzing the sample variance.
• The analysis of variance is a procedure that tests to determine whether differences exits among two or more population means.
• To do this, the technique analyzes the variance of the data. How can analyzing variance help us understand relationship between population means?
• If the ratio of the variance between samples compared to the variance within samples is large, then the means of the samples are likely to be unequal (i.e. reject H0).
14.2 Single - Factor (One - Way) Analysis of Variance : Independent Samples
• Example 14.1– An apple juice manufacturer is planning to develop a new
product -a liquid concentrate.– The marketing manager has to decide how to market the
new product.– Three strategies are considered
• Emphasize convenience of using the product.• Emphasize the quality of the product.• Emphasize the product’s low price.
• Example 14.1 - continued– An experiment was conducted as follows:
• In three cities an advertising campaign was launched .
• In each city only one of the three characteristics
(convenience, quality, and price) was emphasized.
• The weekly sales were recorded for twenty weeks
following the beginning of the campaigns.
Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532
Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532
• Example 14.1 - continued– Data (see file XM14 -01.)
Weekly sales
• Solution– The data are quantitative.
– Our problem objective is to compare sales in three cities.
– We hypothesize on the relationships among the three mean weekly sales:
H0: 1 = 2= 3
H1: At least two means differ
To perform the analysis of variance we need to build an “F” statistic.
To more easily follow the process we use the following notation:
Independent samples are drawn from k populations.Each population is called a “treatment”.
1 2 kX11
x21
.
.
.Xn1,1
1
1x
n
X12
x22
.
.
.Xn2, 2
2
2x
n
X1k
x2k
.
.
.Xnk,k
k
kx
n
Sample sizeSample mean
First observation,first sample
Second observation,second sample
X is the “response variable”..The variables’ values are called “responses”.The average of the sample means is the “grand mean”
• The Test Statistic
• The test stems from the following rationale:– If the null hypothesis is true, we would expect all the
sample means be close to one another (and as a result to the grand mean).
– If the alternative hypothesis is true, at least some of the sample means would be different from one another.
20
1
12
19
9
25
30
7
Treatment 1 Treatment 2 Treatment 3
10
Treatment 1Treatment 2Treatment 3
20
161514
1110
9
10x1
15x2
20x3
10x1
15x2
20x3
The sample means are the same as before,but the larger within-sample variability makes it harder to draw a conclusionabout the population means. =noise=“error”
A small variability withinthe samples makes it easierto draw a conclusion about the population means.
If the b/n sample mean variance is large, relative to the within-sample variance, then the sample means are unequal.
• The variability among the sample means is measured as the sum of squared distances between each mean and the grand mean.
This sum is called the Sum of Squares for Treatments
SST
2
1
)( xxnSSTk
jjj
In our example treatments arerepresented by the differentadvertising strategies.
SST - Continued
2k
1jjj )xx(nSST
Note: When the averages are close toone another, their distance from the grand average is small, leading to asmall SST. Thus, large SST indicateslarge variation among sample averages.
There are k treatments
The size of sample j The average of sample j
The grand mean
The grand mean is calculated by
k21
kk2211
n...nnxn...xnxn
X
• The variability within samples is measured by adding all the squared distances between observations and their sample mean.
This sum is called the Sum of Squares for Error -
SSE.
k
jjij
n
i
)xx(SSEj
1
2
1
k
jjij
n
i
)xx(SSEj
1
2
1
In our example this is the sum of all squared differencesbetween sales in city j and thesample mean of city j (over all the three cities).
2k
1jjj )xx(nSST
= 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 == 57,512.23
Calculation of SST Calculation of SSE
k
1j
2ij
n
1i
)xx(SSEj
= (n1 - 1)S12 + (n2 -1)S2
2 + (n3 -1)S32
= (20 -1)10,774.44 + (20 -1)7238.61+ (20-1)8,669.47 = = 506,967.88
To perform the test we need to calculate the mean sums of squaresmean sums of squares as follows:
Calculation of MST - Mean Square for Treatments
12.756,281323.512,57
1kSST
MST
Calculation of MSEMean Square for Error
17.894,836088.967,509
knSSE
MSE
23.317.894,812.756,28
MSEMST
F
with the following degrees of freedom:v1=k -1 and v2=n-kWe require that:
1. The populations tested are normally distributed.2. The variances of all the populations tested are equal.
H0: 1 = 2 = …=k
H1: At least two means differ
Test statistic:
Rejection region: F>Fk-1,n-k
MSEMST
F
Specifically in our advertisement problem
Which in terms of our hypothesis test, looks like . . .
Ho: 1 = 2= 3
H1: At least two means differ
Test statistic F= MST / MSE= 3.23
15.3:.. 57,2,05.1 FFFRR knk
Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.
23.317.894,812.756,28
MSEMST
F
Anova: Single Factor
SUMMARYGroups Count Sum Average Variance
Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395
ANOVASource of Variation SS df MS F P-value F crit
Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474
Total 564495.73 59
Anova: Single Factor
SUMMARYGroups Count Sum Average Variance
Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395
ANOVASource of Variation SS df MS F P-value F crit
Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474
Total 564495.73 59
SS(Total) = SST + SSE
Checking the required conditions
• The F test requires that the populations are normally distributed with equal variance.
• From the Excel printout we compare the sample variances: 10774, 7238, 8670. It seems the variances are equal (see section 14.7 for Bartlett’s test of equality of variances).
• To check the normality observe the histogram of each sample.
Convenience
0
24
68
10
450 550 650 750 850 More
Quality
02468
10
450 550 650 750 850 More
Price
02468
10
450 550 650 750 850 More
All the distributions seem to be normal.
14.3 Analysis of Variance Models
• Several elements may distinguish between one experimental design and others.– The number of factors.
• Each characteristic investigated is called a factor.• Each factor has several levels.
Factor ALevel 1Level2
Level 1
Factor B
Level 3
Two - way ANOVA
Level2
One - way ANOVA
Treatment 3
Treatment 2
Response
Response
Treatment 1
– Independent samples or blocks.• Groups of matched observations are formed into blocks,
in order to remove the effects of “noise” variability.• By doing so we improve the chances of detecting the
variability of interest.
– Fixed and random effects models.• If all levels of a factor included in our analysis are pre-
determined, we have a fixed effect ANOVA.– The conclusion of a fixed effect ANOVA applies only to the levels
studied.
• If the levels included in our analysis represent a random sample of all the possible levels, we have a random-effect ANOVA.
– The conclusion of the random-effect ANOVA applies to all the levels (not only those studied).
• The calculation of the test statistic for fixed and random effects may differ for some ANOVA models.
14.4 Single-Factor Analysis of Variance: Randomized Blocks
• The purpose of designing a randomized block experiment is to reduce the within-treatments variation thus increasing the relative among-treatment variation.
• This helps in detecting differences among the treatment means more easily.
Treatment 4
Treatment 3
Treatment 2
Treatment 1
Block 1Block3 Block2
Block all the observations with some commonality across treatments
Blocked samples from k populations (treatments)
TreatmentBlock 1 2 k Block mean
1 X11 X12 . . . X1k2 X21 X22 X2k...b Xb1 Xb2 Xbk
Treatment mean
1]B[x
2]B[x
b]B[x
1]T[x 2]T[x k]T[x
• The sum of square total is partitioned into three sources of variation– Treatments– Blocks– Within samples (Error)
SS(Total) = SST + SSB + SSESS(Total) = SST + SSB + SSE
Sum of square for treatments Sum of square for blocks Sum of square for error
Recall: SS(Total) = SST + SSE
1kSST
MST
1bSSB
MSB
1bknSSE
MSE
MSEMST
F
MSEMSB
F
To perform hypothesis tests for treatments and blocks we need
• Mean square for treatments• Mean square for blocks• Mean square for error
Test statistics for treatments
Test statistics for blocks
• Example 14.2– A radio station manager wants to know if the amount of
time his listeners spent listening to a radio per day is about the same every day of the week.
– 200 teenagers were asked to record how long they spend listening to a radio each day of the week.
• Solution– The problem objective is to compare seven populations.– The data are quantitative.
– Each day of the week can be considered a treatment.
– Each 7 data points (per person) can be blocked, because they belong to the same person.
– This procedure eliminates the variability in the “radio-times” among teenagers, and helps detect differences of the mean times teenagers listen to the radio among the days of the week.
Checking the required conditions• Observing the histograms of the seven populations we can assume
that all the distributions are approximately normally distributed.
0
20
40
60
80
100
40 60 80 100 120 More
Sunday
0
20
40
60
80
40 60 80 100 120 More
Monday
• The population variances seem to be equal. See the sample variances :
Sunday 462.9742Monday 502.1718Tuesday 506.2758Wednsday 540.7065Thursday 483.7455Friday 484.6227Saturday 481.6128
ANOVASource of Variation SS df MS F P-value F critRows 209834.6 199 1054.445 2.627722 1.04E-23 1.187531Columns 28673.73 6 4778.955 11.90936 5.14E-13 2.106162Error 479125.1 1194 401.2773
Total 717633.5 1399
ANOVASource of Variation SS df MS F P-value F critRows 209834.6 199 1054.445 2.627722 1.04E-23 1.187531Columns 28673.73 6 4778.955 11.90936 5.14E-13 2.106162Error 479125.1 1194 401.2773
Total 717633.5 1399
BlocksTreatments K-1 b-1
MST / MSE
MSB / MSE
Conclusion: At 5% significance level there is sufficient evidence to reject the null hypothesis, and infer that mean “radio time”is different in at least one of the week days.
14.5 Two Factor Analysis of Variance: Independent Samples• Example 14.3
– Suppose in example 14.1, two factors are to be examined:• The effects of the marketing approach on sales.
– Emphasis on convenience– Emphasis on quality– Emphasis on price
• The effects of the selected media on sales.– Advertise on TV– Advertise in newspapers
• Example 14.3 - continued– The combinations of level, one for each factor define
the treatments.– The hypotheses tested are:
H0: 1= 2= 3= 4= 5= 6
H1: At least two means differ.
– We assume that the only existing levels are those studied. Thus, this is a fixed - effects factorial experiment.
1. Emphasize convenience+ advertise on TV.
2. Emphasize convenience+ advertise in newspaper.
• We can design the experiment as follows: City 1 City2 City3 City4 City5 City6Convenience& Quality& Price & Convenience& Quality & Price & TV TV TV Newspaper Newspaper Newspaper
• This is a one - way ANOVA experimental design. The p-value =.045. We conclude that there is a strong evidence that differences exist in the mean weekly sales.
• Are these differences caused by differences in the marketing approach?
• Are these differences caused by differences in the medium used for advertising?
• Are there combinations of these two factors that interact to affect the weekly sales?
• A new experimental design is needed to answer these questions.
City 1sales
City3sales
City 5sales
City 2sales
City 4sales
City 6sales
TV
Newspapers
Convenience Quality Price
Factor A: Marketing strategy
Factor B: Advertising media
Are there differences in the mean sales caused by different marketing strategies?
Test whether mean sales of “Convenience”, “Quality”, and “Price” significantly differ from one another.
Factor A: Marketing strategy
Factor B: Advertising media
Factor A: Marketing strategy
Factor B: Advertising media
Factor A: Marketing strategy
Factor B: Advertising media
City 1sales
City 3sales
City 5sales
City 2sales
City 4sales
City 6sales
TV
Newspapers
Convenience Quality Price
Factor A: Marketing strategy
Factor B: Advertising media
Are there differences in the mean sales caused by different advertising media?
Test whether mean sales of the “TV”, and “Newspapers” significantly differ from one another. Use SS(B).
City 1sales
City 5sales
City 2sales
City 4sales
City 6sales
TV
Newspapers
Convenience Quality Price
Factor A: Marketing strategy
Factor B: Advertising media
Are there differences in the mean sales caused by interaction between marketing strategy and advertising medium?
Test whether mean sales of certain cells are different than the level expected.
City 3sales
Graphical description of the possible relationships between factors A and B.Graphical description of the possible relationships between factors A and B.
Levels of factor A
1 2 3
Level 1 of factor B
Level 2 of factor B
1 2 3
1 2 31 2 3
Level 1and 2 of factor B
Difference among the levels of factor ANo difference among the levels of factor B
Difference among the levels of factor A, anddifference among the levels of factor B; nointeraction
Levels of factor A
Levels of factor A Levels of factor A
No difference among the levels of factor A.Difference among the levels of factor B
Interaction
M Re sa pn o n s e
M Re sa pn o n s e
M Re sa pn o n s e
M Re sa pn o n s e
Sums of squares
a
1i
2i )x]A[x(rb)A(SS })xx()xx()xx{(rb 2
price2
quality2
.conv
b
1j
2j )x]B[x(ra)B(SS })xx()xx{(ra 2
Newspaper2
TV
b
1j
2jiij
a
1i
)x]B[x]A[x]AB[x(r)AB(SS
b
1j
2ijijk )]AB[xx()E(SS
F tests for the Two-way ANOVA
• Test for the difference among the levels of the main factors A and B
F= MS(A)MSE
F= MS(B)MSE
Rejection region: F > F,a-1 ,n-ab F > F, b-1, n-b
• Test for interaction between factors A and B
F= MS(AB)MSE
Rejection region: F > F,a-1)(b-1),n-ab
SS(A)/(a-1) SS(B)/(b-1)
SS(AB)/(a-1)(b-1)
SSE/(n-ab)
• Example 14.3 - continued– Test of the difference in mean sales among the three
marketing approachesH0: conv. = quality = price
H1: At least two mean sales are different
F = MS(Marketing strategy)/MSE = 5.33 (see computer printout next.)Fcritical = F,a-1,n-ab = F.05,3-1,60-(3)(2) = about 3.15
– At 5% significance level there is evidence to infer that differences in weekly sales exist among the marketing strategies.
• Example 14.3 - continued– Test of the difference in mean sales between the two
advertising mediaH0: TV. = Nespaper
H1: The two mean sales differ
F = MS(Media)/MSE = 1.42 (see computer printout next.)Fcritical = Fa-1,n-ab = F.05,2-1,60-(3)(2) = about 4.00
– At 5% significance level there is insufficient evidence to infer that differences in weekly sales exist between the two advertising media.
• Example 14.3 - continued– Test for interaction between factor A and B
H0: TV*conv. = TV*quality =…=newsp.*price
H1: At least two means differ
F = MS(Marketing*Media)/MSE = .09 (see computer printout next.)Fcritical = Fa-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = about 3.15
– At 5% significance level there is insufficient evidence to infer that the two factors interact to affect the mean weekly sales.
Convenience Quality Price
TV 491 677 575TV 712 627 614TV 558 590 706TV 447 632 484TV 479 683 478TV 624 760 650TV 546 690 583TV 444 548 536TV 582 579 579TV 672 644 795
Newspaper 464 689 803Newspaper 559 650 584Newspaper 759 704 525Newspaper 557 652 498Newspaper 528 576 812Newspaper 670 836 565Newspaper 534 628 708Newspaper 657 798 546Newspaper 557 497 616Newspaper 474 841 587
The two - way ANOVA Excel solution
Factor A = Marketing strategies
Factor B = Advertising media
ANOVASource of Variation SS df MS F P-value F crit
Sample 13172.02 1 13172.02 1.419351 0.23872 4.01954Columns 98838.63 2 49419.32 5.32518 0.007748 3.168246Interaction 1609.633 2 804.8167 0.086723 0.917058 3.168246Within 501136.7 54 9280.309
Total 614757 59
14.7 Multiple Comparisons
• When the null hypothesis is rejected, it may be desirable to find which mean(s) is (are) different, and at what ranking order.
• Three statistical inference procedures, geared at doing this, are presented:– Fisher’s least significant difference (LSD) method– Bonferroni adjustment– Tukey’s multiple comparison method
• Two means are considered different if the difference between the corresponding sample means is larger than a critical number. Then, the larger sample mean is believed to be associated with a larger population mean.
• Conditions common to all the methods here:– The ANOVA model is the independent-sample,
single factor– The conditions required to perform the ANOVA are
satisfied.– The experiment is fixed-effect
Fisher Least Significant Different (LSD) Method
• This method builds on the equal variance t-test of the difference between two means.
• The test statistic is improved by using MSE rather than sp2.
• We can conclude that i and j differ (at % significance level if |i - j| > LSD, where
kn.f.d
)n1
n1
(MSEtLSDji
2
The Bonferroni Adjustment
• The Fisher’s method may result in an increased probability of committing a type I error ().
• The Bonferroni adjustment determines the required type I error probability per pairwise comparison, to secure a pre-determined overall
• The procedure:– Compute the number of pairwise comparisons (C)
[C=k(k-1)/2}], where k is the number of populations.– Set = E/C, where E is the true probability of making at
least one type I error (called experimentwise type I error).– We can conclude that i and j differ (at /C% significance
level if
kn.f.d
)n1
n1
(MSEtji
C2ji
• Example 14.1 - continued– Rank the effectiveness of the marketing strategies
(based on mean weekly sales).– Use the Fisher’s method, and the Bonferroni adjustment
method• Solution (the Fisher’s method)
– The sample mean sales were 577.55, 653.0, 608.65.– Then,
35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
71.59)20/1()20/1(8894t
)n1
n1
(MSEt
2/05.
ji2
• Solution (the Bonferroni adjustment)– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.– We set = .05/3 = .0167, thus t.00833, 60-3 = 2.467 (Excel).
54.73)20/1()20/1(8894t
)n1
n1
(MSEt
0167.
ji2
Again, the significant difference is between 1 and 2.35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
• The test procedure:– Find a critical number as follows:
gnMSE
),k(q
k = the number of samples =degrees of freedom = n - kng = number of observations per sample (recall, all the sample sizes are the same) = significance levelq(k,) = a critical value (the studentized range) obtained from a table
The Tukey Multiple Comparisons
• Repeat this procedure for each pair of samples. Rank the means if possible.
• Select a pair of means. Calculate the difference between the larger and the smaller mean.
• If there is sufficient evidence to conclude that max > min .
minmax xx
minmax xx
If the sample sizes are different, use the above procedureprovided the sizes are similar. For ng use the harmonic mean.
k21 n1...n1n1k
gn
City 1 vs. City 2: 653 - 577.55 = 75.45City 1 vs. City 3: 608.65 - 577.55 = 31.1City 2 vs. City 3: 653 - 608.65 = 44.35
• Example 14.1 - continued– We had three populations (three marketing strategies).
K = 3,Sample sizes were equal. n1 = n2 = n3 = 20,= n-k = 60-3 = 57,MSE = 8894.
minmax xx minmax xx
70.7120
8894)57,3(.q
nMSE
),k(q 05g
Take q.05(3,60) from the table.
Population
Sales - City 1Sales - City 2Sales - City 3
Mean
577.55653698.65
14.8 Bartlett’s Test
• This procedure is conducted when testing
• The test statistic is
differsiancevaroneleastAt:H
...:H
1
2k
22
210
kn1
1n1
)1k(31
1C
where
)sln()1n()MSEln()kn(C1
B
k
1i i
k
1i
2ii