40
12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights

12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

Embed Size (px)

Citation preview

Page 1: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-1

Chapter

Twelve

McGraw-Hill/Irwin

© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.

Page 2: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-2

Chapter TwelveAnalysis of VarianceAnalysis of Variance

GOALSWhen you have completed this chapter, you will be able to:

ONE List the characteristics of the F distribution.

TWOConduct a test of hypothesis to determine whether the variances of two populations are equal.

THREEDiscuss the general idea of analysis of variance.

FOUR

Organize data into a one-way and a two-way ANOVA table.

Goals

Page 3: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-3

Chapter Twelve continuedAnalysis of VarianceAnalysis of Variance

GOALSWhen you have completed this chapter, you will be able to:

FIVE Define and understand the terms treatments and blocks.

SIXConduct a test of hypothesis among three or more treatment means.

SEVENDevelop confidence intervals for the difference between treatment means.

EIGHTConduct a test of hypothesis to determine if there is a difference among

block means.

Goals

Page 4: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-4

Characteristics of F-Distribution

Its values range from 0 to . As F the curve approaches the X-axis but never touches it.

Characteristics of the F-Distribution

There is a “family” of F Distributions.

Each member of the family is determined by two parameters: the numerator degrees of freedom and the denominator degrees of freedom.

F cannot be negative, and

it is a continuous

distribution.

The F distribution is

positively skewed.

4.5

1

Page 5: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-5

Test for Equal Variances of Two Populations

22

21

s

sF

22s

For the two tail test, the test statistic is given by

Test for Equal Variances of Two Populations

and are the sample variances for the two samples. The larger s is placed in the denominator.

s 21

The degrees of freedom are n1-1 for the numerator and n2-1 for the denominator.

The null hypothesis is rejected if the computed value of the test statistic is greater than the critical value.

Page 6: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-6

Example 1

The mean rate of return on a sample of 8 utility stocks was 10.9 percent with a standard deviation of 3.5 percent. At the .05 significance level, can Colin conclude that there is more variation in the software stocks?

Colin, a stockbroker at Critical Securities, reported that the mean rate of return on a sample of 10 internet stocks was 12.6 percent with a standard deviation of 3.9 percent.

Page 7: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-7

Example 1 continued

221

220

:

:

UI

UI

H

H

Step 3: The test statistic is the F distribution.

Step 1: The hypotheses are

Step 2: The significance level is .05.

Page 8: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-8

Example 1 continued

2416.1)5.3(

)9.3(2

2

F

Step 5: The value of F is computed as follows.

The p(F>1.2416) is .3965.

H0 is not rejected. There is insufficient evidence to show more variation in the internet stocks.

Step 4: H0 is rejected if F>3.68 or if p < .05. The degrees of freedom are n1-1 or 9 in the numerator and n1-1 or 7 in the denominator.

Page 9: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-9

The ANOVA Test of Means

The null and alternate hypotheses for four sample means is given as:

Ho: 1 = 2 = 3 = 4 H1: 1 = 2 = 3 = 4

The ANOVA Test of Means

The F distribution is also used for testing whether two or more sample means came from the same or equal populations.

This technique is called analysis of variance or

ANOVA

Page 10: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-10

The populations have equal standard deviations.

ANOVA requires the following conditions

Underlying assumptions for ANOVA

The sampled populations follow the normal distribution.

The samples are independent

Page 11: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-11

F =

Estimate of the population variancebased on the differences among the sample means

Estimate of the population variancebased on the variation within the samples

ANOVA Test of Means

Degrees of freedom for the F statistic in

ANOVA

If there are k populations being sampled, the numerator degrees of freedom is k – 1

If there are a total of n observations the denominator degrees of freedom is n – k.

Page 12: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-12

In the following table, i stands for the ith observationxG is the overall or grand mean k is the number of treatment groups

ANOVA Test of Means

ANOVA divides the Total VariationTotal Variation into the

variation due to the treatment, Treatment VariationTreatment Variation,

and to the error component, Random VariationRandom Variation.

Page 13: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-13

ANOVA TableSource of Variation

Sum of Squares

Degrees of

Freedom

Mean Square

F

Treatments

(k)

SST

k

nk(Xk-XG)2

k-1 SST/(k-1)

=MST MST

MSE

Error SSE

i k

(Xi.k-Xk)2

n-k SSE/(n-k)

=MSE

Total TSS

i

(Xi-XG)2

n-1

Anova Table

Treatment variation

Random variation

Total variation

Page 14: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-14

Rosenbaum Restaurants specialize in meals for families. Katy Polsby, President, recently developed a new meat loaf dinner. Before making it a part of the regular menu she decides to test it in several of her restaurants.

Example 2

She would like to know if there is a difference in the mean number of dinners sold per day at the Anyor, Loris, and Lander restaurants. Use the .05 significance level.

Page 15: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-15

Number of Dinners Sold by Restaurant

Restaurant

DayAynor Loris Lander

Day 1

Day 2

Day 3

Day 4

Day 5

13

12

14

12

10

12

13

11

18

16

17

17

17

Example 2 continued

Page 16: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-16

Step One: State the null hypothesis and the alternate hypothesis.

Ho: Aynor = Loris = Landis H1: Aynor = Loris = Landis

Step Two: Select the level of significance. This is given in the problem statement as .05.

Step Three: Determine the test statistic. The test statistic follows the F distribution.

Example 2 continued

Page 17: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-17

Step Five: Select the sample, perform the calculations, and make a decision.

Step Four: Formulate the decision rule.The numerator degrees of freedom, k-1, equal 3-1 or 2. The denominator degrees of freedom, n-k, equal 13-3 or 10. The value of F at 2 and 10 degrees of freedom is 4.10. Thus, H0 is rejected if F>4.10 or p< of .05.

Example 2 continued

Using the data provided, the ANOVA calculations follow.

Page 18: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-18

Anyor

#sold

SS(Anyor) Loris #sold

SS(Loris) Lander

#sold

SS(Lander)

13

12

14

12

(13-12.75)2

(12-12.75)2

(14-12.75)2

(12-12.75)2

2.75

10

12

13

11

(10-11.5)2

(12-11.5)2

(13-11.5)2

(11-11.5)2

5

18

16

17

17

17

(18-17)2

(16-17)2

(17-17)2

(17-17)2

(17-17)2

2

Xk 12.75 11.5 17

SSE: 2.75 + 5 + 2 = 9.75

XG: 14.00

Computation of SSE i k

(Xi.k-Xk)2

Page 19: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-19

Anyor

#sold

TSS(Anyor) Loris #sold

TSS(Loris) Lander

#sold

TSS(Lander)

13

12

14

12

(13-14)2

(12-14)2

(14-14)2

(12-14)2

9.00

10

12

13

11

(10-14)2

(12-14)2

(13-14)2

(11-14)2

30

18

16

17

17

17

(18-14)2

(16-14)2

(17-14)2

(17-14)2

(17-14)2

47

TSS: 9.00 + 30 + 47 = 86.00

SSE: 9.75

XG: 14.00

Computation of TSS i

(Xi-XG)2

Example 2 continued Computation of TSS

Page 20: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-20Computation of SST k

nk(Xk-XG)2

Restaurant XT SST

Anyor

Loris

Lander

12.75

11.50

17.00

4(12.75-14)2

4(11.50-14)2

5(17.00-14)2

76.25

Shortcut: SST = TSS – SSE = 86 – 9.75

= 76.25Example 2 continued Computation of SST

Page 21: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-21

ANOVA TableSource of Variation

Sum of Squares

Degrees of

Freedom

Mean Square

F

Treatments 76.25 3-1

=2

76.25/2

=38.125 38.125

.975

= 39.103

Error 9.75 13-3

=10

9.75/10

=.975

Total 86.00 13-1

=12

Example 2 continued

Page 22: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-22

Example 2 continued

The ANOVA tables on the next two slides are from the Minitab and EXCEL systems.

The p(F> 39.103) is .000018.

The mean number of meals sold at the three locations is not the same.

Since an F of 39.103 > the critical F of 4.10, the p of .000018 < a of .05, the decision is to reject the null hypothesis and conclude that

At least two of the treatment means are not the same.

Page 23: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-23

Example 2 continued

Analysis of Variance

Source DF SS MS F P

Factor 2 76.250 38.125 39.10 0.000

Error 10 9.750 0.975

Total 12 86.000

Individual 95% CIs For Mean

Based on Pooled StDev

Level N Mean StDev ---------+---------+---------+-------

Aynor 4 12.750 0.957 (---*---)

Loris 4 11.500 1.291 (---*---)

Lander 5 17.000 0.707 (---*---)

---------+---------+---------+-------

Pooled StDev = 0.987 12.5 15.0 17.5

Page 24: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-24

SUMMARY

Groups Count Sum Average Variance

Aynor 4 51 12.75 0.92

Loris 4 46 11.50 1.67

Lander 5 85 17.00 0.50

ANOVA

Source of Variation SS df MS F P-value F crit

Between Groups 76.25 2 38.13 39.10 2E-05 4.10

Within Groups 9.75 10 0.98

Total 86.00 12

Anova: Single Factor

Example 2 continued

Page 25: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-25

Inferences About Treatment Means

One of the simplest procedures is through the use of confidence intervals around the difference

in treatment means.

When I reject the null hypothesis that the

means are equal, I want to know which

treatment means differ.

Page 26: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-26

Confidence Interval for the Difference Between Two Means

X X t MSEn n1 2

1 2

1 1

If the confidence interval around the difference in treatment means includes zero, there is not a

difference between the treatment means.

t is obtained from the t table with degrees of freedom (n - k).

MSE = [SSE/(n - k)]

Page 27: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-27

EXAMPLE 3

( . ) . .

. . ( . , . )

17 12 75 2 228 9751

4

1

5

4 25 148 2 77 5 73

95% confidence interval for the difference in the mean number of meat loaf dinners sold in Lander and Aynor

Can Katy conclude that there is a difference between the two restaurants?

Page 28: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-28

Example 3continued

The mean number of meals sold in

Aynor is different from Lander.

Because zero is not in the interval, we conclude that this

pair of means differs.

Page 29: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-29

Two-Factor ANOVA

SSB = r (Xb – XG)2

where r is the number of blocks

Xb is the sample mean of block b

XG is the overall or grand mean

In the following ANOVA table, all sums of squares are computed as before, with the addition of the SSB.

Sometimes there are other causes of variation. For the two-factor ANOVA we test whether there is a significant difference between the treatment effect and whether there is a difference in the blocking effect (a second treatment variable).

Page 30: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-30

ANOVA TableSource of Variation

Sum of Squares Degrees of

Freedom

Mean Square

F

Treatments

(k)

SST

k-1 SST/(k-1)

=MST MST

MSE

MSB

MSE

Blocks

(b)

SSB b-1 SSB/(b-1)

=MSB

Error SSE

(TSS – SST –SSB)

(k-1)(b-1) SSE/(n-k)

=MSE

Total TSS n-1

Two factor ANOVA table

Page 31: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-31

Example 4

At the .05 significance level, can we conclude there is a difference in the mean production by shift and in the mean production by employee?

The Bieber Manufacturing Co. operates 24 hours a day, five days a week. The workers rotate shifts each week. Todd Bieber, the owner, is interested in whether there is a difference in the number of units produced when the employees work on various shifts. A sample of five workers is selected and their output recorded on each shift.

Page 32: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-32

Example 4 continued

Employee DayOutput

EveningOutput

NightOutput

McCartney 31 25 35

Neary 33 26 33

Schoen 28 24 30

Thompson 30 29 28

Wagner 28 26 27

Page 33: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-33

Step 5: Perform the calculations and make a decision.

Step 4: Formulate the decision rule.Ho is rejected if F > 4.46, the degrees of freedom are 2 and 8, or if p < .05.

Example 4 continued

Step 1: State the null hypothesis and the alternate hypothesis.

H1: Not all means are equal.

: 3210 H

Treatment Effect

Step 2: Select the level of significance. Given as .05.

Step 3: Determine the test statistic. The test statistic follows the F distribution.

Page 34: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-34

Step 1: State the null hypothesis and the alternate hypothesis.

H1: Not all means are equal.

: 543210 HStep 2: Select the level of significance. Given as = .05.

Step 3: Determine the test statistic. The test statistic follows the F distribution.

Example 4 continued

Block Effect

Step 4: Formulate the decision rule.H0 is rejected if F>3.84, df =(4,8) or if p < .05.

Step 5: Perform the calculations and make a decision.

Page 35: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-35Block Sums of Squares

Effects of time of day and worker on productivity

Day Evening Night Employee x SSBMcCartney 31 25 35 30.33 3(30.33-28.87)2

= 6.42Neary 33 26 33 30.67 3(30.67-28.87)2

= 9.68Schoen 28 24 30 27.33 3(27.33-28.87)2

7.08Thompson 30 29 28 29.00 3(29.00-28.87)2

.09

Wagner 28 26 27 27.00 3(27.00-28.87)2

10.49

SSB = 6.42 + 9.68 + 7.08 + .05 + 10.49= 33.73

Note: xG = 28.87

Page 36: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-36

Example 4 continued

Compute the remaining sums of squares as before: TSS = 139.73

SST = 62.53 SSE = 43.47 (139.73-62.53-33.73) df(block) = 4 (b-1) df(treatment) = 2 (k-1)df(error)=8 (k-1)(b-1)

Page 37: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-37

Example 4 continued

ANOVA TableSource of Variation

Sum of Squares

Degrees of Freedom

Mean Square

F

Treatments

(k)

62.53

2 62.53/2

=31.275

31.27/5.43

= 5.75

Blocks

(b)

33.73 4 33.73/4

=8.43

8.43/5.43

=1.55

Error 43.47 8 43.47/8

=5.43

Total 139.73 14

Page 38: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-38

Example 4 continued

Block EffectSince the computed F of 1.55 < the critical F of 3.84, the p of .28> of .05, H0 is not rejected since there is no significant difference in the average number of units produced for the different employees.

Treatment Effect

Since the computed F of 5.75 > the critical F of 4.10, the p of .03 < of .05, H0 is rejected. There is a difference in the mean number of units produced for the different time periods.

Page 39: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-39

Example 4 continued

Two-way ANOVA: Units versus Worker, Shift

Analysis of Variance for Units

Source DF SS MS F P

Worker 4 33.73 8.43 1.55 0.276

Shift 2 62.53 31.27 5.75 0.028

Error 8 43.47 5.43

Total 14 139.73

Minitab output

Page 40: 12-1 Chapter Twelve McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved

12-40

SUMMARY Count Sum Average VarianceDay 5 150 30.0 4.5Evening 5 130 26.0 3.5Night 5 153 30.6 11.3

McCartney 3 91 30.33 25.33Neary 3 92 30.67 16.33Schoen 3 82 27.33 9.33Thompson 3 87 29.00 1Wagner 3 81 27.00 1

ANOVA

Source of Variation SS df MS F P-value F crit

Rows 62.53 2 31.27 5.75 0.03 4.46Columns 33.73 4 8.43 1.55 0.28 3.84Error 43.47 8 5.43

Total 139.73 14

Anova: Two-Factor Without Replication

OutputUsingEXCEL Example 4 continued