35
Go to Table of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D.

Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Embed Size (px)

Citation preview

Page 1: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Analysis of Variance: Randomized Blocks

Farrokh Alemi Ph.D.

Kashif Haqqi M.D.

Page 2: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Additional Reading

• For additional reading see Chapter 13 in Michael R. Middleton’s Data Analysis Using Excel, Duxbury Thompson Publishers, 2000.

• Example described in this lecture is based in part on Chapter 14, Sections 3 through 5 of Keller and Warrack’s Statistics for Management and Economics. Fifth Edition, Duxbury Thompson Learning Publisher, 2000.

• Read any introductory statistics book about Analysis of Variance

Page 3: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Which Approach Is Appropriate When?

• Analysis of Variance described here expands single factor ANOVA to multiple factors and analysis of more than 2 matched groups of populations.

• Choosing the right method for the data is the key statistical expertise that you need to have.

• You might want to review a decision tool that we have organized for you to help you in choosing the right statistical method.

Page 4: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Do I Need to Know the Formulas?

• You do not need to know exact formulas.• You do need to know where they are in your

reference book.• You do need to understand the concept behind

them and the general statistical concepts imbedded in the use of the formulas.

• You do not need to be able to do Analysis of Variance by hand. You must be able to do it on a computer using Excel or other software.

Page 5: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Table of Content

• Objectives

• Randomized Block Design

• Repeated Measure Design

• Sources of Variance

• Test Statistic

• An Example

• Assumptions

• Results of ANOVA

• Understanding Blocking

• ANOVA with replication

• Factorial Experimental Design

• An Example

• How to Analyze Data From Factorial Designs?

• Take Home Lesson

Page 6: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Objectives

• To learn the assumptions and the interpretation of Analysis of Variance for randomized block design.

• To learn assumptions and the interpretation of Analysis of Variance for multifactor models.

• To use Excel to do Analysis of Variance.

Page 7: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Single and Multiple Factors

• The ANOVA we discussed so far applies to one single factor (one quantitative response variable).

• We have seen in paired matched studies how making sure that the same or similar subjects receive the treatments reduces variations and allows more informative tests.

• We now extend the ANOVA model described earlier to situations where more than 2 populations are matched or in our new terminology to situations were there is “randomized block designs”.

Page 8: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Randomized Block Design

• If the subjects who receive a particular treatment are the same, or essentially the same, then we have a randomized block design.

• For example, if different treatment is provides to patients in low, medium and high severity then severity is used to create a block design.

• A block design removes differences among the experimental subjects within a particular treatment and therefore reduces the variations in response variable.

Page 9: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Repeated Measure Design

• Is a special form of randomized block design when the same subjects receive different treatments.

• For example, surveying same patients at monthly intervals is a repeated measure design.

• The same patients receive different treatment.• Repeated measures reduces variation due to

differences of subjects across treatment programs.

Page 10: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Sources of Variance

In randomized block design we partition the total variation in the data (i.e. the difference between each observation and the grand mean) into three sources:– Sum of square treatment, SST– Sum of square of errors, SSE– Sum of square of blocks, SSB

SS(Total) = SST + SSB + SSE

Page 11: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Calculation of Sources of Variance

Formula Degrees of freedom

SS(total) Sum across all observations of square of the difference between observations and the grand mean.

n-1

SST Sum across treatments of (b * squared difference of mean of treatments and grand mean)

k-1

SSB Sum across block of (k * squared difference of mean of blocks and the grand mean)

b-1

SSE SS(total)-SST-SSB n-k-b-1

b is number of blocks, k is number of treatments, n is number of observations

Page 12: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Calculation of Mean Sources of Variance

Formula Degrees of freedom

MSS SST/k-1 k-1

MSB SSB/b-1 b-1

MSE SSE/(n-k-b-1) n-k-b-1

b is number of blocks, k is number of treatments, n is number of observations

Page 13: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Test Statistic

• Test statistic for treatment is MST/MSE distributed as an F distribution with k-1 and n-k-b-1 degrees of freedom.

• Test statistics for effect of blocks is MSB/MSE distributed as an F distribution with b-1 and n-k-b-1 degrees of freedom.

Page 14: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

An Example in Health Care

• 200 Patients at a Nursing home were followed for seven months.

• Each month we recorded their daily living activity score (measured on an interval scale).

• Sample of data are shown or download full data.

• Did patients’ daily living activity change over time?

Patient Month 1 Month 2 Month 71 65 40 1102 90 85 1003 30 30 704 72 52 94

196 75 58 69197 90 67 84198 67 25 31199 60 48 83200 80 95 120

Page 15: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Displaying the data

We need to see if the apparent changes in some months are real or due to random chance

0

20

40

60

80

100

Month1

Month2

Month3

Month4

Month5

Month6

Month7

Da

ily L

ivin

g A

cti

vity

Sc

ore

Page 16: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Components of ANOVA

• Response variable is daily living activity score.• Treatment are the months.• The experimental plan is randomized block

design.• We use two factor ANOVA without replication.

(“With replication” is used when measures are repeated for different levels of the same factor).

Page 17: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

What Are the Null Hypotheses?

Means for each patients are the same and means for all 7 months are equal:

1=

2=

3=

4=

5=

6=

7

Page 18: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

What Are the Alternative Hypotheses?

At least two months have different means. At least two patients have

different means.

Page 19: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Assumptions

• The variable of interest is quantitative.• The problem is to compare 2 or more

means.• The experimental plan is a blocked

randomized design.• Treatment observations are distributed

according to a Normal distribution.• The variance of the samples are equal.

Page 20: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Verifying Assumptions

• Response variable is quantitative.

• The Problem is comparison of seven means.

• Assumption of blocked sample design is appropriate as repeated measures are used.– Same subjects are rated across the seven

months.

Page 21: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Verifying Assumptions (Continued)

• Samples have Normal distribution. Month one data is shown. Other months were also Normal but not displayed.

Histogram for Month 1

0

10

20

30

40

50

0 19 37 56 75 94 112

More

BinFr

eq

ue

nc

y

Page 22: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Verifying Assumptions (Continued)

Equality of variances will be examined after the ANOVA is done.

Page 23: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Excel Setup For ANOVA

• Prepare data so that columns correspond to treatment and rows to blocks.

• Select tools, data analysis, ANOVA without replication.

• Include as input the column corresponding to blocks and all treatment columns.

Page 24: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Results of ANOVA

• First part shows averages and variances for each block (in this case patients).

• First 10 patients are shown in this slide.

• There are 7 observations per patient over the 7 months.

• Means differ but are differences significant.

SUMMARY Count Sum Average Variance1 7 430 61.42857 677.28572 7 638 91.14286 230.80953 7 265 37.85714 365.47624 7 527 75.28571 281.57145 7 501 71.57143 163.6196 7 498 71.14286 523.47627 7 440 62.85714 321.80958 7 652 93.14286 326.47629 7 560 80 310.3333

10 7 508 72.57143 269.9524

Page 25: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Results of ANOVA (Continued)

• Next, treatment data are described.

• Assumption of equal variances are met as variances are in the same range.

• Means differ but are differences significant.

SUMMARY Count Sum Average VarianceMonth 1 200 13825 69.125 462.9742Month 2 200 13919 69.595 502.1718Month 3 200 14075 70.375 506.2758Month 4 200 13559 67.795 540.7065Month 5 200 14123 70.615 483.7455Month 6 200 15104 75.52 484.6227Month 7 200 16347 81.735 481.6128

Page 26: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Result of ANOVA (Continued)

• Next, sum of square table is shown.

• Rows correspond to patients, columns to months.

• Note total variation = SST+SSB+SSE.

• Note mean sum of square is calculated by dividing sum of squares by degrees of freedom.

Source of Variation SS df MSRows 209834.6 199 1054.445Columns 28673.73 6 4778.955Error 479125.1 1194 401.2773

Total 717633.5 1399

Page 27: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Result of ANOVA (Continued)

• Test statistic for rows is 2.6 and larger than the critical value. Probability of observing this high an F value is 0.

• Reject the hypothesis that patients had same means.

• Similarly, reject the hypothesis of same means across the months.

ANOVASource of Variation F P-value F critRows 2.627722 1.04E-23 1.187531Columns 11.90936 5.14E-13 2.106162

Page 28: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Understanding Blocking

• Blocking is the extension of matched pair design to more than 2 populations

• Blocking reduces variation and improves our ability to detect differences in treatment. You can see this in the formula for total sum of square = SST+SSB+SSE

• In the absence of blocking SSB will be added to SSE

Page 29: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

ANOVA with replication

• It is possible to have multiple blocks.

• For each possible block and treatment combination there may be multiple observations (replicated measures).

• How would we use ANOVA for these circumstances?

Page 30: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Factorial Experimental Design

• In designing data collection it is important to create as much efficiency as possible.

• The most optimal design is a factorial experimental design (typically analyzed using ANOVA with replication or multiple regression).

Page 31: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

How to Create Factorial Designs?

• For each factor (or block), take two levels the maximum and the minimum.

• Examine all possible combinations of the factors. For a 3 factor model, this will lead to two to the power of 3, or 8, possible combinations. For a four factor model this leads to 2 to power of 4 possible combinations or 16 combinations.

• Measure the response variable for all possible combinations with replication.

Page 32: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

A Factorial Design for 3 Factors

• Note there are eight unique cases. No case has the same level of the three factors.

• The combination was created by repeating every 4 cases for factor one, every 2 cases for factor two and every case for factor three.

Factor 1 Factor 2 Factor 3 ResponseMinimum Minimum MinimumMinimum Minimum MaximumMinimum Maximum MinimumMinimum Maximum MaximumMaximum Minimum MinimumMaximum Minimum MaximumMaximum Maximum MinimumMaximum Maximum Maximum

Page 33: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

An Example In Health Care

• Three factors are assumed to affect consumer satisfaction: waiting time, travel time and bed side manner. Design an experiment to understand the relative influence of the three factors.

Waiting time

Travel time

Bed side manner

Satisfaction ratings of n patients

Short Short PoorShort Short GoodShort Long PoorShort Long GoodLong Short Poor Long Short GoodLong Long PoorLong Long Good

Page 34: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

How to Analyze Data From Factorial Designs?

• Data can be analyzed using ANOVA with replications, if for each combination of factors there are repeated measures.

• Excel provides a method for analyzing 2 factors with all levels of the factors specified. This is a limited method of analysis. An easier, more generalized approach, is to analyze the data using Multiple regression. A concept we introduce later.

Page 35: Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D

Go to Table of Content

Take Home Lesson

• Experimental design affects the method of the analysis.

• An effective approach is block randomized design (an extension of matched pair t-test). In these circumstances we use two factor ANOVA without replication.

• An optimal design is factorial experimental design. In these circumstances an ANOVA with replication is appropriate.