ANOVA - Pennsylvania State University

Preview:

Citation preview

ANOVA

An Old Research Question

The impact of TV on high-school grade

Watch or not watch

Two groups

The impact of TV hours on high-school grade

Exactly how much TV watching would make

difference

Multiple groups

Not watch, watch a little, watch regularly

Then we could have

something like this

What Should We Do?

Should t-Test Be Used?

Multiple comparison

Increasing the chance of

Type I error

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50

Multiple Comparison Is

Common

In particular in factorial design

Single factor

Multiple levels: previous example

Multiple factors

Impact of TV watching and library visit

Terminology

Factor

The independent variable that designates the

groups being compared

TV watching and library visit

Levels

Individual conditions or values that make up

a factor

Factorial design

A study that combines two or more factors

The research study uses two factors

One factor uses two levels of therapy technique (I

versus II)

The second factor uses three levels of time

(before, after, and 6 months after).

Figure 12.2

Two-Factor Research Design

Figure 12.2

Two-Factor Research Design

Also notice that the therapy factor uses two

separate groups (independent measures)

and the time factor uses the same group for

all three levels (repeated measures).

We have 15 comparisons!

Figure 12.2

Two-Factor Research Design

0

0.2

0.4

0.6

0.8

1

0 10 20 30 40 50

How to deal with this

problem?

Analysis of Variance

Analysis of variance

Also called ANOVA

Used to evaluate mean differences between

two or more treatments (advantage over t-

test)

Uses sample data as basis for drawing

general conclusions about populations

Analysis of Variance

Null hypothesis: the level or value on the

factor does not affect the dependent variable

In the population, this is equivalent to saying that

the means of the groups do not differ from each

other

Alternative hypothesis: There is at least one

mean difference among the populations

All means are different from every other mean

Some means are not different from some others,

but other means do differ from some means

3210 : H

ANOVA: Statistics

F test

F-ratio: based on variance instead of sample mean

difference

Numerator: Variance caused by differences among sample

means

Denominator: Variance be expected if there is no treatment

effect

chancebyexpecteddifference

meanssamplebetweendifferenceobtainedt

chance)by (error effect treatmentnowith expectede)(differencvariance

meanssamplebetweene)(differencvarianceF

Logic of ANOVA

A study with three

treatments

Sources of Variability

Between Treatments

Systematic differences caused

by treatments

Random, unsystematic differences

Individual differences

Experimental (measurement) error

What if the null

hypothesis is true?

F-Ratio

The ratio of the variance between treatments

to the variance within treatments

(treatment effects + chance) / (chance)

If no treatment effect, F should be 1

Otherwise, F should be larger than 1.

Experimental Design

Simple experiments

Single factor

Between-subjects design

Within-subjects design

Factorial experiments

More factors

2 x 2

These design all involve multiple treatments

ANOVA would be needed.

Numerator of F-ratio

Numerator of F-ratio

Denominator of F-ratio

Denominator of F-ratio

Logic of Repeated-Measures

ANOVA

Comparing variance

Between-treatments vs. within-treatments

Removing the difference between subjects

s)difference individual (chancebyexpectedvariance

s)difference individual ( treatmentsbetweenvarianceF

without

without

chancebyexpectedvariance

treatmentsbetweenvarianceF

ANOVA

Notation and Formulas

k: the number of treatment

n: the number of scores in each treatment

N: the number of total scores in the study

SX or T: the sum of the scores for each

treatment

G: the sum of all the scores in the study

G = S(SX) = ST

SX2, SS, s2, df,

Figure 12.4 ANOVA

Calculation Structure and

Sequence

Figure 12.5 Partitioning SS

for Independent-measures

ANOVA

ANOVA equations

N

GXSStotal

22

treatment each insidetreatmentswithin SSSS

N

G

n

TSS treatmentsbetween

22

Degrees of Freedom Analysis

Total degrees of freedom

dftotal= N – 1

Within-treatments degrees of freedom

dfwithin= N – k

Between-treatments degrees of freedom

dfbetween= k – 1

Figure 12.6 Partitioning

Degrees of Freedom

Mean Squares and F-ratio

within

withinwithinwithin

df

SSsMS 2

between

betweenbetweenbetween

df

SSsMS 2

within

between

within

between

MS

MS

s

sF

2

2

ANOVA Summary Table

Source SS df MS F

Between Treatments 40 2 20 10

Within Treatments 20 10 2

Total 60 12

• Concise method for presenting ANOVA results

• Helps organize and direct the analysis process

• Convenient for checking computations

• “Standard” statistical analysis program output

Distribution of F-ratios

If the null hypothesis is true, the value of F

will be around 1.00

Because F-ratios are computed from two

variances, they are always positive numbers

Table of F values is organized by two df

df numerator (between) shown in table columns

df denominator (within) shown in table rows

Figure 12.7

Distribution of F-ratios

ANOVA Test

Uses the same four steps that have been

used in earlier hypothesis tests.

Computation of the test statistic F is done

in stages

Compute SStotal, SSbetween, SSwithin

Compute MStotal, MSbetween, MSwithin

Compute F

Measuring Effect size for

ANOVA

Compute percentage of variance accounted

for by the treatment conditions

In published reports of ANOVA, effect size is

usually called η2 (“eta squared”)

r2 concept (proportion of variance explained)

total

treatments between

SS

SS2

In the Literature

Treatment means and standard deviations

are presented in text, table or graph

Results of ANOVA are summarized, including

F and df

p-value

η2

• E.g., F(3,20) = 6.45, p<.01, η2 = 0.492

Example

For each

experiment

N = 14

Experiment A

Source SS df MS F

Between Treatments

Within Treatments

Total

Experiment B

Source SS df MS F

Between Treatments

Within Treatments

Total

post hoc Tests

ANOVA compares all individual mean

differences simultaneously, in one test

A significant F-ratio indicates that at least one

difference in means is statistically significant

Does not indicate which means differ significantly

from each other!

post hoc tests are follow up tests done to

determine exactly which mean differences

are significant, and which are not

Tukey’s Honestly Significant

Difference

A single value that determines the minimum

difference between treatment means that is

necessary to claim statistical significance–a

difference large enough that p < αexperimentwise

Honestly Significant Difference (HSD)

n

MSqHSD within

A vs. B: MA – MB = 2.44 > HSD significant

B vs. C: MB – MC = 1.66 < HSD

A vs. C: MA – MC = 4.00 > HSD significant

The Scheffé Test

The Scheffé test is one of the safest of all

possible post hoc tests

Uses an F-ratio to evaluate significance of the

difference between two treatment conditions

groups twoof SS with calculatedB A versus

within

between

MS

MSF

Between A & B

A & B F(2,24) = 3.36

B & C F(2,24) = 1.36

A & C F(2,24) = 9.00

df = 2, 24 and α = .05

the critical value for F: 3.40

Only the difference between A&C is significant.

Relationship between ANOVA

and t tests

For two independent samples, either t or F

can be used

Always result in same decision

F = t2

For any value of α, (tcritical)2 = Fcritical

Figure 12.10

Distribution of t and F statistics

Independent Measures ANOVA

Assumptions

The observations within each sample must

be independent

The population from which the samples are

selected must be normal

The populations from which the samples are

selected must have equal variances

(homogeneity of variance)

Violating the assumption of homogeneity of

variance risks invalid test results

To Report ANOVA Result

The subjects averaged MA = 3, MB = 5.44, and MC = 7 in three treatments respectively. ANOVA indicated a significant difference, F(2, 24) = 9.15, p<.05, 2 = ….

Post hoc analysis (Tukey’s HSD) indicated significant difference between Treatments A and B, as well as between Treatments A and C (HSD = 2.36).

or

Post hoc analysis (Sheffé) indicated significant difference between Treatments A and C only, FA vs. C (2,24) = 9, p<.05.

Homework

12.22