18
Today’s Lecture • One more test for normality – Shapiro-Wilk Test • Testing variances – Equality of Variance via the F- Distribution – Levene’s Test for Equality of Variances

Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Embed Size (px)

Citation preview

Page 1: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Today’s Lecture

• One more test for normality– Shapiro-Wilk Test

• Testing variances – Equality of Variance via the F-Distribution– Levene’s Test for Equality of Variances

Page 2: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Reference Material

• Shapiro and Wilk, 1965. Biometrika (52:3 and 4) pgs. 591-611.

• Burt and Barber, page 325

• Levene, 1960. In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, I. Olkin et al. eds., Stanford University Press, pp. 278-292.

Page 3: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

More Pretests• The tests presented in today’s lecture are pretests

that can help to verify the assumptions of a parametric hypothesis test

• The first is one of the strongest tests for normality

• The second is one of the simplest tests for determining if a pooled or non-pooled variance t-test is required

• The last allows for a comparison of variances in a multiple category layout like the analysis of variance

Page 4: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Shapiro-Wilk

• The Shapiro-Wilk is one of the strangest tests that I have encountered thus far in my statistical explorations

• But it is either the best or the second best test of normality in existence

• It excels at normality testing small samples and is the definitive test for n<30

Page 5: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Curiouser and Curiouser• A brief rundown of the strangeness

associated with the Shapiro-Wilk– You fail reject the null when your observed

value is greater than your critical value (that’s right, the critical region on this test is in the small tail)

– The test actually pairs observations from within the sample to determine normality

– The number of pairs is determined by nearly the same equation that you would use to determine the median

Page 6: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

So How Does It Work?• The W-Statistic:

• Recall that the variance of a sample is s2

• So really all we are required to give is the sum of the squared deviations from the mean (plus this term b2)

• b2is a bit more complex, but it is more odd than difficult

2

2

)1( sn

bW

1

)(12

n

xxs

n

ii

n

ii xxsn

1

22 )()1(

Page 7: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Getting to B-Squared• The b term is actually a weighted comparison of

all the pairs within the sample• The way that it works is that you sort all of your

data from least to greatest• Then you create k number of pairs from the

sample with k=n/2 if n is even and k=n+1/2 if n is odd (note that k is the median of the sample)

• Each pair has a companion that is from the other end of the sample

• Example: Given the following set of numbers- 1,2,3,4,5,100 your pairs would be as follows:

• 100 and 1, 5 and 2, and 4 and 3

Page 8: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Once You Have Your Pairs• The pairs are important because you will be

taking the difference between the large value and the small value (100-1=99, 5-2=3, 4-3=1)

• Once you have all your differences, you then assign them weights (from a W-weight table

• Once you have the weights, you multiply each one by its pair and then sum them all

• This sum is b, which you then square

Page 9: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Strange, don’t you think?

• Let’s go to Excel

• But first let’s show you the equation for b

k

iiini xxab

11 )(

Big and Little Pairs

ai weight (from math that you don’t want to have to learn) – basically the weights are the result of an expected normal distribution and its resulting covariance matrix

The median

Page 10: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Results• W=0.952165• This isn’t very small, so we are going to fail to

reject• H0: Normal HA: Not Normal (note the wording

here, we are not saying that this test shows that the data is normal, we are only saying that it fails to show that the data is not normal)

• W(critical) for 0.05 and n=20 is 0.905• Note that this distribution is severely skewed so

our result of 0.952 has a p-value of around 0.40• This sample is suitable for parametric analysis

Page 11: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Shapiro-Wilk Tables

Pair Coefficients (weights)

Critical levels for significance

Page 12: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Equality of Variance via Ratio• Assumptions:

– s12 and s2

2 are independent estimates of σ2

– The population from which the samples are drawn is normal (This means you had better check for normality first)

• H0: σ12 = σ2

2 Ha: σi2 ≠ σj2

• Statistic: s12/s2

2 (I typically place the larger variance in the numerator of the equation, but it doesn’t matter for two tailed tests)

• Once you compute the statistic you find the F-distribution in the appendix of your book (page 613) and then use n1-1 and n2-1 for your degrees of freedom

Page 13: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Example• A couple of weeks ago we used two

samples in a t-test. The first sample had an n=12 and a variance of 17.3, the second sample had an n=10 and a variance of 18.9

• 18.9/17.3=1.092• A look at our tables with 11 and 9 degrees

of freedom at alpha=0.05 will tell us that a critical value of 3.96 (we have to use 10 for n1, because there is no 11 column)

• Since 1.09<3.96, we fail to reject the null

Page 14: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Levene’s L-Statistic

• Test for the equality of variance in multiple categories

• H0: σ12 = σ2

2 = … = σk2

• Ha: σi2 ≠ σj2 for at least one pair (i,j).

• The statistic is run on the deviations from the mean but is very similar to the ANOVA in terms of computation

• The test uses the F-distribution to determine significance

Page 15: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

The Equation

k

j

n

ijij

k

jjj

j

kNZZ

kZZn

L

1 1

2

1

2

)/().(

)1/(..).(

jijij xxZ .

This is a categorical mean of differences

All data in each category is “differenced” by its category mean

This is the global mean of differences This is a sum of squares

between, but on the xij differences

This is a sum of squares within, but on the xij differences

dfb

dfw

Page 16: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Off to Excel

Page 17: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Results

• After all of our computations, we find an L value of 2.41

• Since our degrees of freedom are k-1=2 and N-k=12 an alpha of 0.05 would require a critical value for L of 3.88

• Since 2.41<3.88 we fail to reject the null of equal variances between categories

• This data set is suitable for parametric analysis via an ANOVA

Page 18: Todays Lecture One more test for normality –Shapiro-Wilk Test Testing variances –Equality of Variance via the F-Distribution –Levenes Test for Equality

Homework

• Given two data sets, test for normality using the Shapiro Wilk and then test for equality of variance via ratio.

• Once you have completed both tests, recommend the correct test for comparing the samples.

• Your choices are the T-Test (pooled variance), T-Test (non-pooled variance) and the Wilcoxon Rank-Sum Test