24
Lecture 11 - The T-test C2 Foundation Mathematics (Standard Track) Dr Linda Stringer Dr Simon Craik [email protected] [email protected] INTO City/UEA London

C2 st lecture 11 the t-test handout

Embed Size (px)

Citation preview

Page 1: C2 st lecture 11   the t-test handout

Lecture 11 - The T-testC2 Foundation Mathematics (Standard Track)

Dr Linda Stringer Dr Simon [email protected] [email protected]

INTO City/UEA London

Page 2: C2 st lecture 11   the t-test handout

Hypothesis testing

We use hypothesis testing to compare the mean of a very largedata set, a population mean, with the mean of a sample dataset, a sample mean.

Z-test question: A lightbulb company says their lightbulbs last amean time of 1000 hours with a standard deviation of 50. Wethink their lightbulbs last longer than this and propose a test ata 5% level of significance. We buy 75 lightbulbs and they last amean time of 1022 hours.

The population mean is 1000 hours (A = 1000).The sample is the 75 light bulbs that we test (n = 75).The sample mean is 1022 hours (x̄ = 1022.)

Page 3: C2 st lecture 11   the t-test handout

Z-test example

A lightbulb company says their lightbulbs last a mean time of1000 hours with a standard deviation of 50. We think theirlightbulbs last longer than this and propose a test at a 5% levelof significance. We buy 75 lightbulbs and they last a mean timeof 1022 hours.

I Hypotheses: H0 : µ = 1000, H1 : µ > 1000I Critical value: +1.65I Test statistic: x̄−A

σ/√

n= 1022−1000

50/√

75= 3.81 to 2 d.p.

I Decision: 3.81 > 1.65 so reject H0

I Conclusion: The sample provides sufficient evidence at5% significance level to reject null hypothesis; thelightbulbs last longer than 1000 hours.

Page 4: C2 st lecture 11   the t-test handout

Z-test summary

I You will be given1. Population mean, A2. Population standard deviation, σ3. Significance level (1% or 5%)4. Sample mean, x̄5. Sample size, n6. Quantifying word.

I You have to work out1. Null hypothesis, alternative hypotheis2. Critical value(s)3. Test statistic4. Decision - accept/reject H0 (sketch a picture if it helps)5. Conclusion

Page 5: C2 st lecture 11   the t-test handout

The difference between a Z-test and a T-test

In a Z-test the sample is large (n ≥ 25). You are given thesample mean and population or sample standard deviation

In a T-test the sample is small (n < 25). You usually have towork out the sample mean and the sample standard deviation.Also in a T -test you have to work out the degrees of freedom touse in the critical value table.

d .o.f . = n − 1

Page 6: C2 st lecture 11   the t-test handout

T-test summary

I You will be given1. Population mean, A2. Significance level3. Sample data set4. Quantifying word.

I You have to work out1. Null hypothesis (H0 : µ = A) and alternative hypotheis2. Degrees of freedom, d .o.f . = n − 13. Critical value(s), look this up in the table4. Sample mean, x̄ = Σx

n

5. Sample standard deviation, s =√∑

x2−nx̄2

n−1

(MAKE SURE YOU CALCULATE s, not σ)6. Test statistic, x̄−A

s/√

n

7. Decision, accept/reject H0 (sketch a picture if it helps)8. Conclusion, write this in words

Page 7: C2 st lecture 11   the t-test handout

The null hypothesis and the alternative hypothesis forthe Z-test and T-test

I The null hypothesis is initially assumed to be true. It is

H0 : µ = A

where µ is ’population mean’ and A is the hypotheticalvalue of the population mean

I The alternative hypothesis is either

H1 : µ 6= A or H1 : µ < A or H1 : µ > A

I Sample data is collected and tested to see if it is consistentwith the null hypothesis. If the sample mean is significantlydifferent from the population mean, H0 is rejected in favourof the alternative hypothesis, H1.

Page 8: C2 st lecture 11   the t-test handout

Significance level

I The null hypothesis will always be tested to a given level ofsignificance.

I A 5% level of significance means we are testing to see ifthe probability of getting the sample mean is less than0.05. If the probability is less we reject the null hypothesisin favour of the alternative hypothesis.

I A 1% level of significance translates to a probability of 0.01.

Page 9: C2 st lecture 11   the t-test handout

Critical value (T-test)The critical value is the boundary (or boundaries) of therejection region(s). In a T-test this depends on the alternativehypothesis, significance level and degrees of freedom(d .o.f . = n − 1, where n is the number of values in the data set)

5% significance level 1% significance level

d.o.f. One-tailed Two-tailed One-tailed Two-tailed

2 2.92 4.30 6.97 9.93

3 2.35 3.18 4.54 5.84

4 2.13 2.78 3.75 4.60

5 2.02 2.57 3.37 4.03

6 1.94 2.45 3.14 3.71

7 1.90 2.37 3.00 3.50

8 1.86 2.31 2.90 3.36

9 1.83 2.26 2.82 3.25

10 1.81 2.23 2.76 3.17

Page 10: C2 st lecture 11   the t-test handout

Degrees of freedom (T-test)

I The degrees of freedom of a set of data is a way ofmeasuring how the tests effect each other.

I If the data has size n and each sample does not effect anyothers the degree of freedom is n − 1. (This is usually thecase with our data).

I Consider a bag containing 10 stones.I If as a sample we pick out 10 stones our degree of

freedom is 0 because the choice of the first one constrainsthe possibilities for all others and the final one is left withno choices.

I If as a sample we pick out 7 stones our degree of freedomis 3 because if we take out three stones before we start thechoice of stones is unique.

I If as a sample we just pick out 1 stone our degree offreedom is 9.

Page 11: C2 st lecture 11   the t-test handout

H1 : µ 6= A

If our alternative hypothesis is H1 : µ 6= A we are doing atwo-tailed test and we have 2 critical values, one negative andone positive.The critical value is the boundary of the rejection region.For a 5% level of significance we have the following picture:

−2.31 2.31

The rejection (shaded) regions have a combined area of 0.05.

Page 12: C2 st lecture 11   the t-test handout

H1 : µ > A

If our alternative hypothesis is H1 : µ > A we are doing aone-tailed test and we have 1 critical value which is positive.The critical value is the boundary of the rejection region.For a 5% level of significance we have the following picture:

1.86

The rejection region has an area of 0.05.

Page 13: C2 st lecture 11   the t-test handout

H1 : µ < A

If our alternative hypothesis is H1 : µ < A we are doing aone-tailed test and we have 1 critical value which is negative.The critical value is the boundary of the rejection region.For a 5% level of significance we have the following picture:

−1.86

The rejection region has an area of 0.05.

Page 14: C2 st lecture 11   the t-test handout

Sample mean and sample standard deviation

I The sample mean, x̄ is the mean x̄ = Σxn

I The sample standard deviation, s of a set of data is slightlydifferent from the standard deviation σ. It is important touse the correct formula.

I For a T-test question ALWAYS use the sample varianceformula

s2 =

∑x2 − nx̄2

n − 1I For a T-test question DO NOT USE the variance formula

from Lecture 10

σ2 =

∑x − x̄n

=

∑x2

n− x̄2

I (We are using the sample data to work out anapproximation to the population variance)

Page 15: C2 st lecture 11   the t-test handout

Test statistic and conclusion

I The test statistic is difference between the sample mean, x̄and the (hypothetical) population mean A, divided by thestandard error.

I The standard error is σ/√

n for the Z -test and s/√

n for theT -test, where n is the sample size, σ is the populationstandard deviation and s is the sample standard deviation.

I The T-test statistic isx̄ − As/√

n

I If the test statistic lies beyond the critical value(s) (in therejection region) we reject H0. We say THERE ISSUFFICIENT EVIDENCE TO REJECT H0.

I If the test statistic does not lie beyond the critical value, weaccept H0. We say THERE IS NOT SUFFICIENTEVIDENCE TO REJECT H0

Page 16: C2 st lecture 11   the t-test handout

Normal distribution X ∼ N(µ, σ2) and the theorybehind the Z-test and the T-test

If samples of size n are taken from a population with mean Aand standard deviation σ, then the sample means aredistributed normally, with mean A and standard deviation σ/

√n.

−4 −2 2 4

0.1

0.2

0.3

0.4

0.5

x

y

When we calculate the test statistic, we are calculating theZ-score of the sample mean. The critical value is the Z-score ofa sample mean which we have a 5% (or 1%) probability ofobtaining. For further information, try a statistics book from thelibrary, or the khanacademy videos on youtube.

Page 17: C2 st lecture 11   the t-test handout

T-test - Example 1

A light bulb company claim their light bulbs last an average of1000 hours. We want to test whether this is true to a 5% level ofsignificance.

I H0: µ = 1000.H1: µ 6= 1000.

We test a sample of 10 light bulbs. Their lifetimes in hours arelisted below.

1020,860,987,1109,1015,952,964,1007,1082,1017

I Degrees of freedom:(d .o.f . = n − 1) 10-1=9I Critical values: We are doing a two-tailed test as our

alternative hypothesis says µ 6= 1000. Look up 5% with 9degrees of freedom for the critical value.

I Our critical values are -2.26 and 2.26.

Page 18: C2 st lecture 11   the t-test handout

T-test - Example 1I Sample mean: x̄ = 1001.3.I Sample standard deviation:

s2 =

∑x2 − nx̄2

n − 1= 4768.9

s =√

4768.9 = 69.057

I Test statistic:

x̄ − As/√

n=

1001.3− 100069.057/

√10

= 0.06

I Decision: −2.26 < 0.06 < 2.26 .The test statistic is not inthe rejection region so we accept the null hypothesis.

I Conclusion: The sample of 10 light bulbs does not providesufficient evidence at a 5% significance level to reject thelight bulb company’s claim; the average bulb lifetime is1000 hours

Page 19: C2 st lecture 11   the t-test handout

T-test - Example 2

An average person is said to be able run to 100m in 14.2seconds. We think that this is a bit on the slow side. We decideto test at a 5% level of significance.

I H0 : µ = 14.2H1 : µ < 14.2.

We ask 7 people to run 100m. Their times are as follows:

12.6,13.2,11.7,14.6,11.3,12.0,13.5

I The degree of freedom of this set is 7-1=6I We are doing a one-tailed test as our alternative

hypothesis says µ < 14.2. Look up 5% with 6 degrees offreedom for the critical value.

I The critical value is −1.94.

Page 20: C2 st lecture 11   the t-test handout

T-test - Example 2I Sample mean x̄ = 12.7.I Sample standard deviation

s2 =

∑x2 − nx̄2

n − 1= 1.327

s =√

1.327 = 1.152

I Test statistic

T =x̄ − As/√

n=

12.7− 14.21.152/

√7

= −3.45

I Decision: −3.45 < −1.94 so we reject the null hypothesis.I Conclusion: The data collected provides sufficient

evidence at a 5% significance level to reject the claim thatthe average person runs 100m in 14.2s; people run fasterthan this.

Page 21: C2 st lecture 11   the t-test handout

T-test - Example 3

I An average person has an IQ of 100. We think that we arecleverer than this so we test at a 1% level of significance.

I H0 : µ = 100.I H1 : µ > 100.I We got 8 people to take an IQ test. Their marks were as

follows:

117,106,93,142,110,114,120,126

I The degree of freedom of this set is 8-1=7I We are doing a one-tailed test as our alternative

hypothesis says µ > 100. Look up 1% with 7 degrees offreedom for the critical value.

I The critical value is 3.00.

Page 22: C2 st lecture 11   the t-test handout

T-test - Example 3

I Sample mean x̄ = 116.I Sample standard deviation

s2 =

∑x2 − nx̄2

n − 1= 208.857

s = 14.452

I Test statistic

x̄ − As/√

n=

116− 10014.452/

√8

= 3.13

I Decision: 3.00 < 3.13 so we reject the null hypothesis.I Conclusion: The sample of 8 people provides sufficient

evidence at a 1% significance level to reject the claim thatthe average IQ is100; people are more intelligent than this.

Page 23: C2 st lecture 11   the t-test handout

T-test - Example 4

I The chocolate company claims that a bag of malteasershas an average of 20 malteasers inside. In the name ofscience we buy 6 bags to see if this is right to a 1% level ofsignificance. The bags have the following number ofmalteasers:

19,16,18,19,22,14

I H0 : µ = 20.I H1 : µ 6= 20.I Degree of freedom is 6-1=5.I We are doing a two-tailed test as our alternative hypothesis

says µ 6= 20. Look up 1% with 5 degrees of freedom for thecritical values.

I The critical values are −4.03 and 4.03.

Page 24: C2 st lecture 11   the t-test handout

T-test - Example 4I Sample mean x̄ = 18.I Sample standard deviation

s2 =

∑x2 − nx̄2

n − 1= 7.6

s = 2.757

I Test statistic

T =x̄ − As/√

n=

18− 202.757/

√6

= −1.78

I Decision: −4.03 < −1.78 < 4.03 so we accept the nullhypothesis.

I Conclusion: The sample of 6 bags of maltesers does notprovide sufficient evidence at a 1% significance level toreject the chocolate company’s claim; there is an averageof 18 maltesers per bag.