67
Revision: 2-12 1 Module 7: Hypothesis Testing I Statistics (OA3102) Professor Ron Fricker Naval Postgraduate School Monterey, California Reading assignment: WM&S chapter 10.1-10.5

Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 1

Module 7:

Hypothesis Testing I Statistics (OA3102)

Professor Ron Fricker Naval Postgraduate School

Monterey, California

Reading assignment:

WM&S chapter 10.1-10.5

Page 2: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 2

Goals for this Module

• Introduction to testing hypotheses

– Steps in conducting a hypothesis test

– Types of errors

– Lots of terminology

• Large sample tests for the means and

proportions

– Aka z-tests

– Type II error probability calculations

– Sample size calculations

Page 3: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 3

A Simple Example

• You hypothesize a coin is fair

– To test, take a coin and start flipping it

– If it is fair, you expect that about half of the flips

will be heads and half tails

– After a large number of flips, if you see either a

large fraction of heads or a large fraction of tails

you tend to disbelieve your hypothesis

• This is an informal hypothesis test!

• How to decide when the fraction is too large?

Page 4: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 4

Is the Coin Fair?

• Back to the in-class exercise from Module 1: flip a

coin 10 times and count the number of heads

• Assuming the coin is fair (and flips are independent),

the number of heads has a binomial distribution with

n=10 and p=0.5

• So, if our assumption is true, the distribution of the

number of heads is:

• And, if your assumption is not true, what do we

expect to see? Either too few or too many heads

Page 5: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 5

Setting Up a Test for the Coin

• Idea: Make a rule, based on the number of heads

observed, from which we will conclude either that our

assumption (p=0.5) is true or false

• Here’s one rule: Assume p=0.5 if 3 < x < 7

– Otherwise, conclude p≠0.5

• If p=0.5, what’s our chance of making a mistake?

• If p=0.8, what’s our chance of detecting the biased coin?

~11%

~68%

Page 6: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 6

CIs vs. Hypothesis Testing

• Previously we would have answered this

question with a confidence interval:

– Say we observed : we’re 95% confident

that the interval [0.5, 0.9] covers the true p

• Now we look to answer the question using a

hypothesis test:

– If the true probability of heads is p = 0.5 (i.e., the

coin is fair), how unlikely would it be to see 7

heads out of ten flips?

• If we see an outcome inconsistent with our

hypothesis (the coin is fair), then we reject it

0.7y n =

Page 7: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 7

Why Do Hypothesis Tests?

• Confidence intervals provide more information than hypothesis tests

• But often we need to test a particular theory – E.g., “The mod decreases the mean down-time.”

• Sometimes confidence intervals are hard/impossible – When the theory you’re testing has several

dimensions

• E.g., regression slope and intercept

– When “intervals” don’t make sense

• E.g., are two categorical variables independent?

Page 8: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 8

Elements of a Statistical Test

• Null hypothesis

– Denoted H0

– We believe the null unless/until it is proven false

• Alternative hypothesis

– Denoted Ha

– It’s what we would like to prove

• Test statistic

– The test statistic is the empirical evidence from the data

• Rejection region

– If the test statistic falls in the rejection region, we say “reject

the null hypothesis” or “we’ve proven the alternative”

– Otherwise, we “fail to reject the null”

Page 9: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 9

Errors in Hypothesis Testing

conclusion

don’t reject

H0 reject

H0

H0 true no

error type I error

true

situatio

n

Ha true type II error

no error

Page 10: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 10

Probability of a Type I Error

• a=Pr(Type I error) =

Pr(H0 is rejected when it is true)

• Also called the significance level or the level

of the test

• Experimenter chooses prior to the test

– Conventions: a=0.10, 0.05, and 0.01

– Can increase or decrease depending on particular

problem

• Choice of a defines rejection region (or size

of “p-value”) that results in rejection of null

Page 11: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 11

Probability of a Type II Error

• b=Pr(Type II error) =

Pr(not rejecting H0 when it is false)

• It is a function of the actual alternative

distribution and the sample size

• 1-b called the power of a test

– It’s Pr(rejecting H0 when it is false)

• Ideal is both small a and b (i.e., high power),

but for a fixed sample size they trade off

– By convention, we control a by choice and b with

sample size (bigger sample, more power)

Page 12: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Choosing the Null and Alternative

Based on the Severity of Error

• In hypothesis testing, we get to control the Type I

error

– So, if one error is more severe than the other, set the test up

so that it’s the Type I error

• E.g., in the US, we consider sending an innocent

person to jail more serious than letting a guilty person

go free

– Hence, the burden of proof of guilt is placed on the

prosecution at trial (“innocent until proven guilty”)

– I.e., the null hypothesis is a person is innocent, and the trial

process controls the chance of sending an innocent person

to jail

Revision: 2-12 12

Page 13: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.1

• Consider a political poll of n=15 people

– Want to test H0: p = 0.5 vs. Ha: p < 0.5, where

p is the proportion of the population favoring a

candidate

– The test statistic is Y, the number of people in the

sample favoring the candidate

• If the RR={y < 2}, find a

• Solution:

Revision: 2-12 13

Page 14: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.1 (continued)

Revision: 2-12 14

Page 15: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.2

• Continuing with the previous problem, if p=0.3

what is the probability of a Type II error (b)?

– I.e., what’s the probability of concluding that the

candidate will win?

• Solution:

Revision: 2-12 15

Page 16: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.2 (continued)

Revision: 2-12 16

Page 17: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.3

• Still continuing with the previous problem, if

p=0.1 then what is the probability of a Type II

error (b)?

• Solution:

Revision: 2-12 17

Page 18: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.4

• Still continuing the problem, if RR={y < 5}:

– What is the level of the test (a)?

– If p = 0.3, what is b?

• Solution:

Revision: 2-12 18

Page 19: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Example 10.4 (continued)

Revision: 2-12 19

Page 20: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 20

Terminology

• Null and alternative hypotheses

– We will believe the null until it is proven false

• Acceptance vs. rejection region

– The null is proven false if the test statistic falls in the

rejection region

• Type I vs. Type II error

– Type I: Rejecting the null hypothesis when it is really true

– Type II: Accepting the null hypothesis when it is really false

• Significance level or level of the test (a)

– Probability of a Type I error

• Pr(Type II error) = b and 1-b is called the power

– It’s a function of the actual alternative distribution

Page 21: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 21

Another Example

• As a program manager, you want to decrease

mean down-time m for a type of equipment

– Manufacturer suggests modification

– Goal is to see if mod actually does decrease m

• Implement modification on a sample of

equipment (n=25) and measure the down-

time (say, per month)

– Currently, equipment down-time is 75 mins/month

– Standard deviation is s=9 mins

Page 22: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 22

Intuition Behind the Test

m=75

s=9/5

1. We will assume the status

quo is true… a. In this case, that there is

no change in the mean

down-time

X

2. …unless there is sufficient

evidence to contradict it a. In this case, meaning we

observe a much smaller

sample mean than we

would expect to see

assuming m=75

Page 23: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 23

Steps in Hypothesis Testing

1. Identify the parameter of interest

2. State null and alternative hypotheses

3. Determine form of test statistic

4. Calculate rejection region

5. Calculate test statistic

6. Determine test outcome by comparing

test statistic to rejection region

Page 24: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 24

1. Identify Parameter of Interest

• In hypothesis tests, we are testing the parameter of a

distribution

– E.g., m or s for a normal distribution

– E.g., p for a binomial distribution

– E.g., a and b for a gamma distribution

• So, the first step is to identify the parameter of

interest

– Often we’ll be testing the mean m of a normal, since the CLT

often applies to the sample mean

• E.g., in the equipment down-time example, we’re

interested in testing the mean down-time

– For the coin and election examples, we’re testing p

Page 25: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 25

2. State Null and Alternative Hypotheses

• H0: The null hypothesis is a specific theory

about the population that we wish to disprove

– We will believe the null until it is disproved

– Example: Mean down-time is equal to 75

• In notation,

• Ha: The alternative hypothesis is what we want

to prove

– What we will believe if the null is rejected

– Example: Mean down-time is less than 75

• In notation,

0: 75H m =

: 75aH m

Page 26: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 26

Null and Alternative Hypotheses are

Fundamentally Different

• The null hypothesis is what you have assumed

– Generally, it’s the status quo or less desirable test outcome

– Failing to prove the alternative does not mean the null is true,

only that you don’t have enough evidence to reject it

• The alternative is proven based on empirical evidence

– It’s the desired test outcome and/or the outcome upon which

the burden of proof rests

– The significance level (a) is set so that the chance of incorrectly

“proving” the alternative is tolerably low

• Having proved the alternative is a much stronger

outcome than failing to reject the null

– Thus, structure the test so the alternative is what needs proving

Page 27: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 27

In the Example

• In the example, we want to test is whether the mod

decreases mean down-time

– So, the null hypothesis is the status quo and the alternative

carries the burden of proof to show a decrease

– We write this out as

• The other possibilities are and

– What would you be testing with these hypotheses?

0: 75

: 75a

H

H

m

m

=

0: 75

: 75a

H

H

m

m

=

0: 75

: 75a

H

H

m

m

=

Page 28: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Expressing the Null as an Equality

• We will express the null hypothesis as an

equality and the alternative as an inequality

– E.g.,

• In reality, the hypotheses divide the real line

into two separate regions

– E.g.,

• However, the most powerful test occurs when

the null hypothesis is at the boundary of its

region

– Hence, we write the null as an equality Revision: 2-12 28

0: 75 versus : 75aH Hm m=

0: 75 versus : 75aH Hm m

Page 29: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 29

3. Determine Test Statistic

(and its Sampling Distribution)

• The test statistic is (a function of) the sample statistic corresponding to population parameter you are testing – Population and sample statistic examples:

• Population mean Sample mean

• Difference of two population means Difference of two sample means

– It is sometimes “a function of” the sample statistic as we may rescale the sample statistic

• We use the sampling distribution to determine whether the observed statistic is “unusual”

Page 30: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 30

In the Example

• In the example, we are testing the mean m so the

obvious choice for the sample statistic is

• In this case, it’s easier to make the test statistic the

rescaled sample statistic,

where m0 is the null hypothesis mean

• Why? Because, assuming the null hypothesis is true,

we know the sampling distribution of Z: Z~N(0,1)

– This is exactly true if X has a normal distribution and approximately

true via the CLT for large sample sizes

X

0

/,

XZ

n

m

s

=

Page 31: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 31

4. Calculate Rejection Region

• Rejection region depends on the alternative

hypothesis

• Set the significance level a so that the

Pr(fall in rejection region | null hyp. is true) = a

• Means you will have to determine the

appropriate quantile or quantiles of the

sampling distribution

Page 32: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 32

The Way to Think About It

(for a “two-sided” test)

Rejection region – unlikely under the null (i.e., probability a)

If test statistic falls in this region, reject the null

Acceptance region – likely under the null hypothesis

If test statistic falls in this region, fail to reject null

Page 33: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 33

In the Example

• In our example, the test statistic

• In addition, we know , so this is a

“one-tailed” or “one-sided” test

• So, we need to find za so that Pr(Z < za) = a

• Choosing a=0.05:

– From Table 4 we get

Pr(Z < -1.645) = 0.05

– Using Excel: =NORMSINV(0.05)

– And in R: qnorm(0.05)

Z~N(0,1)

: 75aH m

Page 34: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 34

Picturing the Rejection Region

0 -1.645

75 75 - 1.645 x 9/5 = 72

• For the rescaled test statistic,

the rejection region is in red

• Probability of falling in that

region, assuming the null is

true, is a=0.05

• Here’s the equivalent test

without rescaling

• Probability of falling in that

region, assuming the null is

true, is still a=0.05

Z~N(0,1) pdf

X~N(75,81/25) pdf

Page 35: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 35

5. Calculate the Test Statistic

• Now, plug the necessary quantities into the formula and calculate the test statistic

• E.g., in the example, imagine we tested 25 pieces of equipment for a month, measuring the total down-time for each:

• Calculating the sample mean gives

• Thus, the test statistic is

68.6X =

0 68.6 753.8

/ 9 / 25

Xz

n

m

s

= = =

Page 36: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 36

6. Determine the Outcome

• Now, compare the test statistic with the

rejection region

– If it falls within the rejection region you have

“rejected the null hypothesis”

• Equivalently, “proven the alternative”

– If it falls in the acceptance region, you have

“failed to reject the null hypothesis”

• Equivalently, “failed to prove the alternative”

Page 37: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 37

In the Example

• We observed which falls in the

rejection region

• The picture:

• Thus, conclude that the mod is effective at

reducing mean down-time for the equipment

0 -1.645

3.8z =

3.8z =

Very unusual to see

this, if the null is true

Page 38: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 38

One Way to Think About It

• The test statistic tells us our observation was 3.8

standard errors way from the hypothesized mean

– The calculation

• How likely is this?

– If the null were true, it would be very unlikely

– Another way to think about it is about one chance in 14,000

to see something like this or more extreme (i.e., 1/0.000072)

– Pretty convincing evidence that the mod decreases mean

down-time

0 68.6 753.8

/ 9 / 25

Xz

n

m

s

= = =

0Pr( 3.8 | is true) 3.8 0.000072Z H = =

Page 39: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 39

Logic of Hypothesis Testing

– Suppose I am a

general/flag officer

– People would salute

me on Tuesdays

– No one ever salutes

me

– I must not be a

general/flag officer

Null Hypothesis

What I expect if null hypothesis is true

What I see is inconsistent with what I expect

My initial assumption must be wrong

It’s proof by contradiction:

Page 40: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 40

Now, for the Example…

– Assume m is 75

– Sample mean (n=25)

normally distributed w/

mean 75, s.e.

– Observe a sample

mean of 68.6 that is

3.8 s.e.s below

assumed mean

– Real mean m must be

less than 75

Null Hypothesis

What you expect to see if null is true

Not very likely to happen if null is true (p =0.00007)

Null must be false

9 / 25 1.8=

Page 41: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 41

Large-Sample or z-Tests

• The statistic is (approximately) normally distributed

• The rescaled test statistic is

• The null hypothesis is

• Three possible alternative hypotheses and tests:

0

ˆ

ˆZ

s

=

0:aH

0:aH

0:aH

Alternative Hypothesis Rejection Region for Level a Test

z za

z za

/ 2 / 2 or z z z za a

(upper-tailed test)

(lower-tailed test)

(two-tailed test)

0 0:H =

Page 42: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 42

Picturing z-Tests

Upper-tailed test Lower-tailed test Two-tailed test

* Figure from Probability and Statistics for Engineering and the Sciences, 7th ed., Duxbury Press, 2008.

Page 43: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 43

Example 10.5

• VP claims mean contacts/week is less than15

– Data collected on random sample of n=36 people

– Given and s2=9, is there evidence to refute

the claim at a significance level of a=0.05?

• Specify the hypotheses to be tested:

17y =

Page 44: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 44

Example 10.5 (continued)

• Specify the test statistic and the rejection

region

Page 45: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 45

Example 10.5 (continued)

• Conduct the test and state the conclusion

Page 46: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 46

Large Sample Tests

for Population Proportion (p)

• If we have a large sample, then via the CLT, has an approximate normal distribution

• For the null hypothesis H0: p = p0 there are three possible alternative hypotheses

where the test statistic is

0:aH p p

0:aH p p

0:aH p p

Alternative Hypothesis Rejection Region for Level a Test

z za

z za

/ 2 / 2 or z z z za a

(upper-tailed test)

(lower-tailed test)

(two-tailed test)

0

0 0

ˆ

(1 ) /

p pz

p p n

=

ˆ /p y n=

Page 47: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 47

What’s a “Large Sample”?

• Remember that the y in is a sum of

indicator variables

– If you sum enough, then the CLT “kicks in” and

you can assume has an approximately normal

distribution

• So use the same rule we used back in the

CLT lecture:

ˆ /p y n=

0 0

0 0

max( , )9

min( , )

p qn

p q

Page 48: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 48

Example 10.6

• Machine must be repaired is produces more

than 10% defectives

– A random sample of n=100 items has 15 defectives

– Is there evidence to support the claim that the

machine needs repairing at a significance level of

a=0.01?

• Specify the hypotheses to be tested:

Page 49: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 49

Example 10.6 (continued)

• Specify the test statistic and the rejection

region

Page 50: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 50

Example 10.6 (continued)

• Conduct the test and state the conclusion

Page 51: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 51

Example 10.7

• Reaction time study on men and women

conducted

– Data on independent random samples of 50 men

and 50 women collected giving

– Is there evidence to suggest a difference in the true

mean reaction times between men and women at

the a=0.01 level?

• Specify the hypotheses to be tested:

2 23.6, 0.18, 3.8, and 0.14 men men women womeny s y s= = = =

Page 52: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 52

Example 10.7 (continued)

• Specify the test statistic and the rejection

region

Page 53: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 53

Example 10.7 (continued)

• Conduct the test and state the conclusion

Page 54: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Calculating Type II Error Probabilities

• The probability of a Type II error (b) is the

probability a test fails to reject the null when

the alternative hypothesis is true

– Note that it depends on a particular alternative

hypothesis

– We can write it mathematically as

Pr(reject H0 | Ha is true with =a)

– To determine, first figure out the rejection region

(a function of the null hypothesis), then calculate

the probability of falling in the acceptance region

when =a

Revision: 2-12 54

Page 55: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Calculating Type II Error

Probabilities, continued

• For example, consider the test

versus

– Then the rejection region is of the form

for some value of k

• So, the probability of a Type II error is

Revision: 2-12 55

0 0:H =

0:aH

ˆ ˆ:RR k =

ˆ ˆ

ˆPr not in RR when true with =

ˆPr | =

ˆPr

a a a

a

a a

H

k

k

b

s s

=

=

=

Page 56: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 56

Example 10.8

• Returning to Example 10.5, find b if ma=16

– Remember that m0=15, a=0.05, n=36, and s2=9

• Pictorially, we have:

17y =

Page 57: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 57

Example 10.8 (continued)

• Now solving:

Page 58: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 58

Example 10.8 (continued)

• An alternative way to solve:

Page 59: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 59

Example 10.8 (continued)

Page 60: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Alternatively, the Power of a Test

• Remember, that’s the power of a test is the

probability a test will reject the null for a

particular alternative hypothesis

– It’s just 1-b

• Why is this important?

– Prior to doing a test, natural problem is that you

want to make sure you have sufficient power to

prove “interesting” alternatives

– Sometimes after a test results in a null result, you

might want to know the probability of rejecting at

the observed level

Revision: 2-12 60

Page 61: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 61

Sample Size Calculations

• The sample size n for which a level a test

also has b at the alternative value ma is

– Here za and zb are the quantiles of the normal

distribution for a and b

2

0

2

/2

0

for a one-tailed test

for a two-tailed test (approximate sol'n)

a

a

z z

n

z z

a b

a b

s

m m

s

m m

=

Page 62: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 62

Example 10.9

• Returning to Exercise 10.5, assuming s2=9,

what sample size n is required to test

vs. with a=b=0.05?

• Solution:

0: 15H m =

: 16aH m =

Page 63: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 63

Another Example

• Consider a test: H0: m =30,000 vs Ha: m ≠ 30,000, where

– The desired significance level is a=0.01

– The population has a normal distribution with s =1,500

• Find the required sample size so that at ma=31,000,

the probability of a Type II error is 0.1:

2

2

0a

z zn

a bs

m m

= =

Page 64: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Relationship Between Hypothesis

Tests and Confidence Intervals

• The large sample hypothesis test (z-test) is

based on the statistic

where the acceptance region is

which we can write as

Revision: 2-12 64

0

ˆ

ˆZ

s

=

02 2

ˆ

ˆRR z za a

s

=

ˆ ˆ2 0 2ˆ ˆRR z za a

s s=

Page 65: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Relationship Between Hypothesis

Tests and Confidence Intervals

• Now, remember the general form for a two-

sided large sample confidence interval:

• Note the similarities to the acceptance region:

• So, if we would fail to

reject the hypothesis test – Can interpret 100(1-a)% CI as the set of all values

for 0 for which H0: =0 is “acceptable” at level a Revision: 2-12 65

ˆ ˆ2 2ˆ ˆ100 1 % CI ,z za a

a s s =

ˆ ˆ2 0 2ˆ ˆRR z za a

s s=

ˆ ˆ0 2 2ˆ ˆ,z za a

s s

Page 66: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 66

• Introduced hypothesis testing

– Steps in conducting a hypothesis test

– Types of errors

– Lots of terminology

• Large sample tests for the means and

proportions

– Aka z-tests

– Type II error probability calculations

– Sample size calculations

What We Covered in this Module

Page 67: Module 7: Hypothesis Testing I - Naval Postgraduate Schoolfaculty.nps.edu/rdfricke/OA3102/Hypothesis Testing I - annotated.pdf · Reading assignment: WM&S chapter 10.1-10.5 . Revision:

Revision: 2-12 67

Homework

• WM&S chapter 10

– Required: 2, 3, 20, 21, 27, 34, 38, 41, 42

– Extra credit: None

• Useful hints:

Always first write out the null and alternative hypotheses.

I also recommend drawing the null distribution and then

highlighting the rejection region. This can be particularly

helpful when calculating b...

Exercises 21 and 27: We didn’t do these types of problems

in class, but just use what you learned with two-sample

confidence intervals…

• The relevant hypotheses are

and

0 1 2

1 2

: 0

: 0a

H

H

m m

m m

=

0 1 2

1 2

: 0

: 0a

H p p

H p p

=