28
1 Lecture 3 • Miscellaneous details about hypothesis testing • Type II error • Practical significance vs. statistical significance • Chapter 12.2: Inference about mean when s.d. is unknown.

1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

  • View
    220

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

1

Lecture 3

• Miscellaneous details about hypothesis testing• Type II error• Practical significance vs. statistical significance• Chapter 12.2: Inference about mean when s.d. is

unknown.

Page 2: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

2

Relation between p-value and rejection region methods

• Compare the p-value to Reject the null hypothesis only if p-value <

• Ex. 11.1:

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

Y

Estim Mean 177.9965 Hypoth Mean 170 T Ratio 2.3392873316 P Value 0.0099065919

150 160 170 180 190

X

Sample Size = 400

Page 3: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

3

Null Hypothesis in One-Sided Test

• We start by defining H1 because this is the focus of our test.

• Example 11.1: H1: > 170• The null hypothesis is more

logically satisfying than • However, only the parameter value in H0 that is

closest to H1 influences the form of the test.• We therefore take for simplicity.

170:0 H170:0 H

170:0 H

Page 4: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

4

Calculating the Probability of a Type II Error• To properly interpret the results of a test of

hypothesis, we need to– specify an appropriate significance level or judge the

p-value of a test;– understand the relationship between Type I and

Type II errors.– How do we compute a type II error?

Page 5: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

5

• A Type II error occurs when a false H0 is not rejected.

• To calculate Type II error we need to…– express the rejection region directly, in terms of the

parameter hypothesized (not standardized).– specify the alternative value under H1.

• Let us revisit Example 11.1

Calculation of the Probability of a Type II Error

Page 6: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

6

Calculation of the Probability of a Type II Error

• Let us revisit Example 11.1– The rejection region was with = .05.34.175x – Let the alternative value be = 180 (rather than just

>170)

)18034.175( thatgivenxP

0764.0)40065

18034.175(

zP

Page 7: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

7

• A hypothesis test is effectively defined by the significance level and by the sample size n.

• A measures of effectiveness is the probability of Type II error. Typically we want to keep the probability of Type II error as small as possible.

• If the probability of a Type II error is judged to be too large, we can reduce it by– increasing , and/or– increasing the sample size.

Judging the Test

Page 8: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

8

• Increasing the sample size reduces

Judging the Test

By increasing the sample size the standard deviation of the sampling distribution of the mean decreases. Thus, decreases.

Lx

nzxthus,

nx

z:callRe LL

Page 9: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

9

• In Example 11.1, suppose n increases from 400 to 1000.

0)22.3Z(P)100065

18038.173Z(P

38.173100065

645.1170n

zxL

Judging the Test

• remains 5%, but the probability of a Type II drops dramatically.

Page 10: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

10

• Another way of expressing how well a test performs is to report its power– The power of a test is defined as 1 - – It represents the probability of rejecting the null

hypothesis when it is false.

Judging the Test

Page 11: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

11

Planning Studies

• Power calculations are important in planning studies.

• Using a hypothesis test with low power makes it unlikely that you will reject H0 even if the truth is far from the null hypothesis.

• Operating characteristic curve is a plot of versus the alternative for a fixed sample size n and a fixed significance level

Page 12: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

12

Operating Characteristic Curve for Example 11.1

170 175 180 185 190

0.0

0.2

0.4

0.6

0.8

Operating Characteristic Curves

Population Mean

Pro

ba

bili

ty o

f a T

ype

II e

rro

r n=100n=400n=1000n=2000

Page 13: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

13

Problem 11.54

Many Alpine ski centers base their projections of revenues and profits on the assumption that the average Alpine skier skis 4 times per year. To investigate the validity of this assumption, a random sample of 63 skiers is drawn and each is asked to report the number of times they skied the previous year. Assume that the population standard deviation is 2, and the sample mean is 4.84. Can we infer at the 10% level that the assumption is wrong?

Page 14: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

14

Page 15: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

15

Problem 11.54 follow-up

• What is the probability of making a Type II error if the average Alpine skier skis 4.2 times per year?

Page 16: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

16

Problem: Effects of SAT Coaching

• Suppose that SAT mathematics scores in the absence of coaching have a normal distribution with 475 and standard deviation 100. Suppose further that coaching may change the mean but not the standard deviation. Calculate the p-value for the test of versus

for each of the following three situations:(a) A coaching service coaches 100 students; their SAT-M scores average

(b) By the next year, the coaching service has coached 1000 students; their SAT-M scores average

(c) An advertising campaign brings the total number of students coached to 10,000; their average score is still

475:0 H 475:1 H

478x

478x

478x

Page 17: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

17

Page 18: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

18

Practical Significance vs. Statistical Significance

• An increase in the average SAT-M score from 475 to 478 is of little importance in seeking admission to college, but a large enough sample size will always declare very small effects statistically significant.

• A confidence interval provides information about the size of the effect and should always be reported. The two-sided 95% confidence intervals for the SAT coaching problem are . Thus, for (a) - (458.4,497.6); (b) – (471.8,484.2); (c) – (476.04,479.96).

• For large samples, the CI says “Yes, the mean score is higher after coaching but only by a small amount.”

)/100)(96.1(478 n

Page 19: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

19

Chapter 12

• In this chapter we utilize the approach In this chapter we utilize the approach developed before to describe a population.developed before to describe a population.– Identify the parameter to be estimated or tested.Identify the parameter to be estimated or tested.– Specify the parameter’s estimator and its sampling Specify the parameter’s estimator and its sampling

distribution.distribution.– Construct a confidence interval estimator or perform Construct a confidence interval estimator or perform

a hypothesis test.a hypothesis test.

Page 20: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

20

Recall that when is known we use the following statistic to estimate and test a population mean

When is unknown, we use its point estimator s,

and the z-statistic is replaced then by the t-statistic

12.2 Inference About a Population Mean When the Population Standard Deviation Is Unknown

n

xz

Page 21: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

21

t-Statistic

ns

xt

/

• When the sampled population is normally distributed, the t statistic is Student t distributed with n-1 degrees of freedom.

• Confidence Interval: where is the quantile of the Student t-distribution with n-1 degrees of freedom.

n

stx n 1,2/

1,2/ nt2/

Page 22: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

22

The t - Statistic

n

x

n

x

s

0

The t distribution is mound-shaped, and symmetrical around zero.

The “degrees of freedom”,(a function of the sample size)determine how spread thedistribution is (compared to the normal distribution)

d.f. = v2

d.f. = v1

v1 < v2

t

Page 23: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

23

Degrees of Freedom1 3.078 6.314 12.706 31.821 63.6572 1.886 2.92 4.303 6.965 9.925. . . . . .. . . . . .

20 1.325 1.725 2.086 2.528 2.845. . . . . .. . . . . .

200 1.286 1.653 1.972 2.345 2.6011.282 1.645 1.96 2.326 2.576

tA

t.100 t.05 t.025 t.01 t.005

A = .05

Page 24: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

24

Testing when is unknown

• Example 12.1– In order to determine the number of workers required to meet

demand, the productivity of newly hired trainees is studied.– It is believed that trainees can process and distribute more

than 450 packages per hour within one week of hiring.– Fifty trainees were observed for one hour. In this sample of

50 trainees, the mean number of packages processed is 460.38 and s=38.82.

– Can we conclude that the belief is correct, based on the productivity observation of 50 trainees?

Page 25: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

25

Page 26: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

26

Checking the required conditions

• In deriving the test and confidence interval, we have made two assumptions: (i) the sample is a random sample from the population; (ii) the distribution of the population is normal.

• The t test is robust – the results are still approximately valid as long as the population is not extremely nonnormal. Also if the sample size is large, the results are approximately valid.

• A rough graphical approach to examining normality is to look at the sample histogram.

350 400 450 500 550

Packages

Distributions

Page 27: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

27

JMP Example

• Problem 12.45: Companies that sell groceries over the Internet are called e-grocers. Customers enter their orders, pay by credit card, and receive delivery by truck. A potential e-grocer analyzed the market and determined that to be profitable the average order would have to exceed $85. To determine whether an e-grocer would be profitable in one large city, she offered the service and recorded the size of the order for a random sample of customers. Can we infer from the data than e-grocery will be profitable in this city at significance level 0.05?

Page 28: 1 Lecture 3 Miscellaneous details about hypothesis testing Type II error Practical significance vs. statistical significance Chapter 12.2: Inference about

28

Practice Problems

• 11.68,11.84,12.40,12.46