41
Hypothesis Testing for Continuous Variables Yuantao Hao 19th,Otc., 2009 Chapter4

Chapter 4(1) Basic Logic

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Chapter 4(1) Basic Logic

Hypothesis Testing for

Continuous Variables

Yuantao Hao

19th,Otc., 2009

Chapter4

Page 2: Chapter 4(1) Basic Logic

Methods of statistical inference :

Parameter estimation: interval estimation

Hypothesis testing

Page 3: Chapter 4(1) Basic Logic

4.1

Specific logic and

main steps of hypothesis testing

Page 4: Chapter 4(1) Basic Logic

4.1 Specific logic and main steps of hypothesis testing

Example 4.1 : Randomly select 20 cases from the patients with certain kind of disease. The sample mean of blood sedimentation (mm/h) (血沉) is 9.15, sample standard deviation is 2.13. To estimate the 95% confidence interval and 99% confidence interval of population mean under the assumption that the blood sedimentation of this kind of disease follows a normal distribution

Page 5: Chapter 4(1) Basic Logic

Solution:

15.8 and 15.1020

13.2093.215.905.005.0

n

stxstx x

87.7 and 51.1020

13.2861.215.901.001.0

n

stxstx x

the 95% confidence interval is (8.15, 10.15),

the 99% confidence interval is (7.78, 10.51).

Page 6: Chapter 4(1) Basic Logic

Other consideration:

However, researchers often have preconceived

ideas about what these parameters might be

and wish to test whether the data conform with

these ideas.

Page 7: Chapter 4(1) Basic Logic

Question:

Whether the population mean was equal to

10.50 that had been reported in the literatures?

It was one of the typical problems of

hypothesis testing.

Page 8: Chapter 4(1) Basic Logic

50.1015.9

Sample mean

μ

How to explain this difference?

Two guesse

s

Page 9: Chapter 4(1) Basic Logic

4.1.1 Set up the statistical hypotheses

5.10:0 H

50.10:1 H

null hypothesis

alternative hypothesis

Page 10: Chapter 4(1) Basic Logic

4.1.2 Select statistics and calculate its current value

1.~/

50.10

ndistt

nS

Xt

19120,8345.220/13.2

50.1015.9

t

Page 11: Chapter 4(1) Basic Logic

-2.8345 0 2.8345

Fig.4.1 Demonstration for the current value of t and the P-value

Symmetric around 0

Page 12: Chapter 4(1) Basic Logic

4.1.3 Determine the P value

P-value is defined as a probability of the event

that the current situation and even more

extreme situation towards appear in the

population. 0H

Page 13: Chapter 4(1) Basic Logic

The P-value can also be thought of as the

probability of obtaining a test statistic as

extreme as or more extreme than the actual

test statistic obtained, given that the null

hypothesis is true.

)8345.2( tPP

Page 14: Chapter 4(1) Basic Logic

-2.8345 0 2.8345

0.01<p<0.02

Fig.4.1 Demonstration for the current value of t and the P-value

Current situation

Extreme situation

Page 15: Chapter 4(1) Basic Logic

4.1.4 Decision and conclusion

In general, the decision rule is:

When P≤ , reject ;

otherwise, not reject .

0H

0H

An ignorable small probability alpha should be

defined in advance such as alpha=0.05

Page 16: Chapter 4(1) Basic Logic

Statements:

For convenience of statement, “reject ” is

often stated as “there is a statistically

significant difference” or “the difference is

statistically significant”, but it does not mean

that the difference is big or obvious;

0H

Page 17: Chapter 4(1) Basic Logic

Statements:

accordingly, “not reject ” is often stated as

“there is no statistically significant difference” o

r “the difference is not statistically significant”.

there is no enough evidence to reject and i

t does not straightforwardly mean to “accept ”

0H

0H

Page 18: Chapter 4(1) Basic Logic

Conclusion:

The result of the above example might cover: t = -2.8345 , P < 0.02 , reject , that is, there is a statistical significant difference between the population mean and 10.50 mm/h, which is reported in the literatures.

0H

Incorporating the background, it is considerable

that the blood sedimentation (mm/h) of this kind of

patients might be lower than 10.50 on an average.

Page 19: Chapter 4(1) Basic Logic

Two Errors:

Type I error : If is true, reject it.

Type II error : If is not true, not reject it.

0H

0H

Page 20: Chapter 4(1) Basic Logic

Table 1 Two by Two Table

Truth

H0 H1

Decision Not reject H0 Correct conclusion

(1-)

False not reject H0

Type II error ()

False negative result

Reject H0

False reject H0

Type I error ()

False positive result

Correct conclusion

Power=(1-)

Probability of detecting a

predefined statistical significant

difference.

Making Type I or Type II errors often

result in monetary and nonmonetary

costs.

Page 21: Chapter 4(1) Basic Logic

4.2

The t Test for One Group of Data under

Completely Randomized Design

Page 22: Chapter 4(1) Basic Logic

4.2 The t Test for One Group of Data under Completely Randomized Design

Based on the mean and standard deviation of a

sample with n individuals randomly selected from

a normal distribution , if one wants to judge

whether the population mean is equal to a

given constant , the t test for one group of

data under completely randomized design can be

used.

0

Page 23: Chapter 4(1) Basic Logic

main steps:

(1) Set up the statistical hypotheses

(2) Select statistics and calculate its current value

(3) Determine the P-value

00 : H 01 : H

nS

Xt

/0

) statistic of aluecourrent v( ttPP

Page 24: Chapter 4(1) Basic Logic

(4) Decision and conclusion

Comparing the P-value with the pre-assigned

small probability , if P ≤ , then reject ;

otherwise, not reject . Finally, issue the

conclusion incorporating with the background.

0H

0H

Page 25: Chapter 4(1) Basic Logic

Example 4.2

A large scale survey had reported that the mean

of pulses for healthy males is 72 times/min. A

physician randomly selected 25 healthy males in

a mountainous area and measured their pulses,

resulting in a sample mean of 75.2 times/min and

a standard deviation of 6.5 times/min. Can one

conclude that the mean of pulses for healthy

males in the mountainous area is higher than that

in the general population?

Page 26: Chapter 4(1) Basic Logic

Solution: step1

72:0 H

05.0

72:1 H 72:1 H

Page 27: Chapter 4(1) Basic Logic

One-side & two-side tests:

01 : H

01 : H

01 : H

two-side test

one-side test

Page 28: Chapter 4(1) Basic Logic

Definition:

A two-side test is a test in which the values

of the parameter being studied under the

alternative hypothesis are allowed to be

either greater than or less than the values

of the parameter under the null hypothesis.

Page 29: Chapter 4(1) Basic Logic

Definition:

A one-side test is a test in which the values

of the parameter being studied under the alter

native hypothesis are allowed to be either gre

ater than or less than the values of the param

eter under the null hypothesis, but not both.

Page 30: Chapter 4(1) Basic Logic

Solution:

72:0 H 72:1 H

05.0

t =2.69 , 0.005<P<0.01

Conclusion: the mean of pulses for healthy males in the mountainous area is higher than that in the general population

Page 31: Chapter 4(1) Basic Logic

-2.69 0 2.69

0.005<p<0.01

Fig.4.1 Demonstration for the current value of t and the P-value

P valueP valueOne side

Page 32: Chapter 4(1) Basic Logic

Exercise 1:

Suppose we want to test the hypothesis that

mothers with low socioeconomic status (SES)

deliver babies whose birth-weights are lower

than “normal”.

To test this hypothesis, a list is obtained of birth-

weights from 100 consecutive, full-term, live-

born deliveries from the maternity ward of a

hospital in a low-SES area.

Page 33: Chapter 4(1) Basic Logic

The mean birth-weight is found to be 115 oz,

with a sample standard deviation of 24 oz.

Suppose we know from nationwide survey based

on millions of deliveries that the mean birth-

weight in the United States is 120 oz.

Can we actually say the underlying mean

birth-weight from this hospital is lower than

the national average?

Page 34: Chapter 4(1) Basic Logic

Questions:

1. How to test the hypothesis?

2. What are the type I error and type II

errors for the data? What results will

be occurred by the errors?

Page 35: Chapter 4(1) Basic Logic

Solution: step1

120:0 H

05.0

120:1 H

Page 36: Chapter 4(1) Basic Logic

Step2:

991100,08.2100/24

120115

t

Page 37: Chapter 4(1) Basic Logic

-2.08 0 2.08

0.01<p<0.05

Fig.4.1 Demonstration for the current value of t and the P-value

P valueOne side

Page 38: Chapter 4(1) Basic Logic

Step3:

We can reject H0 at a significance level of

0.05.

The true mean birth-weight is significantly

lower in this hospital than in the general

population.

Page 39: Chapter 4(1) Basic Logic

Two Errors:

Type I error would be the probability of

deciding that the mean birth-weight in the

hospital was lower than 120 oz when in fact it

was 120 oz.

IF a type I error is made, then a special-care

nursery will be recommended, with all the

extra costs involved, when in fact it is not

needed.

Page 40: Chapter 4(1) Basic Logic

Type II error would be the probability of decidin

g that the mean birth-weight was 120 oz when i

n fact it was lower than 120 oz.

If a type II error is made, a special-care nursery

will not be needed, when in fact it is needed. Th

e nonmonetary cost of this decision is that low-

birthweight babies may not survive without the

unique equipment in a special-care nursery.

Page 41: Chapter 4(1) Basic Logic

THE END

THANKS!