Confidence Intervals and Hypothesis Testing Using and

Confidence Intervalsand

Hypothesis Testing

Using and .X p̂

Two Statistical Procedures

Confidence Intervals

“which is an interval of values that the researcher is fairly “confident” covers the true value of the population.”

Hypothesis Testing

(or significance testing)

“uses sample data to reject the notion that chance alone can explain the sample results.”

Remember Empirical Rule

Going out 1·2.5, 2·2.5, 3·2.5 standard deviations from the mean, 69, gives us information regarding the amount of data contained in the range (interval).

Confidence Level

P(-z* < Z < z*) = .90 z* = 1.645

P(-z* < Z < z*) = .95 z* = 1.960

P(-z* < Z < z*) = .99 z* = 2.576

Confidence Level can vary. Since we want to be fairly sure about our confidence intervals, we usually use confidence levels in the 90+% range.

What is a Confidence Interval?

It is a range of values calculated from a sample(s), based on the sample mean or proportion, sample standard deviation, sample size, and level of confidence.

We use it to infer possible (plausible) values for the true population parameter ( or ).

Example: 95% CI for is 45.67 to 53.83

Example: 90% CI for is 0.245 to 0.265

How is a Confidence Interval constructed?

__%Confidence Interval = statistic (critical value)(standard error of statistic)

Example for Sample Proportion:

Example for Sample Mean:

n

ppzpCI

ˆ1ˆ*)(ˆ%__

n

sxCI )t*(%__

n

xCI

)z*(%__

Margin of Error

n

ppzpCI

ˆ1ˆ*)(ˆ%__

n

ppz

ˆ1ˆ*)(Margin of Error

95% CI of # of Tails/10 flips

Sample Prop SE Prop 95.0 % C.I.

T% 0.51594 0.01902 (0.47865, 0.55323)

0.80.70.60.50.40.30.2

20

10

0

H%

Fre

quen

cy

Interpretation: Based on this sample, we are 95% confident that the true population proportion () of number of tails in 10 flips is between 0.479 and 0.553.

95% CI of My Age

N MEAN STDEV SE MEAN 95 % C.I.

Age 69 36.130 5.843 0.703 (34.726, 37.534)

50454035302520

20

10

0

Age

Fre

quen

cy

Interpretation: Based on this sample, we are 95% confident that the true population mean (µ) of my age is between 34.7 and 37.5.

Revisit CLT

The first graph represents X (number of G’s in story).

The second graph represents (average number of G’s in 3 attempts).

What 3 important facts must you remember about CLT?

1009080706050403020100

20

10

0

G2

Freq

uenc

y

1009080706050403020100

30

20

10

0

AvgG

Freq

uenc

y

X

Properties of CI1) Sample statistic is the center of CI for

it is our best (and sometimes our only) estimate of the unknown population parameter of interest.

2) Larger the level of confidence, the wider the confidence interval.

3) The larger the sample size, the narrower the confidence interval.

4) Sampling distribution must be normally distributed (CLT, n and n(1- ) 10).p̂ p̂

How Many G’s?

N MEAN STDEV SEMEAN

AvgG 69 61.46 8.48 1.02

95 % C.I. =

= 61.56 ± 1.997(1.02)

= (59.423, 63.497)

n

stx *)(

95% CI of Brown M&Ms %Suppose you sampled a bag of 50 M&M’s

and found 23 brown candies. Construct a 95% confidence interval.

n

ppzpCI

ˆ1ˆ*)(ˆ%95

50

15023

*)(50

23%__

5023

zCI

CI95% = (0.322, .598)

95% CI of Brown M&Ms %First 20 Conf Intervals

0

5

10

15

20

25

30

0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700

20 bags of M&M’s were used to calculate a sample proportion of brown candies per bag.

Each sample proportion was used to construct a 95% CI which was graphed.

How many CI do NOT contain 30%?

Various CI’s w/90% Conf. Level

Example Exam QuestionsWhich of the following BEST describes what

95% confidence means in a 95% CI for of (7.8, 9.4)?

A. There is a 95% probability that is between 7.8 and 9.4.

B. In repeated sampling, will fall between 7.8 and 9.4 about 95% of the time.

C. In repeated sampling, 95% of the confidence intervals fall between 7.8 and 9.4.

D. In repeated sampling, the confidence intervals will contain about 95% of the time.

Hypothesis TestingUse hypothesis testing when you can state a

hypothesis based on an assumption about the population parameter.

Null hypothesis, Ho, states that the population parameter is equal to an assumed true value.

Alternative hypothesis, Ha, states that the population parameter is <, >, or to an assumed true value.

Ho: = 35

Ha: < 35

Example of Hypothesis Testing

Is there a statistical difference between my age and students’ sample

mean of my age?

Ho: = 35

Ha: > 35

TEST OF MU = 35.000 VS MU G.T. 35.000

N MEAN STDEV SEMEAN T P VALUEAge 69 36.130 5.843 0.703 1.61 0.056

-4 -3 -2 -1 0 1 2 3 4

What exactly is a p-value?

1. Bias in sampling

2. Luck of the draw—misrepresentative sample

3. Significant difference

Indication of significance!

What exactly is a p-value?Is the probability, assuming Ho is true, of

getting data like this or ‘worse’ (more like the alternative hypothesis).

Is the probability of obtaining extreme statistics or more when Ho is true.

You compare the p-value to alpha, .

If p-value < , reject Ho.

If p-value > , fail to reject Ho.

What exactly is a p-value in context of problem?

Is the probability, assuming Ho (my age is 35) is true, of getting data like this or ‘worse’ (more like the alternative hypothesis—claiming I am older than 35).

Is the probability of obtaining extreme statistics (sample mean of 36.130 from sample size of 69 students) or more when Ho (my age is 35) is true.

If alpha is 0.05 and p-value (probability) is 0.056, since p-value > alpha we will fail to reject that Ho (my age is 35) is true.

Example of Hypothesis Testing

Determine if the coin flipped was fair or not?

Ho: p = .5

Ha: p .5

1PropZTest(not equal)

= 356/690 = .5159

z = .8375p-value = .4023

n = 690

p̂

-4 -3 -2 -1 0 1 2 3 4

Power, , and

Let confidence level be represented by L.

Then 1 – L = (OR 1 - = L).

is called a Type I error. is known as the Type II error.

Power is equal to 1 - .

All 3 of these have probability values (likelihood's of occurrence.).

Hypothesis Testing like a court case

Assume innocent

Null hypothesis is true—when person is actually innocent

Null hypothesis is false—when person is actually guilty

Fail to reject null hypothesis—found innocent

Confidence Level (1-)—found innocent, when innocent

Type II Error --found innocent, when guilty

Reject null hypothesis—found guilty

Type I Error --found guilty, when innocent

Power (1- )—found guilty, when guilty

What are the Type I and II Errors?

Your boss asked you to test a new marketing scheme to potentially increase profits for the company.

But the new scheme also increases the expense of marketing the product.

What are the two potential ERRORs in the context of this problem?

P-Value Question

Which of the following statements is correct?A. A large p-value indicates that the data is

consistent with the alternative hypothesis.B. The smaller the p-value, the weaker the

evidence against the null hypothesis.C. The p-value measures the probability of

making a Type II error.D. The p-value measures the probability that

the hypothesis is true.E. An extremely small p-value indicates that

the actual data differs from the expected if the null hypothesis were true.

Power Question

Power in this situation refers toA. the ability to detect a difference when in

fact there is no difference.B. the ability to not detect a difference

when in fact there is no difference.C. the ability to detect a difference when in

fact there is a difference.D. the ability to not detect a difference

when in fact there is a difference.E. the ability to make a correct decision

regardless if there is a difference or not.

Sample QuestionThe government requires that water be sampled and tested for

safety every day. If the worker responsible for these testing uses a null hypothesis that means the water is safe, which of the following statements correctly describes a Type I and Type II error?

A. A Type I error would mean that the citizens were not warned of contaminated water, but a Type II error would mean that the citizens would panic although the water is ok.

B. A Type II error would mean that the citizens were not warned of contaminated water, but a Type I error would mean that the citizens would panic although the water was ok.

C. A Type II error would mean that the citizens were not warned of contaminated water, but a Type I error would mean that the citizens would not panic although the water is ok.

D. A Type I error would mean that the citizens were not warned of contaminated water, but a Type II error would mean that the citizens would panic although the water is not ok.

E. Both errors would cause unnecessary panic since a mistake means something went wrong but everything would work out.

Documents

Confidence Intervals and Hypothesis Testing Using and