26
Two-Sample Inference Procedures with Means

Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Embed Size (px)

Citation preview

Page 1: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Two-Sample Inference

Procedures with Means

Page 2: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Remember:

yxyx

22

yxyx

We will be

interested in the differenc

e of means, so we

will use this to

find standard

error.

Page 3: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.6 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are normally distributed.

Describe the distribution of the difference in heights between males and females (male-female).Normal distribution withx-y =6 inches & x-y =3.471 inches

Page 4: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

7165

FemaleMale

6

Difference = male - female

= 3.471

Page 5: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

a) What is the probability that the height of a randomly selected man is at most 5 inches taller than the height of a randomly selected woman?

b) What is the 70th percentile for the difference (male-female) in heights of a randomly selected man & woman?

P((xM-xF) < 5) = normalcdf(-∞,5,6,3.471) = .3866

(xM-xF) = invNorm(.7,6,3.471) = 7.82

Page 6: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

a) What is the probability that the mean height of 30 men is at most 5 inches taller than the mean height of 30 women?

b) What is the 70th percentile for the difference (male-female) in mean heights of 30 men and 30 women?

6.332 6.332 inchesinches

P((xP((xmm – x – xww)< 5) )< 5) = .0573= .0573

Page 7: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Two-Sample Procedures with

means• The goal of these inference

procedures is to compare the responses to two treatmentstwo treatments or to compare the characteristics of two populationstwo populations.

• We have INDEPENDENT samples from each treatment or population

When we compare, what are

we interested

in?

Page 8: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Assumptions:• Have two SRS’stwo SRS’s from the populations or

two randomly assignedtwo randomly assigned treatment groups

• Samples are independent

• Both distributions are approximately normally– Have large sample sizes– Graph BOTH sets of data

• ’’ss unknown

Page 9: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Hypothesis Statements:

H0: 1 - 2 = 0

Ha: 1 - 2 < 0

Ha: 1 - 2 > 0

Ha: 1 - 2 ≠ 0

H0: 1 = 2

Ha: 1< 2

Ha: 1> 2

Ha: 1 ≠ 2

Be sure to define BOTHBOTH 1 and 2!

Page 10: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Formulas

Since in real-life, we will NOTNOT know both

’s, we will do t-procedures.

Page 11: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Hypothesis Test:

statistic of SD

parameter - statisticstatisticTest

t

2

2

2

1

2

1

2121

ns

nsxx

Since we usually assume H0 is true,

then this equals 0 – so we can usually

leave it out

Page 12: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Degrees of FreedomOption 1: use the smaller of the

two values n1 – 1 and n2 – 1

This will produce conservative This will produce conservative results – higher p-values & results – higher p-values & lower confidence.lower confidence.

Option 2: approximation used by technology

2

2

2

21

2

1

1

2

2

2

2

1

2

1

11

11

ns

nns

n

ns

ns

df

Calculator does this

automatically!

Page 13: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Dr. Phil’s Survey

VS

If there were such a thing as a personality test, do you

think that the guys’ personalities would be different from the girls?

Page 14: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Assume the absorption time is normally distributed. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A

20.1 8.7 12 Brand B18.9 7.5 12

Is there sufficient evidence that these drugs differ in the speed at which they enter the blood stream?

Page 15: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed ’s unknown

05.53.217210.

361.

125.7

127.8

9.181.2022

2

22

1

21

21

αdfvaluep

ns

ns

xxt

Since p-value > a, I fail to reject H0. There is not sufficient evidence to suggest that these drugs differ in the speed at which they enter the blood stream.

State assumptions!

Formula & calculations

Conclusion in context

H0: A= B

Ha:A= B

Where A is the true mean absorption time for Brand A & B is the true mean absorption time for Brand B

Hypotheses & define variables!

Page 16: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Suppose that the sample mean of Brand B is 16.5, then is Brand B faster?

05.53.212896.

085.1

125.7

127.8

5.161.2022

2

22

1

21

21

αdfvaluep

ns

ns

xxt

No, I would still fail to reject the null hypothesis.

Page 17: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

A modification has been made to the process for producing a certain type of time-zero film (film that begins to develop as soon as the picture is taken). Because the modification involves extra cost, it will be incorporated only if sample data indicate that the modification decreases true average development time by more than 1 second. Should the company incorporate the modification?

Original 8.6 5.1 4.5 5.4 6.3 6.6 5.7 8.5Modified 5.5 4.0 3.8 6.0 5.8 4.9 7.0 5.7

Page 18: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Assume we have 2 independent SRS of film Both distributions are approximately normal due to approximately symmetrical boxplots ’s unknownH0: O- M = 1

Ha:O- M > 1

Where O is the true mean developing time for original film & M is the true mean developing time for modified film

05.75.

0

80636.1

85146.1

13375.53375.622

2

22

1

21

2121

dfvaluep

ns

ns

xxt

Since p-value > , I fail to reject H0. There is not sufficient evidence to suggest that the company incorporate the modification.

Page 19: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Confidence intervals:

statistic of SD valuecritical statisticCI

21xx *t

2

2

2

1

2

1

n

s

n

s

Called standard

error

Page 20: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Two competing headache remedies claim to give fast-acting relief. An experiment was performed to compare the mean lengths of time required for bodily absorption of brand A and brand B. Assume the absorption time is normally distributed. Twelve people were randomly selected and given an oral dosage of brand A. Another 12 were randomly selected and given an equal dosage of brand B. The length of time in minutes for the drugs to reach a specified level in the blood was recorded. The results follow: mean SD n Brand A

20.1 8.7 12 Brand B18.9 7.5 12

Find a 95% confidence interval difference in mean lengths of time required for bodily absorption of each brand.

Page 21: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Assumptions:

Have 2 independent randomly assigned treatments Given the absorption rate is normally distributed ’s unknown

)085.8,685.5(12

5.7

12

7.8080.29.181.20

22

53.21*2

22

1

21

21 dfn

s

n

stxx

We are 95% confident that the true difference in mean lengths of time required for bodily absorption of each brand is between –5.685 minutes and 8.085 minutes.

State assumptions!

Formula & calculations

Conclusion in contextFrom calculator df = 21.53, use t* for df = 21 & 95% confidence

level

Think “Price is Right”!

Closest without going over

Page 22: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Brand A Brand B 517, 495, 503, 491 493, 508, 513, 521503, 493, 505, 495 541, 533, 500, 515498, 481, 499, 494 536, 498, 515, 515

Compute a 95% confidence interval for the mean difference in amount of acetaminophen in Brand A and Brand B.

In an attempt to determine if two competing brands of cold medicine contain, on the average, the same amount of acetaminophen, twelve different tablets from each of the two competing brands were randomly selected and tested for the amount of acetaminophen each contains. The results (in milligrams) follow.

Page 23: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Input Brand A data into L1 and Brand B data into L2.

I am computing a 2-sample T interval for means at a 95% confidence level.

Confidence level: (-28.48, -7.189) with 17 df

I am 95% confident that the true difference in the mean amount of acetaminophen in Brand A is between 28.5 and 7.2 mg lower than in Brand B.

Page 24: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Note: confidence interval statements

•Matched pairs – refer to “mean difference”“mean difference”

•Two-Sample – refer to “difference of means”“difference of means”

Page 25: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Pooled procedures:

• Used for two populations with the samesame variance

• When you pool, you average the two-sample variances to estimate the common population variance.

• DO NOT use on AP Exam!!!!!We do NOT know the variances of the population,

so ALWAYS tell the calculator NO for pooling!

Page 26: Two-Sample Inference Procedures with Means. Remember: We will be interested in the difference of means, so we will use this to find standard error

Robustness:

• Two-sample procedures are more more robustrobust than one-sample procedures

• BESTBEST to have equal sample sizes! (but not necessary)