Upload
sandra-merrill
View
260
Download
2
Tags:
Embed Size (px)
Citation preview
Chapter 11
Comparing Two Populations or Treatments
Suppose we have a population of adult men with a mean height of 71 inches and standard deviation of 2.5 inches. We also have a population of adult women with a mean height of 65 inches and standard deviation of 2.3 inches. Assume heights are normally distributed.
Suppose we take a random sample of 30 men and a random sample of 25 women from their respective populations and calculate the difference in their heights (man’s height – woman’s height).
If we did this many times, what would the distribution of differences be like?
On the next slide we will investigate this
distribution.
6
71
Male Heights
65
Female Heights
6571
M = 2.5
F = 2.3
30
5.2
Mx
Suppose we took repeated samples of size n = 25 from
the population of female heights and calculated the sample means. We would
have the sampling distribution of xF 25
3.2
Fx
xM - xF
Randomly take one of the
sample means for the males
and one of the sample means for the females
and find the difference in
mean heights.
Doing this repeatedly, we will
create the sampling
distribution of (xM – xF)
22
25
3.2
30
5.2
FM xx
Suppose we took repeated samples of size n = 30 from
the population of male heights and calculated the sample means. We would
have the sampling distribution of xM.
Heights Continued . . .
• Describe the sampling distribution of the difference in mean heights between men and women.
• What is the probability that the difference in mean heights of a random sample of 30 men and a random sample of 25 women is less than 5 inches?
The sampling distribution is normally distributed with
66571 FM xx22
253.2
305.2
FM xx
60614.)5)(( FM xxP
2
22
1
21
21 nnxx
Properties of the Sampling Distribution of x1 – x2
1.
2. and
If the random samples on which x1 and x2 are based are selected independently of one another, then
212121 xxxx
2
22
1
21222
2121 nnxxxx
Mean value
of x1 – x2
The sampling distribution of x1 – x2 is always centered at the value of 1 – 2, so
x1 – x2 is an unbiased statistic for estimating 1 – 2.3. In n1 and n2 are both large or the population
distributions are (at least approximately) normal, x1 and x2 each have (at least approximately) normal distributions. This implies that the sampling distribution of x1 – x2 is also (approximately) normal.
The variance of the differences is the sum of the variances.
The properties for the sampling distribution of x1 – x2 implies that x1 – x2 can be standardized to obtain a variable with a sampling distribution that is approximately the standard normal (z) distribution.
When two random samples are independently selected and n1 and n2 are both large or the population distributions are (at least approximately) normal, the distribution of
2
22
1
21
2121 )(
nn
xxz
is described (at least approximately) by the standard normal (z) distribution.
We must know and 2 in order to use this
procedure.
If 1and is unknown
we must use t
distributions.
Two-Sample t Test for Comparing Two Populations
Null Hypothesis: H0: 1 – 2 = hypothesized value
Test Statistic:
The appropriate df for the two-sample t test is
The computed number of df should be truncated to an integer.
2
22
1
21
21 value edhypothesiz
ns
ns
xxt
11
df
2
22
1
21
221
nV
nV
VV
where1
21
1 ns
V 2
22
2 ns
V and
The hypothesized value is often 0, but there are
times when we are interested in testing for a difference that is not 0.
A conservative estimate of the P-value can be found by using the t-curve with the number of degrees of freedom equal to the smaller of (n1 – 1) or (n2 –
1).
Two-Sample t Test for Comparing Two Populations Continued . . .Null Hypothesis: H0: 1 – 2 = hypothesized value
Alternative Hypothesis: P-value:
Ha: 1 – 2 > hypothesized value
Area under the appropriate t curve to the right of the computed t
Ha: 1 – 2 < hypothesized value
Area under the appropriate t curve to the left of the computed t
Ha: 1 – 2 ≠ hypothesized value
2(area to right of computed t) if +t or2(area to left of computed t) if -t
Another Way to Write Hypothesis Statements:
H0: 1 - 2 = 0
Ha: 1 - 2 < 0
Ha: 1 - 2 > 0
Ha: 1 - 2 ≠ 0
H0: 1 = 2
Ha: 1 < 2
Ha: 1 > 2
Ha: 1 ≠ 2
Be sure to define
BOTH 1 and 2!
When the hypothesized value is 0, we can rewrite
these hypothesis statements:
Two-Sample t Test for Comparing Two Populations Continued . . .Assumptions:1) The two samples are independently selected
random samples from the populations of interest2) The sample sizes are large (generally 30 or
larger) or the population distributions are (at least approximately) normal.
When comparing two treatment groups, use the following assumptions:
1) Individuals or objects are randomly assigned to treatments (or vice versa)
2) The sample sizes are large (generally 30 or larger) or the treatment response distributions are approximately normal.
Are women still paid less than men for comparable work? A study was carried out in which salary data was collected from a random sample of men and from a random sample of women who worked as purchasing managers and who were subscribers to Purchasing magazine. Annual salaries (in thousands of dollars) appear below (the actual sample sizes were much larger). Use = .05 to determine if there is convincing evidence that the mean annual salary for male purchasing managers is greater than the mean annual salary for female purchasing managers.
H0: 1 – 2 = 0Ha: 1 – 2 > 0
Men 81 69 81 76 76 74 69 76 79 65
Women
78 60 67 61 62 73 71 58 68 48
Where 1 = mean annual salary for male purchasing managers and 2 = mean annual salary for female purchasing managers
State the hypotheses:
If we had defined 1 as the mean salary for female purchasing
managers and 2 as the mean salary for male purchasing managers, then
the correct alternative hypothesis would be the difference in the
means is less than 0.
Men
Women
60 80
Salary War Continued . . .
H0: 1 – 2 = 0
Ha: 1 – 2 > 0
Assumptions:1)Given two independently selected random samples of male and female purchasing managers.
Men 81 69 81 76 76 74 69 76 79 65
Women
78 60 67 61 62 73 71 58 68 48
Where 1 = mean annual salary for male purchasing managers and 2 = mean annual salary for female purchasing managers
2) Since the sample sizes are small, we must determine if it is plausible that the sampling distributions for each of the two populations are approximately normal. Since the boxplots are reasonably symmetrical with no outliers, it is plausible that the sampling distributions are approximately normal.
Verify the assumptionsEven though these are samples from subscribers of Purchasing magazine, the
authors of the study believed it was reasonable to view the samples as representative of the
populations of interest.
Salary War Continued . . .
H0: 1 – 2 = 0Ha: 1 – 2 > 0
Test Statistic:
P-value =.004 = .05
Since the P-value < , we reject H0. There is convincing evidence that the mean salary for male purchasing managers is higher than the mean salary for female purchasing managers.
Men 81 69 81 76 76 74 69 76 79 65
Women
78 60 67 61 62 73 71 58 68 48
Where 1 = mean annual salary for male purchasing managers and 2 = mean annual salary for female purchasing managers
Compute the test statistic and P-value
11.3
106.8
104.5
06.646.7422
t
1514.15
9396.7
9916.2
396.7916.2df 22
2
To find the P-value, first find the appropriate df.
Now find the area to the right of t = 3.11 in the t-curve with df =
15.
Truncate (round down) this value.
What potential type error could we have made with this conclusion?
Type I
The Two-Sample t Confidence Interval for the Difference Between Two Population or Treatment Means
The general formula for a confidence interval for 1 – 2 when
1)The two samples are independently selected random samples from the populations of interest
2)The sample sizes are large (generally 30 or larger) or the population distributions are (at least approximately) normal.
is
The t critical value is based on
df should be truncated to an integer.
2
22
1
21
21 value) critical t(ns
ns
xx
11
df
2
22
1
21
221
nV
nV
VV
1
21
1 ns
V 2
22
2 ns
V where and
For a comparison of two treatments, use the following assumptions:1) Individuals or objects are randomly assigned to treatments (or vice versa)2) The sample sizes are large (generally 30 or larger) or the treatment response distributions are approximately normal.
In a study on food intake after sleep deprivation, men were randomly assigned to one of two treatment groups. The experimental group were required to sleep only 4 hours on each of two nights, while the control group were required to sleep 8 hours on each of two nights. The amount of food intake (Kcal) on the day following the two nights of sleep was measured. Compute a 95% confidence interval for the true difference in the mean food intake for the two sleeping conditions.
x4 = 3924 s4 = 829.67 x8 = 4069.27 s8 = 952.90
4-hour sleep
3585 4470 3068 5338 2221 4791 4435 3099
3187 3901 3868 3869 4878 3632 4518
8-hour sleep
4965 3918 1987 4993 5220 3653 3510 3338
4100 5792 4547 3319 3336 4304 4057
Find the mean and standard deviation for each treatment.
Assumptions:
1)Men were randomly assigned to two treatment groups
Food Intake Study Continued . . .4-hour sleep
3585 4470 3068 5338 2221 4791 4435 3099
3187 3901 3868 3869 4878 3632 4518
8-hour sleep
4965 3918 1987 4993 5220 3653 3510 3338
4100 5792 4547 3319 3336 4304 4057
x4 = 3924 s4 = 829.67 x8 = 4069.27 s8 = 952.90
Verify the assumptions.2) The assumption of normal
response distributions is plausible because both boxplots are approximately symmetrical with no outliers.
4000
4-hour
8-hour
We are 95% confident that the true difference in the mean food intake for the two sleeping conditions is between -814.1 Kcal and 523.6 Kcal.
Food Intake Study Continued . . .4-hour sleep
3585 4470 3068 5338 2221 4791 4435 3099
3187 3901 3868 3869 4878 3632 4518
8-hour sleep
4965 3918 1987 4993 5220 3653 3510 3338
4100 5792 4547 3319 3336 4304 4057
x4 = 3924 s4 = 829.67 x8 = 4069.27 s8 = 952.90
)6.523,1.814(15
90.95215
67.829052.2)27.40693924(
22
Calculate the interval.
Interpret the interval in context.
Based upon this interval, is there a significant difference in the mean food intake for the two sleeping
conditions?No, since 0 is in the confidence interval, there is
not convincing evidence that the mean food intake for the two sleep conditions are different.
Pooled t Test
• Used when the variances of the two populations are equal (1 = 2)
• Combines information from both samples to create a “pooled” estimate of the common variance which is used in place of the two sample standard deviations
• Is not widely used due to its sensitivity to any departure from the equal variance assumption
When the population variances are equal, the pooled t procedure is
better at detecting deviations from H0 than the two-sample t test.
P-values computed using the pooled t procedure can be far from the actual P-value if the population variances are not
equal.
Suppose that an investigator wants to determine if regular aerobic exercise improves blood pressure. A random sample of people who jog regularly and a second random sample of people who do not exercise regularly are selected independently of one another.
Can we conclude that the difference in mean blood pressure is attributed to jogging?
What about other factors like weight?
One way to avoid these difficulties would be to pair subjects by weight
then assign one of the pair to jogging and the other to no exercise.
Summary of the Paired t test for Comparing Two Population or Treatment Means
Null Hypothesis: H0: d = hypothesized value
Test Statistic:
Where n is the number of sample differences and xd and sd are the mean and standard deviation of the sample differences. This test is based on df = n – 1.
Alternative Hypothesis: P-value:Ha: d > hypothesized value Area to the right of calculated t
Ha: d < hypothesized value Area to the left of calculated t
Ha: d ≠ hypothesized value 2(area to the right of t) if +t
or 2(area to the left of t) if -t
ns
xt
d
d value edhypothesizThe hypothesized value is usually 0 – meaning that
there is no difference.
Where d is the mean of the differences in the paired
observations
Summary of the Paired t test for Comparing Two Population or Treatment Means Continued . . .
Assumptions:1. The samples are paired.2. The n sample differences can be viewed as
a random sample from a population of differences.
3. The number of sample differences is large (generally at least 30) or the population distribution of differences is (at least approximately) normal.
Is this an example of paired samples?
An engineering association wants to see if there is a difference in the mean annual salary for electrical engineers and chemical engineers. A random sample of electrical engineers is surveyed about their annual income. Another random sample of chemical engineers is surveyed about their annual income.
No, there is no pairing of individuals, you have two independent samples
Is this an example of paired samples?A pharmaceutical company wants to test its new weight-loss drug. Before giving the drug to volunteers, company researchers weigh each person. After a month of using the drug, each person’s weight is measured again.
Yes, you have two observations on each individual, resulting in paired data.
Can playing chess improve your memory? In a study, students who had not previously played chess participated in a program in which they took chess lessons and played chess daily for 9 months. Each student took a memory test before starting the chess program and again at the end of the 9-month period.
Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0: d = 0
Ha: d < 0
Where d is the mean memory score difference between students with no chess training and students who have completed chess training
First, find the differences pre-test minus post-test.State the hypotheses.
If we had subtracted Post-test minus Pre-test, then the
alternative hypothesis would be the mean difference is greater
than 0.
Playing Chess Continued . . .Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0: d = 0
Ha: d < 0
Assumptions:
1) Although the sample of students is not a random sample, theVerify assumptions
Where d is the mean memory score difference between students with no chess training and students who have completed chess training
investigator believed that it was reasonable to view the 12 sample differences as representative of all such differences.
2) A boxplot of the differences is approximately symmetrical with no outliers so the assumption of normality is plausible.
Playing Chess Continued . . .Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
H0: d = 0
Ha: d < 0
Test Statistic:
Where d is the mean memory score difference between students with no chess training and students who have completed chess training
P-value ≈ 0 df = 11 = .05
Since the P-value < , we reject H0. There is convincing evidence to suggest that the mean memory score after chess training is higher than the mean memory score before training.
56.4
1274.109
06.144
t Compute the test statistic
and P-value.
State the conclusion in context.
Paired t Confidence Interval for d
When1. The samples are paired.2. The n sample differences can be viewed as a random
sample from a population of differences.3. The number of sample differences is large (generally at
least 30) or the population distribution of differences is (at least approximately) normal.
the paired t interval for d is
Where df = n - 1
n
stx d
d value) critical (
Playing Chess Revisited . . .Student 1 2 3 4 5 6 7 8 9 10 11 12
Pre-test 510 610 640 675 600 550 610 625 450 720 575 675
Post-test 850 790 850 775 700 775 700 850 690 775 540 680
Difference -340 -180 -210 -100 -100 -225 -90 -225 -240 -55 35 -5
)69.87,5.201(12
74.109796.16.144
We are 90% confident that the true mean difference in memory scores before chess training and the memory scores after chess training is between -201.5 and -87.69.
Compute a 90% confidence interval for the mean difference in memory scores before chess training and the memory
scores after chess training.
Large-Sample Inferences Concerning the Difference Between Two Population or
Treatment Proportions
Investigators at Madigan Army Medical Center tested using duct tape to remove warts versus the more traditional freezing treatment.
Suppose that the duct tape treatment will successfully remove 50% of warts and that the traditional freezing treatment will successfully remove 60% of warts.
Some people seem to think that duct tape can fix anything . . . even remove warts!
Let’s investigate the sampling distribution of pfreeze - ptape
.6
.1
.5100
)4(.6.ˆ freezep
100
)5(.5.ˆ tapep
100)5(.5.
100)4(.6.
ˆˆ tapefreeze pp
pfreeze = the true proportion of warts that are successfully removed by freezing
pfreeze = .6
ptape = the true proportion of warts that are successfully removed by using duct tape
ptape = .5Suppose we repeatedly treated 100 warts using the
traditional freezing treatment and calculated the proportion of warts that are successfully
removed. We would have the sampling distribution of
pfreeze
Suppose we repeatedly treated 100 warts using the
duct tape method and calculated the proportion of warts that are successfully removed. We would have
the sampling distribution of ptape.
pfreeze - ptapeDoing this repeatedly, we will
create the sampling
distribution of (pfreeze – ptape)
Randomly take one of the
sample proportions for
the freezing treatment and
one of the sample
proportions for the duct tape treatment and
find the difference.
Properties of the Sampling Distribution of p1 – p2
If two random samples are selected independently of one another, the following properties hold:
1.
This says that the sampling distribution of p1 – p2 is centered at p1 – p2 so p1 – p2 is an unbiased statistic for estimating p1 – p2.
21ˆˆ 21pp
pp
2
22
1
11ˆˆ
)1()1(21 n
ppn
pppp
2.
3. If both n1 and n2 are large (that is, if n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, and n2(1 – p2) > 10), then p1 and p2 each have a sampling distribution that is approximately normal, and their difference p1 – p2 also has a sampling distribution that is approximately normal.
When performing a hypothesis test, we
will use the null hypothesis that p1 and p2 are equal. We will not know
the common value for p1 and p2.
Since the value for p1 and p2 are
unknown, we will combine p1 and p2
to estimate the common value of p1
and p2
Use:
21
2211ˆˆ
ˆnn
pnpnpc
Summary of Large-Sample z Test for p1 – p2 = 0
Null Hypothesis: H0: p1 – p2 = 0
Test Statistic:
Alternative Hypothesis: P-value:Ha: p1 – p2 > 0 area to the right of calculated z
Ha: p1 – p2 < 0 area to the left of calculated z
Ha: p1 – p2 ≠ 0 2(area to the right of z) if +z or
2(area to the left of z) if -z
21
2121
)ˆ1(ˆ)ˆ1(ˆ
)(ˆˆ
npp
npp
ppppz
cccc
Use:
21
2211ˆˆ
ˆnn
pnpnpc
Another Way to Write Hypothesis statements:
H0: p1 - p2 = 0
Ha: p1 - p2 > 0
Ha: p1 - p2 < 0
Ha: p1 - p2 ≠ 0
Be sure to define both
p1 & p2!
H0: p1 = p2
Ha: p1 > p2
Ha: p1 < p2 Ha: p1 ≠ p2
Assumption:1) The samples are independently chosen
random samples or treatments were assigned at random to individuals or objects
Summary of Large-Sample z Test for p1 – p2 = 0 Continued . . .
2) Both sample sizes are large n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, n2(1
– p2) > 10
Since p1 and p2 are unknown we must use p1 and p2 to verify that the
samples are large enough.
Investigators at Madigan Army Medical Center tested using duct tape to remove warts. Patients with warts were randomly assigned to either the duct tape treatment or to the more traditional freezing treatment. Those in the duct tape group wore duct tape over the wart for 6 days, then removed the tape, soaked the area in water, and used an emery board to scrape the area. This process was repeated for a maximum of 2 months or until the wart was gone. The data follows:
Do these data suggest that freezing is less successful than duct tape in removing warts?
Treatment nNumber with wart
successfully removed
Liquid nitrogen freezing
100 60
Duct tape 104 88
Duct Tape Continued . . .
H0: p1 – p2 = 0
Ha: p1 – p2 < 0
Assumptions: 1) Subjects were randomly assigned to the two treatments.
Treatment nNumber with wart successfully
removed
Liquid nitrogen freezing
100 60
Duct tape 104 88Where p1 is the true proportion of warts that would be successfully removed by freezing and p2 is the true proportion of warts that would be successfully removed by duct tape
2) The sample sizes are large enough because:
n1p1 = 100(.6) = 60 > 10 n1(1 – p1) = 100(.4) = 40 > 10
n2p2 = 100(.85) = 85 > 10 n2(1 – p2) = 100(.15) = 15 > 10
Duct Tape Continued . . .
H0: p1 – p2 = 0
Ha: p1 – p2 < 0
Treatment nNumber with wart successfully
removed
Liquid nitrogen freezing
100 60
Duct tape 104 88
73.1041008860ˆ
cp
03.4
104)27(.73.
100)27(.73.
085.6.
z P-value ≈ 0
= .01
Since the P-value < , we reject H0. There is convincing evidence to suggest the proportion of warts successfully removed is lower for freezing than for the duct tape treatment.
A Large-Sample Confidence Interval for p1 – p2
When1)The samples are independently chosen random samples or treatments were assigned at random to individuals or objects
a large-sample confidence interval for p1 – p2 is
2) Both sample sizes are large n1p1 > 10, n1(1 – p1) > 10, n2p2 > 10, n2(1 – p2) >
10
2
22
1
1121
)ˆ1(ˆ)ˆ1(ˆvalue critical ˆˆ
npp
npp
zpp
The article “Freedom of What?” (Associated Press, February 1, 2005) described a study in which high school students and high school teachers were asked whether they agreed with the following statement: “Students should be allowed to report controversial issues in their student newspapers without the approval of school authorities.” It was reported that 58% of students surveyed and 39% of teachers surveyed agreed with the statement. The two samples – 10,000 high school students and 8000 high school teachers – were selected from schools across the country.
Compute a 90% confidence interval for the difference in proportion of students who agreed with the statement and the proportion of teachers who agreed with the statement.
Newspaper Problem Continued . . .
p1 = .58 p2 = .39
1) Assume that it is reasonable to regard these two samples as being independently selected and representative of the populations of interest.
2) Both sample sizes are large enough n1p1 = 10000(.58) > 10, n1(1 – p1) = 10000(.42) > 10, n2p2 = 8000(.39) > 10, n2(1 – p2) = 8000(.61) > 10
)202.,178(.8000
)61(.39.10000
)42(.58.645.1)39.58(.
We are 90% confident that the difference in proportion of students who agreed with the statement and the proportion of teachers who agreed with the statement is between .178 and .202.
Based on this confidence interval, does there appear to be a significant difference in proportion of students who agreed with the statement and the proportion of teachers who agreed with the statement? Explain.