25
© 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

© 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

  • View
    219

  • Download
    1

Embed Size (px)

Citation preview

Page 1: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

© 2010 Pearson Prentice Hall. All rights reserved

Confidence Intervals for the Population Mean When the

Population Standard Deviation is Unknown

Page 2: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

As we discussed earlier, a confidence interval is nothing more than a range of numbers we might expect the population parameter to take on. This range was constructed by the general formula:

Point estimate ± margin of error.

Page 3: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

In this instance, our point estimate will be the sample mean, .

Due to knowledge restrictions, we will also be unable to know what the true standard deviation of the population is; so we will have to estimate it’s value with the sample standard deviation.

x

Page 4: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-4

Suppose that a simple random sample of size n is taken from a population. If the population from which the sample is drawn follows a normal distribution, the distribution of

follows Student’s t-distribution with n-1 degrees of freedom where is the sample mean and s is the sample standard deviation.

Student’s t-Distribution

t x

s

n

x

Page 5: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-5

a) Obtain 1,000 simple random samples of size n=5 from a normal population with =50 and =10.

b) Determine the sample mean and sample standard deviation for each of the samples.

c) Compute and for each sample.

d) Draw a histogram for both z and t.

Parallel Example 1: Comparing the Standard Normal Distribution to the t-Distribution Using Simulation

z x

n

t x s

n

Page 6: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-6

Histogram for z

Page 7: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-7

Histogram for t

Page 8: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-8

CONCLUSIONS:

• The histogram for z is symmetric and bell-shaped with the center of the distribution at 0 and virtually all the rectangles between -3 and 3. In other words, z follows a standard normal distribution.

• The histogram for t is also symmetric and bell-shaped with the center of the distribution at 0, but the distribution of t has longer tails (i.e., t is more dispersed), so it is unlikely that t follows a standard normal distribution. The additional spread in the distribution of t can be attributed to the fact that we use s to find t instead of . Because the sample standard deviation is itself a random variable (rather than a constant such as ), we have more dispersion in the distribution of t.

Page 9: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-9

1. The t-distribution is different for different degrees of freedom.

2. The t-distribution is centered at 0 and is symmetric about 0.

3. The area under the curve is 1. The area under the curve to the right of 0 equals the area under the curve to the left of 0 equals 1/2.

4. As t increases without bound, the graph approaches, but never equals, zero. As t decreases without bound, the graph approaches, but never equals, zero.

Properties of the t-Distribution

Page 10: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-10

5. The area in the tails of the t-distribution is a little greater than the area in the tails of the standard normal distribution, because we are using s as an estimate of , thereby introducing further variability into the t- statistic.

6. As the sample size n increases, the density curve of t gets closer to the standard normal density curve. This result occurs because, as the sample size n increases, the values of s get closer to the values of , by the Law of Large Numbers.

Properties of the t-Distribution

Page 11: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-11

Page 12: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-12

Page 13: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-13

Parallel Example 2: Finding t-values

Find the t-value such that the area under the t-distribution to the right of the t-value is 0.2 assuming 10 degrees of freedom. That is, find t0.20 with 10 degrees of freedom.

Page 14: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-14

The figure to the left shows the graph of the t-distribution with 10 degrees of freedom.

Solution

The unknown value of t is labeled, and the area under the curve to the right of t is shaded. The value of t0.20 with 10 degrees of freedom is 0.8791.

Page 15: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-15

Constructing a (1-)100% Confidence Interval for , Unknown

Suppose that a simple random sample of size n is taken from a population with unknown mean and unknown standard deviation . A (1-)100% confidence interval for is given by

Lower Upperbound: bound:

Note: The interval is exact when the population is normally distributed. It is approximately correct for nonnormal populations, provided that n is large enough.

x t2

s

n

x t2

s

n

Page 16: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-16

Parallel Example 3: Constructing a Confidence Interval about a Population Mean

The pasteurization process reduces the amount of bacteria found in dairy products, such as milk. The following data represent the counts of bacteria in pasteurized milk (in CFU/mL) for a random sample of 12 pasteurized glasses of milk. Data courtesy of Dr. Michael Lee, Professor, Joliet Junior College.

Construct a 95% confidence interval for the bacteria count.

Page 17: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-17

NOTE: Each observation is in tens of thousand. So, 9.06 represents 9.06 x 104.

Page 18: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-18

Solution: Checking Normality and Existence of Outliers

Normal Probability Plot for CFU/ml

Page 19: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-19

Boxplot of CFU/mL

Solution: Checking Normality and Existence of Outliers

Page 20: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-20

x 6.41 and s 4.55

Lowerbound:

Upperbound:

0.05, n 12, so t0.05

2

2.201

6.41 2.2014.55

123.52

6.41 2.2014.55

129.30

Page 21: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-21

The 95% confidence interval for the mean bacteria count in pasteurized milk is (3.52, 9.30).

Interpretation: We are 95% confident that the true mean for the bacteria count in pasteurized milk is between 3.52 and 9.30.

Page 22: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-22

Parallel Example 5: The Effect of Outliers

Suppose a student miscalculated the amount of bacteria and recorded a result of 2.3 x 105. We would include this value in the data set as 23.0.

What effect does this additional observation have on the 95% confidence interval?

Page 23: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-23

Boxplot of CFU/mL

Solution: Checking Normality and Existence of Outliers

Page 24: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-24

x 7.69 and s 6.34

Lowerbound:

Upperbound:

The 95% confidence interval for the mean bacteria count in pasteurized milk, including the outlieris (3.86, 11.52).

• •

0.05, n 13, so t0.05

2

2.179

Solution

7.69 2.1796.34

133.86

7.69 2.1796.34

1311.52

Page 25: © 2010 Pearson Prentice Hall. All rights reserved Confidence Intervals for the Population Mean When the Population Standard Deviation is Unknown

9-25

CONCLUSIONS:• With the outlier, the sample mean is larger because

the sample mean is not resistant• With the outlier, the sample standard deviation is

larger because the sample standard deviation is not resistant

• Without the outlier, the width of the interval decreased from 7.66 to 5.78.

s 95% CI

WithoutOutlier

6.41 4.55 (3.52, 9.30)

WithOutlier

7.69 6.34 (3.86, 11.52)

x