22
Section 9.1 Introduction to Inference

With replacement or without replacement? Draw conclusions about a population based on data about a sample. Ask questions about a number which describe

Embed Size (px)

Citation preview

Page 1: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Section 9.1 Introduction to Inference

Page 2: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Question for you…

With replacement or without replacement?

Page 3: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Statistical Inferences

Draw conclusions about a population based on data about a sample.

Ask questions about a number which describe a population.

Numbers which describe populations are called… Parameters

To estimate a parameter, choose a sample from the population and use a statistic (a number calculated from the sample).

Page 4: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Sample Proportions

Population Proportions: p Sample Proportions:

(also known as p-hat) – we’ve seen this before… it’s the count in the sample which applied (i.e., said yes) divided by the sample size

Example: The 2001 Youth Risk Behavioral Survey questioned a nationally representative sample of 12,960 students in grade 9-12. Of these, 3340 said they had smoked cigarettes at least one day in the past month.

Who is the population? How do you write the sample proportion?

Page 5: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

The Answers

The population is high school students in the United States.

The sample is the 12,960 students surveyed.

The sample proportion (p-hat) …

Give all answers to 4 decimal places.

count in the sample 3340ˆ 0.2577

size of the sample 12,960p

Page 6: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

What it means…

The parameter is the proportion that have smoked cigarettes in the past month. (This is usually an unknown value).

Using the value of p-hat, we can make generalized statements about the population.

Based on this sample, we can estimate that the proportion of all high school students who have smoked cigarettes in the past month is about 25.77%.

Page 7: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Confidence Intervals

On the last example, the answer was about 25.77%. In order to capture the true population parameter in 95% of all samples, use a 95% confidence interval…

Remember that with a 68-95-99.7 curve, 95% is 2 standard deviations in each direction.

This is the formula we will use…

ˆ ˆ(1 )ˆ 2

p pp

n

Page 8: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Confidence Intervals

Using this formula willgive two endpoints for a confidence interval. 95% of all samples will fall in this interval.

In this formula, the value under the square root represents the standard deviation. n stands for the number in the sample.

Page 9: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Example

Remember, p-hat=0.2577 and n=12960.

Page 10: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

The answer…

This interval catches the true unknown population proportion in 95% of all samples.

In other words, we are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.01% and 26.53%.

Page 11: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Another way you may see it…

Estimate ± Margin of Error What would

the confidence

interval be here?

Page 12: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Critical Values

So far, we have used 68% = 1 std. dev., 95% = 2 std. dev., and 99.7% = 3 std. dev.

These were estimates which were used. More accurate values are Critical

Values, denoted z* (this is on page 502):

Page 13: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

So this changes the equation…

Z* now takes the place of the number of standard deviations outside the square root.

ˆ ˆ(1 )ˆ *

p pp z

n

Page 14: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Critical Values Example…using the same data as before.

Page 15: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Conclusions

We are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.03% and 26.51%.

Earlier, with using “2” for the number of standard deviations, instead of “1.96” as the critical value z*, we had stated:

We are 95% confident that the true proportion of high school students who have smoked cigarettes at least one day in the past month is between 25.01% and 26.53%.

Page 16: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Conclusions

While this is a very small difference, the benefits with using critical values is that there are more options to use (8) than with the 68-95-99.7 rule’s 3 standard deviations.

Critical Values z* can be used for the following confidence levels: 50%, 60%, 70%, 80%, 90%, 95%, 99%, and 99.9%.

Standard Deviations can be used for 68%, 95%, and 99.7%

Page 17: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Comparing critical values… Remember, the larger

the number, the wider the interval…if you don’t need a high proportion of confidence, you can use a smaller number (like 50%). 50% will give a smaller confidence interval (will contain a smaller proportion of the true unknown population).

50%: (0.2551, 0.2603) 60%: (0.2545, 0.2609) 70%: (0.2537, 0.2617) 80%: (0.2528, 0.2626) 90%: (0.2514, 0.2640) 95%: (0.2503, 0.2651) 99%: (0.2478, 0.2676) 99.9%: (0.2451, 0.2704)

So the smallest interval is 25.51% to 26.03% and the largest interval is 24.51%-27.04%

Page 18: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Quiz…

What is the formula for p-hat?

Page 19: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Quiz…

What is the formula for a confidence interval, using z* (just the basic formula—no numbers)?

ˆ ˆ(1 )ˆ *

p pp z

n

Page 20: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Quiz…

What does n stand for in the equation?

The sample size.

Page 21: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Quiz…

How many critical values (z* values) are there to choose from?

8…they are 50%, 60%, 70%, 80%, 90%, 95%, 99%, and 99.9%

Page 22: With replacement or without replacement?  Draw conclusions about a population based on data about a sample.  Ask questions about a number which describe

Homework

Page 495-496, #9.1-9.5Page 505, #9.9, 9.10