21
Chapter 18 – Sampling Distribution Models

Chapter 18 – Sampling Distribution Models

  • Upload
    gamada

  • View
    65

  • Download
    3

Embed Size (px)

DESCRIPTION

Chapter 18 – Sampling Distribution Models. How accurate is our sample?. Sometimes different polls show different results for the same question. Since each poll samples a different group of people, we should expect some variation in the results. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 18 – Sampling Distribution Models

Chapter 18 – Sampling Distribution Models

Page 2: Chapter 18 – Sampling Distribution Models

How accurate is our sample?

Sometimes different polls show different results for the same question.

Since each poll samples a different group of people, we should expect some variation in the results.

We could try drawing lots of samples and looking at the variation amongst those samples.

Page 3: Chapter 18 – Sampling Distribution Models

Experiment: Simulating a sample

A recent US Census Bureau study (source) reports that about 30% of Americans 25 or older have a Bachelor’s degree.

Open up a blank Minitab worksheet and let’s generate some random data: Calc > Random Data > Bernoulli Enter 200 rows Store in Column C1-C20 Event Probability: .3

Page 4: Chapter 18 – Sampling Distribution Models

Proportion estimates for samples of size 5

We can treat each row as a sample and calculate the proportion of each sample using the mean.

Samples of size 5: Calc > Row Statistics > Mean

Input Variables: C1 – C5 Store result in: C21

Look at these sample proportions. Are they close to the population proportion of 30%?

Draw a histogram of the sample proportions in C21

Page 5: Chapter 18 – Sampling Distribution Models

Proportion estimates for samples of size 10

Samples of size 10: Calc > Row Statistics > Mean

Input Variables: C1 – C10 Store result in: C22

Look at these sample proportions. Are they close to the population proportion of 30%?

Draw a histogram of the sample proportions in C22

Page 6: Chapter 18 – Sampling Distribution Models

Proportion estimates for samples of size 20

Samples of size 10: Calc > Row Statistics > Mean

Input Variables: C1 – C20 Store result in: C23

Look at these sample proportions. Are they close to the population proportion of 30%?

Draw a histogram of the sample proportions in C23

Page 7: Chapter 18 – Sampling Distribution Models

Sampling Distribution Model for a Proportion

Our histogram of the sample proportions started to look like a Normal model

The larger our sample size gets, the better the Normal model works

Assumptions: Independence: sampled values must be independent

of each other Sample Size: n must be large enough

Page 8: Chapter 18 – Sampling Distribution Models

Conditions to check for assumptions

Randomization Condition: Experiments should have treatments randomly assigned Survey samples should be a simple random sample or

representative, unbiased sample otherwise

10% Condition: Sample size n must be no more than 10% of population

Success/Failure Condition: Sample size needs to be large enough to expect at least

10 successes and 10 failures

Page 9: Chapter 18 – Sampling Distribution Models

Sampling Distribution Model for a Proportion

If the sampled values are independent and the sample size is large enough,

The sampling distribution model of is modeled by a Normal model with:

p

( )p p ( ) pqSD pn

Page 10: Chapter 18 – Sampling Distribution Models

Example: Proportion of Vegetarians

7% of the US population is estimated to be vegetarian. If a random sample of 200 people resulted in 20 people reporting themselves as vegetarians, is this an unusually high proportion?

Conditions: Randomization 10% condition Success/Failure

Page 11: Chapter 18 – Sampling Distribution Models

Vegetarians Example continued

Since our conditions were met, it’s ok to use a Normal model.

= 20/200 = .10

E( ) = p = .07

z = This result is within 2 sd’s of mean, so not unusual

pp

(.07)(.93)( ) .018200

pqSD pn

.10 .07 1.67.018

Page 12: Chapter 18 – Sampling Distribution Models

68-95-99.7 Rule with Vegetarians

p 1σ

-3σ

-2σ

-1σ

68%

95%

98%

Page 13: Chapter 18 – Sampling Distribution Models

Sampling Distribution of a Mean

Rolling dice simulation10,000 individual rolls recorded

Figure from DeVeaux, Intro to Stats

Page 14: Chapter 18 – Sampling Distribution Models

Sampling Distribution of a Mean

Roll 2 dice 10,000 times, average dice

Figure from DeVeaux, Intro to Stats

Page 15: Chapter 18 – Sampling Distribution Models

Sampling Distribution of a Mean

Rolling 3 dice 10,000 times and averaging dice

Figure from DeVeaux, Intro to Stats

Page 16: Chapter 18 – Sampling Distribution Models

Sampling Distribution of a Mean

Rolling 5 dice 10,000 times and averaging

Figure from DeVeaux, Intro to Stats

Page 17: Chapter 18 – Sampling Distribution Models

Sampling Distribution of a Mean

Rolling 20 dice 10,000 times and averaging

Once again, as sample size increases, Normal model appears

Figure from DeVeaux, Intro to Stats

Page 18: Chapter 18 – Sampling Distribution Models

Central Limit Theorem

The sampling distribution of any mean becomes more nearly Normal as the sample size grows. The larger the sample, the better the approximation

will be

Observations need to be independent and collected with randomization.

Page 19: Chapter 18 – Sampling Distribution Models

CLT Assumptions

Assumptions: Independence: sampled values must be independent Sample Size: sample size must be large enough

Conditions: Randomization 10% Condition Large enough sample

Page 20: Chapter 18 – Sampling Distribution Models

Which Normal Model to use?

The Normal Model depends on a mean and sd

Sampling Distribution Model for a Mean

When a random sample is drawn from any population with mean µ and standard deviation σ, its sample mean y has a sampling distribution with:

Mean: µ Standard Deviation:

n

Page 21: Chapter 18 – Sampling Distribution Models

Example: CEO compensation

800 CEO’sMean (in thousands) = 10,307.31SD (in thousands) = 17,964.62

Samples of size 50 were drawn with:Mean = 10,343.93 SD = 2,483.84

Samples of size 100 were drawn with:Mean = 10,329.94 SD = 1,779.18

According to CLT, what should theoretical mean and sd be?

Example from DeVeaux, Intro to Stats