Sampling Distributions of Proportions

Preview:

DESCRIPTION

Sampling Distributions of Proportions. The dotplot was a partial graph of the sampling distribution of all sample proportions of sample size 40. If we found all the possible sample proportions – this would be approximately normal!. Remember the skittles example. - PowerPoint PPT Presentation

Citation preview

Sampling Sampling Distributions of Distributions of

ProportionsProportions

•Remember the skittles example.

•We calculated the proportion of orange skittles & marked it on the dot plots on the board.

What shape did the n=40 dot plot have?

The dotplot was a partial graph of the sampling

distribution of all sample proportions of sample size

40. If we found all the possible sample proportions

– this would be approximately normal!

Sampling DistributionSampling Distribution

• Choose an SRS of size n from a large population with population proportion p having some characteristic of interest. Let p-hat be the proportion of the sample having that characteristic..

• We need to come up with some formulas for the mean and standard deviation.

Suppose we have a population of six Suppose we have a population of six people: Melissa, Jake, Charles, Kelly, people: Melissa, Jake, Charles, Kelly, Mike, & BrianMike, & Brian

What is the proportion of females?

What is the parameter of interest in this population?

 Draw samples of two from this Draw samples of two from this population.population.

How many different samples are possible?

Proportion of femalesProportion of females

66CC22 =15 =15

1/31/3

Find the 15 different samples that Find the 15 different samples that are possible & find the sample are possible & find the sample

proportion of the number of females proportion of the number of females in each sample.in each sample.

Melissa & Jake .5Melissa & Charles .5

Melissa & Kelly 1Melissa & Mike .5Melissa & Brian .5Jake & Charles 0Jake & Kelly .5Jake & Mike 0

Jake & Brian 0

Charles & Kelly .5

Charles & Mike 0

Charles & Brian 0

Kelly & Mike .5

Kelly & Brian .5

Mike & Brian 0

Find the mean & standard deviation of all p-hats.Find the mean & standard deviation of all p-hats.

29814.0σ&31

μ ˆˆ pp

How does the mean of the sampling distribution (p-hat) compare to the population

parameter (p)?p-hat = p

Formulas:Formulas:

n

pp

pn

Xp

p

p

1

ˆ

ˆ

ˆ

The mean of the sampling

distribution.The standard

deviation of the sampling

distribution.

Does the standard deviation of the Does the standard deviation of the

sampling distribution equal the equation?sampling distribution equal the equation? NO -

29814.031

23

23

1σˆ p

WHY?WHY?

We are sampling more than 10% of our population!

So – in order to calculate the standard deviation of the sampling distribution, we

MUST be sure that our sample size is less than 10%

of the population!

Assumptions (Rules of Assumptions (Rules of Thumb)Thumb)

• Use this formula for standard deviation when the population is sufficiently large, at least 10 times as large as the sample.

• Sample size must be large enough to insure a normal approximation can be used. We can use the normal approximation when

np np >> 10 & n (1 – p) 10 & n (1 – p) >> 10 10

Why does the second assumption Why does the second assumption insure an approximate normal insure an approximate normal distribution?distribution?

Suppose n = 10 & p = 0.1 (probability of a success), a histogram of this distribution is strongly skewed right!

Remember back to binomial distributions

Now use n = 100 & p = 0.1 (Now np > 10!) While the histogram is still strongly skewed right – look what happens to the tail!

np > 10 & n(1-p) > 10 insures that the

sample size is large enough to have a

normal approximation!

Based on past experience, a bank believes that 7% of the people who receive loans will not make payments on time. The bank recently approved 200 loans.

What are the mean and standard deviation of the proportion of clients in this group who may not make payments on time?

Are assumptions met?

What is the probability that over 10% of these clients will not make payments on time?

01804.

20093.07.

σ

07.μ

ˆ

ˆ

p

p

Yes – np = 200(.07) = 14n(1 - p) = 200(.93) = 186

Ncdf(.10, 1E99, .07, .01804) = .0482

Example #1

A polling organization asks an SRS of 1500 first year college students whether they applied for admission to any other college. In fact, 35% of all first-year students applied to colleges besides the one they are attending. What is the probability that the random sample of 1500 students will give a result within 2 percentage points of the true value?

STATEPLANDO

CONCLUDE

Example #1

STATE: We want to know the probability that a

random sample yields a result within 2 percentage points of the true proportion.

We want to determine ˆ(.33 .37)P p

Example #1

PLAN:

We have drawn an SRS of size 1500 from the population of interest.

The mean of the sampling distribution of p-hat is 0.35:

ˆ 0.35p

Example #1

PLAN: We can assume that the population of first-year

college students is over 15,000, and are safe to use the standard deviation formula:

In order to use a normal approximation for the sampling distribution, the expected number of successes and failures must be sufficiently large:

Therefore,

ˆ

(1 ) (0.35)(0.65)0.0123

1500p

p p

n

10 (1 ) 10

1500(.35) 10 1500(.65) 10

np and n p

and

ˆ (0.35,0.0123)p N

Example #1

DO: Perform a normal distribution calculation to find the desired probability: ˆ(.33 .37) .8961P p

Example #1

CONCLUDE: About 90% of all SRS’s of size 1500 will give a result within 2 percentage points of true proportion.

Suppose one student tossed a coin 200 times and found only 42% heads. Do you believe that this is likely to happen?

0118.200

)5(.5.,5,.42,.

ncdf

No – since there is approximately a 1% chance of

this happening, I do not believe the student did this.

np = 200(.5) = 100 & n(1-p) = 200(.5) = 100Since both > 10, I can use a normal curve!

Find & using the formulas.

Example #2

Assume that 30% of the students at HH wear contacts. In a sample of 100 students, what is the probability that more than 35% of them wear contacts?

Check assumptions!

p-hat = .3 & p-hat = .045826

np = 100(.3) = 30 & n(1-p) =100(.7) = 70

Ncdf(.35, 1E99, .3, .045826) = .1376

Example #3

Example #4 (Your turn)

• About 11% of American adults are black. Therefore, the proportion of blacks in an SRS of 1500 adults should be close to .11.

If a national sample contains only 9.2% black, should we suspect that the sampling procedure is somehow under-representing blacks?

Recommended