Chapter 9: Sampling Distributions

Chapter 9: Sampling Distributions

“It has been proved beyond a shadow of a doubt that smoking is one of the leading causes of statistics.”

Fletcher Knebel

9.1 Sampling Distributions (pp. 456-469)

Samples: Examined in order to come to a reasonable

conclusion about the population from which the sample is chosen

To glean meaningful information, one must be statistically literate

Must have an awareness of what the sample results tell us and don’t tell us

A statistic calculated from a sample may suffer from bias or high variability Does not represent a good estimate of a population

parameter

Vocabulary Review

Parameter: an index that is related to a population

Statistic: an index that is related to a sample Sampling distribution of a statistic: the

distribution of values of a statistic taken from all possible samples of a specific size

A statistic is unbiased if the mean of the sampling distribution is equal to the true value of the parameter being estimated

Rules to Review:

Variance formula for a POPULATION:

Variance formula for a SAMPLE:

2

2 ix

N

2

2ix x

sN

Consider the 3-element population

P = {1, 2, 3}

2

2 3

0.81649658

0.6666667

These values are parameters since they

are derived from a population.

N

2

2

1 1

2 0

3 1

2

3

i ix x x

Now consider all possible samples of size 2, with replacement There would be 9 samples.

Order is important!

23 9

Complete the Chart

Sample Sample Mean Sample Variance Sample S.D.

1, 1

1, 2

1, 3

2, 1

2, 2

2, 3

3, 1

3, 2

3, 3

MEANS

Completed ChartSample Sample Mean Sample Variance Sample S.D.

1, 1 1 0 0

1, 2 1.5 .25 1

1, 3 2 1 .5

2, 1 1.5 .25 .5

2, 2 2 0 0

2, 3 2.5 .25 .5

3, 1 2 1 1

3, 2 2.5 .25 .5

3, 3 3 0 0

MEANS 2 .33

The chart shows that…

The mean of the distribution of sample means is the mean (μ)

of the population This illustrates that a

sample mean is an unbiased estimator of the population mean

The distribution of sample means “centers” around the mean of the population

The mean of the distribution of sample variances (s2) is equal to the variance (σ2) of the population

This illustrates that a sample variance (s2) is an unbiased estimator of the population variance

The distribution of sample variances “centers” around the variance of the population

Take Note:

A sample standard deviation is NOT an unbiased estimator of the population standard deviation

In the above example, the mean of the sample deviation is 0.628539, and the standard deviation for the population is 8.81649658

The distribution of sample standard deviations does not center around the standard deviation of the population

9.2 Sample Proportions (pp. 472-477)The normal distribution curve is often

extremely useful in analyzing sample proportions. This section provides insights into the circumstances that allow for use of normal distribution properties.

Consider an SRS of 1000 people from a large population

X represents the number in this sample who are Republicans. There are 1001 possible values for X: 0, 1, …1000/

P-hat represents the possible sample proportions of

Republicans in the sample. There are 1001 possible values of p-hat: 0/1000, 1/1000…

1000/1000.

We could choose many SRS’s and calculate a p-hat for each.

We would expect the distribution of p-hat to be

approximately normal.

p

If we choose an SRS of size n from a large population with population proportion p having some characteristic of interest, and if p-hat is the proportion of the sample having that characteristic, then: The sampling distribution of p-hat is approximately normal.

The mean of the sampling distribution is p (the population parameter).

The standard deviation of the sampling distribution is 1p p

n

It is reasonable to use the previous statements when:

The population is at least 10 times as large as the sample. Rule of Thumb #1

Np is at least 10 and n(1-p) is at least 10. Rule of Thumb #2

Suppose it is known that 60% of the registered voters in a district of over 20,000 people are Republicans. IF YOU CHOOSE AN SRS OF 1000 REGISTERED VOTERS: What is the probability that the proportion of

registered voters in the sample is between 58% and 62%?

What is the probability that the sample will contain more than 550 Republicans?

Are both rules of thumb satisfied?

Convert x = .55 to its z-score.

Interpret.

Rare occurrence????

.000628 is approx. = 1/1592. So if we had 1600 random samples of size 1000, how many of them would we “expect” to have 550 or fewer Republicans?

9.3 Sample Means (pp. 481-494)

This section contains one of the most important of all statistical theorems, the Central Limit Theorem of Statistics.

It also emphasizes that it is conventionally the Greek letters μ and σ that are used for the population parameters mean and standard deviation and x-bar and s are used to represent the mean and standard deviation for samples.

The Central Limit Theorem

Consider an SRS of size n from any population with mean μ and standard deviation σ. When n is large, the sampling distribution of x-bar has the following properties: It is approximately normal. The mean of the distribution is x-bar ( = μ). The standard deviation of the distribution is s.

n

Consider the population

Now consider all possible sample size 2 with NO REPLACEMENT. There would be 3x3 or 9 such samples.

2,4,6P

2 2 2

4

2 4 4 4 6 41.632993162

n

Sample Sample Mean2, 2 22, 4 32, 6 44, 2 34, 4 44, 6 56, 2 46, 4 56, 6 6

The mean of the sample means is equal to 4, which is equal to μ. This illustrates the second part of the Central Limit

Theorem. The standard deviation of the sample means is

equal to 1.154700538.

This illustrates the third part of the Central Limit Theorem.

1.632993162

2 2

Documents

Chapter 9: Sampling Distributions