20
Part V From the Data at Hand to the World at Large

Part V From the Data at Hand to the World at Large

Embed Size (px)

Citation preview

Page 1: Part V From the Data at Hand to the World at Large

Part V

From the Data at Hand to the World at Large

Page 2: Part V From the Data at Hand to the World at Large

Chapter 18Sampling Distribution Models Modeling the Distribution of sample

proportions From 1000 randomly selected voters

(2004) Poll 1 : John Kerry 49% Poll 2 : John Kerry 45.9%

Page 3: Part V From the Data at Hand to the World at Large

Assumptions and Conditions Assumptions

The sampled values must be independent of each other

The sample size, n, must be large. Conditions

10% Condition If drawing without replacement then the sample n

must be no larger than 10% of the population Success / Failure condition

The sample size has to be big enough so that both np and nq are greater than 10

Page 4: Part V From the Data at Hand to the World at Large

The Sampling Distribution Model for a Proportion Provided that the sampled values are

independent and the sample size is large enough, the sampling distribution of p is modeled by a normal model with mean

And standard deviation

pp )ˆ(

npq

pSD )ˆ(

Page 5: Part V From the Data at Hand to the World at Large

Models for proportions

Exercise 10 page 424

Page 6: Part V From the Data at Hand to the World at Large

Means: The Fundamental Theorem of Statistics Central Limit Theorem (CLT)

The sampling distribution of any mean becomes normal as the sample grows (independent observations)

As the sample size “n” increases, the mean of n independent values has a sampling distribution that tends toward a normal model with mean equal to the population mean and standard deviation

)( y

nySDy

)()(

Page 7: Part V From the Data at Hand to the World at Large

CLT

Page 8: Part V From the Data at Hand to the World at Large

Assumptions and Conditions Random Sampling Condition

The values must be sampled randomly

Independence assumption

10% condition The sample size is less than 10% of the

population

Page 9: Part V From the Data at Hand to the World at Large

Exercise

Step-by-step page 418

Ex.36 page 426

Page 10: Part V From the Data at Hand to the World at Large

Standard Error When we estimate the standard

deviation of a sampling distribution using statistics found from the data, the estimates are called standard error:

For a proportion

For the sample mean

nqp

pES ˆˆ)ˆ.(.

ns

yES ).(.

Page 11: Part V From the Data at Hand to the World at Large

Don’t confuse the sampling distribution with the distribution of the sample Distribution of the sample

Take a sample Look at the distribution on a histogram Calculate summary statistics

Sampling Distribution Models an imaginary collection of the values

that a statistic, might have taken from all the samples that you didn’t get.

We use the sampling distribution model to make statements about how statistics varies

Page 12: Part V From the Data at Hand to the World at Large

Confidence Intervals for Proportions Example: Infected Sea fan corals

at Las Redes Reef (LRR)

%9.5110454

ˆ p

npq

pSD )ˆ(

%9.4049.0104

)481.0)(519.0(ˆˆ)ˆ.(. nqp

pES

Page 13: Part V From the Data at Hand to the World at Large

Confidence Intervals 68% of the samples will have p^ within

1 SE of p. And 95% of all samples will be within p±2SE

We know that for 95% of random samples p^ will be no more than 2SE away from p.

Now from p^ point of view, there is a 95% chance that p is no more than 2SE away from p^

Page 14: Part V From the Data at Hand to the World at Large

Confidence interval We are 95% confident that between

42.1% and 61.7% of LRR sea fans are infected.

Margin of Error Certainty vs. Precision Estimate ± M.E. The margin of error for our 95% confidence

interval was 2SE For 99.7% confident 3SE 100% Confident 0% to 100% Low Confidence 51.8% to 52.0

Page 15: Part V From the Data at Hand to the World at Large

Critical Values z* The number of standard errors to

move away from the mean of the sampling distribution to correspond to the specified level of confidence.

Find z* (critical value) for 98% confidence.

For 95%?

Page 16: Part V From the Data at Hand to the World at Large

Confidence interval(one-proportion z-interval)

The critical value z* depends on the particular confidence interval we specify and

Assumptions Independence

Conditions Randomization 10% Condition

)ˆ.(.*ˆ pESzp

nqp

pES ˆˆ)ˆ.(.

Page 17: Part V From the Data at Hand to the World at Large

Exercise

#13 Page 444

Page 18: Part V From the Data at Hand to the World at Large

Chapter 20 Testing Hypothesis about proportions Example:

Metal Manufacturer Ingots 20% defective (cracks)

After Changes in the casting process: 400 ingots and only 17% defective

IS this a result of natural sampling variability or there is a reduction in the cracking rate?

Page 19: Part V From the Data at Hand to the World at Large

Hypotheses We begin by assuming that a

hypothesis is true (as a jury trial). Data consistent with the hypothesis:

Retain Hypothesis Data inconsistent with the hypothesis:

We ask whether they are unlikely beyond reasonable doubt.

If the results seem consistent with what we would expect from natural sampling variability we will retain the hypothesis. But if the probability of seeing results like our data is really low, we reject the hypothesis.

Page 20: Part V From the Data at Hand to the World at Large

Testing Hypotheses

Null Hypothesis H0

Specifies a population model parameter of interest and proposes a value for this parameter

Usually: No change from traditional value No effect No difference

In our example H0:p=0.20 How likely is it to get 0.17 from sample

variation?