Upload
shawn-johnston
View
222
Download
0
Embed Size (px)
Citation preview
Confidence Intervalsand
Hypothesis Testing
Using and .X p̂
Two Statistical Procedures
Confidence Intervals
“which is an interval of values that the researcher is fairly “confident” covers the true value of the population.”
Hypothesis Testing
(or significance testing)
“uses sample data to reject the notion that chance alone can explain the sample results.”
Remember Empirical Rule
Going out 1·2.5, 2·2.5, 3·2.5 standard deviations from the mean, 69, gives us information regarding the amount of data contained in the range (interval).
Confidence Level
P(-z* < Z < z*) = .90 z* = 1.645
P(-z* < Z < z*) = .95 z* = 1.960
P(-z* < Z < z*) = .99 z* = 2.576
Confidence Level can vary. Since we want to be fairly sure about our confidence intervals, we usually use confidence levels in the 90+% range.
What is a Confidence Interval?
It is a range of values calculated from a sample(s), based on the sample mean or proportion, sample standard deviation, sample size, and level of confidence.
We use it to infer possible (plausible) values for the true population parameter ( or ).
Example: 95% CI for is 45.67 to 53.83
Example: 90% CI for is 0.245 to 0.265
How is a Confidence Interval constructed?
__%Confidence Interval = statistic (critical value)(standard error of statistic)
Example for Sample Proportion:
Example for Sample Mean:
n
ppzpCI
ˆ1ˆ*)(ˆ%__
n
sxCI )t*(%__
n
xCI
)z*(%__
Margin of Error
n
ppzpCI
ˆ1ˆ*)(ˆ%__
n
ppz
ˆ1ˆ*)(Margin of Error
95% CI of # of Tails/10 flips
Sample Prop SE Prop 95.0 % C.I.
T% 0.51594 0.01902 (0.47865, 0.55323)
0.80.70.60.50.40.30.2
20
10
0
H%
Fre
quen
cy
Interpretation: Based on this sample, we are 95% confident that the true population proportion () of number of tails in 10 flips is between 0.479 and 0.553.
95% CI of My Age
N MEAN STDEV SE MEAN 95 % C.I.
Age 69 36.130 5.843 0.703 (34.726, 37.534)
50454035302520
20
10
0
Age
Fre
quen
cy
Interpretation: Based on this sample, we are 95% confident that the true population mean (µ) of my age is between 34.7 and 37.5.
Revisit CLT
The first graph represents X (number of G’s in story).
The second graph represents (average number of G’s in 3 attempts).
What 3 important facts must you remember about CLT?
1009080706050403020100
20
10
0
G2
Freq
uenc
y
1009080706050403020100
30
20
10
0
AvgG
Freq
uenc
y
X
Properties of CI1) Sample statistic is the center of CI for
it is our best (and sometimes our only) estimate of the unknown population parameter of interest.
2) Larger the level of confidence, the wider the confidence interval.
3) The larger the sample size, the narrower the confidence interval.
4) Sampling distribution must be normally distributed (CLT, n and n(1- ) 10).p̂ p̂
How Many G’s?
N MEAN STDEV SEMEAN
AvgG 69 61.46 8.48 1.02
95 % C.I. =
= 61.56 ± 1.997(1.02)
= (59.423, 63.497)
n
stx *)(
95% CI of Brown M&Ms %Suppose you sampled a bag of 50 M&M’s
and found 23 brown candies. Construct a 95% confidence interval.
n
ppzpCI
ˆ1ˆ*)(ˆ%95
50
15023
*)(50
23%__
5023
zCI
CI95% = (0.322, .598)
95% CI of Brown M&Ms %First 20 Conf Intervals
0
5
10
15
20
25
30
0.000 0.100 0.200 0.300 0.400 0.500 0.600 0.700
20 bags of M&M’s were used to calculate a sample proportion of brown candies per bag.
Each sample proportion was used to construct a 95% CI which was graphed.
How many CI do NOT contain 30%?
Various CI’s w/90% Conf. Level
Example Exam QuestionsWhich of the following BEST describes what
95% confidence means in a 95% CI for of (7.8, 9.4)?
A. There is a 95% probability that is between 7.8 and 9.4.
B. In repeated sampling, will fall between 7.8 and 9.4 about 95% of the time.
C. In repeated sampling, 95% of the confidence intervals fall between 7.8 and 9.4.
D. In repeated sampling, the confidence intervals will contain about 95% of the time.
Hypothesis TestingUse hypothesis testing when you can state a
hypothesis based on an assumption about the population parameter.
Null hypothesis, Ho, states that the population parameter is equal to an assumed true value.
Alternative hypothesis, Ha, states that the population parameter is <, >, or to an assumed true value.
Ho: = 35
Ha: < 35
Example of Hypothesis Testing
Is there a statistical difference between my age and students’ sample
mean of my age?
Ho: = 35
Ha: > 35
TEST OF MU = 35.000 VS MU G.T. 35.000
N MEAN STDEV SEMEAN T P VALUEAge 69 36.130 5.843 0.703 1.61 0.056
-4 -3 -2 -1 0 1 2 3 4
What exactly is a p-value?
1. Bias in sampling
2. Luck of the draw—misrepresentative sample
3. Significant difference
Indication of significance!
What exactly is a p-value?Is the probability, assuming Ho is true, of
getting data like this or ‘worse’ (more like the alternative hypothesis).
Is the probability of obtaining extreme statistics or more when Ho is true.
You compare the p-value to alpha, .
If p-value < , reject Ho.
If p-value > , fail to reject Ho.
What exactly is a p-value in context of problem?
Is the probability, assuming Ho (my age is 35) is true, of getting data like this or ‘worse’ (more like the alternative hypothesis—claiming I am older than 35).
Is the probability of obtaining extreme statistics (sample mean of 36.130 from sample size of 69 students) or more when Ho (my age is 35) is true.
If alpha is 0.05 and p-value (probability) is 0.056, since p-value > alpha we will fail to reject that Ho (my age is 35) is true.
Example of Hypothesis Testing
Determine if the coin flipped was fair or not?
Ho: p = .5
Ha: p .5
1PropZTest(not equal)
= 356/690 = .5159
z = .8375p-value = .4023
n = 690
p̂
-4 -3 -2 -1 0 1 2 3 4
Power, , and
Let confidence level be represented by L.
Then 1 – L = (OR 1 - = L).
is called a Type I error. is known as the Type II error.
Power is equal to 1 - .
All 3 of these have probability values (likelihood's of occurrence.).
Hypothesis Testing like a court case
Assume innocent
Null hypothesis is true—when person is actually innocent
Null hypothesis is false—when person is actually guilty
Fail to reject null hypothesis—found innocent
Confidence Level (1-)—found innocent, when innocent
Type II Error --found innocent, when guilty
Reject null hypothesis—found guilty
Type I Error --found guilty, when innocent
Power (1- )—found guilty, when guilty
What are the Type I and II Errors?
Your boss asked you to test a new marketing scheme to potentially increase profits for the company.
But the new scheme also increases the expense of marketing the product.
What are the two potential ERRORs in the context of this problem?
P-Value Question
Which of the following statements is correct?A. A large p-value indicates that the data is
consistent with the alternative hypothesis.B. The smaller the p-value, the weaker the
evidence against the null hypothesis.C. The p-value measures the probability of
making a Type II error.D. The p-value measures the probability that
the hypothesis is true.E. An extremely small p-value indicates that
the actual data differs from the expected if the null hypothesis were true.
Power Question
Power in this situation refers toA. the ability to detect a difference when in
fact there is no difference.B. the ability to not detect a difference
when in fact there is no difference.C. the ability to detect a difference when in
fact there is a difference.D. the ability to not detect a difference
when in fact there is a difference.E. the ability to make a correct decision
regardless if there is a difference or not.
Sample QuestionThe government requires that water be sampled and tested for
safety every day. If the worker responsible for these testing uses a null hypothesis that means the water is safe, which of the following statements correctly describes a Type I and Type II error?
A. A Type I error would mean that the citizens were not warned of contaminated water, but a Type II error would mean that the citizens would panic although the water is ok.
B. A Type II error would mean that the citizens were not warned of contaminated water, but a Type I error would mean that the citizens would panic although the water was ok.
C. A Type II error would mean that the citizens were not warned of contaminated water, but a Type I error would mean that the citizens would not panic although the water is ok.
D. A Type I error would mean that the citizens were not warned of contaminated water, but a Type II error would mean that the citizens would panic although the water is not ok.
E. Both errors would cause unnecessary panic since a mistake means something went wrong but everything would work out.