Upload
richard-ferreria
View
2.530
Download
3
Embed Size (px)
DESCRIPTION
Citation preview
Chapter 12
Significance Tests in Practice
12.1 TESTS ABOUT POPULATION MEAN
Student’s t-distribution
• Published in 1908• Used to describe the sampling
distribution when the population std dev is unknown
TEST STATISTIC ( uknown)
1/
n
xt
s n
Student’s t- distribution
• Since this is just another significance test:Use PHANTOMS
• Differences:–We are using a t distribution with n-1
degrees of freedom– Use “tcdf(lower, upper, df)”– The t-distribution is not resistant to
outliers when sample size is small (less than 30)
Student’s t-distribution
Assumptions• Simple Random Sample• Independence
N > 10n• Normality
The sample must be approx Normal to indicate the Normality of the sampling distribution(1) Histogram: single peak, symmetric
note: slight skew is OK, but must be mentioned(2) Norm probability plot: approx. linear(3) NO OUTLIERS
Example 12.2
Tasters use a “sweetness scale” of 1 to 10. Cola is rated before and after a month of storage in high temperature. The differences are shown below. The bigger the difference, the greater the loss of sweetness. Does the data indicate that sweetness was lost after the storage interval?
2.0, 0.4, 0.7, 2.0, -0.4, 2.2, -1.3, 1.2, 1.1, 2.3
Example 12.2
Parameter• Let = the population mean
sweetness lost after a month of storage at high temperature
• Let xbar = the sample (n=10) sweetness loss after a month of storage at high temperatures
Example 12.2
Hypotheses• H0: = 0
This indicates that there is no sweetness loss
• Ha: > 0This indicates that there is sweetness loss
Example 12.2
Assumptions• Simple Random Sample
We are not told that our data represents an SRS. We must check that this sample is an SRS (or acts like an SRS) and proceed.
• IndependenceWe can be assured that the population of cola is greater than 10(10) = 100
Example 12.2
Assumptions (cont.)• Normality
Our histogram is single peaked with a slight left skew
The Normal probability plot is approximately linear
There are no outliers
Example 12.2
Test Statistic
P-value
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
9
1.02 0
1.1961/ 10t
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
9
1.02 0
1.1961/ 10t
9
1.02 0
1.1961/ 102.697
t
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
9
1.02 0
1.1961/ 10t
9
1.02 0
1.1961/ 102.697
t
9Pval 2.697P t
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
9
1.02 0
1.1961/ 10t
9
1.02 0
1.1961/ 102.697
t
Example 12.2
Test Statistic
P-value
10, 1.02, 1.1961
10 1 9
n x s
df
9
1.02 0
1.1961/ 10t
9
1.02 0
1.1961/ 102.697
t
9Pval 2.697P t
0.0123
Example 12.2
Make a decisionWe are going to “reject” (our p-value is small)
SummarizeApproximately 1% of the time, a sample of size 10 will produce a mean sweetness loss of at least 1.02.Since the p-value is smaller than a presumed = 0.05, we reject the null hypothesis.We have evidence to conclude that the mean sweetness loss greater than 0. Our new estimate for the average sweetness loss is 1.02.
Paired t-tests
• When a sample is produced using a matched pair design, the data used in the significance test is the difference of the two measurements
• Some typical examples of a paired t-test would be a “pre-test and post-test” as well as the previous example.
• The important thing here is to recognize the matched pair design and to work of the difference of the scores (and not the scores themselves)
Example 12.5
Example 12.5
We will work through the first few steps and leave the rest for you on your own.
Parameter• = the population difference in
average time to complete the maze • xbar = the sample (n = 21)
difference in average time to complete the maze
Example 12.5
Hypotheses• H0: = 0
• Ha: > 0(the scented mask decreases average time to complete maze)
• Remember: we are looking at the “difference” column only!
Example 12.5
Assumptions• Simple Random Sample
The data comes from a randomized matched pair design; we will have to assume that this is an SRS of the population and proceed with the test
• IndependenceWe must assume that the population is greater than 10(21) = 210 and that the scented and unscented trails are independent; we will proceed as though this condition is satisfied
Example 12.5
Assumptions (cont.)• Normality
Our histogram is single peaked with a slight right skew
The Normal probability plot is approximately linear
There are no outliers
Example 12.5
Name of the Test• We will use a “paired t-test for a
mean”Test Statistic
• You can do the rest, yeah?
20
0.9567 0
12.5479 / 21t
21, 0.9567,
12.5479, 20
n x
s df
0.349
Robustness
• t-procedures are robust against non-Normal population except in the presence of outliers
Guidelines for using t-procedures• n < 15: data must be approx normal,
no outliers• n >15: data can have slight skew,
no outliers• n > 30: data can have skew
Assignment 12.1
• Page 745 #1, 3, 6, 9, 10, 12, 16, 20
12.2 TESTS ABOUT A POPULATION PROPORTION
z-tests for proportion
• Again, we have introduced most of the material- this is just another significance test.
• Unlike tests for means, tests for proportions will always be a z-test
• We will review some of the key information:
Assumptions for proportions
• Simple Random Sample• Independence
N > 10n• Normality (of sampling distribution)
np > 10 and nq >10remember that this is just the number of responses
Test Statistic for proportions
If H0: p = p0
0
0 0
p pz
p q
n
Example 12.8
A random sample of 100 workers from a large chain restaurant were asked whether or not work stress had a negative impact on their personal lives. Thirty-two of them responded “No.” A large national survey reported that 25% of restaurant workers did not feel that stress exerted a negative impact.Does this large chain restaurant have the same work stress as the nation?
Example 12.8
• We are going to use the national survey as our known population mean.
Parameterp = the national proportion of restaurant workers who feel as though work stress has a negative impact on their personal lives.p-hat = the proportion of the sample of 100 workers who feel as though work stress has a negative impact on their personal lives.
Example 12.8
HypothesesH0: p = 0.25
Ha: p 0.25
AssumptionsSimple Random Sample
We are told that our sample is from a random sample. We will treat this as an SRS.
IndependenceAlthough we are not told, we will make the assumption that there are more than 10(100) = 1000 workers for this national chain
Normalitynp = 38 >10 and nq = 62 > 10Our sampling distribution is approximately Normal
Example 12.8
Name of TestWe will conduct a 1-proportion z-test(Note that this will be a two-tailed test)
Test Statistic
0
0 0
p pz
p q
n
0.32 0.25
0.25 0.75
100
1.617z
Example 12.8
Pvalue
Decision
Example 12.8
Pvalue
Decision
Pval 2 1.617P z
Example 12.8
Pvalue
Decision
Pval 2 1.617P z
Example 12.8
Pvalue
Decision
Pval 2 1.617P z Pval 0.1059
Example 12.8
Pvalue
DecisionFail to reject
Pval 2 1.617P z Pval 0.1059
Example 12.8
SummarizeApproximately 11% of the time, a sample of size 100 will produce a proportion at least as extreme 0.38.Since this is not less than a presumed = 0.05, we will fail to reject the H0.
We do not have enough evidence to conclude that the proportion of workers of this national chain who feel that work stress affects their personal lives is not 0.25.
Power
The preceding example should make you feel a bit uncomfortable: that p-value was not large. We don’t really expect the average to be exactly equal to the national average!Most likely, the restaurant’s proportion was not that different from the national average: maybe just a few percentage points greater.Our test didn’t have enough power to detect the difference between the two proportions!
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
CI *p q
p zn
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
CI *p q
p zn
Notice that the Conf. Int.uses p-hat and q-hat for the Standard Error
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
CI *p q
p zn
0.32 0.68CI 0.32 *
100z
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
CI *p q
p zn
0.32 0.68CI 0.32 *
100z
CI 0.2433,0.3967
Confidence Intervals and Significance Tests
• Let’s calculate the confidence interval for out proportion in the preceding example.
CI *p q
p zn
0.32 0.68CI 0.32 *
100z
CI 0.2433,0.3967Our interval contains 0.25It is equally likely that theproportion is actually 0.25!i.e. the proportion could be0.25!
Calculator Usage
As you may have already noticed, the TI calculators automate many of these calculations.Of course, this does not excuse you from writing out the PHANTOMS or PANIC procedures, or even showing your calculations!
Calculator Usage
Calculator Usage
TI83/84: Since these functions are menu driven we will just list the tests and their usage
[STAT] → “TESTS”Z-Test = one or two tailed z test for meanT-Test = one or two tail t-test for means1-PropZTest = one or two tail test for proportionsZInterval = confidence interval for mean (-known)Tinterval = confidence interval for mean (-
unknown)1-PropZInt = confidence interval for proportions