Upload
justina-martin
View
228
Download
10
Tags:
Embed Size (px)
Citation preview
Lab 5
Hypothesis testing and Confidence Interval
Outline
One sample t-test
Two sample t-test
Paired t-test
Lab 5
One-sample t-test
One sample t-test
The hypotheses: One sided
Two sided
01
00
:
:
H
H
01
00
:
:
H
H
01
00
:
:
H
H
One sample t-test
Test statistics
12
0 ~
nt
ns
x
One sample t-test
Conclusion Compare the test statistics with the critical value
… Compare the p-value with the level of significance
α (e.g. 0.05, 0.1) Reject H0 if p-value < α (enough evidence)
Cannot reject H0 if p-value > α (not enough evidence)
Example
Download the biotest.txt data file
Read into R using function read.table() Extract the 1st column and store as ‘X1’ Store the 2nd column as ‘X2’
Example
> X1 = read.table(“biotest.txt”) [ ,1]
> X2 = read.table(“biotest.txt”) [ ,2]
Example
Take ‘X1’ as the sample in this case,
Test H0 : μ = 115 against H1 : μ ≠ 115
at significant level α = 0.05
[R] command
t.test()
Syntax:t.test(x=“data”, alternative = “less / greater /
two.sided”, mu=“μ0” )
Example 1
> t.test(X1, alternative = “two.sided”, mu=115)
One Sample t-test
data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6
Example 1
> t.test(X1, alternative = “two.sided”, mu=115)
One Sample t-test
data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6
Example 1
> t.test(X1, alternative = “two.sided”, mu=115)
One Sample t-test
data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6
larger than 0.05
Cannot reject H0 at 0.05 level of significance
Example 1
> t.test(X1, alternative = “two.sided”, mu=115)
One Sample t-test
data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6
μ0 inside the 95% CI
Example 2
Test H0 : μ ≤ 108 against H1 : μ > 108
at significant level α = 0.05
Example 2
> t.test(X1, alternative = “greater”, mu=108)
One Sample t-test
data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6
Example 2
> t.test(X1, alternative = “greater”, mu=108)
One Sample t-test
data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6
smaller than 0.05
Reject H0 at 0.05 level of significance
Example 2
Conclude that the population mean is significantly greater than 108
Example 2
> t.test(X1, alternative = “greater”, mu=108)
One Sample t-test
data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6
Statistical significance
vs.
Practical significance
Confidence Interval
By default, the function t.test() includes a 95% confidence interval
Question: Can we change the confidence level?
Confidence Interval
e.g. want a 99% confidence interval
> t.test(x1, alternative=“greater”, mu=108,
conf.level = 0.99)
Lab 5
Two-sample t-test
Two-sample t-test
Testing the population mean of two independent samples
Two-sample t-test
Two-sided
One-sided
BA
BA
H
H
:
:
1
0
BA
BA
H
H
:
:
1
0
BA
BA
H
H
:
:
1
0
0:
0:
1
0
BA
BA
H
H
Example 3
Consider the two sample X1 and X2
Want to test if there is there is a significant difference between the mean of X1 and mean of X2.
Example 3
Two sided testH0 : μ1 = μ2 against H1 : μ1 ≠ μ2
at 0.05 level of significance
Assuming equal variance
Example 3
> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)
Two Sample t-test
data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4
Example 3
> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)
Two Sample t-test
data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4
Example 3
Not assuming equal variance?
> t.test(X1, X2, alternative = “two.sided”,
var.equal = FALSE)
Lab 5
Paired t-test
Paired t-test
Two samples problem But they are no longer independent Example:
Measurement taken twice at different time point from the same group of subjects
Blood pressure before and after some treatment Want to test the difference of the means
Paired t-test
If we take the difference of the measurements of each subject.
Reduce to a one sample problem The rest is the same as a one sample t-test
X1
X2
X3
X4
y1
y2
y3
y4
- =
d1
d2
d3
d4
Example 4
Consider again the dataset X1 and X2, and assume they are pairwise observations
Test the equality of the means
i.e. test if difference in mean = 0H0 : μ1 = μ2 against H1 : μ1 ≠ μ2
at 0.05 level of significance
Example 4
> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)
Paired t-test
data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences
-4.8
Example 4
> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)
Paired t-test
data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences
-4.8
Alternatively…
> t.test(X1-X2, alternative = “two.sided”)
One Sample t-test
data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8
Alternatively…
> t.test(X1-X2, alternative = “two.sided”)
One Sample t-test
data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8
EXACTLY THE
SAME RESULT!!
Final Remarks
Notice that the conclusion from the two sample t-test and the paired t-test are different even if we are looking at the same data set.
Should check if the two sample are independent or not
Final Remarks
Using the wrong test either lead to loss of sensitivity or invalid analysis.