40

Lab 5 Hypothesis testing and Confidence Interval

Embed Size (px)

Citation preview

Page 1: Lab 5 Hypothesis testing and Confidence Interval
Page 2: Lab 5 Hypothesis testing and Confidence Interval

Lab 5

Hypothesis testing and Confidence Interval

Page 3: Lab 5 Hypothesis testing and Confidence Interval

Outline

One sample t-test

Two sample t-test

Paired t-test

Page 4: Lab 5 Hypothesis testing and Confidence Interval

Lab 5

One-sample t-test

Page 5: Lab 5 Hypothesis testing and Confidence Interval

One sample t-test

The hypotheses: One sided

Two sided

01

00

:

:

H

H

01

00

:

:

H

H

01

00

:

:

H

H

Page 6: Lab 5 Hypothesis testing and Confidence Interval

One sample t-test

Test statistics

12

0 ~

nt

ns

x

Page 7: Lab 5 Hypothesis testing and Confidence Interval

One sample t-test

Conclusion Compare the test statistics with the critical value

… Compare the p-value with the level of significance

α (e.g. 0.05, 0.1) Reject H0 if p-value < α (enough evidence)

Cannot reject H0 if p-value > α (not enough evidence)

Page 8: Lab 5 Hypothesis testing and Confidence Interval

Example

Download the biotest.txt data file

Read into R using function read.table() Extract the 1st column and store as ‘X1’ Store the 2nd column as ‘X2’

Page 9: Lab 5 Hypothesis testing and Confidence Interval

Example

> X1 = read.table(“biotest.txt”) [ ,1]

> X2 = read.table(“biotest.txt”) [ ,2]

Page 10: Lab 5 Hypothesis testing and Confidence Interval

Example

Take ‘X1’ as the sample in this case,

Test H0 : μ = 115 against H1 : μ ≠ 115

at significant level α = 0.05

Page 11: Lab 5 Hypothesis testing and Confidence Interval

[R] command

t.test()

Syntax:t.test(x=“data”, alternative = “less / greater /

two.sided”, mu=“μ0” )

Page 12: Lab 5 Hypothesis testing and Confidence Interval

Example 1

> t.test(X1, alternative = “two.sided”, mu=115)

One Sample t-test

data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6

Page 13: Lab 5 Hypothesis testing and Confidence Interval

Example 1

> t.test(X1, alternative = “two.sided”, mu=115)

One Sample t-test

data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6

Page 14: Lab 5 Hypothesis testing and Confidence Interval

Example 1

> t.test(X1, alternative = “two.sided”, mu=115)

One Sample t-test

data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6

larger than 0.05

Cannot reject H0 at 0.05 level of significance

Page 15: Lab 5 Hypothesis testing and Confidence Interval

Example 1

> t.test(X1, alternative = “two.sided”, mu=115)

One Sample t-test

data: X1 t = 0.1841, df = 9, p-value = 0.858alternative hypothesis: true mean is not equal to 115 95 percent confidence interval: 108.2257 122.9743 sample estimates:mean of x 115.6

μ0 inside the 95% CI

Page 16: Lab 5 Hypothesis testing and Confidence Interval

Example 2

Test H0 : μ ≤ 108 against H1 : μ > 108

at significant level α = 0.05

Page 17: Lab 5 Hypothesis testing and Confidence Interval

Example 2

> t.test(X1, alternative = “greater”, mu=108)

One Sample t-test

data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6

Page 18: Lab 5 Hypothesis testing and Confidence Interval

Example 2

> t.test(X1, alternative = “greater”, mu=108)

One Sample t-test

data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6

smaller than 0.05

Reject H0 at 0.05 level of significance

Page 19: Lab 5 Hypothesis testing and Confidence Interval

Example 2

Conclude that the population mean is significantly greater than 108

Page 20: Lab 5 Hypothesis testing and Confidence Interval

Example 2

> t.test(X1, alternative = “greater”, mu=108)

One Sample t-test

data: X1 t = 2.3314, df = 9, p-value = 0.02232alternative hypothesis: true mean is greater than 108 95 percent confidence interval: 109.6243 Inf sample estimates:mean of x 115.6

Statistical significance

vs.

Practical significance

Page 21: Lab 5 Hypothesis testing and Confidence Interval

Confidence Interval

By default, the function t.test() includes a 95% confidence interval

Question: Can we change the confidence level?

Page 22: Lab 5 Hypothesis testing and Confidence Interval

Confidence Interval

e.g. want a 99% confidence interval

> t.test(x1, alternative=“greater”, mu=108,

conf.level = 0.99)

Page 23: Lab 5 Hypothesis testing and Confidence Interval

Lab 5

Two-sample t-test

Page 24: Lab 5 Hypothesis testing and Confidence Interval

Two-sample t-test

Testing the population mean of two independent samples

Page 25: Lab 5 Hypothesis testing and Confidence Interval

Two-sample t-test

Two-sided

One-sided

BA

BA

H

H

:

:

1

0

BA

BA

H

H

:

:

1

0

BA

BA

H

H

:

:

1

0

0:

0:

1

0

BA

BA

H

H

Page 26: Lab 5 Hypothesis testing and Confidence Interval

Example 3

Consider the two sample X1 and X2

Want to test if there is there is a significant difference between the mean of X1 and mean of X2.

Page 27: Lab 5 Hypothesis testing and Confidence Interval

Example 3

Two sided testH0 : μ1 = μ2 against H1 : μ1 ≠ μ2

at 0.05 level of significance

Assuming equal variance

Page 28: Lab 5 Hypothesis testing and Confidence Interval

Example 3

> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)

Two Sample t-test

data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4

Page 29: Lab 5 Hypothesis testing and Confidence Interval

Example 3

> t.test(X1, X2, alternative = “two.sided”, var.equal = TRUE)

Two Sample t-test

data: X1 and X2 t = -0.9052, df = 18, p-value = 0.3773alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -15.940831 6.340831 sample estimates:mean of x mean of y 115.6 120.4

Page 30: Lab 5 Hypothesis testing and Confidence Interval

Example 3

Not assuming equal variance?

> t.test(X1, X2, alternative = “two.sided”,

var.equal = FALSE)

Page 31: Lab 5 Hypothesis testing and Confidence Interval

Lab 5

Paired t-test

Page 32: Lab 5 Hypothesis testing and Confidence Interval

Paired t-test

Two samples problem But they are no longer independent Example:

Measurement taken twice at different time point from the same group of subjects

Blood pressure before and after some treatment Want to test the difference of the means

Page 33: Lab 5 Hypothesis testing and Confidence Interval

Paired t-test

If we take the difference of the measurements of each subject.

Reduce to a one sample problem The rest is the same as a one sample t-test

X1

X2

X3

X4

y1

y2

y3

y4

- =

d1

d2

d3

d4

Page 34: Lab 5 Hypothesis testing and Confidence Interval

Example 4

Consider again the dataset X1 and X2, and assume they are pairwise observations

Test the equality of the means

i.e. test if difference in mean = 0H0 : μ1 = μ2 against H1 : μ1 ≠ μ2

at 0.05 level of significance

Page 35: Lab 5 Hypothesis testing and Confidence Interval

Example 4

> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)

Paired t-test

data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences

-4.8

Page 36: Lab 5 Hypothesis testing and Confidence Interval

Example 4

> t.test(X1, X2, alternative = “two.sided”, paired = TRUE)

Paired t-test

data: X1 and X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of the differences

-4.8

Page 37: Lab 5 Hypothesis testing and Confidence Interval

Alternatively…

> t.test(X1-X2, alternative = “two.sided”)

One Sample t-test

data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8

Page 38: Lab 5 Hypothesis testing and Confidence Interval

Alternatively…

> t.test(X1-X2, alternative = “two.sided”)

One Sample t-test

data: X1 - X2 t = -3.3247, df = 9, p-value = 0.008874alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -8.066013 -1.533987 sample estimates:mean of x -4.8

EXACTLY THE

SAME RESULT!!

Page 39: Lab 5 Hypothesis testing and Confidence Interval

Final Remarks

Notice that the conclusion from the two sample t-test and the paired t-test are different even if we are looking at the same data set.

Should check if the two sample are independent or not

Page 40: Lab 5 Hypothesis testing and Confidence Interval

Final Remarks

Using the wrong test either lead to loss of sensitivity or invalid analysis.