22
Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: Null hypothesis - Statement regarding the value(s) of unknown parameter(s). Typically will imply no association between explanatory and response variables in our applications (will always contain an equality) Alternative hypothesis - Statement contradictory to the null hypothesis (will always contain an inequality) Test statistic - Quantity based on sample data and null hypothesis used to test between null and alternative hypotheses Rejection region - Values of the test statistic for which we reject the null in favor of the alternative hypothesis

Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Embed Size (px)

Citation preview

Page 1: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Hypothesis Testing

• Goal: Make statement(s) regarding unknown population parameter values based on sample data

• Elements of a hypothesis test:– Null hypothesis - Statement regarding the value(s) of unknown

parameter(s). Typically will imply no association between explanatory and response variables in our applications (will always contain an equality)

– Alternative hypothesis - Statement contradictory to the null hypothesis (will always contain an inequality)

– Test statistic - Quantity based on sample data and null hypothesis used to test between null and alternative hypotheses

– Rejection region - Values of the test statistic for which we reject the null in favor of the alternative hypothesis

Page 2: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Hypothesis Testing

Test Result –

True State

H0 True H0 False

H0 True CorrectDecision

Type I Error

H0 False Type II Error CorrectDecision

)()( ErrorIITypePErrorITypeP

• Goal: Keep , reasonably small

Page 3: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Efficacy Test for New drug

• Drug company has new drug, wishes to compare it with current standard treatment

• Federal regulators tell company that they must demonstrate that new drug is better than current treatment to receive approval

• Firm runs clinical trial where some patients receive new drug, and others receive standard treatment

• Numeric response of therapeutic effect is obtained (higher scores are better).

• Parameter of interest: New - Std

Page 4: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Efficacy Test for New drug

• Null hypothesis - New drug is no better than standard trt

00:0 StdNewStdNewH

• Alternative hypothesis - New drug is better than standard trt

0: StdNewAH

• Experimental (Sample) data:

StdNew

StdNew

StdNew

nn

ss

yy

Page 5: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Sampling Distribution of Difference in Means

• In large samples, the difference in two sample means is approximately normally distributed:

2

22

1

21

2121 ,~nn

NYY

• Under the null hypothesis, 1-2=0 and:

)1,0(~

2

22

1

21

21N

nn

YYZ

• 12 and 2

2 are unknown and estimated by s12 and s2

2

Page 6: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Efficacy Test for New drug

• Type I error - Concluding that the new drug is better than the standard (HA) when in fact it is no better (H0). Ineffective drug is deemed better.

– Traditionally = P(Type I error) = 0.05

• Type II error - Failing to conclude that the new drug is better (HA) when in fact it is. Effective drug is deemed to be no better.

– Traditionally a clinically important difference ( is assigned and sample sizes chosen so that:

= P(Type II error | 1-2 = ) .20

Page 7: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Elements of a Hypothesis Test

• Test Statistic - Difference between the Sample means, scaled to number of standard deviations (standard errors) from the null difference of 0 for the Population means:

2

22

1

21

21:..

ns

ns

yyzST obs

• Rejection Region - Set of values of the test statistic that are consistent with HA, such that the probability it falls in this region when H0 is true is (we will always set =0.05)

645.105.0:.. zzzRR obs

Page 8: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

P-value (aka Observed Significance Level)

• P-value - Measure of the strength of evidence the sample data provides against the null hypothesis:

P(Evidence This strong or stronger against H0 | H0 is true)

)(: obszZPpvalP

Page 9: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Large-Sample Test H0:1-2=0 vs H0:1-2>0

• H0: 1-2 = 0 (No difference in population means

• HA: 1-2 > 0 (Population Mean 1 > Pop Mean 2)

)(:

:..

:..

2

22

1

21

21

obs

obs

obs

zZPvalueP

zzRR

ns

ns

yyzST

• Conclusion - Reject H0 if test statistic falls in rejection region, or equivalently the P-value is

Page 10: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Botox for Cervical Dystonia

• Patients - Individuals suffering from cervical dystonia • Response - Tsui score of severity of cervical dystonia

(higher scores are more severe) at week 8 of Tx• Research (alternative) hypothesis - Botox A

decreases mean Tsui score more than placebo• Groups - Placebo (Group 1) and Botox A (Group 2)• Experimental (Sample) Results:

354.37.7

336.31.10

222

111

nsy

nsy

Source: Wissel, et al (2001)

Page 11: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Botox for Cervical Dystonia

0024.)82.2(:

645.1:..

82.285.0

4.2

35)4.3(

33)6.3(

7.71.10:..

0:

0:

05.

22

21

210

ZPvalP

zzzRR

zST

H

H

obs

obs

A

Test whether Botox A produces lower mean Tsui scores than placebo ( = 0.05)

Conclusion: Botox A produces lower mean Tsui scores than placebo (since 2.82 > 1.645 and P-value < 0.05)

Page 12: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

2-Sided Tests

• Many studies don’t assume a direction wrt the difference 1-2

• H0: 1-2 = 0 HA: 1-2 0

• Test statistic is the same as before• Decision Rule:

– Conclude 1-2 > 0 if zobs z=0.05 z2=1.96)

– Conclude 1-2 < 0 if zobs -z=0.05 -z2= -1.96)

– Do not reject 1-2 = 0 if -zzobs z

• P-value: 2P(Z |zobs|)

Page 13: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Power of a Test

• Power - Probability a test rejects H0 (depends on 1- 2)

– H0 True: Power = P(Type I error) = – H0 False: Power = 1-P(Type II error) = 1-

· Example: · H0: 1- 2 = 0 HA: 1- 2 > 0

=

n1 = n2 = 25

· Decision Rule: Reject H0 (at =0.05 significance level) if:

326.2645.12

2121

2

22

1

21

21

yy

yy

nn

yyzobs

Page 14: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Power of a Test

• Now suppose in reality that 1-2 = 3.0 (HA is true)

• Power now refers to the probability we (correctly) reject the null hypothesis. Note that the sampling distribution of the difference in sample means is approximately normal, with mean 3.0 and standard deviation (standard error) 1.414.

• Decision Rule (from last slide): Conclude population means differ if the sample mean for group 1 is at least 2.326 higher than the sample mean for group 2

• Power for this case can be computed as:

)414.10.2,3(~)326.2( 2121 NYYYYP

Page 15: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Power of a Test

6844.)48.041.1

3326.2()326.2( 21

ZPYYPPower

• All else being equal:

• As sample sizes increase, power increases

• As population variances decrease, power increases

• As the true mean difference increases, power increases

Page 16: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Power of a Test

Distribution (H0) Distribution (HA)

Page 17: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Power of a Test

Power Curves for group sample sizes of 25,50,75,100 and varying true values 1-2 with 1=2=5.

• For given 1-2 , power increases with sample size

• For given sample size, power increases with 1-2

Page 18: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Sample Size Calculations for Fixed Power• Goal - Choose sample sizes to have a favorable chance of

detecting a clinically meaning difference

• Step 1 - Define an important difference in means:– Case 1: approximated from prior experience or pilot study - dfference

can be stated in units of the data

– Case 2: unknown - difference must be stated in units of standard deviations of the data

21

• Step 2 - Choose the desired power to detect the the clinically meaningful difference (1-, typically at least .80). For 2-sided test:

2

22/

21

2

zz

nn

Page 19: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Rosiglitazone for HIV-1 Lipoatrophy

• Trts - Rosiglitazone vs Placebo• Response - Change in Limb fat mass• Clinically Meaningful Difference - 0.5 (std dev’s)• Desired Power - 1- = 0.80• Significance Level - = 0.05

63

)5.0(

84.096.12

84.96.1

2

2

21

20.2/

nn

zzz

Source: Carr, et al (2004)

Page 20: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Confidence Intervals

• Normally Distributed data - approximately 95% of individual measurements lie within 2 standard deviations of the mean

• Difference between 2 sample means is approximately normally distributed in large samples (regardless of shape of distribution of individual measurements):

2

22

1

21

2121 ,~nn

NYY

• Thus, we can expect (with 95% confidence) that our sample mean difference lies within 2 standard errors of the true difference

Page 21: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

(1-)100% Confidence Interval for 1-2

2

22

1

21

2/21 n

s

n

szyy

• Large sample Confidence Interval for 1-2:

• Standard level of confidence is 95% (z.025 = 1.96 2)

• (1-)100% CI’s and 2-sided tests reach the same conclusions regarding whether 1-2= 0

Page 22: Hypothesis Testing Goal: Make statement(s) regarding unknown population parameter values based on sample data Elements of a hypothesis test: –Null hypothesis

Example - Viagra for ED

• Comparison of Viagra (Group 1) and Placebo (Group 2) for ED

• Data pooled from 6 double-blind trials

• Subjects - White males

• Response - Percent of succesful intercourse attempts in past 4 weeks (Each subject reports his own percentage)

2403.425.23

2643.412.63

222

211

nsy

nsy

95% CI for 1- 2:

)0.47,4.32(3.77.39240

)3.42(

264

)3.41(96.1)5.232.63(

22

Source: Carson, et al (2002)