From diagnostic test to hypothesis test - MICEapps · From diagnostic test to hypothesis test. Plan • Revise concepts related to diagnostic tests • Learn concepts related to hypothesis

Kameshwar Prasad

Professor of Neurology, Former Chief, Neurosciences Centre

All India Institute of Medical Sciences, New Delhi

From diagnostic test to

hypothesis test

Plan

• Revise concepts related to diagnostic

tests

• Learn concepts related to hypothesis

testing

How do I teach sample size

calculation?

Dichotomous Outcome (2 Independent Samples)

• Test H0: p1 = p2 vs. HA: p1 p2

• Assuming two-sided alternative and equal allocation

***Always Round Up To Nearest Integer!

2

22111/2-1

/

2 z

qpqpzqpn groupper

p1, p2 = projected true probabilities of “success” in the

two groups

q1 = 1 – p1, q2 = 1 – p2

= p1 – p2

p = (p1 + p2)/2, q = 1 – p

z1-/2 is the N(0,1) cutoff corresponding to

z1- is the N(0,1) cutoff corresponding to β

Dichotomous Outcome(2 Independent Samples)

where is the probability from a standard normal distribution

2211

2/1 2

qpqp

qpznPower

Continuous Outcome(2 Independent Samples)

• Test H0: 1 = 2 vs. HA: 1 2

• Two-sided alternative and equal allocation

• Assume outcome normally distributed with:

2

2

12/1

2

2

2

1

/

zzn groupper

mean 1 and variance 12 in Group 1

mean 2 and variance 22 in Group 2

For RCTs

Sample size with grade 3 math

Randomized Controlled Trial (RCT)

Two groups of equal size

Parallel groups

• Hypothesis : In patients with hypertensive brain

haemorrhage, surgery reduces 30-day mortality from

40% to 20%.

Best medical management alone : 40% (Pc) to

Surgery + best medical management : 20% (Pe)

RCT : Superiority Hypothesis

Find out their average = 30% (p)

Pe = 20%

Pc = 40%

Find out the difference between Pe & Pc

= 20% (d)


Average of Pe & Pc = 30% (p)

Difference between Pe & Pc = 20% (d)

Sample size per group =

16p(100 - p)

(d x d)= 16 x 30 x 70 = 84 per group

20 x 20

Total N=168



16p(100 - p)d x d

This is for a study with two equal parallel groups.

p = average of Pe & Pc

d = difference between Pe and Pc

Exercise: calculate sample size for the following hypothesis:

Dexamethasone adjunctive therapy reduces mortality from 15% to 5% in children with neonatal meningitis.


16p(100 - p)d x d

This is for a study with two equal parallel groups.

Thank You

By now, you have at least one

question?

• Where does the ‘16’ come from?

By now, you have at least one

question?

• Where does the ‘16’ come from?

• Before we address this, I will take your

test…….

Truth revealed by Gold standard

T

E

S

T

Disease + Disease -

+

-


T

E

S

T

Disease + Disease -

+ True positive

-


T

E

S

T

Disease + Disease -

+ True positive False positive

-


T

E

S

T

Disease + Disease -


- False

negative


T

E

S

T

Disease + Disease -


- False

negative

True negative

Truth revealed by Gold

standard

T

E

S

T

Disease + Disease -

+ 190 40

- 10 160

200 200

Hypothetical Example of a study

with sample size of 400


T

E

S

T

Disease + Disease -

+ 190

True positive

40

False positive

- 10

False negative

160

True positive

200 200




standard

T

E

S

T

Disease + Disease -

+ 95% 20%

- 5% 80%




standard

T

E

S

T

Disease

+

Disease -

+ 190 40

- 10 160

200 200


T

E

S

T

Disease + Disease -

+ TP rate FP rate

- FN rate TN rate

What we call rate is actually probability.


standard

T

E

S

T

Disease

+

Disease -

+ 95%

(True

positive

rate)

20%

(False

positive

rate)

- 5%

(False

negative

rate)

80%

(True

negative

rate)




standard

T

E

S

T

Disease + Disease -

+ 190

(True

Positive)

40

(False

Positive)

- 10

(False

Negative)

160

(True

Negative)

200 200


standard

T

E

S

T

Disease + Disease -

+ 95%

(True

positive

rate)

20%

(False

positive

rate)

- 5%

(False

negative

rate)

80%

(True

negative

rate)

Giving names: ‘terms’


standard

T

E

S

T

Disease + Disease -

+ Sensitivity

(True

Positive

rate)

(False

Positive

rate)

-

(False

Negative

rate)

Specificity

(True

Negative

rate)

Think of relationship between

false-negative rate and sensitivity

2 by 2 table: sensitivity

Disease

Test

+ -

+

-

Sensitivity = a / a + c

Proportion of people

with the disease who

have a positive test

result.

So, a test with 84%

sensitivity….means

that the test identifies

84 out of 100 people

WITH the disease

a

True

positives

c

False

negatives

Conducting a study is like doing a

diagnostic test

• Want to know (diagnose) the truth

• But there is no ‘gold standard’ to reveal

the truth, which is known only to ‘God’

• Any study, like a diagnostic test, is an

attempt to find the truth

Truth about New Treatment

Study finds

that treatment

Works

+

Does not Work

-

Works

+

Does not Work

-


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

Positive

Does not Work

-


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

positive

False

positive

Does not Work

-


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

positive

False

positive

Does not Work

-

False

negative


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

positive

False

positive

Does not Work

-

False

negative

True

negative

Four Possible Results

• Where do results go wrong?• Where do errors occur?


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

Positive

False

Positive

Does not Work

-

False

Negative

True

Negative



Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

Positive

False

Positive error

Does not Work

-

False

Negative

error

True

Negative



Study finds

that treatment

Works

+

Does not Work

-

Works

+

True

Positive

Type I error

(False

positive)

Does not Work

-

Type II

error

(false

negative)

True

Negative

Types of errors

False positive result: Type I error

False negative result: Type II error

Cannot plan for error free results



Study finds

that treatment

Works

+

Does not Work

-

Works

+

True Positive

probability

False

Positive error

probability

Does not Work

-

False

Negative

error

probability

True

Negative

probability

Consultation with prof of

biostatistics

• Sir, can you help with sample size

calculation of my thesis (RCT)?

• And, so on….



Study finds

that treatment

Works

+

Does not Work

-

Works

+

True Positive

probability

False Positive

error probability

(alpha)

Does not Work

-

False Negative

error probability

(Beta)

True Negative

probability

Think of relationship between

false-negative rate and sensitivity

2 by 2 table: sensitivity

Disease

Test

+ -

+

-

Sensitivity = a / a + c

Proportion of people

with the disease who

have a positive test

result.

So, a test with 84%

sensitivity….means

that the test identifies

84 out of 100 people

WITH the disease

a

True

positives

c

False

negatives

What is the true probability when beta is varying?


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True Positive

probability

False Positive

error probability

(alpha)

Does not Work

-

False Negative

error probability

(Beta)

True Negative

probability

What is the other name for true positive probability?


Study finds

that treatment

Works

+

Does not Work

-

Works

+

True Positive

probability

POWER

False Positive error

probability (alpha)

Does not Work

-

False Negative

error probability

(Beta)

True Negative

probability

How much risk of errors you want to take?

Type I error rate

5%

Type II error rate

5%, 10%, 20%

Probability of Errors

Probability of FP (Type I ) error: α

Probability of FN (Type II) error: β

Almost always α=5%

Usually β = 20 %, 10%.

With 20%, power = ?, With 10%, power = ?

Source for 16

α= 5% β= 20%Power = 80%

Allocation ratio = 1:1

Dichotomous Outcome (2 Independent Samples)

• Test H0: p1 = p2 vs. HA: p1 p2

• Assuming two-sided alternative and equal allocation

***Always Round Up To Nearest Integer!

2

22111/2-1

/

2 z

qpqpzqpn groupper

p1, p2 = projected true probabilities of “success” in the

two groups

q1 = 1 – p1, q2 = 1 – p2

= p1 – p2

p = (p1 + p2)/2, q = 1 – p

z1-/2 is the N(0,1) cutoff corresponding to

z1- is the N(0,1) cutoff corresponding to β

ANY QUESTION?

Table: Multiplication factors for frequently used

power and alpha*

Power Multiplication

Factor for

alpha = 5%

Multiplication

Factor for

alpha = 1%

80% 16 23

90% 21 30

95% 26 36

99% 37 48

* Rounded to the nearest whole number.

Summary (what have we learnt)

• Concepts of hypothesis testing are

similar to those used in studies of

diagnostic tests

• Type I error/ Type II error

• Alpha/ beta

• Power

• ‘God is great’

Thank You

Sample size for a diagnostic test

study

• How many cases (Disease +) you

need?

• What is your expected (for the test to be

useful) sensitivity? Say 90% (= p)

• Within what range you want to estimate

this? Say +/- 5%

• d= difference between upper and lower

limit of the range 10%

• Now do the mental math.

Sample size for a diagnostic test

study• How many controls (disease -)you

need?

• What is your expected (for the test to be

useful) specificity? Say 80% (= p)

• Within what range you want to estimate

this? Say +/- 5%

• d= 10%

• Now do the mental math.

What kind of outcome measure

does your study have?

• Two category (dichotomous, binary)

• Numerical (continuous)

What kind of outcome measures

are there in these statements?

• Rosaglitazone reduces blood sugar level in diabetes.

• Clopidogrel reduces incidence of myocardial infarction.

• Carotid angioplasty prevents stroke.

• Nifedipine controls BP effectively in hypertensive

emergencies.

• Statins control cholesterol level in high-risk individuals.

• Steroids induce remssionin SLE.

• Rampril improves LV ejection fraction.

RCT-Superiority hypothesis

Continuous outcome measure• RCT : Two groups of equal size.

• Formula : 16x how much effect are you interested in?

• The value of ‘x’ depends on size of effect.

Effect size x Sample

Small 25 16x25 = 400 per group

Moderate 4 16x4 = 64 per group

Large 2 16x2 = 32 per group

Effect Size

• How much is the difference with respect

to its variation

• Difference d

• Variation s.d.

• Effect size = d / s.d.

• Large 0.8, Moderate 0.5, Small 0.2

Example

• You are planning a study to improve

LVEF using stem cells in acute MI

• You expect moderate effect, hence the

sample size will be 16x4 = 64 per group

• What is the general formula?

• n per group = 16 s2/d2

Variable Placebo (N=92) BMC (N=95)

Global LVEF (%)

Baseline

Mean

Median

46.9±10.4

47.5

48.3±9.2

50.6

4 Mo

Mean

Median

49.9±13.0

53.2

53.8±10.2

54.7

Absolute difference

Mean

Median

3.0±10

4.0

5.5±10

5.0

Death, recurrence of MI, & any

revascularization procedure40 / 103 23/101

One MI Study using stem cells

Calculating sample size for LVEF change

• Need two things :

Standard deviation (s)

Difference expected (d)

s = 10%

d = 5%

Sample size = 16s2 / d2

= (16 x 10 x 10) / (5 x 5)

= 64 per group

k n1 n2 n1+n2

1 n n 2n

2 0.75n 1.5n 2.25n

3 0.67n 2.0n 2.67n

4 0.62n 2.5n 3.12n

5 0.60n 3.0n 3.60n

10 0.55n 5.5n 6.05n

100 0.50n 50.n 50.0n

Table: Study sizes necessary to achieve approximately

the same power in trial with two groups, of which one

contains k times as many individuals as the other.

Reference for the formula

• Lehr R. Sixteen S-aquared over D-

squared: A Relation for crude sample

size estimates. Statistics in Medicine

1992;11:1099-1102

Disclaimer

• The sample size formula discussed in the given time

works only for –

RCT with superiority hypothesis and dichotomous

outcomes

NOT for case control, cohort or cross-sectional

studies

NOT for non-inferiority or equivalence hypothesis

Thank You


In % 16p(100-p)

d x d

In decimals, 16p (1-p)

d x d

p = average of Pe & Po

d = difference between Pe and Po

Documents

From diagnostic test to hypothesis test - MICEapps · From diagnostic test to hypothesis test. Plan • Revise concepts related to diagnostic tests • Learn concepts related to hypothesis