Upload
others
View
39
Download
0
Embed Size (px)
Citation preview
Kameshwar Prasad
Professor of Neurology, Former Chief, Neurosciences Centre
All India Institute of Medical Sciences, New Delhi
From diagnostic test to
hypothesis test
Plan
• Revise concepts related to diagnostic
tests
• Learn concepts related to hypothesis
testing
How do I teach sample size
calculation?
Dichotomous Outcome (2 Independent Samples)
• Test H0: p1 = p2 vs. HA: p1 p2
• Assuming two-sided alternative and equal allocation
***Always Round Up To Nearest Integer!
2
22111/2-1
/
2 z
qpqpzqpn groupper
p1, p2 = projected true probabilities of “success” in the
two groups
q1 = 1 – p1, q2 = 1 – p2
= p1 – p2
p = (p1 + p2)/2, q = 1 – p
z1-/2 is the N(0,1) cutoff corresponding to
z1- is the N(0,1) cutoff corresponding to β
Dichotomous Outcome(2 Independent Samples)
where is the probability from a standard normal distribution
2211
2/1 2
qpqp
qpznPower
Continuous Outcome(2 Independent Samples)
• Test H0: 1 = 2 vs. HA: 1 2
• Two-sided alternative and equal allocation
• Assume outcome normally distributed with:
2
2
12/1
2
2
2
1
/
zzn groupper
mean 1 and variance 12 in Group 1
mean 2 and variance 22 in Group 2
For RCTs
Sample size with grade 3 math
Randomized Controlled Trial (RCT)
Two groups of equal size
Parallel groups
• Hypothesis : In patients with hypertensive brain
haemorrhage, surgery reduces 30-day mortality from
40% to 20%.
Best medical management alone : 40% (Pc) to
Surgery + best medical management : 20% (Pe)
RCT : Superiority Hypothesis
Find out their average = 30% (p)
Pe = 20%
Pc = 40%
Find out the difference between Pe & Pc
= 20% (d)
RCT : Superiority Hypothesis
Average of Pe & Pc = 30% (p)
Difference between Pe & Pc = 20% (d)
Sample size per group =
16p(100 - p)
(d x d)= 16 x 30 x 70 = 84 per group
20 x 20
Total N=168
RCT : Superiority Hypothesis
Sample size per group =
16p(100 - p)d x d
This is for a study with two equal parallel groups.
p = average of Pe & Pc
d = difference between Pe and Pc
Exercise: calculate sample size for the following hypothesis:
Dexamethasone adjunctive therapy reduces mortality from 15% to 5% in children with neonatal meningitis.
Sample size per group =
16p(100 - p)d x d
This is for a study with two equal parallel groups.
Thank You
By now, you have at least one
question?
• Where does the ‘16’ come from?
By now, you have at least one
question?
• Where does the ‘16’ come from?
• Before we address this, I will take your
test…….
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
-
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
- False
negative
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ True positive False positive
- False
negative
True negative
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 190 40
- 10 160
200 200
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ 190
True positive
40
False positive
- 10
False negative
160
True positive
200 200
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 95% 20%
- 5% 80%
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease
+
Disease -
+ 190 40
- 10 160
200 200
Truth revealed by Gold standard
T
E
S
T
Disease + Disease -
+ TP rate FP rate
- FN rate TN rate
What we call rate is actually probability.
Truth revealed by Gold
standard
T
E
S
T
Disease
+
Disease -
+ 95%
(True
positive
rate)
20%
(False
positive
rate)
- 5%
(False
negative
rate)
80%
(True
negative
rate)
Hypothetical Example of a study
with sample size of 400
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 190
(True
Positive)
40
(False
Positive)
- 10
(False
Negative)
160
(True
Negative)
200 200
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ 95%
(True
positive
rate)
20%
(False
positive
rate)
- 5%
(False
negative
rate)
80%
(True
negative
rate)
Giving names: ‘terms’
Truth revealed by Gold
standard
T
E
S
T
Disease + Disease -
+ Sensitivity
(True
Positive
rate)
(False
Positive
rate)
-
(False
Negative
rate)
Specificity
(True
Negative
rate)
Think of relationship between
false-negative rate and sensitivity
2 by 2 table: sensitivity
Disease
Test
+ -
+
-
Sensitivity = a / a + c
Proportion of people
with the disease who
have a positive test
result.
So, a test with 84%
sensitivity….means
that the test identifies
84 out of 100 people
WITH the disease
a
True
positives
c
False
negatives
Conducting a study is like doing a
diagnostic test
• Want to know (diagnose) the truth
• But there is no ‘gold standard’ to reveal
the truth, which is known only to ‘God’
• Any study, like a diagnostic test, is an
attempt to find the truth
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
False
negative
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
positive
False
positive
Does not Work
-
False
negative
True
negative
Four Possible Results
• Where do results go wrong?• Where do errors occur?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
False
Positive
Does not Work
-
False
Negative
True
Negative
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
False
Positive error
Does not Work
-
False
Negative
error
True
Negative
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True
Positive
Type I error
(False
positive)
Does not Work
-
Type II
error
(false
negative)
True
Negative
Types of errors
False positive result: Type I error
False negative result: Type II error
Cannot plan for error free results
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False
Positive error
probability
Does not Work
-
False
Negative
error
probability
True
Negative
probability
Consultation with prof of
biostatistics
• Sir, can you help with sample size
calculation of my thesis (RCT)?
• And, so on….
Four Possible Results
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False Positive
error probability
(alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
Think of relationship between
false-negative rate and sensitivity
2 by 2 table: sensitivity
Disease
Test
+ -
+
-
Sensitivity = a / a + c
Proportion of people
with the disease who
have a positive test
result.
So, a test with 84%
sensitivity….means
that the test identifies
84 out of 100 people
WITH the disease
a
True
positives
c
False
negatives
What is the true probability when beta is varying?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
False Positive
error probability
(alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
What is the other name for true positive probability?
Truth about New Treatment
Study finds
that treatment
Works
+
Does not Work
-
Works
+
True Positive
probability
POWER
False Positive error
probability (alpha)
Does not Work
-
False Negative
error probability
(Beta)
True Negative
probability
How much risk of errors you want to take?
Type I error rate
5%
Type II error rate
5%, 10%, 20%
Probability of Errors
Probability of FP (Type I ) error: α
Probability of FN (Type II) error: β
Almost always α=5%
Usually β = 20 %, 10%.
With 20%, power = ?, With 10%, power = ?
Source for 16
α= 5% β= 20%Power = 80%
Allocation ratio = 1:1
Dichotomous Outcome (2 Independent Samples)
• Test H0: p1 = p2 vs. HA: p1 p2
• Assuming two-sided alternative and equal allocation
***Always Round Up To Nearest Integer!
2
22111/2-1
/
2 z
qpqpzqpn groupper
p1, p2 = projected true probabilities of “success” in the
two groups
q1 = 1 – p1, q2 = 1 – p2
= p1 – p2
p = (p1 + p2)/2, q = 1 – p
z1-/2 is the N(0,1) cutoff corresponding to
z1- is the N(0,1) cutoff corresponding to β
ANY QUESTION?
Table: Multiplication factors for frequently used
power and alpha*
Power Multiplication
Factor for
alpha = 5%
Multiplication
Factor for
alpha = 1%
80% 16 23
90% 21 30
95% 26 36
99% 37 48
* Rounded to the nearest whole number.
Summary (what have we learnt)
• Concepts of hypothesis testing are
similar to those used in studies of
diagnostic tests
• Type I error/ Type II error
• Alpha/ beta
• Power
• ‘God is great’
Thank You
Sample size for a diagnostic test
study
• How many cases (Disease +) you
need?
• What is your expected (for the test to be
useful) sensitivity? Say 90% (= p)
• Within what range you want to estimate
this? Say +/- 5%
• d= difference between upper and lower
limit of the range 10%
• Now do the mental math.
Sample size for a diagnostic test
study• How many controls (disease -)you
need?
• What is your expected (for the test to be
useful) specificity? Say 80% (= p)
• Within what range you want to estimate
this? Say +/- 5%
• d= 10%
• Now do the mental math.
What kind of outcome measure
does your study have?
• Two category (dichotomous, binary)
• Numerical (continuous)
What kind of outcome measures
are there in these statements?
• Rosaglitazone reduces blood sugar level in diabetes.
• Clopidogrel reduces incidence of myocardial infarction.
• Carotid angioplasty prevents stroke.
• Nifedipine controls BP effectively in hypertensive
emergencies.
• Statins control cholesterol level in high-risk individuals.
• Steroids induce remssionin SLE.
• Rampril improves LV ejection fraction.
RCT-Superiority hypothesis
Continuous outcome measure• RCT : Two groups of equal size.
• Formula : 16x how much effect are you interested in?
• The value of ‘x’ depends on size of effect.
Effect size x Sample
Small 25 16x25 = 400 per group
Moderate 4 16x4 = 64 per group
Large 2 16x2 = 32 per group
Effect Size
• How much is the difference with respect
to its variation
• Difference d
• Variation s.d.
• Effect size = d / s.d.
• Large 0.8, Moderate 0.5, Small 0.2
Example
• You are planning a study to improve
LVEF using stem cells in acute MI
• You expect moderate effect, hence the
sample size will be 16x4 = 64 per group
• What is the general formula?
• n per group = 16 s2/d2
Variable Placebo (N=92) BMC (N=95)
Global LVEF (%)
Baseline
Mean
Median
46.9±10.4
47.5
48.3±9.2
50.6
4 Mo
Mean
Median
49.9±13.0
53.2
53.8±10.2
54.7
Absolute difference
Mean
Median
3.0±10
4.0
5.5±10
5.0
Death, recurrence of MI, & any
revascularization procedure40 / 103 23/101
One MI Study using stem cells
Calculating sample size for LVEF change
• Need two things :
Standard deviation (s)
Difference expected (d)
s = 10%
d = 5%
Sample size = 16s2 / d2
= (16 x 10 x 10) / (5 x 5)
= 64 per group
k n1 n2 n1+n2
1 n n 2n
2 0.75n 1.5n 2.25n
3 0.67n 2.0n 2.67n
4 0.62n 2.5n 3.12n
5 0.60n 3.0n 3.60n
10 0.55n 5.5n 6.05n
100 0.50n 50.n 50.0n
Table: Study sizes necessary to achieve approximately
the same power in trial with two groups, of which one
contains k times as many individuals as the other.
Reference for the formula
• Lehr R. Sixteen S-aquared over D-
squared: A Relation for crude sample
size estimates. Statistics in Medicine
1992;11:1099-1102
Disclaimer
• The sample size formula discussed in the given time
works only for –
RCT with superiority hypothesis and dichotomous
outcomes
NOT for case control, cohort or cross-sectional
studies
NOT for non-inferiority or equivalence hypothesis
Thank You
RCT : Superiority Hypothesis
In % 16p(100-p)
d x d
In decimals, 16p (1-p)
d x d
p = average of Pe & Po
d = difference between Pe and Po