Upload
cheryl-montgomery
View
28
Download
0
Embed Size (px)
DESCRIPTION
Biostatistics Case Studies 2014. Session 1: Sample Size & Power for Inequality and Equivalence Studies I. Youngju Pak, PhD. Biostatistician [email protected]. Class Schedule. Announcements. All class materials will be uploaded in the following website - PowerPoint PPT Presentation
Citation preview
Biostatistics Case Studies 2014
Youngju Pak, PhD.
Biostatistician
Session 1:
Sample Size & Power for Inequality and Equivalence Studies I
Class Schedule
Date Topic Related Paper
Session 1, Sept 16 Sample Size & Power for Inequality and Equivalence Studies I
Howard Paper, Gilchrist Paper, Williamson Paper
Session 2, Sept 23 Sample Size & Power for Inequality and Equivalence Studies II
Diestelhorst Paper
Session 3, Sept 30 Research Study Designs To Be Determined
Session 4, Oct 7 Regression Models and Multivariate Analyses
TBD
Session 5, Oct 14 Survival Analysis Fundamentals
TBD
Session 6, Oct 21 Free Topics & Discussion TBD
Announcements
• All class materials will be uploaded in the following website
• http://research.labiomed.org/Biostat/Education/CaseStudies_Fall2014/CaseStudies2014Outline.htm
• Try to read posted articles before each as best as you can and pay more attention to statistical components when you read them
• Send me an e-mail ([email protected]) so I can communicate with you if necessary.
• Send me a copy of article that you want to discuss if you have one. This might be used for the last session
Inequality study:• Two or more treatments are assumed equal (H0)and
the study is designed to find overwhelming evidence of a difference (Superiority and/or Inferiority).
• Most common comparative study type.
• It is rare to assess only one of superiority or inferiority (“one-sided” statistical tests), unless there is biological impossibility of one of them.
• Hypotheses:Ha: | mean(treatment ) - mean (control ) | ≠ 0H0: | mean(treatment ) - mean (control ) | = 0
Insignificnat p-values for Inequality tests
• Insignificant p-values (> 0.05) usually mean that you don’t find a statistically sufficient evidence to support Ha and this doesn’t necessary mean H0 is true.
• H0 might or might not be true => Your study is still “INCONCLUSIVE”.
• Insignificant p-values do NOT prove your null !
Equivalence Study:Two treatments are assumed to differ (H0) and the study is designed to find overwhelming evidence that they are equal.
• Usually, the quantity of interest is a measure of biological activity or potency(the amount of drug required to produce
an effect) and “treatments” are drugs or lots or batches of drugs.
• AKA, bioequivalence.
• Sometimes used to compare clinical outcomes for two active treatments if neither treatment can be considered standard or accepted. This usually requires LARGE numbers of subjects.
Hypotheses for equivalence tests• Ha : mean (trt 1) – mean (trt 2) = 0
• H0: mean(trt 1) - mean (trt 2 ) ≠ 0
• With a finite sample size, it is very hard to find two group means are exactly the same.
• So we put a tolerability level for the equivalence, AKA, the equivalence margin, usually denoted as Δ
• Practical hypotheses would be • Ha : Δ 1< mean(trt 1) – mean (trt2) < Δ2
• H0 : mean(trt 1) – mean (trt2) ≤ Δ 1
or mean(trt 1) – mean (trt2) ≥ Δ2
Non-inferiority
Today, we are going to learn how to determine sample size for Inequality tests using software
for three papers.
Then, Discuss some logic.
Paper #1
How was N=498 determined?
What reduction in CVD events can 224 + 224 subjects detect? Nevertheless
How many subjects would be needed to detect this Δ?
Software Output for % of CVD Events
224 + 224 → detect 6.7% vs. 1.13%, i.e., 88% ↓.
Need 3115 + 3115 to detect 25% ↓ from 6.7% to 5%, i.e., a total of (3115+3115)/0.9 = 6922.
From earlier design paper (Russell 2007):
Δ = 0.85(0.05)
mm = 0.0425 mm
Software Output for Mean IMT
Each group N for 10% Dropout → 0.9N = 224
→ N = 224/0.9 = 249. Total study size = 2(249)=498
Paper #2
Williamson paper
Software Output - Percentages
Software Output - Means
Can detect 0.4 SDs. Units? Since normal range =~ 6SD, this corresponds to ~0.4/6=7% shift in normal range.
Applies to any continuously measured outcome.
Paper #3
From Nance paper
Δ = ~8%
Δ
SD√(1/N1 + 1/N2)= 2.82
Solve for SD to get SD =~ 6.8%
Software Output for Gilchrist Paper
Some Logic
How was 498 determined?
Back to:
How IMT Change Comparison Will be Made
Strength of Treatment Effect:
Signal:Noise Ratio t=
Observed Δ
SD√(1/N1 + 1/N2)
Δ = Aggressive - Standard Mean Diff in IMT changes
SD = Std Dev of within group IMT changes
N1 = N2 = Group size
| t | > ~1.96 ↔ p<0.05
Could Solve for N
Observed Δ
SD√(1/N1 + 1/N2)
This is not quite right.
The Δ is the actual observed difference.
This sample Δ will vary from the real Δ in “everyone”.
Need to increase N in case the sample happens to have a Δ that is lower than the real Δ (50% possibility).
≥~1.96 if (with N = N1 = N2):
Δ ≥ 1.96SD√(2/N) or N ≥ 2SD2
Δ2(1.96)2
t =
Need to Increase N for Power
Need to increase N to:
2SD2
Δ2(1.96 + 0.842)2
Power is the probability that p<0.05 if Δ is the real effect, incorporating the possibility that the Δ in our sample could be smaller.
2SD2
Δ2(1.96)2N = for 50% power.
for 80% power.N =
N =2SD2
Δ2(1.96 + 1.282)2 for 90% power.
from Normal Tables
Info Needed for Study Size: Comparing Means
1. Effect
2. Subject variability
3. Type I error (1.96 for α=0.05; 2.58 for α=0.01)
4. Power (0.842 for 80% power; 1.645 for 95% power)
(1.96 + 0.842)22SD2
Δ2N =
Same four quantities, but different formula, if comparing %s, hazard ratios, odds ratios, etc.
(1.96 + 0.842)2 2(0.16)2
(0.0425)2N = = 224
Each group N for 10% Dropout → 0.9N = 224
→ N = 224/0.9 = 249. Total study size = 2(249)=498
2SD2
Δ2N = (1.96 + 0.842)2
Change Effect Size to be Detected
SD Estimate Could be Wrong
Should examine SD as study progresses.
May need to increase N if SD was underestimated.
Some Study Size Software
Free Study Size Software
www.stat.uiowa.edu/~rlenth/Power
Study Size Software in GCRC Lab
ncss.com ~$500
nQuery - Used by Most Drug Companies