Upload
carmela-borjas
View
119
Download
0
Tags:
Embed Size (px)
Citation preview
[email protected]@uab.es 11
p
[email protected]@uab.es 22
Clinical Trial InvestigationClinical Trial Investigation
Interpretation of ResultsInterpretation of Results“to p or not to p”
Ferran TorresFerran Torres
Hospital Clínic Barcelona / Universitat Autònoma Barcelona. Hospital Clínic Barcelona / Universitat Autònoma Barcelona.
EMA:EMA:
Scientific Advice Working Party (SAWP)Scientific Advice Working Party (SAWP)
Biostatistics Working Party (BSWP). Biostatistics Working Party (BSWP).
[email protected]@uab.es 33
p
[email protected]@uab.es 44
Today’s talk is on statistics
[email protected]@uab.es 55
[email protected]@uab.es 66
Statistics ConsiderationsStatistics Considerations
[email protected]@uab.es 77
Basic statisticsBasic statistics Why Statistics?Why Statistics? Samples and populationsSamples and populations P-ValueP-Value Random and sistematical errorsRandom and sistematical errors Statistical errorsStatistical errors Sample sizeSample size Confidence IntervalsConfidence Intervals Interpretation of CI: superiority, non-Interpretation of CI: superiority, non-
inferiority, equivalenceinferiority, equivalence
[email protected]@uab.es 88
The role of statisticsThe role of statistics
““Thus statistical methods are no Thus statistical methods are no substitute for common sense and substitute for common sense and objectivityobjectivity. They should never aim to . They should never aim to confuse the reader, but instead should confuse the reader, but instead should be a major contributor to the clarity of a be a major contributor to the clarity of a scientific argument.”scientific argument.”
The role of statistics. The role of statistics. Pocock SJ Pocock SJ . Br J Psychiat 1980; . Br J Psychiat 1980; 137:188-190137:188-190
[email protected]@uab.es 1010
Variability
[email protected]@uab.es 1111
Why Statistics?Medicine is a quantitative science but not exact
Not like physics or chemistry
Variation characterises much of medicine
Statistics is about handling and quantifying variation and uncertainty
Humans differ in response to exposure to adverse effectsExample: not every smoker dies of lung cancer
some non-smokers die of lung cancerHumans differ in response to treatment
Example: penicillin does not cure all infectionsHumans differ in disease symptoms
Example: Sometimes cough and sometimes wheeze are presenting features for asthma
[email protected]@uab.es 1212
Why Statistics Are Necessary
Statistics can tell us whether events could have happened by chance and to make decisions
We need to use Statistics because of variability in our data
Generalize: can what we know help to predict what will happen in new and different situations?
[email protected]@uab.es 1313
Population and Samples
Target Population
Population of the Study
Sample
[email protected]@uab.es 1414
Extrapolation
Sample
Population
Inferential analysisStatistical Tests
Confidence Intervals
Study Results
“Conclusions”
[email protected]@uab.es 1515
Statistical Inference
Statistical Tests=> p-value
Confidence Intervals
[email protected]@uab.es 1616
Valid samples?Population
Likely to occur
Unlikely to occurInvalid Sample and Conclusions
[email protected]@uab.es 1717
P-valueThe p-value is a “tool” to answer the question:
–Could the observed results have occurred by chance*?
–Remember:Decision given the observed results in a SAMPLE
Extrapolating results to POPULATION
*: accounts exclusively for the random error, not bias
p < .05“statistically significant”
[email protected]@uab.es 1818
P-value: an intuitive definition
The p-value is the probability of having observed our data when the null hypothesis is true (no differences exist)
Steps:1) Calculate the treatment differences in the sample (A-B)2) Assume that both treatments are equal (A=B) and then…3) …calculate the probability of obtaining a magnitude of at
least the observed differences, given the assumption 24) We conclude according the probability:
a. p<0.05: the differences are unlikely to be explained by random, – we assume that the treatment explains the differences
b. p>0.05: the differences could be explained by random, 1) we assume that random explains the differences
[email protected]@uab.es 1919
Factors influencing statistical significance
• Signal
• Noise (background)
• Quantity
• Difference
• Variance (SD)
• Quantity of data
[email protected]@uab.es 2020
130 150 170
01 02 03 04 05
True Value
Random vs Sistematic error
Random Systematic (Bias)
130 150 170
01 05
02 03
04
True Value
Example: Systolic Blood Pressure (mm Hg)
[email protected]@uab.es 2222
P-value
A “statistically significant” result (p<.05)
tells us NOTHING about clinical or scientific importance. Only, that the results were not due to chance.
A p-value does NOT account for biasonly by random error
STAT REPORT
[email protected]@uab.es 2323
P-valueA “very low” p-value do NOT imply:
–Clinical relevance (NO!!!)
–Magnitude of the treatment effect (NO!!)
With n or variability p
•Please never compare p-values!! (NO!!!)
[email protected]@uab.es 2424
RCT from a statistical point of view
1 homogeneous population 2 distinct populations
RandomisationTreatment B (control)
Treatment A
[email protected]@uab.es 2626
• Statistics can never PROVE anything beyond any doubt, just beyond reasonable doubt!!
• … because of working with samples and random error
[email protected]@uab.es 2727
Type I & II Error & Power
Reality (Population)
A=B A≠B
Conclusion (sample)
“A=B” p>0.05 OK Type I I error
()
A≠B p<0.05 Type I error
() OK
[email protected]@uab.es 2828
Utilidad de Creer en la Existencia de Dios (según Pascal)
Realidad
Dios Existe Dios No Existe
Dios Existe Acierto No PenalizaciónDecisiónde Pascal
Dios No Existe Condena Eterna Acierto
H0: Dios No ExisteH1: Dios Existe
[email protected]@uab.es 2929
Type I & II Error & PowerType I Error ()
– False positive– Rejecting the null hypothesis when in fact it is true – Standard: =0.05– In words, chance of finding statistical significance when in fact
there truly was no effect
Type II Error ()– False negative– Accepting the null hypothesis when in fact alternative is true– Standard: =0.20 or 0.10– In words, chance of not finding statistical significance when in
fact there was an effect
[email protected]@uab.es 3030
The planned number of participants is calculated on the basis of:
– Expected effect of treatment(s)
– Variability of the chosen endpoint
– Accepted risks in conclusion
↗ effect ↘ number
↗ variability ↗ number
↗ risk ↘ number
Sample Size
[email protected]@uab.es 3131
Sample Size The planned number of participants is calculated
on the basis of:
– Expected effect of treatment(s)
– Variability of the chosen endpoint
– Accepted risks in conclusion
↗ effect ↘ number
↗ variability ↗ number
↗ risk ↘ number
ALTURA
ALTURA
Fre
cu
en
cia
300
200
100
0
Desv. típ. = 25.54
Media = 165.1
N = 2000.00
ALTURA
ALTURA
Fre
cue
nci
a
300
200
100
0
Desv. típ. = 26.94
Media = 165.0
N = 2000.00
ALTURA
ALTURA
Fre
cu
en
cia
120
100
80
60
40
20
0
Desv. típ. = 32.27
Media = 165.1
N = 2000.00
[email protected]@uab.es 3232
Sample Size The planned number of participants is calculated on the
basis of:
– Expected effect of treatment(s)
– Variability of the chosen endpoint
– Accepted risks in conclusion
↗ effect ↘ number
↗ variability ↗ number
↗ risk ↘ number
Reality (Population)
A=B A≠B
Conclusion (sample)
“A=B” p>0.05 OK Type I I error
()
A≠B p<0.05 Type I error
() POWER
[email protected]@uab.es 3333
Interval Estimation
Confidence Confidence intervalinterval
Sample Sample statistic statistic
(point (point estimate)estimate)
Confidence Confidence limit (lower)limit (lower)
Confidence Confidence limit limit
(upper)(upper)
““A probability that the population A probability that the population parameter falls somewhere within parameter falls somewhere within
the intervalthe interval””
[email protected]@uab.es 3434
95%CIBetter than p-values…
– …use the data collected in the trial to give an estimate of the treatment effect size, together with a measure of how certain we are of our estimate
CI is a range of values within which the “true” treatment effect is believed to be found, with a given level of confidence. –95% CI is a range of values within which the ‘true’ treatment effect will lie 95% of the time
Generally, 95% CI is calculated as –Sample Estimate ± 1.96 x Standard Error
[email protected]@uab.es 3535
Superiority study
d > 0+ effect
IC95%
d = 0No
differences
d < 0- effect
Test betterControl better
[email protected]@uab.es 3636
0
Lower equivalence boundary
Upper equivalence boundary
Treatment more effective -><- Treatment less effective
Statistical Superiority
Non-inferiority
Equivalence
Inferiority
Treatment-Control
Statistically and Clinically superiority
[email protected]@uab.es 3737
Escalas de medición del efecto
P0 P1 Difabs Difrel RR OR
80.0% 75.0% -5.0% -6.3% 0.938 0.75015.0% 10.0% -5.0% -33.3% 0.667 0.63015.0% 14.0% -1.0% -6.7% 0.933 0.922
Riesgos
[email protected]@uab.es 3838
Cálculo de RR y OR
RR ó OR > 1
RR ó OR =1
RR ó OR < 1
Factor de riesgo
Ausencia de ‘efecto’
Factor protector
[email protected]@uab.es 3939
Cálculo de RR y OR
No Expuestos
Expuestos
Enfermos
Proporción en Expuestos: 0.50
Proporción en no Expuestos: 0.25
RR=2
Odds en Expuestos:
2/2=> 1 Odds en no
Expuestos: 1/3
OR=3
[email protected]@uab.es 4040
dcba
OR
2
10000004
10000008
OR
dcc
baa
RR
2
10000044
10000088
RR
Enfermos No EnfermosExp 8 1000000No Exp 4 1000000
[email protected]@uab.es 4141
Enfermos No Enfermos
Exp 524288 1000000No Exp 262144 1000000
dcba
OR
2
1000000262144
1000000524288
OR
dcc
baa
RR
6560.1
1262144262144
1524288524288
RR
[email protected]@uab.es 4343
Seamos críticosObtención de los resultados
¿Es adecuada la técnica estadística utilizada?
0
5
10
15
20
25
30
35
1 2 3 4 5 6 7
Encuesta AEncuesta B
•T-Test•ANOVA de medidas repetidas
[email protected]@uab.es 4444
[email protected]@uab.es 4545
[email protected]@uab.es 4646
Seamos críticos
Afirmaciones sin especificación de resultados
Porcentajes sin el denominador
Medias sin intervalo de confianza
¿Me fío del valor?
[email protected]@uab.es 4747
Seamos críticos A un paciente se le recomienda una intervención
quirúrgica y pregunta por la probabilidad de sobrevivir.
El cirujano le contesta que en las 30 operaciones que ha realizado, ningún paciente ha muerto.
¿Qué valores de P(morir) son compatibles con esta información, con una confianza del 95%?
Otro ejemplo más
[email protected]@uab.es 4848
Seamos críticosSolución
Límite superior del IC 95% para p=0 con n=30Pr(X=0,n=30,ps) = 0,025
La solución aproximada no sirve. Solución exacta, basada en la binomial:
{0; 0,116}
Incluso si la mortalidad es de un 11,6%, en 30 intervenciones no se observará ninguna muerte con Pr=0,025
[email protected]@uab.es 4949
Seamos críticos Si se disponen de datos...
... No se han de desperdiciar. Unos datos bien ‘torturados’ al final cantan.
¡¡¡ p<0.05 !!!
[email protected]@uab.es 5050
... ¿Y lo del denominador?El famoso perro fantástico
[email protected]@uab.es 5151
Por que después pasa lo que pasa
[email protected]@uab.es 5252
Key statistical issuesKey statistical issues MultiplicityMultiplicity Subgroups: interaction & confoundingSubgroups: interaction & confounding Superiority and non-inferiority (and Superiority and non-inferiority (and )) Adjustment by covariatesAdjustment by covariates Missing dataMissing data OthersOthers
– Interim analysesInterim analyses– Meta-analysis vs one pivotal studyMeta-analysis vs one pivotal study– Flexible designsFlexible designs
[email protected]@uab.es 5353
MULTIPLICITYMULTIPLICITY
[email protected]@uab.es 5454
Torneo Roland Garros 19991ª Ronda
Carlos Moyá vs Markus Hipfl
Moyá Hipfl
J uegos Totales Ganados 22 24Puntos Totales Ganados 147 1461er Servicio 62% 69%Aces 5 3Doble Faltas 4 5% Ganadores con el 1er Servicio 63 de 95 = 66% 61 de 96 = 64%% Ganadores con el 2º Servicio 25 de 58 = 43% 20 de 44 = 45%Ganadores (incluyendo el Servicio) 30 56Errores No Forzados 62 75Puntos de Break Ganados 6 of 21 = 29% 6 of 27 = 22%Aproximaciones a la red 48 of 71 = 68% 29 of 41 = 71%Velocidad del Servicio más Rápido 200 KPH 193 KPHPromedio Velocidad 1er Servicio 157 KPH 141 KPHPromedio Velocidad 2º Servicio 132 KPH 126 KPH
Set 1 2 3 4 5
Carlos Moyá 3 1 6 6 6Markus Hipfl 6 6 4 4 4
[email protected]@uab.es 5555
Lancet 2005; 365: 1591–95
To say it To say it colloquially,colloquially,
torture the data torture the data until they until they speak...speak...
[email protected]@uab.es 5656
Torturing data…Torturing data…– Investigators examine Investigators examine additional endpointsadditional endpoints, ,
manipulate group manipulate group comparisonscomparisons, do many , do many subgroup subgroup analyses, and analyses, and undertake undertake repeated interim analysesrepeated interim analyses..
– Investigators Investigators should should report all analytical report all analytical comparisons comparisons implemented. Unfortunately, they implemented. Unfortunately, they sometimes hide the complete analysis, handicapping sometimes hide the complete analysis, handicapping the readerthe reader’’s understanding of the s understanding of the results.results.
Lancet 2005; 365: 1591–95Lancet 2005; 365: 1591–95
[email protected]@uab.es 5757
Design Conduction Results
[email protected]@uab.es 5858
MultiplicityMultiplicity
K independent hypothesis : HK independent hypothesis : H01 01 , H, H02 02 , ... , H, ... , H0K0K
S significant results ( p<S significant results ( p<) )
Pr (S Pr (S 1 | H 1 | H01 01 H H02 02 ... ... H H0K0K = H = H0.0.) ) = 1 - Pr (S=0|H= 1 - Pr (S=0|H0.0.))
= 1- (1 - = 1- (1 - ))KK
K Pr(S>=1|Ho.) K Pr(S>=1|Ho.)
1 0.0500 10 0.4013
2 0.0975 15 0.5367
3 0.1426 20 0.6415
4 0.1855 25 0.7226
5 0.2262 30 0.7854
[email protected]@uab.es 5959
Same examplesSame examples
case A case B case CVariables 2 5 5Times 2 4 4Subgroups 2 3 3Comparisons 1 1 3
total 8 60 180False positive rate 33.66% 96.61% 99.99%
[email protected]@uab.es 6060
MultiplicityMultiplicity Bonferroni correction Bonferroni correction (simplified (simplified
version)version)
– K tests with level of signification of K tests with level of signification of – Each test can be tested at the Each test can be tested at the /k /k levellevel
Example:Example:– 5 independent tests5 independent tests– Global level of significance=5%Global level of significance=5%– Each test shoud be tested at the 1% level Each test shoud be tested at the 1% level
5% /55% /5 => 1% => 1%
[email protected]@uab.es 6161
But this is the simplified version for the general public
[email protected]@uab.es 6262
Cautionary ExampleCautionary Example RCT to treat rheumatoid arthritis RCT to treat rheumatoid arthritis Basic Clin Med 1981, Basic Clin Med 1981, 1515: 445: 445
Several end‑points repeated at various Several end‑points repeated at various timepoints and various subdivisions timepoints and various subdivisions
48 of these gave p-values < 0.05 48 of these gave p-values < 0.05
But… expect 5% of 850 = 850/20 = 42.5 But… expect 5% of 850 = 850/20 = 42.5
=>so finding 48 is not very impressive =>so finding 48 is not very impressive
[email protected]@uab.es 6363
Some strategies to ‘burden’ with Some strategies to ‘burden’ with multiple contrastsmultiple contrasts
[email protected]@uab.es 6464
Handling Multiplicity in Handling Multiplicity in VariablesVariables
Scenario 1:Scenario 1: One Primary VariableOne Primary Variable– Identify Identify one primary variable one primary variable -- other -- other
variables are secondaryvariables are secondary
– Trial is positive if and only if primary Trial is positive if and only if primary variable shows significant (p < 0.05), variable shows significant (p < 0.05), positive resultspositive results
[email protected]@uab.es 6565
[email protected]@uab.es 6666
Handling Multiplicity in Handling Multiplicity in VariablesVariables
Scenario 2Scenario 2 Divide Type I ErrorDivide Type I Error
– Identify two (or more) co-primary variablesIdentify two (or more) co-primary variables
– Divide the 0.05 experiment-wise Divide the 0.05 experiment-wise Type I error Type I error over these co-primary variables, e.g., 0.04 for over these co-primary variables, e.g., 0.04 for the 1st, and 0.01 for the 2nd co-primary variablethe 1st, and 0.01 for the 2nd co-primary variable
– Trial is positive if at least one of the co-primary Trial is positive if at least one of the co-primary variables shows significant, positive resultsvariables shows significant, positive results
[email protected]@uab.es 6767
Handling Multiplicity in Handling Multiplicity in VariablesVariables
Scenario 3 Scenario 3 Sequentially Rejective Sequentially Rejective ProcedureProcedure– Identify n co-primary variables, e.g., n = 3Identify n co-primary variables, e.g., n = 3– Order obtained p-valuesOrder obtained p-values
Interpret the variable with the highest p-value at the Interpret the variable with the highest p-value at the 0.05 level; 0.05 level;
if significant, then interpret the variable with the 2nd if significant, then interpret the variable with the 2nd highest p-value at the 0.05/2 level; highest p-value at the 0.05/2 level;
if positive, then interpret the variable with the if positive, then interpret the variable with the smallest p-value at the 0.05/3 level. smallest p-value at the 0.05/3 level.
Test procedure stops when a test is not Test procedure stops when a test is not significant.significant.
[email protected]@uab.es 6868
Handling Multiplicity in Handling Multiplicity in VariablesVariables
Scenario 4Scenario 4 HierarchyHierarchy– Prespecify hierarchy Prespecify hierarchy among n co-primary variables,among n co-primary variables,
– All tested at the same levelAll tested at the same level interpret 1st variable at 0.05 level, if significant, then interpret 1st variable at 0.05 level, if significant, then interpret 2nd variable at 0.05 level; if positive, then interpret 2nd variable at 0.05 level; if positive, then interpret 3rd variable at 0.05 level. interpret 3rd variable at 0.05 level. ……
Test procedure stops when a test is not significant.Test procedure stops when a test is not significant.
– Trial is positive if first co-primary variable shows Trial is positive if first co-primary variable shows significant, positive resultsignificant, positive result
[email protected]@uab.es 6969
Secondary VariablesSecondary Variables Secondary variables can only be claimed if Secondary variables can only be claimed if
and only if and only if – the primary variable shows significant results, the primary variable shows significant results,
and and – the comparisons related to the secondary the comparisons related to the secondary
variables also are protected under the same variables also are protected under the same Type I error rate as the primary variable.Type I error rate as the primary variable.
Similar procedures as already discussed Similar procedures as already discussed can be used to protect Type I errorcan be used to protect Type I error
[email protected]@uab.es 7070
Handling Multiplicity in Handling Multiplicity in TreatmentsTreatments
Similar procedures as how to handle Similar procedures as how to handle multiplicity in variables.multiplicity in variables.
Additional procedures are available, Additional procedures are available, mainly geared to very specific settings of mainly geared to very specific settings of the statistical hypotheses.the statistical hypotheses.– Dunnett, Scheffee, REGW, Williams …Dunnett, Scheffee, REGW, Williams …
[email protected]@uab.es 7171
SUBGROUPSSUBGROUPS
[email protected]@uab.es 7272
SubgroupsSubgroups
Indiscriminate subgroup analyses pose Indiscriminate subgroup analyses pose serious serious multiplicity concerns. Problems multiplicity concerns. Problems reverberate throughout the medical reverberate throughout the medical literature. Even after many warnings, literature. Even after many warnings, some investigators doggedly persist in some investigators doggedly persist in undertaking excessive undertaking excessive subgroup subgroup analyses.analyses.
Lancet 2000; 355: Lancet 2000; 355: 1033–341033–34
Lancet 2005; 365: 1657–61Lancet 2005; 365: 1657–61
[email protected]@uab.es 7373
InteracciónInteracción
Edad < 45 años Edad >= 45 años
d=5%
d=0.7% d=11.5%
[email protected]@uab.es 7474
Factores de confusiónFactores de confusión
No fumadores Fumadores
d=6%
d=0%
d=0%
[email protected]@uab.es 7575
Subgroups & Simpson’s Subgroups & Simpson’s ParadoxParadox
Experimental Controln (%) n (%)
ALL Succes 70 (70%) 60 (60%)Failure 30 (30%) 40 (40%)
100 100
[email protected]@uab.es 7676
Subgroups & Simpson’s ParadoxSubgroups & Simpson’s Paradox cont.cont.Experimental Control
n (%) n (%)MALE Succes 10 (33%) 24 (40%)
Failure 20 (67%) 36 (60%)30 60
FEMALE Succes 60 (86%) 36 (90%)Failure 10 (14%) 4 (10%)
70 40
Experimental Controln (%) n (%)
ALL Succes 70 (70%) 60 (60%)Failure 30 (30%) 40 (40%)
100 100
[email protected]@uab.es 7777
SubgroupsSubgroups
AspirinPlaceboVascular Death150 147
Total 1357 1442
11.1% 10.2%
p=0.42045 d=-0.9
ISIS-2: Vascular death by Star signs
Geminis/Libra Other Star Signs
AspirinPlaceboVascular Death 654 868
Total 7228 7157
9.0% 12.1%
p<0.0001 d=3.1
Interacction p = 0.019
Lancet 1988; 2: 349–60.
[email protected]@uab.es 7878
Changes from ISIS-2 resultsChanges from ISIS-2 results
Lancet 2005; 365: 1657–61
[email protected]@uab.es 7979
“The answer to a randomized controlled trial that does not confirm one’s beliefs is not the conduct of several subanalyses until one can see what one believes. Rather, the answer is to re-examine one’s beliefs carefully.”
– BMJ 1999; 318: 1008–09.BMJ 1999; 318: 1008–09.
[email protected]@uab.es 8080
Lancet 2005; 365: 1657–61
[email protected]@uab.es 8181
the question is the question is NOTNOT: ‘Is the treatment : ‘Is the treatment effect in this subgroup statistically effect in this subgroup statistically significantly different from zero?’significantly different from zero?’
BUT…BUT…are there any differences in the are there any differences in the treatment effect treatment effect betweenbetween the various the various subgroups? subgroups?
The correct statistical procedures are The correct statistical procedures are either a test of heterogeneity or a test either a test of heterogeneity or a test for for interactioninteraction
[email protected]@uab.es 8282
SubgroupsSubgroups Recommendations: Recommendations:
– 1) Examine the global effect 1) Examine the global effect – 2) Test for the interaction2) Test for the interaction– 3) Plan 3) Plan adjustments for confirmatory adjustments for confirmatory
analysesanalyses– 4) Some points which increase the 4) Some points which increase the
credibility:credibility:Pre-specificationPre-specificationBiologic plausibilityBiologic plausibility
[email protected]@uab.es 8383
Lancet 2005; 365: 176–86
[email protected]@uab.es 8484
MULTIPLE INSPECTIONSMULTIPLE INSPECTIONS
[email protected]@uab.es 8585
Interim Analyses in the CDP
Z ValueZ ValueZ ValueZ Value
+2+2
+1+1
00
-1-1
-2-2
+2+2
+1+1
00
-1-1
-2-210 20 30 40 50 60 70 80 90 10010 20 30 40 50 60 70 80 90 100
Month of Follow-upMonth of Follow-up
(Month 0 = March 1966, Month 100 = July 1974)
Coronary Drug Project Mortality Surveillance. Circulation. 1973;47:I-1
http://clinicaltrials.gov/ct/show/NCT00000483;jsessionid=C4EA2EA9C3351138F8CAB6AFB723820A?order=23
[email protected]@uab.es 8686
Lancet 2005; 365: 1657–61
[email protected]@uab.es 8787
Tipos de diseño Tipos de diseño secuencialsecuencial
1) Reestimación del tamaño muestral1) Reestimación del tamaño muestral
2) Métodos secuenciales por grupos2) Métodos secuenciales por grupos
3) Aproximación por funciones de gasto de 3) Aproximación por funciones de gasto de
4) Intervalos de confianza repetidos4) Intervalos de confianza repetidos
5) Restricción estocástica5) Restricción estocástica
6) Métodos bayesianos6) Métodos bayesianos
7) Límites continuos (función de verosimilitud)7) Límites continuos (función de verosimilitud)
[email protected]@uab.es 8888
Diseño NO aplicable a método Diseño NO aplicable a método secuencialsecuencial
¿Análisis?
Desarrollo total
Reclutamiento
[email protected]@uab.es 8989
Diseño SÍ aplicable a método Diseño SÍ aplicable a método secuencialsecuencial
Análisis
Desarrollo total
Reclutamiento
[email protected]@uab.es 9090
Métodos secuenciales por Métodos secuenciales por gruposgrupos
Pocock (1977)Pocock (1977) Pruebas de significación repetidasPruebas de significación repetidas K = Nº máximo de inspecciones a K = Nº máximo de inspecciones a
realizarrealizar K fijo K fijo a prioria priori Análisis con pruebas estadísticas Análisis con pruebas estadísticas
clásicas (clásicas (22, , t-test, ...t-test, ...))
[email protected]@uab.es 9191
K z ' z ' z '1 2.782 0.005 2.576 0.010 2.178 0.0292 1.967 0.049 1.969 0.049 2.178 0.029
1 3.438 0.001 2.576 0.010 2.289 0.0222 2.431 0.015 2.576 0.010 2.289 0.0223 1.985 0.047 1.969 0.049 2.289 0.022
1 4.084 0.000 3.291 0.001 2.361 0.0182 2.888 0.004 3.291 0.001 2.361 0.0183 2.358 0.018 3.291 0.001 2.361 0.0184 2.042 0.041 1.969 0.049 2.361 0.018
1 4.555 0.000 3.291 0.001 2.413 0.0162 3.221 0.001 3.291 0.001 2.413 0.0163 2.630 0.009 3.291 0.001 2.413 0.0164 2.277 0.023 3.291 0.001 2.413 0.0165 2.037 0.042 1.969 0.049 2.413 0.016
O'Brien & Fleming Peto Pocock
Group Sequential MethodsGroup Sequential Methods
[email protected]@uab.es 9292
Modelo triangular bilateralModelo triangular bilateral
[email protected]@uab.es 9393
CPMP/EWP/482/99: CPMP/EWP/482/99: PTC on Switching between PTC on Switching between
Superiority and Non-Superiority and Non-InferiorityInferiority
&&
CPMP/EWP/2158/99:CPMP/EWP/2158/99: PtC on the Choice of DeltaPtC on the Choice of Delta
[email protected]@uab.es 9494
RANDOMIZATION & RANDOMIZATION & COVARIATESCOVARIATES
[email protected]@uab.es 9595
AdjustementAdjustement The objective should be not to compensate The objective should be not to compensate
unbalance (randomisation) but to unbalance (randomisation) but to improve the improve the precisionprecision
Avoid to adjust by post-randomization Avoid to adjust by post-randomization variablesvariables
In RCT, never use this widespread strategy: In RCT, never use this widespread strategy: ““adjust by any baseline significant variable adjust by any baseline significant variable (5% or 10% level)(5% or 10% level)””
[email protected]@uab.es 9696
StratificationStratification A priori A priori
May desire to have treatment groups May desire to have treatment groups balanced with respect to prognostic or risk balanced with respect to prognostic or risk factors (co-variates)factors (co-variates)
For large studies, randomization For large studies, randomization ““tendstends”” to give balance to give balance For smaller studies a better guarantee may be neededFor smaller studies a better guarantee may be needed
Useful only to a limited extent (especially for Useful only to a limited extent (especially for small trials) but small trials) but avoid to many variables avoid to many variables (i.e. (i.e. many empty or partly filled strata)many empty or partly filled strata)
[email protected]@uab.es 9797
Testing for Testing for ““baseline baseline homogeneityhomogeneity””
All observed All observed differences differences are known with are known with certainty to be certainty to be due to chance. due to chance.
We We must not must not test for ittest for it: : there is no alternative there is no alternative hypothesis whose truth can be supported by such a test. hypothesis whose truth can be supported by such a test.
If significant, the If significant, the estimatorestimator is still is still unbiasedunbiased
Balance:Balance:– Decreases the variance and Decreases the variance and increases the powerincreases the power. . – It has It has no effect on type I errorno effect on type I error..
[email protected]@uab.es 9898
Observed Unbalanced…Observed Unbalanced… NEVER NEVER justifies the post-hoc justifies the post-hoc
adjustmentadjustment::– RandomizationRandomization is more important is more important– The treatment effect is unbiased without The treatment effect is unbiased without
adjustment (adjustment (randomizationrandomization))– Type I error level takes into account for Type I error level takes into account for
“chance error”“chance error”– Post-hocPost-hoc: data driven analyses : data driven analyses – Multiplicity issues Multiplicity issues : increase type I error by : increase type I error by
allowing a post-hoc adjustmentallowing a post-hoc adjustment
[email protected]@uab.es 9999
Adjusted AnalysesAdjusted Analyses
‘‘ When the potential value of an When the potential value of an adjustment is in doubt, it is often adjustment is in doubt, it is often advisable to nominate the advisable to nominate the unadjusted analysis as the one for unadjusted analysis as the one for primary attentionprimary attention, the adjusted , the adjusted analysis being supportive.analysis being supportive.’’
[email protected]@uab.es 100100
Ajuste por covariablesAjuste por covariables
Definición Definición a prioria priori La aparición de La aparición de desigualdades basalesdesigualdades basales
NONO justifica el ajuste justifica el ajuste per se:per se:– Se da más importancia a la randomizaciónSe da más importancia a la randomización– Peligro de los análisis post-hocPeligro de los análisis post-hoc– MultiplicidadMultiplicidad
Como estrategia general, el Como estrategia general, el ajuste por ajuste por variables significativas basalesvariables significativas basales (ej, (ej, p<0.1 o p<0.05) a priori: p<0.1 o p<0.05) a priori: NO NO es válidaes válida
[email protected]@uab.es 101101
Definición de las distintas Definición de las distintas poblaciones de un estudiopoblaciones de un estudio
[email protected]@uab.es 102102
Objetivo: Evaluar la eficacia de un programa para reducir el peso frente a los a los consejos habituales
Diseño: Ensayo Clínico Aleatorio
Candidatos: 790
Obesos: 320
Grupo intervención: 161
Grupo control: 159
Rechazo: 59Petición espontánea: 54
Acaban: 102 Acaban: 105
[email protected]@uab.es 103103
Grupo intervención: 161
Grupo control: 159
Rechazo: 59Petición espontánea: 54
Acaban: 102 Acaban: 105
Grupo intervención Grupo Control
Opción A 161 159
Opción B 102 105
Opción C 59 54
Opción D 156 164
[email protected]@uab.es 104104
MISSING DATAMISSING DATA
[email protected]@uab.es 105105
Ex: LOCF & lineal extrapolation
36
32
28
24-
20
16
12
8
4 0 2 4 6 8 10 12 14 16 18 Time (months)
LOCF
Lineal Regresion
Bias
Ad
as-
Cog
> Worse
< Better
[email protected]@uab.es 106106
Ex: Early drop-out due to AE
Ad
as-
Cog
36
32
28
24-
20
16
12
8
4 0 2 4 6 8 10 12 14 16 18 Time
(months)
Placebo
Active
> Worse
< Better
Bias:
Favours
Active
[email protected]@uab.es 107107
Ex: Early drop-out due to lack of Efficacy
Ad
as-
Cog
36
32
28
24-
20
16
12
8
4 0 2 4 6 8 10 12 14 16 18 Time (months)
Placebo
Active
> Worse
< Better
Bias:
Favours
Placebo
[email protected]@uab.es 108108
RND
B
Baseline Last Visit
≠ Frecuencies
A
Drop-outs and missing dataDrop-outs and missing data
A A A A A AB B A
Visit 2Visit 1
A
[email protected]@uab.es 109109
RND
Baseline Last Visit
≠ Timing
A
Drop-outs and missing dataDrop-outs and missing data
A A A A B B
Visit 2Visit 1
B B B
[email protected]@uab.es 110110
MDMD e incorrecto uso de poblaciones e incorrecto uso de poblaciones (1)(1)
DiseñoDiseño Cirugía vs Tratamiento Médico en Cirugía vs Tratamiento Médico en
estenosis carotidea bilateral (Sackket et estenosis carotidea bilateral (Sackket et al., 1985)al., 1985)
Variable principalVariable principal: Número de pacientes : Número de pacientes que presenten TIA, ACV o muerteque presenten TIA, ACV o muerte
Distribución de los pacientes:Distribución de los pacientes: Pacientes randomizados:Pacientes randomizados: 167167 Tratamiento quirúrgico: Tratamiento quirúrgico: 94 94 Tratamiento médico:Tratamiento médico: 73 73
– Pacientes que no completaron el Pacientes que no completaron el estudio debido a ACV en las fases estudio debido a ACV en las fases iniciales de hospitalización: iniciales de hospitalización:
Tratamiento quirúrgico: 15 pacientesTratamiento quirúrgico: 15 pacientesTratamiento médico:Tratamiento médico: 01 pacientes 01 pacientes
[email protected]@uab.es 111111
MDMD e incorrecto uso de poblaciones e incorrecto uso de poblaciones (2)(2)
Población Por Protocolo (PP):Población Por Protocolo (PP):
Pacientes que hayan completado el estudioPacientes que hayan completado el estudio
AnálisisAnálisis
– Tratamiento quirúrgico:Tratamiento quirúrgico: 43 / (94 - 15) = 43 / 79 = 54%43 / (94 - 15) = 43 / 79 = 54%
– Tratamiento médico:Tratamiento médico: 53 / (73 - 1) = 53 / 72 = 74%53 / (73 - 1) = 53 / 72 = 74%
– Reducción del riesgo:Reducción del riesgo: 27%, p = 0.0227%, p = 0.02
Primer análisis que se realiza :
[email protected]@uab.es 112112
MDMD e incorrecto uso de poblaciones e incorrecto uso de poblaciones (3)(3)
El análisis definitivo queda de la siguiente forma :
Población Intención de Tratar (ITT):Población Intención de Tratar (ITT):
Todos los pacientes randomizadosTodos los pacientes randomizados
AnálisisAnálisis– Tratamiento quirúrgico:Tratamiento quirúrgico: 58 / 94 = 62%58 / 94 = 62%– Tratamiento médico:Tratamiento médico: 54 / 73 = 74%54 / 73 = 74%– Reducción del riesgo:Reducción del riesgo: 18%, p = 0.0918%, p = 0.09 (PP: 27%, p = (PP: 27%, p =
0.02)0.02)
Conclusiones: La población correcta de análisis es la ITT El tratamiento quirúrgico no ha demostrado ser significativamente superior al tratamiento médico
[email protected]@uab.es 113113
Handling of MDHandling of MD Methods for imputation:Methods for imputation:
– Many techniquesMany techniques– No gold standard for every situationNo gold standard for every situation– In principle, all methods may be valid:In principle, all methods may be valid:
Simple methods to more complex:Simple methods to more complex:– From LOCF to multiple imputation methodsFrom LOCF to multiple imputation methods– Worst Case, “Mean methods”Worst Case, “Mean methods”
Multiple ImputationMultiple Imputation But their appropriateness has to be justifiedBut their appropriateness has to be justified
Statistical approaches less sensitive to MDStatistical approaches less sensitive to MD::– Mixed modelsMixed models– Survival modelsSurvival models
They assume no relationship between treatment and the They assume no relationship between treatment and the missing outcome, and generally this cannot be assumed.missing outcome, and generally this cannot be assumed.
[email protected]@uab.es 114114
CONCLUSIONCONCLUSION
[email protected]@uab.es 115115
[email protected]@uab.es 116116
[email protected]@uab.es 117117
[email protected]@uab.es 118118
JAMA 2002; 287: 1807-1814
[email protected]@uab.es 119119
Effect Size & Sample SizeRelative Effect Absolute Size Power* difference (%) (%) (mmHg)----------------------------------- 0% 4.9% 0.0 10% 5.9% 0.2 20% 8.5% 0.4 30% 13.3% 0.6 40% 20.2% 0.8 50% 28.2% 1.0 60% 39.3% 1.2 70% 49.3% 1.4 80% 61.1% 1.6 90% 71.0% 1.8 100% 80.4% 2.0 -----------------------------------*Statistical power assuming constant variability (SD=20mmHg)
[email protected]@uab.es 120120
[email protected]@uab.es 121121
CPMP/EWP/482/99: CPMP/EWP/482/99: PTC on Switching between PTC on Switching between
Superiority and Non-Superiority and Non-InferiorityInferiority
&&
CPMP/EWP/2158/99:CPMP/EWP/2158/99: PtC on the Choice of DeltaPtC on the Choice of Delta
[email protected]@uab.es 122122
ENSAYOS DE NO-INFERIORIDADENSAYOS DE NO-INFERIORIDAD
NECESIDADNECESIDAD Implicaciones legales.Implicaciones legales. Implicaciones metodológicas.Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de Limitaciones éticas y prácticas al uso de
placebo.placebo. Limitaciones prácticas a la superioridad Limitaciones prácticas a la superioridad
frente a control activo.frente a control activo. Necesidad de información comparativa.Necesidad de información comparativa. Posibles valores añadidos.Posibles valores añadidos.
[email protected]@uab.es 123123
[email protected]@uab.es 124124
ENSAYOS DE NO-INFERIORIDADENSAYOS DE NO-INFERIORIDAD
NECESIDADNECESIDAD Implicaciones legales. Implicaciones legales. Implicaciones metodológicas.Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de Limitaciones éticas y prácticas al uso de
placebo.placebo. Limitaciones prácticas a la superioridad Limitaciones prácticas a la superioridad
frente a control activo.frente a control activo. Necesidad de información comparativa.Necesidad de información comparativa. Posibles valores añadidos.Posibles valores añadidos.
[email protected]@uab.es 125125
Aproximación con el PoderAproximación con el Poder(prueba clásica + cálculo del poder)(prueba clásica + cálculo del poder)
[email protected]@uab.es 126126
ENSAYOS DE NO-INFERIORIDADENSAYOS DE NO-INFERIORIDAD
NECESIDADNECESIDAD Implicaciones legales. Implicaciones legales. Implicaciones metodológicas.Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de Limitaciones éticas y prácticas al uso de
placebo.placebo. Limitaciones prácticas a la superioridad Limitaciones prácticas a la superioridad
frente a control activo.frente a control activo. Necesidad de información comparativa.Necesidad de información comparativa. Posibles valores añadidos.Posibles valores añadidos.
[email protected]@uab.es 127127
[email protected]@uab.es 128128
ENSAYOS DE NO-INFERIORIDADENSAYOS DE NO-INFERIORIDAD
NECESIDADNECESIDAD Implicaciones legales. Implicaciones legales. Implicaciones metodológicas.Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de Limitaciones éticas y prácticas al uso de
placebo.placebo. Limitaciones prácticas a la superioridad Limitaciones prácticas a la superioridad
frente a control activo.frente a control activo. Necesidad de información comparativa.Necesidad de información comparativa. Posibles valores añadidos.Posibles valores añadidos.
[email protected]@uab.es 129129
Lancet 2001,356: 1668-75
[email protected]@uab.es 130130
ENSAYOS DE NO-INFERIORIDADENSAYOS DE NO-INFERIORIDAD
NECESIDADNECESIDAD Implicaciones legales. Implicaciones legales. Implicaciones metodológicas.Implicaciones metodológicas. Limitaciones éticas y prácticas al uso de Limitaciones éticas y prácticas al uso de
placebo.placebo. Limitaciones prácticas a la superioridad Limitaciones prácticas a la superioridad
frente a control activo.frente a control activo. Necesidad de información comparativa.Necesidad de información comparativa. Posibles valores añadidos.Posibles valores añadidos.
[email protected]@uab.es 131131
Valores añadidosValores añadidos
Posología: 1 vez al díaPosología: 1 vez al día Vía: vía oralVía: vía oral Seguridad: Acontecimientos adversosSeguridad: Acontecimientos adversos Poblaciones especiales: Ancianos, Poblaciones especiales: Ancianos,
pediatríapediatría InteraccionesInteracciones
[email protected]@uab.es 132132
Ensayos de EquivalenciaEnsayos de Equivalencia
Ensayos de Ensayos de bioequivalenciabioequivalencia (producto genérico vs (producto genérico vs comercializado)comercializado)
Nuestro producto no es peor y puede Nuestro producto no es peor y puede presentar otras ventajas (seguridad, presentar otras ventajas (seguridad, comodidad posológica …)comodidad posológica …)– No-inferioridadNo-inferioridad
[email protected]@uab.es 133133
ESTUDIO DE SUPERIORIDADESTUDIO DE SUPERIORIDAD
d > 0+ efecto
IC95%
d = 0No hay
diferencia
d < 0- efecto
Mejor TestMejor Control
[email protected]@uab.es 134134
ESTIMACIÓN POR INTERVALOESTIMACIÓN POR INTERVALO (ESTUDIO DE SUPERIORIDAD) (ESTUDIO DE SUPERIORIDAD)
Es estadísticamente significativa
d = 0No hay
diferencia
d < 0- efecto
d > 0+ efecto
IC95%
Mejor TestMejor Control
[email protected]@uab.es 135135
ESTIMACIÓN POR INTERVALOESTIMACIÓN POR INTERVALO(ESTUDIO DE SUPERIORIDAD)(ESTUDIO DE SUPERIORIDAD)
Es estadísticamente significativa con P=0,05
(justo en el límite)
d > 0+ efecto
d = 0No hay
diferencia
d < 0- efecto
IC 95%
Mejor TestMejor Control
[email protected]@uab.es 136136
ESTUDIO DE EQUIVALENCIAESTUDIO DE EQUIVALENCIA
d > 0+ efecto
d = 0No hay diferencia
d < 0- efecto
-d +d
Región de equivalencia
clínica
Delta: (Delta: ())• mayor diferencia sin relevancia clínica mayor diferencia sin relevancia clínica
o o • la menor diferencia con relevancia clínicala menor diferencia con relevancia clínica
[email protected]@uab.es 138138
NO-INFERIORIDAD TERAPÉUTICANO-INFERIORIDAD TERAPÉUTICA
No-Inferioridad No No-Inferioridad
0-
Mejor TestMejor Control
[email protected]@uab.es 139139
Main effi cacy End-Point
40%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Active Placebo
30%
B
A
P
1/2 ?1/3 ?
[email protected]@uab.es 140140
Main effi cacy End-Point
40%
15%
45%40%
20%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Active 1 Active 2 Active 3 Placebo 1 Placebo 2 Placebo 3
[email protected]@uab.es 141141
Main effi cacy End-Point
33%
8%
33%
13%
3%
40%
15%
58%
40%
20%
10%
47%
22%
65%
47%
27%
17%
51%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Active 1 Active 2 Active 3 Placebo 1 Placebo 2 Placebo 3
[email protected]@uab.es 142142
Main effi cacy End-Point
40%
10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Active REF Placebo Active Test
30%