View
156
Download
0
Embed Size (px)
DESCRIPTION
Lecture on validation measures in epidemiology for master students in publiv health and epidemiology at Karolinska Institutet in Stockholm, Sweden on 24 October 2013.
Citation preview
ACCURACY (AND OTHER VALIDATION MEASURES)
Adina L. Feldman, M.Sc.
Karolinska Institutet Department of Medical Epidemiology and Biostatistics
e-mail: [email protected] tel. 08 5248 2313
24 October 2013 Adina L. Feldman 1
24 October 2013 Adina L. Feldman 2
SYSTEMATIC ERROR
RA
ND
OM
ER
RO
R
Low High
Low
H
igh
Validity
Accuracy is a type of systematic error (potential bias)
(Random error/precision is related to power, e.g. size of study sample)
Validity is what we call the certainty (accuracy) of a proxy measure/test
Why is knowing the validity of a measure important?
Consider these examples: What is the validity of breast cancer screening (mammography)?
What is the validity of home pregnancy tests?
What is the validity of self-reported height? …weight?
What is the validity of register-based Parkinson’s disease diagnoses?
24 October 2013 Adina L. Feldman 3
24 October 2013 Adina L. Feldman 4
24 October 2013 Adina L. Feldman 5
24 October 2013 Adina L. Feldman 6
24 October 2013 Adina L. Feldman 7
Gold Standard
= The best possible available measure agianst which the measure under study is validated
Discuss: What gold standard was used in these validations? Breast cancer screening (mammography)?
Home pregnancy tests?
Self-reported height? …weight?
Register-based Parkinson’s disease diagnoses?
24 October 2013 Adina L. Feldman 8
24 October 2013 Adina L. Feldman 9
Gold Standard
Binary Continuous
Test
mea
sure
Bin
ary
Con
tiuou
s
Discuss: Where do these validations fit in?
Breast cancer screening (mammography)?
Home pregnancy tests?
Self-reported height? …weight?
Register-based Parkinson’s disease diagnoses?
24 October 2013 Adina L. Feldman 10
Gold Standard
Binary Continuous
Test
mea
sure
Bin
ary
Reg PDx X
Con
tiuou
s
Preg test BC screening
Height Weight
Discuss: Where do these validations fit in?
Breast cancer screening (mammography)?
Home pregnancy tests?
Self-reported height? …weight?
Register-based Parkinson’s disease diagnoses?
24 October 2013 Adina L. Feldman 11
Gold Standard
Binary Continuous
Test
mea
sure
Bin
ary Sensitivity,
Specificity, etc.
X
Con
tiuou
s
ROC-curves Correlations,
Bland-Altman plots
Different validation methods are used for different types of validation studies!
These are covered (or at least mentioned) today
24 October 2013 Adina L. Feldman 12
Gold Standard Validity measures for binary outcomes
(Print and pin to your office wall!)
Out
com
e m
easu
re
Positive +
Negative -
Positive +
True Positive (TP)
False Positive (FP)
Positive Predictive Value
(PPV)
=TP/ (TP+FP)
Negative -
False Negative (FN)
True Negative (TN)
Negative Predictive Value
(NPV)
=TN/ (TN+FN)
Sensitivity Specificity
=TP/ (TP+FN)
=TN/ (TN+FP)
24 October 2013 Adina L. Feldman 13
Gold Standard
O
utco
me
mea
sure
Positive +
Negative -
Positive +
True Positive (TP)
False Positive (FP)
Positive Predictive Value
(PPV)
=TP/ (TP+FP)
Negative -
False Negative (FN)
True Negative (TN)
Negative Predictive Value
(NPV)
=TN/ (TN+FN)
Sensitivity Specificity
=TP/ (TP+FN)
=TN/ (TN+FP)
24 October 2013 Adina L. Feldman 14
Gold Standard These are less commonly used measures, but still good to know
Out
com
e m
easu
re
Positive +
Negative -
Positive +
True Positive (TP)
False Positive (FP)
False Positive Rate (FPR),
cPPV (=1-PPV)
=FP/ (TP+FP)
Negative -
False Negative (FN)
True Negative (TN)
False Negative Rate (FNR),
cNPV (=1-NPV)
=FN/ (TN+FN)
True Positive Rate
FPR (OBS!!) (=1-Spec.) Accuracy
=Sens. =FP/ (FP+TN)
=TP+TN/ (TP+TN+FP+FN)
Misclassification
FN and FP are misclassifications
Consider cause of misclassification FN: Why are some cases not detected?
FP: Why are some noncases given erroneous diagnoses?
Differential misclassification: Non-random distribution of TP and FN (with regards to the exposure)
24 October 2013 Adina L. Feldman 15
Misclassification
Discuss: What could be the cause of FP and FN in these validations? What could be the consequences of misclassification here? Breast cancer screening (mammography)?
Home pregnancy tests?
Self-reported height? …weight?
Register-based Parkinson’s disease diagnoses?
24 October 2013 Adina L. Feldman 16
Fictional Example 1 Cohort study of 10,000 participants (random population-based sample)
Binary proxy measure e.g. self-reported myocardial infarction (”heart attack”) ever/never
Binary Gold Standard e.g. myocardial infarction confirmed according to best clinical practice
24 October 2013 Adina L. Feldman 17
24 October 2013 Adina L. Feldman 18
Gold Standard Example 1
Out
com
e m
easu
re
Positive +
Negative -
Positive +
90 5 PPV?
Negative -
10 9,895 NPV?
Sens.? Spec.? GS prev.?
OM prev.?
24 October 2013 Adina L. Feldman 19
Gold Standard Example 1 ↓prevalence
↑PPV ↑Sens.
Out
com
e m
easu
re
Positive +
Negative -
Positive +
90 5 PPV 94.7%
Negative -
10 9,895 NPV 100%
(99.899%)
Sens. Spec. GS prev. 1.0%
90.0% 100% (99.95%)
OM prev. 0.95%
Fictional Example 2 Cohort study of 10,000 participants (random population-based sample)
Binary proxy measure e.g. self-reported influenza during one winter season yes/no
Binary Gold Standard e.g. laboratory-confirmed infection with influenza virus
24 October 2013 Adina L. Feldman 20
24 October 2013 Adina L. Feldman 21
Gold Standard Example 2
Out
com
e m
easu
re
Positive +
Negative -
Positive +
1950 1400 PPV?
Negative -
50 6600 NPV?
Sens.? Spec.? GS prev.?
OM prev.?
24 October 2013 Adina L. Feldman 22
Gold Standard Example 2 ↑prevalence
↓PPV ↑Sens.
Out
com
e m
easu
re
Positive +
Negative -
Positive +
1950 1400 PPV 58.2%
Negative -
50 6600 NPV 99.2%
Sens. Spec. GS prev. 20.0%
97.5% 82.5% OM prev. 33.5%
Discussion points
Many validation study have only available either: Only Gold Standard positive cases
Only proxy outcome positive cases
What validity measures can be calculated in each instance?
Two-phase screening is a very common approach to diagnosing disease, e.g. Breast cancer (mammography followed by ultrasound, cytology) What type of validity is most important in each phase?
24 October 2013 Adina L. Feldman 23
24 October 2013 Adina L. Feldman 24
24 October 2013 Adina L. Feldman 25
24 October 2013 Adina L. Feldman 26
Gold Standard
Binary Continuous
Test
mea
sure
Bin
ary Sensitivity,
Specificity, etc.
X
Con
tiuou
s
ROC-curves Correlations,
Bland-Altman plots
Different validation methods are used for different types of validation studies!
These are covered (or at least mentioned) today
Freq
uenc
y of
cas
es
Measures with discrimination threshold for binary outcomes
24 October 2013 Adina L. Feldman 27
E.g. biomarker concentration in blood
GS-
GS+
Freq
uenc
y of
cas
es
Measures with discrimination threshold for binary outcomes
24 October 2013 Adina L. Feldman 28
E.g. biomarker concentration in blood
GS-
GS+ TN
FN FP
TP
24 October 2013 Adina L. Feldman 29
24 October 2013 Adina L. Feldman 30
Gold Standard: Reduced insulin sensitivity based on established clinical index cutoff
Proxy test: Appendicular lean body mass (LBM) index (kg/m2)
The threshold for LBM is varied and for each step the sensitivity and 1-specificity for the GS are calculated and plotted
The goal is to determine the optimal threshold for LBM in predicting reduced insulin sensitivity
AUC = Area Under the Curve (%) (Bigger = Better)
24 October 2013 Adina L. Feldman 31
24 October 2013 Adina L. Feldman 32
24 October 2013 Adina L. Feldman 33
Gold Standard
Binary Continuous
Test
mea
sure
Bin
ary Sensitivity,
Specificity, etc.
X
Con
tiuou
s
ROC-curves Correlations,
Bland-Altman plots
Different validation methods are used for different types of validation studies!
These are covered (or at least mentioned) today
24 October 2013 Adina L. Feldman 34
24 October 2013 Adina L. Feldman 35
24 October 2013 Adina L. Feldman 36
?
24 October 2013 Adina L. Feldman 37
24 October 2013 Adina L. Feldman 38
24 October 2013 Adina L. Feldman 39
24 October 2013 Adina L. Feldman 40
Pearson correlation coefficient overall = 0.61
Afternoon group excercise: Ad hoc study of the validity of self-reported height Define Gold Standard
Method of ascertainment of self-reported height
Collect data Proxy
Gold Standard
Using Excel Plot correlation (scatter plot)
Brand-Altman plot
Draw conclusion
24 October 2013 Adina L. Feldman 41
24 October 2013 Adina L. Feldman 43
Welcome to my PhD dissertation defence 10 Januari 2014, at 9 am in Andreas Vesalius,
Karolinska Institutet Campus Solna
Dissertation title: ”If I Only Had a Brain
– Epidemiological Studies of Parkinson’s Disease”
Thank You! (See you this afternoon)