Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
#CMIMI18#CMIMI18
Real-World Performance ofDeep-Learning-Based System for
Intracranial Hemorrhage DetectionSehyo Yune, MD MPH MBA
Hyunkwang Lee, Stuart Pomerantz, Javier Romero, Shahmir Kamalian,Ramon Gonzalez, Michael Lev, Synho Do
Department of RadiologyMassachusetts General Hospital
LABORATORY OFMEDICAL IMAGING
AND COMPUTATION
#CMIMI18
Promise vs. Reality
#CMIMI18
Intracranial Hemorrhage Detection System
SAH 99.99%
IPH 0.24%
SDH 0.08%
EDH 0.06%
IVH 0.01%
#CMIMI18
Data Collection
Searched the institutional research database for non-contrast head CT scans acquired February 2005 - August 2017Only used 5-mm axial images Exclusion Criteria: History of brain surgery Intracranial tumor Intracranial device placement Skull fracture Cerebral infarct
#CMIMI18
Data Collection
Data annotation Development set: consensus of 5 neuroradiologists at slice level Test set: confirmation of clinical report by 1 neuroradiologist at case level
Development dataset Test dataset
# of cases # of images # of cases # of images
ICH (+) 625 5,240 100 3,525
ICH (-) 279 9,518 100 3,411
Total 904 14,758 200 6,936
#CMIMI18
Performance of ICH Detection System
AUC, 0.99398% Sensitivity95% Specificity
Dots denote radiologist performance: 1st year resident, 2nd year resident, 3rd year resident,attending radiologist (9-year experience), attending radiologist (20-year experience)
ROC curve for detection of ICH
#CMIMI18
Will This Work in the Real-World?
#CMIMI18
Real-world data collection
All consecutive cases of non-contrast head CT acquired at the emergency department from September – November 2017 2,606 cases collected Labeled by using natural language processing of clinical reports 163 ICH (+), 2,443 ICH (-)
#CMIMI18
Model Performance Comparison
Model Prediction
ICH (+) ICH (-)
Clinical report ICH (+) 142 21 Sensitivity: 87.1%
ICH (-) 1,018 1,425 Specificity: 58.3%
PPV: 12.2% NPV: 98.5%
Model Prediction
ICH (+) ICH (-)
Clinical report + Expert confirmation
ICH (+) 98 2 Sensitivity: 98%
ICH (-) 5 95 Specificity: 95%
PPV: 95.1% NPV: 97.9%
Selected test dataset
Real-world test dataset
NPV, negative predictive value; PPV, positive predictive value
#CMIMI18
Model Performance Comparison
ROC curve on selected data set (n=200) ROC curve on real-world data set (n=2,606)
#CMIMI18
Why?
#CMIMI18
Difference in Vendor Distribution
Development Selected test Real-world test
CT Vendor ICH(+) ICH(-) ICH(+) ICH(-) ICH(+) ICH(-)
A 725 233 68 87 96 1,445
B 101 1 32 12 67 998
C 55 12
D 23 33 1
Model Prediction
ICH (+) ICH (-)
Clinical report
ICH (+) 79 17 Sensitivity: 82.3%
ICH (-) 383 1,062 Specificity: 73.5%
Data Distribution
Vendor A Performance in real-world dataset
Vender B Performance in real-world dataset
Model Prediction
ICH (+) ICH (-)
Clinical report
ICH (+) 63 4 Sensitivity: 94.0%
ICH (-) 635 363 Specificity: 36.4%
#CMIMI18
Difference in Vendor Distribution
Development Selected test Real-world test
CT Vendor ICH(+) ICH(-) ICH(+) ICH(-) ICH(+) ICH(-)
A 725 233 68 87 96 1,445
B 101 1 32 12 67 998
C 55 12
D 23 33 1
Model Prediction
ICH (+) ICH (-)
Clinical report
ICH (+) 79 17 Sensitivity: 82.3%
ICH (-) 383 1,062 Specificity: 73.5%
Data Distribution
Vendor A Performance in real-world dataset
Vender B Performance in real-world dataset
Model Prediction
ICH (+) ICH (-)
Clinical report
ICH (+) 63 4 Sensitivity: 94.0%
ICH (-) 635 363 Specificity: 36.4%
#CMIMI18
Review of False Negative Cases
21 FN cases reviewed by neuroradiologist with > 20-year experience
8 no acute bleeding (report hedging)
11 small bleeding not visualized on axial CT images 2 small (3mm, 10mm) acute subdural hematoma
#CMIMI18
Review of False Negative Cases
8 no acute bleeding (report hedging) 11 small bleeding not visualized on axial CT images
2 small (3mm, 10mm) acute subdural hematoma
#CMIMI18
Review of False Positive Cases
1,018 false-positive cases (5,269 slices) split into 5 sets to be reviewed by 5 neuroradiologists
Hyperdense falx or tentorium (1,580 slices / 420 cases) CT artifacts (1,545 slices / 463 cases) Bleeding (875 slices / 92 cases) Other non-bleed pathology (663 slices / 149 cases) Calcification (348 slices / 130 cases) Others (743 slices / 373 cases)
#CMIMI18
False Positive Case
Falx hyperdensity
#CMIMI18
Hyperdense Falx/tentorium
False-positive case True-negative case True-positive case
#CMIMI18
CT Artifact
Motion, streak, beam hardening, head tilt, etc.
#CMIMI18
Beam Hardening Artifact in Dentate Nucleus
False-positive case True-negative case
#CMIMI18
Other False Positive Cases
Bleeding (875 slices / 92 cases) Chronic ICH, extracranial bleeding, hemorrhagic tumor
Other non-bleed pathology (663 slices / 149 cases) Encephalomalacia, meningioma, metastatic mass, vasogenic edema, post-
surgical change, old infarct
Others (743 slices / 373 cases) Dense blood vessels, deep sulcus, subdural hygroma
#CMIMI18
Review of False Positive Cases
Falx/tentorium CT artifacts Bleeding Other
pathology Calcification Others Total
A 113 (26.9%) 121 (26.1%) 43 (46.7%) 76 (51.0%) 72 (55.4%) 128 (34.3%) 383 (37.6%)
B 307 (73.1%) 342 (73.9%) 49 (53.3%) 73 (49.0%) 58 (44.6%) 245 (65.7%) 635 (62.4%)
Total 420 463 92 149 130 373 1,018
Red texts indicate significantly larger numbers. Statistical significance was determined by Pearson’s χ2 test.
Number of cases acquired by scanners from the two vendors by FP category
#CMIMI18
Future Work
Validate the labels assigned by NLP of clinical reports Improve the model by re-training the CNNs Distinguish chronic vs. subacute vs. acute bleeding Recognize other pathologies
Validate the improved model in a new setting (different CT manufacturers, image acquisition/reconstruction protocols, patient populations)Optimize parameters for each clinical setting before deployment
#CMIMI18
Take Home Message
Know the reality, use it accordingly