24
#CMIMI18 #CMIMI18 Real-World Performance of Deep-Learning-Based System for Intracranial Hemorrhage Detection Sehyo Yune, MD MPH MBA Hyunkwang Lee, Stuart Pomerantz, Javier Romero, Shahmir Kamalian, Ramon Gonzalez, Michael Lev, Synho Do Department of Radiology Massachusetts General Hospital LABORATORY OF MEDICAL IMAGING AND COMPUTATION

Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18#CMIMI18

Real-World Performance ofDeep-Learning-Based System for

Intracranial Hemorrhage DetectionSehyo Yune, MD MPH MBA

Hyunkwang Lee, Stuart Pomerantz, Javier Romero, Shahmir Kamalian,Ramon Gonzalez, Michael Lev, Synho Do

Department of RadiologyMassachusetts General Hospital

LABORATORY OFMEDICAL IMAGING

AND COMPUTATION

Page 2: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Promise vs. Reality

Page 3: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Intracranial Hemorrhage Detection System

SAH 99.99%

IPH 0.24%

SDH 0.08%

EDH 0.06%

IVH 0.01%

Page 4: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Data Collection

Searched the institutional research database for non-contrast head CT scans acquired February 2005 - August 2017Only used 5-mm axial images Exclusion Criteria: History of brain surgery Intracranial tumor Intracranial device placement Skull fracture Cerebral infarct

Page 5: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Data Collection

Data annotation Development set: consensus of 5 neuroradiologists at slice level Test set: confirmation of clinical report by 1 neuroradiologist at case level

Development dataset Test dataset

# of cases # of images # of cases # of images

ICH (+) 625 5,240 100 3,525

ICH (-) 279 9,518 100 3,411

Total 904 14,758 200 6,936

Page 6: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Performance of ICH Detection System

AUC, 0.99398% Sensitivity95% Specificity

Dots denote radiologist performance: 1st year resident, 2nd year resident, 3rd year resident,attending radiologist (9-year experience), attending radiologist (20-year experience)

ROC curve for detection of ICH

Page 7: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Will This Work in the Real-World?

Page 8: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Real-world data collection

All consecutive cases of non-contrast head CT acquired at the emergency department from September – November 2017 2,606 cases collected Labeled by using natural language processing of clinical reports 163 ICH (+), 2,443 ICH (-)

Page 9: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Model Performance Comparison

Model Prediction

ICH (+) ICH (-)

Clinical report ICH (+) 142 21 Sensitivity: 87.1%

ICH (-) 1,018 1,425 Specificity: 58.3%

PPV: 12.2% NPV: 98.5%

Model Prediction

ICH (+) ICH (-)

Clinical report + Expert confirmation

ICH (+) 98 2 Sensitivity: 98%

ICH (-) 5 95 Specificity: 95%

PPV: 95.1% NPV: 97.9%

Selected test dataset

Real-world test dataset

NPV, negative predictive value; PPV, positive predictive value

Page 10: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Model Performance Comparison

ROC curve on selected data set (n=200) ROC curve on real-world data set (n=2,606)

Page 11: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Why?

Page 12: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Difference in Vendor Distribution

Development Selected test Real-world test

CT Vendor ICH(+) ICH(-) ICH(+) ICH(-) ICH(+) ICH(-)

A 725 233 68 87 96 1,445

B 101 1 32 12 67 998

C 55 12

D 23 33 1

Model Prediction

ICH (+) ICH (-)

Clinical report

ICH (+) 79 17 Sensitivity: 82.3%

ICH (-) 383 1,062 Specificity: 73.5%

Data Distribution

Vendor A Performance in real-world dataset

Vender B Performance in real-world dataset

Model Prediction

ICH (+) ICH (-)

Clinical report

ICH (+) 63 4 Sensitivity: 94.0%

ICH (-) 635 363 Specificity: 36.4%

Page 13: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Difference in Vendor Distribution

Development Selected test Real-world test

CT Vendor ICH(+) ICH(-) ICH(+) ICH(-) ICH(+) ICH(-)

A 725 233 68 87 96 1,445

B 101 1 32 12 67 998

C 55 12

D 23 33 1

Model Prediction

ICH (+) ICH (-)

Clinical report

ICH (+) 79 17 Sensitivity: 82.3%

ICH (-) 383 1,062 Specificity: 73.5%

Data Distribution

Vendor A Performance in real-world dataset

Vender B Performance in real-world dataset

Model Prediction

ICH (+) ICH (-)

Clinical report

ICH (+) 63 4 Sensitivity: 94.0%

ICH (-) 635 363 Specificity: 36.4%

Page 14: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Review of False Negative Cases

21 FN cases reviewed by neuroradiologist with > 20-year experience

8 no acute bleeding (report hedging)

11 small bleeding not visualized on axial CT images 2 small (3mm, 10mm) acute subdural hematoma

Page 15: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Review of False Negative Cases

8 no acute bleeding (report hedging) 11 small bleeding not visualized on axial CT images

2 small (3mm, 10mm) acute subdural hematoma

Page 16: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Review of False Positive Cases

1,018 false-positive cases (5,269 slices) split into 5 sets to be reviewed by 5 neuroradiologists

Hyperdense falx or tentorium (1,580 slices / 420 cases) CT artifacts (1,545 slices / 463 cases) Bleeding (875 slices / 92 cases) Other non-bleed pathology (663 slices / 149 cases) Calcification (348 slices / 130 cases) Others (743 slices / 373 cases)

Page 17: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

False Positive Case

Falx hyperdensity

Page 18: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Hyperdense Falx/tentorium

False-positive case True-negative case True-positive case

Page 19: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

CT Artifact

Motion, streak, beam hardening, head tilt, etc.

Page 20: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Beam Hardening Artifact in Dentate Nucleus

False-positive case True-negative case

Page 21: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Other False Positive Cases

Bleeding (875 slices / 92 cases) Chronic ICH, extracranial bleeding, hemorrhagic tumor

Other non-bleed pathology (663 slices / 149 cases) Encephalomalacia, meningioma, metastatic mass, vasogenic edema, post-

surgical change, old infarct

Others (743 slices / 373 cases) Dense blood vessels, deep sulcus, subdural hygroma

Page 22: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Review of False Positive Cases

Falx/tentorium CT artifacts Bleeding Other

pathology Calcification Others Total

A 113 (26.9%) 121 (26.1%) 43 (46.7%) 76 (51.0%) 72 (55.4%) 128 (34.3%) 383 (37.6%)

B 307 (73.1%) 342 (73.9%) 49 (53.3%) 73 (49.0%) 58 (44.6%) 245 (65.7%) 635 (62.4%)

Total 420 463 92 149 130 373 1,018

Red texts indicate significantly larger numbers. Statistical significance was determined by Pearson’s χ2 test.

Number of cases acquired by scanners from the two vendors by FP category

Page 23: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Future Work

Validate the labels assigned by NLP of clinical reports Improve the model by re-training the CNNs Distinguish chronic vs. subacute vs. acute bleeding Recognize other pathologies

Validate the improved model in a new setting (different CT manufacturers, image acquisition/reconstruction protocols, patient populations)Optimize parameters for each clinical setting before deployment

Page 24: Real-WorldPerformanceof Deep-Learning-Based System for … · 2018. 9. 26. · #CMIMI18#CMIMI18 Real-WorldPerformanceof Deep-Learning-Based System for Intracranial Hemorrhage Detection

#CMIMI18

Take Home Message

Know the reality, use it accordingly