4
The Frontier of AI-Assisted Care Scientific Symposium at Stanford (2019) AI-Assisted Thyroid Malignancy Prediction From Whole-Slide Images David Dov * , Shahar Z. Kovalsky , Jonathan Cohen , Danielle Elliott Range § , Ricardo Henao*, Lawrence Carin* With a rapidly increasing incidence rate, thyroid cancer is among the most common cancers [1]. Among the various tests used to diagnose thyroid malignancy prior to surgery, a key test is fine needle aspiration biopsy (FNAB). In this test, follicular (thyroid) cells are smeared onto a glass slide, stained and manually inspected under a microscope by an expert in cytopathology [2]. In many cases, however, FNABs result in indeterminate diagnoses leading to unnecessary surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a deep- learning-based algorithm for computational preoperative prediction of thyroid malignancy. Our algorithm was trained on 799 whole-slide scans and tested on a set of 109 scans never seen during the training procedure. Experimental results show that the proposed algorithm achieves human-level thyroid malignancy predictions (compared to three expert cytopathologists). In particular, the algorithm provides a correct and conclusive diagnosis, with zero false (either positive or negative) predictions, for 35% of the cases, demonstrating its potential use as a screening tool. We further show that the proposed algorithm may be used as an assistive diagnostic tool, helping to improve pathologists’ decisions in indeterminate cases. The full description of the medical problem we address in this research and the mathematical details of the proposed algorithm are presented in [3] and [4], respectively. Background The Bethesda System for Reporting Thyroid Cytopathology (TBS) is the universally accepted system for thyroid FNAB analysis [5]. It provides guidelines for the analysis of an FNAB slide according to which a cytopathologist assigns a score, known as a TBS category, indicating the risk of thyroid malignancy. This assessment is based on the structure of groups of follicular (thyroid) cells as well as the characterization of individual cells (e.g., size, nuclear features, etc.). There are six TBS categories: TBS 1 indicates an inadequately prepared slide, and is excluded from this work; TBS 2 indicates benign findings and requires no treatment but surveillance; TBS 6 indicates malignancy and is referred to surgery; lastly, the intermediate categories TBS 3, 4, and 5 indicate inconclusive findings with an increased risk of malignancy. Figure 1: (Top left) Whole-slide scan. (Bottom left) Heat map of the pre- dictions of regions containing follicular cells. (Right) Zoom-in on the region marked by the red rectangle. We propose a supervised deep- learning approach for the prediction of thyroid malignancy in cytopathol- ogy images. Related deep-learning ap- proaches have been studied for the pre- diction of thyroid malignancy in ul- trasound imaging [6–11] as well as for histopathologicl analysis performed af- ter surgery [12]. [13–18] are concerned with the computational analysis of pre- operative thyroid cytopathology; their scope, however, is restricted to re- gions (or individual cells) carefully pre- selected by an expert. As such, many of the challenges associated with fully automated cytopathological analysis of whole-slide thyroid FNAB images re- main open. * D. David, R. Henao and L. Carin are with the Department of Electrical and Computer Engineering, Duke University, Durham, NC 27708, USA (e-mail: [email protected]; [email protected]; [email protected]). S. Kovalsky is with the Department of Mathematics, Duke University, Durham, NC 27708, USA (e-mail: [email protected]). J. Cohen is with the Department of Surgery, Duke University Medical Center, Durham, NC 27710, USA (e-mail: [email protected]). § D. Elliott Range is with the Department of Pathology, Duke University Medical Center, Durham, NC 27710, USA (e-mail: [email protected]).

AI-Assisted Thyroid Malignancy Prediction From Whole-Slide ... · surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: AI-Assisted Thyroid Malignancy Prediction From Whole-Slide ... · surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a

The Frontier of AI-Assisted Care Scientific Symposium at Stanford (2019)

AI-Assisted Thyroid Malignancy Prediction From Whole-Slide Images

David Dov∗, Shahar Z. Kovalsky†, Jonathan Cohen‡, Danielle Elliott Range§, Ricardo Henao*, Lawrence Carin*

With a rapidly increasing incidence rate, thyroid cancer is among the most common cancers [1]. Among the varioustests used to diagnose thyroid malignancy prior to surgery, a key test is fine needle aspiration biopsy (FNAB). In thistest, follicular (thyroid) cells are smeared onto a glass slide, stained and manually inspected under a microscope by anexpert in cytopathology [2]. In many cases, however, FNABs result in indeterminate diagnoses leading to unnecessarysurgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a deep-learning-based algorithm for computational preoperative prediction of thyroid malignancy. Our algorithm was trainedon 799 whole-slide scans and tested on a set of 109 scans never seen during the training procedure. Experimentalresults show that the proposed algorithm achieves human-level thyroid malignancy predictions (compared to threeexpert cytopathologists). In particular, the algorithm provides a correct and conclusive diagnosis, with zero false(either positive or negative) predictions, for 35% of the cases, demonstrating its potential use as a screening tool. Wefurther show that the proposed algorithm may be used as an assistive diagnostic tool, helping to improve pathologists’decisions in indeterminate cases. The full description of the medical problem we address in this research and themathematical details of the proposed algorithm are presented in [3] and [4], respectively.

Background

The Bethesda System for Reporting Thyroid Cytopathology (TBS) is the universally accepted system for thyroidFNAB analysis [5]. It provides guidelines for the analysis of an FNAB slide according to which a cytopathologistassigns a score, known as a TBS category, indicating the risk of thyroid malignancy. This assessment is based on thestructure of groups of follicular (thyroid) cells as well as the characterization of individual cells (e.g., size, nuclearfeatures, etc.). There are six TBS categories: TBS 1 indicates an inadequately prepared slide, and is excluded fromthis work; TBS 2 indicates benign findings and requires no treatment but surveillance; TBS 6 indicates malignancyand is referred to surgery; lastly, the intermediate categories TBS 3, 4, and 5 indicate inconclusive findings with anincreased risk of malignancy.

Figure 1: (Top left) Whole-slide scan. (Bottom left) Heat map of the pre-dictions of regions containing follicular cells. (Right) Zoom-in on the regionmarked by the red rectangle.

We propose a supervised deep-learning approach for the predictionof thyroid malignancy in cytopathol-ogy images. Related deep-learning ap-proaches have been studied for the pre-diction of thyroid malignancy in ul-trasound imaging [6–11] as well as forhistopathologicl analysis performed af-ter surgery [12]. [13–18] are concernedwith the computational analysis of pre-operative thyroid cytopathology; theirscope, however, is restricted to re-gions (or individual cells) carefully pre-selected by an expert. As such, manyof the challenges associated with fullyautomated cytopathological analysis ofwhole-slide thyroid FNAB images re-main open.

∗D. David, R. Henao and L. Carin are with the Department of Electrical and Computer Engineering, Duke University, Durham, NC27708, USA (e-mail: [email protected]; [email protected]; [email protected]).†S. Kovalsky is with the Department of Mathematics, Duke University, Durham, NC 27708, USA (e-mail: [email protected]).‡J. Cohen is with the Department of Surgery, Duke University Medical Center, Durham, NC 27710, USA (e-mail:

[email protected]).§D. Elliott Range is with the Department of Pathology, Duke University Medical Center, Durham, NC 27710, USA (e-mail:

[email protected]).

Page 2: AI-Assisted Thyroid Malignancy Prediction From Whole-Slide ... · surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a

The Frontier of AI-Assisted Care Scientific Symposium at Stanford (2019)

Methods

Dataset and sample selection We have established a dataset including all FNABs with a subsequent thyroidectomysurgery from June 2008 through June 2017, as documented in the institutional databases. The dataset comprises 908samples each consisting of a whole slide scan with a typical resolution of ∼ 150, 000 × 100, 000 pixels; postoperativehistopathology diagnosis used as the gold standard (ground truth) in this study; and preoperative TBS categoryassigned by a cytopathologist, as recorded in the medical files. The dataset was split into a training and a test set of799 and 109 samples, respectively. To compare the algorithm to human performance, the slides in the test set wereannotated by three expert cytopathologists.

Proposed Algorithm Most indicative for the diagnosis of thyroid malignancy are groups of follicular cells, whosearchitecture, texture size and color are among the main characteristics determining the TBS category. Folliculargroups, however, are only sparsely distributed on the slide whereas most of the slide is covered by blood cells, whichare diagnostically irrelevant and are considered background. Inspired by the work of pathologists, the proposedalgorithm addresses this challenge via a cascade of two convolutional neural networks. The first network is trainedusing a supervised procedure to identify informative regions of the scan containing follicular cells, distinguishing themfrom background regions. The second network aggregates local decisions, based on the informative regions, into aglobal, slide-level, prediction of thyroid malignancy.

We further propose to simultaneously predict TBS category along with the malignancy prediction from a singleoutput of the network. Specifically, the TBS category is predicted by comparing the output of the network to aset of threshold values via an ordinal regression framework that associates higher TBS categories with increasedprobabilities of malignancy. These TBS predictions regularize the training process, improve the prediction of thyroidmalignancy, and could be further employed as diagnostic screening tool.

Results

0.0 0.2 0.4 0.6 0.8 1.0FPR

0.0

0.2

0.4

0.6

0.8

1.0

TPR

Human level (MR TBS) , auc=0.931Human level (Expert 1 TBS) , auc=0.909Human level (Expert 2 TBS) , auc=0.917Human level (Expert 3 TBS) , auc=0.931Proposed, auc=0.932

0.0 0.2 0.4 0.6 0.8 1.0Recall

0.0

0.2

0.4

0.6

0.8

1.0Pr

ecisi

on

Human level (MR TBS) , ap=0.808Human level (Expert 1 TBS) , ap=0.759Human level (Expert 2 TBS) , ap=0.848Human level (Expert 3 TBS) , ap=0.848Proposed, ap=0.872

Figure 2: Comparison of the proposed algorithm to human level thyroid malig-nancy prediction. (Left) ROC curves. (Right) Precision-recall curves.

Fig. 1 demonstrates how thefirst neural network successfullyidentifies follicular cells (brightcolors in the heat map), distin-guishing them from backgroundregions. We further comparethe proposed algorithm to hu-man decisions based on medicalrecords (MR TBS), as well asthose of three expert cytopathol-ogists (Expert 1 to 3). To evalu-ate human performance, we referto TBS categories 2 to 6 assignedby the cytopathologist as corre-sponding to malignancy predic-tions with increasing probabilities. The results are presented in Fig. 2 in the form of receiver operating characteristic(ROC) and precision-recall curves. Compared to the cytopathologists, the proposed algorithm achieves comparableAUC scores and improved AP scores.

We further advocate the use of TBS predictions obtained with the proposed algorithm as a screening tool. Out of109 tested cases, the algorithm provides 29 predictions of TBS 2 and 10 predictions of TBS 6, all of which (39 total)correctly correspond to benign and malignant cases, respectively. Moreover, these 39 cases include 11 cases that werepreviously classified as indeterminate by a pathologist; thus suggesting that the proposed algorithm can be used asan assistive diagnostic tool, helping pathologists resolve indeterminate cases.

Implications for improving the value of care

While we plan to validate the performance of the proposed algorithm on a substantially increased cohort, currentresults already demonstrate potential for improving the value of care as follows: i) The Bethesda categories predicted

Page 3: AI-Assisted Thyroid Malignancy Prediction From Whole-Slide ... · surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a

The Frontier of AI-Assisted Care Scientific Symposium at Stanford (2019)

by the algorithm may be used as a screening tool: all 39 cases predicted by the algorithm as TBS 2 and 6 are indeedbenign and malignant, respectively. ii) The algorithm may be used in remote areas where the availability of expertcytopathologists is limited. iii) The algorithm may improve pathologists’ decisions in indeterminate cases.

References

[1] Briseis Aschebrook-Kilfoy, Rebecca B Schechter, Ya-Chen Tina Shih, Edwin L Kaplan, Brian C-H Chiu, PeterAngelos, and Raymon H Grogan, “The clinical and economic burden of a sustained increase in thyroid cancerincidence,” Cancer Epidemiology and Prevention Biomarkers, 2013.

[2] G. Popoveniuc and J. Jonklaas, “Thyroid nodules,” Medical Clinics, vol. 96, no. 2, pp. 329–349, 2012.

[3] D. Dov, S. Kovalsky, J. Cohen, D. Range, R. Henao, and L. Carin, “Thyroid Cancer Malignancy PredictionFrom Whole Slide Cytopathology Images,” arXiv e-prints, Mar. 2019.

[4] D. Dov, S. Z. Kovalsky, J. Cohen, D. Range Elliotte, R. Henao, and L. Carin, “A Deep-Learning Algorithm forThyroid Malignancy Prediction From Whole Slide Cytopathology Images,” arXiv e-prints, Apr. 2019.

[5] E. S. Cibas and S. Z. Ali, “The bethesda system for reporting thyroid cytopathology,” American journal ofclinical pathology, vol. 132, no. 5, pp. 658–665, 2009.

[6] Tianjiao Liu, Shuaining Xie, Jing Yu, Lijuan Niu, and Weidong Sun, “Classification of thyroid nodules inultrasound images using deep model based transfer learning and hybrid features,” in Acoustics, Speech andSignal Processing (ICASSP), 2017 IEEE International Conference on. IEEE, 2017, pp. 919–923.

[7] Jianning Chi, Ekta Walia, Paul Babyn, Jimmy Wang, Gary Groot, and Mark Eramian, “Thyroid noduleclassification in ultrasound images by fine-tuning deep convolutional neural network,” Journal of digital imaging,vol. 30, no. 4, pp. 477–486, 2017.

[8] Jinlian Ma, Fa Wu, Qiyu Zhao, Dexing Kong, et al., “Ultrasound image-based thyroid nodule automatic segmen-tation using convolutional neural networks,” International journal of computer assisted radiology and surgery,vol. 12, no. 11, pp. 1895–1910, 2017.

[9] Jinlian Ma, Fa Wu, Jiang Zhu, Dong Xu, and Dexing Kong, “A pre-trained convolutional neural network basedmethod for thyroid nodule diagnosis,” Ultrasonics, vol. 73, pp. 221–230, 2017.

[10] Hailiang Li, Jian Weng, Yujian Shi, Wanrong Gu, Yijun Mao, Yonghua Wang, Weiwei Liu, and Jiajie Zhang,“An improved deep learning approach for detection of thyroid papillary cancer in ultrasound images,” Scientificreports, vol. 8, 2018.

[11] Wenfeng Song, Shuai Li, Ji Liu, Hong Qin, Bo Zhang, Zhang Shuyang, and Aimin Hao, “Multi-task cascadeconvolution neural networks for automatic thyroid nodule detection and recognition,” IEEE journal of biomedicaland health informatics, 2018.

[12] John A Ozolek, Akif Burak Tosun, Wei Wang, Cheng Chen, Soheil Kolouri, Saurav Basu, Hu Huang, andGustavo K Rohde, “Accurate diagnosis of thyroid follicular lesions from nuclear morphology using supervisedlearning,” Medical image analysis, vol. 18, no. 5, pp. 772–780, 2014.

[13] Antonis Daskalakis, Spiros Kostopoulos, Panagiota Spyridonos, Dimitris Glotsos, Panagiota Ravazoula, MariaKardari, Ioannis Kalatzis, Dionisis Cavouras, and George Nikiforidis, “Design of a multi-classifier system for dis-criminating benign from malignant thyroid nodules using routinely h&e-stained cytological images,” Computersin biology and medicine, vol. 38, no. 2, pp. 196–203, 2008.

[14] Alexandra Varlatzidou, Abraham Pouliakis, Magdalini Stamataki, Christos Meristoudis, Niki Margari, GeorgePeros, John G Panayiotides, and Petros Karakitsos, “Cascaded learning vector quantizer neural networks forthe discrimination of thyroid lesions,” Anal Quant Cytol Histol, vol. 33, no. 6, pp. 323–334, 2011.

Page 4: AI-Assisted Thyroid Malignancy Prediction From Whole-Slide ... · surgeries. In this work, we establish a dataset of whole-slide digital thyroid cytopathology images and propose a

The Frontier of AI-Assisted Care Scientific Symposium at Stanford (2019)

[15] Balasubramanian Gopinath and Natesan Shanthi, “Computer-aided diagnosis system for classifying benign andmalignant thyroid nodules in multi-stained fnab cytological images,” Australasian physical & engineering sciencesin medicine, vol. 36, no. 2, pp. 219–230, 2013.

[16] Parikshit Sanyal, Tanushri Mukherjee, Sanghita Barui, Avinash Das, and Prabaha Gangopadhyay, “Artificialintelligence in cytopathology: A neural network to identify papillary carcinoma on thyroid fine-needle aspirationcytology smears,” Journal of pathology informatics, vol. 9, 2018.

[17] Hayim Gilshtein, Michal Mekel, Leonid Malkin, Ofer Ben-Izhak, and Edmond Sabo, “Computerized cytometryand wavelet analysis of follicular lesions for detecting malignancy: A pilot study in thyroid cytology,” Surgery,vol. 161, no. 1, pp. 212–219, 2017.

[18] Rajiv Savala, Pranab Dey, and Nalini Gupta, “Artificial neural network model to distinguish follicular adenomafrom follicular carcinoma on fine needle aspiration of thyroid,” Diagnostic cytopathology, vol. 46, no. 3, pp.244–249, 2018.