6
Automated classification of bone marrow cells in microscopic images for diagnosis of leukemia: A comparison of two classification schemes with respect to the segmentation quality Sebastian Krappe 1 , Michaela Benz 1 , Thomas Wittenberg 1 , Torsten Haferlach 2 , Christian Münzenmayer 1 1 Image Processing and Medical Engineering Department, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany; 2 MLL Munich Leukemia Laboratory, Munich, Germany ABSTRACT The morphological analysis of bone marrow smears is fundamental for the diagnosis of leukemia. Currently, the count- ing and classification of the different types of bone marrow cells is done manually with the use of bright field micro- scope. This is a time consuming, partly subjective and tedious process. Furthermore, repeated examinations of a slide yield intra- and inter-observer variances. For this reason an automation of morphological bone marrow analysis is pur- sued. This analysis comprises several steps: image acquisition and smear detection, cell localization and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is depending on the automa- ted cell segmentation and the choice of adequate features extracted from different parts of the cell. In this work we focus on the evaluation of support vector machines (SVMs) and random forests (RFs) for the differentiation of bone marrow cells in 16 different classes, including immature and abnormal cell classes. Data sets of different segmentation quality are used to test the two approaches. Automated solutions for the morphological analysis for bone marrow smears could use such a classifier to pre-classify bone marrow cells and thereby shortening the examination duration. Keywords: automated classification, bone marrow cells, diagnosis of leukemia, support vector machines, random forests, segmentation quality 1. INTRODUCTION The morphological analysis of bone marrow slides is fundamental for the diagnosis of leukemia. This cytological exami- nation serves as clarification of variations in a blood smear differential. It is also used for the clarification of anemia, as a means to exclude the affection of the bone marrow by a lymphoma, and at suspicion of leukemia. The morphological evaluation of bone marrow cells is the basis for a patient’s diagnosis and for decision support for a consequent treatment. For the conventional cytological analysis the bone marrow aspirate smear is stained and examined by means of a light microscope. At first the cell density, the bone marrow fat content and qualitative changes of the cells are observed in a mid-level (e.g. 5-fold) magnification. Afterwards, cells of different types are identified and counted. This step is time consuming, partly subjective, error-prone and tedious. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason an automation of the bone marrow classification is pursued. Difficulties and challenges of automated image-based analysis of bone marrow samples are the high staining variability, the diversity of the smear quality of the samples, and especially the segmentation of cells in clusters and the challenging differentiation of immature cells. The analysis pipeline comprises several steps: image acquisition and smear detection, cell localization and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is strong- ly depending on the automated segmentation and the choice of adequate features which are extracted from different parts of the cell, such as the whole cell, the cell plasma and the cell nucleus. There exist only a few publications on the topic of automated bone marrow analysis. Wu et al. propose a multispectral imaging approach for the analysis of blood and bone marrow images [1]. Osowski et al. present the application of a ge- netic algorithm and a support vector machine for the recognition of bone marrow cells [2]. Theera-Umpon et al. propose morphological granulometric features to characterize nuclei for bone marrow cell classification [3]. Up to now neither a prototype nor a commercial product on the market is capable of a fully automated morphological bone marrow analysis. Medical Imaging 2015: Computer-Aided Diagnosis, edited by Lubomir M. Hadjiiski, Georgia D. Tourassi, Proc. of SPIE Vol. 9414, 94143I · © 2015 SPIE · CCC code: 1605-7422/15/$18 · doi: 10.1117/12.2081946 Proc. of SPIE Vol. 9414 94143I-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

94143I

Embed Size (px)

DESCRIPTION

Images for Acute Lymphoblastic Leukemia

Citation preview

Page 1: 94143I

Automated classification of bone marrow cells in microscopic images for diagnosis of leukemia: A comparison of two classification schemes

with respect to the segmentation quality

Sebastian Krappe1, Michaela Benz1, Thomas Wittenberg1, Torsten Haferlach2, Christian Münzenmayer1

1Image Processing and Medical Engineering Department, Fraunhofer Institute for Integrated Circuits IIS, Erlangen, Germany;

2MLL Munich Leukemia Laboratory, Munich, Germany

ABSTRACT

The morphological analysis of bone marrow smears is fundamental for the diagnosis of leukemia. Currently, the count-ing and classification of the different types of bone marrow cells is done manually with the use of bright field micro-scope. This is a time consuming, partly subjective and tedious process. Furthermore, repeated examinations of a slide yield intra- and inter-observer variances. For this reason an automation of morphological bone marrow analysis is pur-sued. This analysis comprises several steps: image acquisition and smear detection, cell localization and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is depending on the automa-ted cell segmentation and the choice of adequate features extracted from different parts of the cell. In this work we focus on the evaluation of support vector machines (SVMs) and random forests (RFs) for the differentiation of bone marrow cells in 16 different classes, including immature and abnormal cell classes. Data sets of different segmentation quality are used to test the two approaches. Automated solutions for the morphological analysis for bone marrow smears could use such a classifier to pre-classify bone marrow cells and thereby shortening the examination duration. Keywords: automated classification, bone marrow cells, diagnosis of leukemia, support vector machines, random forests, segmentation quality

1. INTRODUCTION The morphological analysis of bone marrow slides is fundamental for the diagnosis of leukemia. This cytological exami-nation serves as clarification of variations in a blood smear differential. It is also used for the clarification of anemia, as a means to exclude the affection of the bone marrow by a lymphoma, and at suspicion of leukemia. The morphological evaluation of bone marrow cells is the basis for a patient’s diagnosis and for decision support for a consequent treatment. For the conventional cytological analysis the bone marrow aspirate smear is stained and examined by means of a light microscope. At first the cell density, the bone marrow fat content and qualitative changes of the cells are observed in a mid-level (e.g. 5-fold) magnification. Afterwards, cells of different types are identified and counted. This step is time consuming, partly subjective, error-prone and tedious. Furthermore, repeated examinations of a slide may yield intra- and inter-observer variances. For that reason an automation of the bone marrow classification is pursued. Difficulties and challenges of automated image-based analysis of bone marrow samples are the high staining variability, the diversity of the smear quality of the samples, and especially the segmentation of cells in clusters and the challenging differentiation of immature cells. The analysis pipeline comprises several steps: image acquisition and smear detection, cell localization and segmentation, feature extraction and cell classification. The automated classification of bone marrow cells is strong-ly depending on the automated segmentation and the choice of adequate features which are extracted from different parts of the cell, such as the whole cell, the cell plasma and the cell nucleus. There exist only a few publications on the topic of automated bone marrow analysis. Wu et al. propose a multispectral imaging approach for the analysis of blood and bone marrow images [1]. Osowski et al. present the application of a ge-netic algorithm and a support vector machine for the recognition of bone marrow cells [2]. Theera-Umpon et al. propose morphological granulometric features to characterize nuclei for bone marrow cell classification [3]. Up to now neither a prototype nor a commercial product on the market is capable of a fully automated morphological bone marrow analysis.

Medical Imaging 2015: Computer-Aided Diagnosis, edited by Lubomir M. Hadjiiski, Georgia D. Tourassi, Proc. of SPIE Vol. 9414, 94143I · © 2015 SPIE · CCC code: 1605-7422/15/$18 · doi: 10.1117/12.2081946

Proc. of SPIE Vol. 9414 94143I-1

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Page 2: 94143I

Image Acquisition and

Smear Detection

Cell Localization and

Segmentation

Feature Extraction Classification

In this work we focus on the evaluation of two classification schemes (support vector machines and random forests) for the differentiation of bone marrow cells in 16 different classes including immature and abnormal cell classes. Data sets of different segmentation quality are used to test these two approaches. Automated solutions for the morphological ana-lysis of bone marrow smears could potentially apply such a classifier to pre-classify bone marrow cells and thereby shor-tening the examination time.

2. MATERIALS AND METHODS

Fig. 1: Bone marrow cell recognition workflow

An overview of workflow from the bone marrow smear on a microscopic slide to an automatic determination of the cell class distribution is depicted in Fig. 1. High resolution microscopic bone marrow images are acquired in relevant regions of the slide (Section 2.1). For a captured image the cell centers are determined and used as seed information for the seg-mentation of the nucleus und plasma parts of each cell (Section 2.2). After the segmentation step, each cell is characte-rized by a variety of features (Section 2.3) which are used to solve the 16-class bone marrow classification problem (Sec-tion 2.4). In the following sections the single steps of the image processing pipeline are explained in detail.

2.1 Image Acquisition and Smear Detection

Bone marrow smears are digitized with an automated microscope in several steps. At first the complete slide is captured in low magnification (1-fold magnification) to obtain an overview image. For an automatic system and to minimize the scanning duration, detection of the bone marrow smear on the microscopic slide is necessary. The contour of the smear is identified by a combination of thresholding and k-means clustering methods. In order to include regions at the boundary of the smear the convex hull is used. Then the bone marrow smear region is determined and digitized in a mid-level (5-fold) magnification. Relevant regions are selected and scanned in high magnification (40-fold magnification with oil immersion) for the morphological cell analysis. All images used for the evaluation were captured with a CCD-Camera mounted on a bright-field microscope (Zeiss Axio Imager Z2). The dimensions of the original images are 2452 × 2056 pixel and the pixel size of the camera is 3.45 × 3.45 µm. 2.2 Cell Localization and Segmentation

For the segmentation of single cells a Fast-Marching approach [4] has been extended by a different determination of potential cell centers [5]. These seed points are then used for the further cell separation. Each segmented cell is afterwards divided into nucleus and plasma parts by applying a threshold to the color transformed image of the whole cell.

Cell Class PercentageBand Neutrophils 6.8

Segmented Neutrophils

21.2

Lymphocytes 17.1 Monocytes 2.5 Eosinophils 3.8

Basophils 0.2 Metamyelocytes 1.8

Myelocytes 3.9 Promyelocytes 7.5

Blasts 8.6 Plasma Cells 7.1

Proerythroblasts 1.7 Erythroblasts 1.7 Normoblasts 15.9

Hairy cells 0.3 immature

Lymphocytes 0.2

Proc. of SPIE Vol. 9414 94143I-2

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Page 3: 94143I

! ,i

2.3 Feature Extraction

The extracted regions of interest (nucleus and whole cell) of different cell types are characterized by shape, texture and color features. For the characterization of the considered 16 bone marrow cell types various shape features are used: Area, Zernike moments, normalized central moments, and Hu’s seven invariant moments. The texture of a relevant cell region is described by numerous texture features: first and second order statistical features, color enhanced second order statistical features, features for the characterization of the heterogeneity and granularity, statistical geometric features, gray level run length based features, granulometric features, textural features corresponding to visual properties of texture and the fractal dimension. The color components of the cells is described by RGB histogram statistic features, central moments in the RGB and HSV color spaces and different moments in the HSV color space. 2.4 Classification

The feature selection and classification task is obtained by two different classification schemes which are evaluated and compare, namely support vector machines (SVMs) and random forests (RFs) The first scheme (SVMs) uses 16 two-class classifiers (one-vs.-rest respectively) to determine the class for unseen data. For each 2-class classifier class-speci-fic features are selected by a forward selection procedure. In the next step, these features are used for the training of the individual support vector machines. A feature vector is assigned by means of the 16 classifiers to the class with the high-est class probability. The second approach is to apply decision trees for the classification of bone marrow cells. For this evaluation random forests have been employed for the differentiation of all classes at once. With random forests the feature selection is done in the process of building the classifier. The minimum sample count required at a leaf node for it to be a split is set to 10 and the maximum number of trees in the forest is set to 200.

(a) (b) (c) (d)

(e) (f) (g) (h)

Fig. 2: representative images for different segmentation qualities: (a) and (b) represent “good” segmentations (the contour of the whole cell is largely correct, the nucleus segmentation is partly at bit too small); (c) and (d) stand for “acceptable” segmentations (segmentation of nucleus is identical with segmentation of whole cell (c) or segmentation of nucleus is correct and segmentation of whole cell leaks a bit (d)); (e) and (f) represent “deformed” segmentations (cell nucleus is only detected (e); segmentation of the whole cell leaks (f)); (g) and (h) stand for “erroneous” segmentations (segmentation of the whole cell touches image border or includes neighboring cells).

Proc. of SPIE Vol. 9414 94143I-3

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Page 4: 94143I

118

266

2117

1686

3443

30261

355

29708

3065

5040

2270

8394

14428

12551

7732

32640

0 5000 10000 15000 20000 25000 30000 35000

3. RESULTS 3.1 Cell Segmentation

In order to obtain training and test sets for the classification task the quality of the automatic cell segmentation procedure was evaluated visually for more than 150,000 manually classified cells by a human expert. Each segmented cell was ma-nually assigned to one of four quality levels: “good”, “acceptable”, “deformed” or “erroneous” segmentation (see. Fig. 2). The percentage segmentation quality distribution per cell class and the cell count distribution of the analyzed 16 clas-ses are depicted in Fig. 3. The ratio of good segmented cells differs among the cell classes. The best segmentation results was achieved for basophils and immature Lymphocytes, cf. Fig. 2 top row and Fig. 3 bottom left. The classes with the smallest ratio of good segmented cells are the segmented and band neutrophils, cf. Fig. 2 bottom row and Fig. 3 top left.

Fig. 3: Left: percentage segmentation quality distribution per cell class. Classes are sorted according to ascending good segmentation

percentage Right: cell count distribution for the 16 bone marrow cell classes.

3.2 Classification

For the training step of the two classifiers a set of 10,269 automatically segmented cells of “good” segmentation quality have been used. These cells were collected from 479 different bone marrow samples acquired from routine examinations in the Munich Leukemia Laboratory (MLL). For each cell a set of 1,330 features mentioned in Section 2.3 has been extracted for different cell parts (nucleus and whole cell) and used to build the classifier. The remaining 140,000 cells were used for the testing step and were grouped into the four quality classes. The SVM classification scheme and the random forests were applied to evaluate datasets of different segmentation quality. For 52,991 cells of “good”

0% 20% 40% 60% 80% 100%

immature …

Basophils

Erythroblasts

Proerythroblasts

Eosinophils

Lymphocytes

Hairy cells

Normoblasts

Monocytes

Myelocytes

Metamyelocytes

Plasma Cells

Blasts

Promyelocytes

Band Neutrophils

Segmented …

good segmentation acceptable segmentation

deformed segmentation erroneous segmentation

Proc. of SPIE Vol. 9414 94143I-4

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Page 5: 94143I

segmentation quality from the test data set an average classification rate of 64% was achieved for the 16-class classifica-tion problem with SVMs and 69% with random forests. These cells for the test set have been collected from 850 diffe-rent slides, also obtained from the MLL. The average classification rate for the 25,044 cells of “acceptable” segmentation quality extracted from 839 slides is 44 % with the SVM framework and 52% with random forests. For the 29,652 “defor-med” cell segmentations which were collected from 825 slides the average classification rate is 27% with SVM and 40% with random forests. The average classification rate for cells of one of the three segmentation qualities is also evaluated with both classifiers for each class. With the random forest the average classification rate for such cells is 57%, with the SVM framework at true positive rate of 49 % is achieved. The classification rates for the 16 different classes are visuali-zed in Fig. 4.

Fig. 4: Average classification rates for the 16 different bone marrow cell classes in %

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Band Neutrophils

Segmented Neutrophils

Lymphocytes

Monocytes

Eosinophils

Basophils

Metamyelocytes

Myelocytes

Promyelocytes

Blasts

Plasma Cells

Proerythroblasts

Erythroblasts

Normoblasts

Hairy cells

immature Lymphocytes

good segmentations + SVM good segmentations + Random Forest

acceptable segmentations + SVM acceptable segmentations + Random Forest

deformed segmentations + SVM deformed segmentations + Random Forest

good, acceptable and deformed segmentations + SVM good, acceptable and deformed segmentations + Random Forest

Proc. of SPIE Vol. 9414 94143I-5

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Page 6: 94143I

4. DISCUSSIONIn this paper we have focused on the automatic classification of bone marrow cells in microscopic images for leukemiadiagnosis. Two classification schemes were evaluated on data sets of different segmentation quality. The results show that a better cell segmentation quality yields to better classification rates in the majority of cases. Random forests can be applied successfully for the bone marrow classification task. For the tested data set the classification rates of randomforests are higher than the rates of the SVM classifier framework for each segmentation quality level.Next research activities will include the evaluation of classification trees with the incorporation of more expert knowledge, the quality improvement of the overall automatic segmentation and the evaluation of different bone marrowspecific features.

ACKNOWLEDGMENTS

This work was funded through the “AutoMorLeu” project from the German Federal Ministry of Education and Research and throughthe MAVO-project “MultiNaBel” from the Fraunhofer-Gesellschaft.

REFERENCES

[1] Wu, Q., Zeng, L., Ke, H., Xie, W., Zheng, H., Zhang, Y., "Analysis of blood and bone marrow smears using multi-spectral imaging analysis techniques," Proc. SPIE 5747, 1872-1882 (2005)

[2] Osowski, S., Siroic, R., Markiewicz, T., Siwek, K., "Application of Support Vector Machine and Genetic Algorithmfor Improved Blood Cell Recognition," IEEE Transactions on Instrumentation and Measurement 58(7), 2159-2168(2009)

[3] Theera-Umpon, N., Dhompongsa, S., "Morphological granulometric features of nucleus in automatic bone marrow white blood cell classification," IEEE Trans Inf Technol Biomed 11(3), 353-359 (2007)

[4] Zerfass, T., Haßlmeyer, E, Schlarb, T, Elter, M., "Segmentation of leukocyte cells in bone marrow smears," Compu-ter-Based Medical Systems (CBMS), 267-272 (2010)

[5] Krappe, S., Macijewski, K., Eismann, E., Ziegler, T., Wittenberg, T., Haferlach, T., Münzenmayer, C., "Lokalisie-rung von Knochenmarkzellen für die automatisierte, morphologische Analyse von Knochenmarkpräparaten," Bildverar-beitung für die Medizin 2014, 403-408 (2014)

Proc. of SPIE Vol. 9414 94143I-6

Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/08/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx