5. LOCAL INTENSITY MODEL AN OUTLIER DETECTION FRAMEWORK WITH APPLICATIONS TO WHITE MATTER HYPERINTENSITY SEGMENTATION.pdf

LOCAL INTENSITY MODEL: AN OUTLIER DETECTION FRAMEWORK WITH APPLICATIONS TO WHITE MATTER HYPERINTENSITY SEGMENTATION

Parnesh Raniga a, Pierre Schmitt a,b, Pierrick Bourgeat a, Jurgen Fripp a,

Victor L. Villemagne c,d, Christopher C. Rowe d, Olivier Salvado a

a CSIRO Preventative Health National Research Flagship ICTC, The Australian e-Health Research Centre-BioMedIA, Royal Brisbane and Women's Hospital, Herston, QLD, Australia.

b Ecole Nationale Supérieure de Télécommunications, Paris, France c The Mental Health Research Institute, University of Melbourne, Parkville, VIC, Australia

d Department of Nuclear Medicine and Centre for PET, and Department of Medicine, University of Melbourne, Austin Hospital, Melbourne, VIC, Australia

ABSTRACT

Automatic segmentation of white matter hyperintensities (WMH) from T2-Weighted and FLAIR MRI is a common task that needs to be performed in the analysis of many different diseases. A method to segment the WMH is proposed whereby a local intensity model (LIM) of normal tissue is generated. WMH are detected as outliers from this model. The LIM enables an accurate modeling of intensity variations thus reducing false positives. Moreover only scans with normal tissues are required to create the model. Twelve normal scans were used to generate the LIM and validation was conducted on a set of 46 scans. Similarity indices between the proposed approach and manual segmentations were 0.59±0.15, 0.65±0.08 and 0.77±0.08 for subjects with small, moderate and large volume of lesions respectively. The proposed approach performed better than support vector machines on the same dataset and compared favorably to approaches in literature.

Index Terms— White matter hyperintensities, Alzheimer’s disease, local intensity mode, segmentation, outlier detection

1. INTRODUCTION

Detection of deviations from the norm is one of the most important applications of medical image analysis as such deviations generally represent pathologies of interest. White matter hyperintensities (WMH) are one such class of pathologies that are present in several neurological conditions including multiple sclerosis and Alzheimer’s disease (AD). WMH can be distinguished from normal appearing white matter (WM) due to their brighter appearance on T2-weighted and fluid attenuated inversion recovery (FLAIR) MRI.

The use of machine learning [1], [2] and pattern recognition [3] methods have been proposed for the segmentation of WMH. Machine learning techniques such as support vector machines (SVM) require a large dataset of

manually segmented scans to be able to distinguish WMH from normal WM. Pattern recognition methods such as fuzzy c-mean (FCM) are sensitive to the relative clusters sizes [4]. As the volume of lesions varies greatly, the FCM algorithm can be run on a slice by slice basis to limit this sensitivity. This may result in inconsistencies between slices.

Furthermore, although the FLAIR sequence has been shown to be the most sensitive at detecting lesions [5] the brighter appearance of temporal and entorhinal cortices [6] can result in a high degree of false positives. To reduce the number of false positives, multiple MR sequences are utilized.

In this paper we propose the segmentation of WMH using an outlier detection framework with a local intensity model (LIM). The method builds a LIM, a voxel by voxel model of the normal appearance of FLAIR scans. A new scan that is to be segmented is compared to this model and all voxels that are deemed to be outliers are segmented as WMH. This has the advantage that only scans without pathology are required. By accounting for the brighter intensity of the temporal and entorhinal cortices, the method is able to reduce false positives.

We built a LIM using 12 cases and tested it on a dataset of 46 scans. Furthermore we compared the proposed method to segmentation using support vector machines [2].

2. MATERIALS AND METHODS

2.1. Data MRI data from fifty eight participants from the Australian Imaging Biomarkers and Lifestyle (AIBL) study [7] were used in this study. Participants for this study included healthy elderly subjects (n=44) as well as subjects with mild cognitive impairment (n=10) and Alzheimer’s disease (n=4). There were 26 males (Mean age: 73.1 ± 6.2) and 32 females (Mean age: 75.1 ± 7.8) on the cohort.

2057978-1-4244-4128-0/11/$25.00 ©2011 IEEE ISBI 2011

T1-weighted magnetization prepared rapid gradient echo (MPRAGE) MRI and Fluid Attenuated inversion recovery (FLAIR) scans were acquired for all the subjects. The scans were conducted on a three Tesla (3T) Siemens Magnetom Trio scanner (Siemens, Germany). For the T1-weighted images, the image size was 160 x 240 x 256 voxels with a voxel spacing of 1.2x1x1 mm in the sagittal, coronal and axial directions respectively (TR = 2300ms, TE = 2.98ms, flip angle = 9°). For the FLAIR scans the image size was 176x240x256 voxels and a voxel spacing of 0.90x0.97x0.97 mm (TR = 6000ms,TE = 421ms, flip angle = 120°, TI = 2100ms). 2.2. Method We assume that a database of MRI scans is available with enough healthy subjects, having no or very few lesions. For this study we built the LIM using scans from 12 healthy subjects with minimal WMH based on visual inspection.

The model was then used to test the algorithm on the remaining 46 subjects from our dataset. The steps of the algorithm were as follows: 1) Pre-processing: Both T1-weighted and FLAIR scans were corrected for bias field effects [8]. FLAIR scans were smoothed using anisotropic diffusion [9] and co-registered to their corresponding T1-weighted scans using a rigid transformation [10].

T1-weighted scans were segmented in gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) using an expectation maximization approach with prior probabilities [11]. Intensity normalization of the images to a randomly selected template from the training set was performed by aligning the peaks of the CSF and WM distribution in the individual images to the template. CSF and WM masks from the T1-weighted segmentation were propagated to the FLAIR space to generate the intensity distribution of WM and CSF.

2) Co-registration to database specific elderly atlas: A population specific atlas was generated using the T1-weighted scans with an approach similar to Rohlfing et al [12]. The atlas was generated using from 100 scans from the AIBL study [7]. An intermediate average affine atlas was generated by co-registering all T1-weighted scans to a representative scan. All scans were then non-linearly registered to this intermediate atlas using a mutual information based free form deformation (FFD) algorithm [13]. An average atlas was then generated and this was repeated for five iterations. All the T1-weighted and FLAIR scans were thus non-linearly registered to the elderly atlas. 3) Local intensity model: A model of the normal distribution of tissue intensities at each voxel of the atlas was generated from our training set. This was done by generating a histogram of normal tissue intensities within a

3x3x3 neighbourhood window around each voxel of each scans. Dilated manual segmentations of WMH were used to exclude WMH and reduce partial volume effects in the generation of the LIM. Histograms were generated with 128 bins as this was found to give the best compromise between memory usage and model accuracy. Histograms were also normalized so that their bins summed to unity, i.e a probability density distribution was generated. 4) Outlier detection: FLAIR scans were co-registered to T1-weighted scans which in turn were non-linearly co-registered to the elderly atlas as explained above. Correspondence established using the coregistrations was used to compute the location of the relevant histograms in the LIM. Detection of voxels as outliers from the LIM was done as a twostep process. i) WM voxels were marked as outlier if their intensity

was greater than Tch percent of voxels at the particular location. Only bright voxels (WM mean + 3 standard deviations) were considered.

ii) A degree of abnormality was computed for each

voxel detected as an outlier above by computing the number of bins from the threshold Tch to the voxel intensity. To generate a binary segmentation, a threshold Td was applied to the degree of abnormality.

WM voxels were detected by using an average WM segmentation in atlas space that excluded the brain stem and cerebellar white matter. As post-processing, clusters of 10 voxels (computed using connected component analysis) or less were removed. 2.3. Validation The proposed method was validated against manual segmentations. Manual segmentations of all the scans were performed by P.R using MRIcro software. The test group was split according to WMH volume (WMHV) as computed from manual segmentations.

i) Large lesion volumes (LLV, n=18) (WMHV > 10ml). ii) Moderate lesion volumes (MLV, n=18) (WMHV ≤

10ml & WMHV > 3ml). iii) Small lesion volumes (SLV, n=10) (WMHV ≤ 3ml).

Comparisons were conducted by computing the similarity index (SI) as well as the overlap fraction (OF), extra fraction (EF) and missed fraction (MF) [14]. The SI is equivalent to the Dice coefficient [15]. These indices were computed using:

Eq. 1

2058

where TP is the number of true positive voxels, TN is the number of true negative voxels, FP is the number of false positive voxels and FN is the number of false negative voxels.

The threshold parameters Tch and Td were estimated as those that gave the best overall performance in terms of SI. This was done using a grid search with a search range of 80 to 98% in steps of 2% for Tch and 1 to 10 bins for Td. Furthermore, correlation between lesion volumes computed from manual and those computed using the automatic segmentations (using the above noted values for Tch and Td) was conducted. 2.4. Comparison to SVM classifier The proposed method was compared to the SVM method [2] which was implemented using LIBSVM software [16]. The SVM was trained using feature sets generated using the manual segmentations. The feature set consisted of all voxel intensities in a 3x3x3 neighbourhood as well as the x, y, z coordinates in atlas space. The preprocessing steps used in the proposed algorithm were also applied to the SVM method. Furthermore, only voxels brighter than the WM mean plus 3 standard deviations were considered.

A 5–fold cross validation was performed on the dataset. For each fold, 40,000 samples, consisting of equal number of positive and negative samples, were used for the training. The 5 SI for each of the cases were averaged.

3. RESULTS

The grid search revealed that the best overall performance of the classifier was achieved at a Tch of 88% and a Td of 6. The results of the method are presented in Table 1below.

Table 1. Results of the proposed method with Tch = 88% and Td = 6. Lesion Load

(mL) OF

(μ±σ) EF

(μ±σ) MF

(μ±σ) SI

(μ±σ) SLV (0-3) 0.58 ±

0.16 0.60 ± 1.23

0.42 ± 0.16

0.59 ± 0.15

MLV (3-10) 0.68 ± 0.13

0.44 ± 0.38

0.32 ± 0.13

0.65 ± 0.08

LLV (>10) 0.69 ± 0.11

0.09 ± 0.11

0.31 ± 0.11

0.77 ± 0.08

An example of the segmentation achieved with the proposed approach is presented in Fig. 1 below. As can be seen the segmentations of the proposed approach are very similar to manual segmentations.

The results of the SVM classifier are presented in Table 2below. The similarity indices of the SVM approach were lower than that of the LIM approach. The standard

deviation was also higher. The SVM approach had a larger OF and smaller MF but the much larger EF resulted in worse SI.

30

0

20

A B C D 0

Fig. 1. Example of lesion detection with the described method on a subject with large volume of lesions (top row) and small volume of lesions (bottom row). An axial slice of the FLAIR scan is presented in (A), the corresponding manual segmentation in (B), the automated binary segmentation with a Tch of 88% and a Td of 6 (C) and corresponding distance map (D).

Table 2. Results of the SVM classifier.

Lesion Load (mL)

OF (μ±σ)

EF (μ±σ)

MF (μ±σ)

SI (μ±σ)

SLV (0-3) 0.67 ± 0.14

2.45 ± 4.89

0.33 ± 0.14

0.49 ± 0.20

MLV (3-10) 0.71 ± 0.12

1.14 ± 1.00

0.29 ± 0.12

0.54 ± 0.14

LLV (>10) 0.69 ± 0.13

0.17 ± 0.28

0.31 ± 0.13

0.75 ± 0.11

4. DISCUSSION

An automatic approach to segment WMH using just FLAIR images was presented. Building a model of normal FLAIR intensities allows for the accurate segmentation of WMH while reducing false positives. The results of the proposed method were similar to those in literature. Anbeek et al [1] reported SI of 0.50 for small (largest lesion < 3mm in diameter), 0.75 (largest lesion between 3- 10 mm in diameter) for moderate and 0.85 for large lesions (largest lesion > 10mm in diameter) using KNN classification utilizing five different MR sequences. Dyrby et al [17] reported SI of 0.45 ± 0.15 for WMHV < 10ml 0.62 ± 0.11 for WMHV 10-30 ml and 0.65 ± 0.15 for WMHV > 30 ml using a neural network and utilizing T1W, T2W and FLAIR sequences. However the performance of Dyrby’s classifier deteriorated when only the FLAIR sequence was used (0.21 ± 0.13, 0.47 ± 0.11, 0.57 ± 0.14).

As reported by other studies as well, the proposed approach did tend to under segment lesions compared to manual segmentations [3]. However the approach produced less false positives (lower mean EF) compared to the SVM

2059

approach. This was achieved by modelling at a local level and thus being able to capture more local difference as opposed to global modelling approach for SVMs. Therefore the proposed algorithm is more specific but not as sensitive. To allow more flexibility with the sensitivity and specificity, the proposed method can be used in a semi-supervised manner. This is the motivation behind having two thresholds, namely Tch and Td. The Td threshold can be chosen by an observer to best segment a particular case. In this manner, by picking the Td that gave the best SI, mean SI’s for the three groups were increased to 0.80 ± 0.06, 0.69 ± 0.06 and 0.64 ± 0.10 for subject with LLV, MLV and SLM respectively.

The proposed approach requires only normal anatomy for training. This is an advantage over traditional training based classifiers which require manual segmentation of pathology. Although we currently use manual segmentations to exclude WMH voxels it would be possible to generate the LIM without the need for manual segmentations if enough normal scans are available. One approach could be to use boot-strapping. By applying the outlier detection on the training set, it may be possible to remove WMH until only normal voxels were left. Moreover, the proposed approach is general enough to be applied to the segmentation of other pathologies.

5. CONCLUSION

Although the FLAIR sequence has been shown to the most sensitive at detecting WMH, overlap in intensities between GM regions and WMH results in false positives when classifying WMH. The proposed method is able to reduce these. Moreover, since an outlier based approach is used, the LIM is built from only normal cases thus requiring minimal or no manual segmentations, unlike machine learning approaches. The approach performed better than SVM on the same dataset and comparably to methods published in the literature which used multiple modalities. The proposed approach is promising not only for WMH detection but for other medical imaging segmentation applications.

6. ACKNOWLEDGEMENTS

Data used in this article was obtained from the AIBL study funded by the CSIRO (www.aibl.csiro.au).

7. REFERENCES [1] P. Anbeek et al., “Probabilistic segmentation of white matter lesions in MR imaging,” NeuroImage, vol. 21, no. 3, pp. 1037-1044, Mar. 2004. [2] Z. Lao et al., “Computer-Assisted Segmentation of White Matter Lesions in 3D MR Images Using Support Vector Machine,” Academic Radiology, vol. 15, no. 3, pp. 300-313, Mar. 2008. [3] F. Admiraal-Behloul et al., “Fully automatic segmentation of

white matter hyperintensities in MR images of the elderly,” NeuroImage, vol. 28, no. 3, pp. 607-617, Nov. 2005. [4] J. C. Noordam et al., “Multivariate image segmentation with cluster size insensitive Fuzzy C-means,” Chemometrics and Intelligent Laboratory Systems, vol. 64, no. 1, pp. 65-78, Oct. 2002. [5] P. Anbeek et al., “Probabilistic segmentation of brain tissue in MR imaging,” NeuroImage, vol. 27, no. 4, pp. 795-804, Oct. 2005. [6] T. Hirai et al., “Limbic Lobe of the Human Brain: Evaluation with Turbo Fluid-attenuated Inversion-Recovery MR Imaging1,” Radiology, vol. 215, no. 2, pp. 470-475, May. 2000. [7] K. A. Ellis et al., “The Australian Imaging, Biomarkers and Lifestyle (AIBL) study of aging: methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of Alzheimer's disease,” International Psychogeriatrics / IPA, pp. 1-16, May. 2009. [8] O. Salvado et al., “Method to correct intensity inhomogeneity in MR images for atherosclerosis characterization,” IEEE Transactions on Medical Imaging, vol. 25, no. 5, pp. 539-552, May. 2006. [9] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629-639, 1990. [10] S. Ourselin et al., “Reconstructing a 3D structure from serial histological sections,” Image and Vision Computing, vol. 19, no. 1, pp. 25-31, Jan. 2001. [11] O. Acosta et al., “Automated voxel-based 3D cortical thickness measurement in a combined Lagrangian-Eulerian PDE approach using partial volume maps,” Medical Image Analysis, vol. 13, no. 5, pp. 730-743, Oct. 2009. [12] T. Rohlfing et al., “Evaluation of atlas selection strategies for atlas-based image segmentation with application to confocal microscopy images of bee brains,” NeuroImage, vol. 21, no. 4, pp. 1428-1442, Apr. 2004. [13] D. Rueckert et al., “Nonrigid registration using free-form deformations: application to breast MR images,” IEEE Transactions on Medical Imaging, vol. 18, no. 8, pp. 712-721, Aug. 1999. [14] R. Stokking et al., “Automatic Morphology-Based Brain Segmentation (MBRASE) from MRI-T1 Data,” NeuroImage, vol. 12, no. 6, pp. 726-738, Dec. 2000. [15] L. R. Dice, “Measures of the Amount of Ecologic Association between Species,” Ecology, vol. 26, no. 3, pp. 297-302, 1945. [16] C. Chang and C. Lin, LIBSVM: a library for support vector machines. 2001. [17] T. B. Dyrby et al., “Segmentation of age-related white matter changes in a clinical multi-center study,” NeuroImage, vol. 41, no. 2, pp. 335-345, 2008.

2060

Documents

5. LOCAL INTENSITY MODEL AN OUTLIER DETECTION FRAMEWORK WITH APPLICATIONS TO WHITE MATTER HYPERINTENSITY SEGMENTATION.pdf