Upload
akaspers
View
288
Download
2
Embed Size (px)
Citation preview
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
1
Abstract—Accurate and precise brain segmentations of
Magnetic Resonance (MR) brain images from patients after
aneurysmal subarachnoid hemorrhage (aSAH) are hard to
acquire by an automated routine due to presence of various
cerebral abnormalities, like enlarged ventricles. Available
routines neither dealt with theses abnormalities nor were suited
for MR images with high magnetic field strength or used
techniques with limited accuracy and precision. In order to
perform accurate and precise brain volume measurements for 3
T aSAH MR images, we created a new routine in which we tried
to deal with these cerebral abnormalities. Measurements of
intracranial volume, total brain, lateral ventricles and peripheral
cerebrospinal fluid were performed on T1 and T2 weighted MR
images of 39 patients and 25 control participants using k-Nearest
Neighbor (kNN) classification. Evaluation showed a fractional
Similarity Index (fSI) of 0.98, 0.93 and 0.92 for respectively intra-
cranial volume, total brain and lateral ventricles, which are
equally good as the inter-observer results.
Index Terms—Aneurysmal Subarachnoid Hemorrhage; k-
Nearest Neighbor classification; Magnetic Resonance imaging;
Segmentation
I. INTRODUCTION
NEURYSMAL SUBARACHNOID HEMORRHA-
GE (aSAH) is a type of stroke, caused by a ruptured
intracranial aneurysm [1]. The annual incidence of a
non-traumatic aSAH varies from 6 - 8 cases per 100,000
person-years [2]. Almost half died within thirty days [3] while
almost half of the survivors suffered from significant cognitive
and neurological or cognitive deficits after a year [4]. It is
assumed that the size of neuropsychological deficits,
commonly detected after treatment of ruptured intracranial
aneurysms is associated with the loss of cerebral volume [5].
Study by Bendel showed enlargement of cerebrospinal fluid
(CSF) and ventricular volume in patients after aSAH, using
the technique of voxel-based morphometry (VBM) [6].
However, the accuracy and precision of VBM is limited since
its measurements are based on an average brain, which is not
specific for aSAH patients [7]. Existing routines, which are
based on training data of Magnetic Resonance (MR) brain
images, were not suited to measure significant volume
differences in scans of patients after aSAH. This is partly
because they were made for MR image data with too low
magnetic field strength, and partly because they lacked
cerebral abnormalities present in patients after aSAH, like
enlarged ventricles. k-Nearest Neighbor-based probabilistic
segmentation (kNN) [8] is a supervised pattern recognition
method which can perform precise and accurate brain volume
measurement [7], for which training data can be obtained from
different high resolution MR brain scans containing variety of
cerebral abnormalities.
In this study we aimed therefore to design a new, automatic
routine for quantification of cerebral structure volumes in
patients after aSAH, based on kNN using manually segmented
MR image training data.
II. MATERIALS AND METHODS
A. Data
For training 10 and for validation 12 scans of patients after
aSAH and of age- and sex-matched control participants were
included, which were obtained between 2005 and 2007.
Patients who were screened on aneurysmata were included as
control participants.
Patients were excluded if they had additional aneurysms
treated with neurosurgical clips that either contained
ferromagnetic material or were located less than 20 mm from
the coiled aneurysm, had a cardiac pacemaker, were
claustrophobic or younger than 18 years [9].
MRI scans were acquired on a 3T Philips magnetic
resonance imaging system using a standardized protocol (24
contiguous slices, voxel size: 0.45 × 0.45 × 4.0 mm) and
consisted of an axial T1-weighted (repetition time in ms [TR]:
500, echo time in ms [TE]: 10) and T2-weighted sequence
(TR: 3000, TE: 80).
B. Image processing
Routine steps
In figure 1, all routine steps from provided images to
resulting probability maps are schematically visualized.
Automated Measurement of Brain Volume
in Patients after Aneurysmal Subarachnoid
Hemorrhage Anne Kaspers, Biomedical Image Sciences, University Medical Centre Utrecht
A
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
2
Fig. 1. Flow chart of the Volume Measurement Routine
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
3
First, the T1-weighted image was rigidly registered to the
T2-weighted image by using Elastix [10].
To exclude hyper-intense non-brain structures like skull and
fatty tissue, a brain mask was created by an automated routine,
based on the k-means algorithm [11], which used both the T1-
and T2-weighted image (figure 2A). The first non-empty slice
was used 5 times to get more hyper-intense background
information for k-means clustering. A foreground mask was
created using k-means clustering with a small sample set,
previous to full k-means clustering (figure 2B). Scan
inhomogeneities were corrected by a shading correction
algorithm using a multiplicative 4th
order correction model on
all voxels covered by the foreground mask [12]. In full k-
means clustering, all shading corrected T1 and T2 intensities
were taken as samples in a 2D feature space, which only
contained intensity parameters. The algorithm tried to find 10
means, which minimized the sum of Euclidean distance of all
samples to their nearest mean. Each voxel was classified to the
cluster number of their nearest mean, which resulted in 10
brain clusters and 1 background cluster, derived from the
foreground mask (figure 2C).
To select clusters suitable for the brain mask, cluster
numbers were counted for a fixed selection of approximately
1/3 of the voxels located in the center of the cluster image.
The 4 largest clusters and extra clusters, which size exceeded a
threshold, were summed to get a basic mask (figure 2D).
To exclude remaining non-brain structures and fill holes, a
number of morphological operations were performed. An
erosion with a round, 11 voxels wide kernel separated non-
brain structures from the brain. These structures were removed
by segmenting groups of attaching mask voxels, further
mentioned as blobs, and keeping only the largest blob.
Dilation with the same kernel as used for erosion restored the
old borders (figure 2E). A set of 6 dilations with a round, 9
voxels wide kernel filled holes while kept the shape of the
mask edge intact. The mask was brought back within its
original borders by 7 erosions with the same kernel (figure
2F). A maximum of the brain mask with holes and the eroded
mask restored the old borders while holes remained filled
(figure 2G). At the end of the routine 3 dilations with a 7
voxels wide, round kernel increased the margin to include all
CSF below the skull. Since only the cerebral volume was
important for our study, the cerebellum was manually
segmented (figure 2H).
The T2 image and the registered T1 image were multiplied
voxelwise by their corresponding mask including cerebellum
and inhomogeneities were corrected [12], resulting in brain
extracted shading corrected images, which were used for kNN
classification (figure 1, processing routine).
As post-processing, small groups of attaching probabilities,
Fig. 2. k-Means mask routine
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
4
further mentioned as blobs, were transferred from the lateral
ventricles to the peripheral CSF probability map; only the
largest blob was not transferred. Afterwards, a visual check
was done to move back wrongly transferred blobs.
To remove as subcortical structures and cortical grey matter
misclassified background outside the brain, the mask was
eroded 2 times with a round, 7 voxels wide, kernel and voxels
of subcortical structures and cortical grey matter outside the
eroded mask were excluded. Infarcts, drain trajectories,
meningiomas, etcetera, significantly diminished classification
outcome and were manually segmented and removed from the
probability maps. In figure 3, an example classification
outcome of one participant is shown.
Routine choices
In this study, volume measurements of subcortical
structures, cortical grey matter, peripheral CSF and lateral
ventricles were performed. Besides these structures, other
structures were included in the masked area, further mentioned
as background, which needed to be included in the training
data to prevent misclassification. Assignment of all not
classified voxels as background in the training data would
incorrectly assign partial volume brain structure voxels to the
background. Assignment of only hypo-intense voxels as
background would lead to misclassification of hyper-intense
background to closely located brain structures with similar
intensity. Therefore, we put a manual selection of non-partial
hypo- and hyper-intense background in the training data.
Remaining misclassified skull and fatty tissue classified as
subcortical structures and cortical grey matter was removed if
it was located within 6 voxels of the edge of the brain mask,
under the assumption that only peripheral CSF could be
located there.
The provided T1 and T2 weighted MR brain images
contained a shading artefact, which diminished intensity
homogeneity for each brain structure. We applied
inhomogeneity correction [12], assuming its effect to the
classification could be large since the orientation of the shaded
area is different for each scan, which makes it hard to handle
by kNN. Preventive removal of shading seemed better than
inclusion of a representative selection of all shading areas in
the training data, which would enlarge the overlap of structure
samples in feature space. In figure 4, T1 and T2 weighted
intensities of samples from a training data patient with
numerous parenchymal high-signal intensity lesions on T2-
weighted MRI are shown before and after inhomogeneity
correction. Both the T1 and T2 weighted image added
information, which showed the different range of structures on
the x- and y-axis. After correction, intensities of all structures
were more concentrated and distinctive. Cortical grey matter,
peripheral CSF and parenchymal lesion intensities were better
separated from each other while there was still overlap
between subcortical structures and cortical grey matter, which
could be explained by the unclear border in both the T1 and
T2 weighted image. The effect of inhomogeneity correction to
cortical grey matter classification is shown in figure 5 for a
participant scan with little and one with significant shading.
After correction, cortical grey matter was better classified on
the shading area, which made the segmentation more uniform.
To create a proper brain mask, we designed an automated
routine, based on the k-means algorithm [11]. It was extended
with cluster selection and a set of morphological operations to
fill holes, caused by exclusion of small clusters in the brain,
while original borders were maintained. Parameters for cluster
selection were determined by testing values close to the
settings which were used in a study by Jongen [13] on our
training data. In contrast to the mask routine used by Jongen,
we automated cluster selection by setting a cluster size
threshold, which provided good cluster selection for 9 of the
10 training data images. After cluster selection, a large number
of small dilations, followed by one more number of small
erosions was used instead of a large morphologic closing, to
fill large holes without loss of border detail. Holes close to the
border were filled while the original border was kept intact by
taking voxelwise the maximum of the unclosed mask and the
closed, eroded mask.
For a selection of participants, results of k-means and the
Brain Extraction Tool (BET) were compared [14]. In normal
cases BET performed similar to k-means, but in cases with
large infarcts k-means performed better. In k-means we could
determine the number and selection of clusters to be classified.
Fig. 3. A registered T1 and T2 weighted image and corresponding kNN probability maps of subcortical structures, cortical grey matter, peripheral CSF and lateral ventricles.
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
5
This allowed us to include large infarcts and exclude hyper-
intense background. BET often considered infarcts as non-
brain structures, which caused large gaps in the mask. Since a
larger part of the patients after aSAH had infarcts (n = 40), we
chose to use k-means instead of BET.
All blobs in the lateral ventricles probability map, except the
largest were transferred from the lateral ventricles to the
peripheral CSF probability map, under the assumption that all
lateral ventricle voxels attach to each other. However, this
assumption was not valid in all cases because of the large slice
thickness. Manual adjustment was needed for some posterior
and inferior ventricle horns. Nevertheless, this operation was
an easy way to get improvement.
Since we were only interested in volume measurements of
brain structures in the cerebrum, we needed to segment the
cerebellum. However, presence of subcortical structures,
cortical grey matter and peripheral CSF in both cerebrum and
cerebellum complicated kNN classification and search for
better methods exceeded the project scope, so we segmented
the cerebellum manually. Because the border between
cerebrum and cerebellum was unclear, specific segmentation
rules had to be defined to guaranty consistency.
C. Training data routine
The training data consisted of non-partial volume
segmentations of 10 participant scans (JB). It is a
representative selection of the dataset (Appendix A),
composed of scans of patients after aSAH and control
participants, which varied in modified Rankin Scale [15] and
size of the lateral ventricles. The segmentations contained
background and 4 brain structures: subcortical structures,
cortical grey matter, peripheral CSF and lateral ventricles. For
all training data participants pre-processing was performed
(section C). A fixed, random selection of 40% of the manually
segmented structures and background was saved by their brain
extracted shading corrected T1 and T2 weighted intensity and
spatial parameters. The kNN algorithm could calculate
distances in feature space to obtain structure probabilities of
partial volume samples.
D. Validation routine
Right or left hemispheres were selected randomly
throughout the brain from 12 participant scans of whom 6
were from the training data and 6 from other data. Subcortical
structures, cortical grey matter, peripheral CSF and lateral
Fig. 4 A. Scatter plot of voxel intensities of the original T2W image relative to the registered original T1WFFE image of one patient from the
training data. Five structures are indicated: subcortical structures (SCS), cortical grey matter (CGM), peripheral (per.) CSF, lateral (lat.) ventricles and parenchymal (par.) lesions B. Same for shading corrected intensities.
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
6
ventricles in these slices were manually segmented by 2
observers. They could indicate multiple structures per voxel.
So, in contrast to the training data, validation data also
contained partial volume voxels.
Since there were multiple structures per voxel, manual
fractions could be computed, as well as for single as combined
observers. Uniform distribution of structures and observer
certainty was assumed for each voxel, since no information
about the distribution was provided. For a single observer, the
manual fraction for voxel and structure is defined as
where is the binary value for voxel and structure
of the observer and the number of structures
classified in voxel by the observer. In order to enlarge the
range of manual fractions, uncertainty of both observers were
combined. For combined observers, the manual fraction is
equal to the average of both observer manual fractions.
For calculation of the manual fraction of total brain,
subcortical structures and cortical grey matter were merged,
and for the manual fraction of total CSF, peripheral CSF and
lateral ventricles were merged. The manual fraction for voxel
of resp. total brain and total CSF for a single observer are
defined as
and
.
For combined observers, the average of the total brain and
total CSF were taken. The manual value of intracranial
volume is binary for a single observer, since it is 1 for all
structures and 0 for the background, and fractional for
combined observers, for which the average of the binary
values of both observers were taken.
Fig. 5. Example of an image with a significant shading artefact (top) and a small shading artefact (bottom) with their cortical grey matter classifications using SC training data on the SC image (middle) and using uncorrected training data on the uncorrected image
(right).
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
7
E. Evaluation
The agreement of observer segmentations and the automatic
segmentation, acquired by kNN classification, and the inter-
observer agreement, were measured by a variant of the Dice
similarity index (SI) [16, 17] . The SI formula assumes binary
values for both the reference and the segmentation. It is
defined as
where “Ref” denotes the volume of the binary reference,
“Seg” is the volume of the binary segmentation, “Ref ∩ Seg”
denotes the volume of the intersection of the binary reference
and binary segmentation, is the sum over all voxels
in the binary reference, is the sum over all voxels,
where in the binary reference the intensity value equals 1 and
idem for the binary segmentation.
Because we calculated manual fractions for the observer
segmentations, and kNN classification provided probabilistic
segmentations, the fractional Similarity Index (fSI) was
measured [18]. It is defined as
where is the manual fraction, computed for single
observers (formula 1) or combined observers (formula 2).
Notice that in case probabilistic values are substituted for
binary values, the fSI formula is equal to the SI formula. The
agreement of the probabilistic manual segmentations with the
automatic segmentation and the inter-observer agreement were
measured with the fSI.
Besides the fSI, also the sensitivity and specificity were
measured, which are more common quality indicators and
therefore makes the validation outcome comparable to other
studies. They are defined as
and
where is the sum of minima of the
reference and segmentation probabilities, equivalent to the
sum of true positives, is the sum of reference
probabilities, equivalent to the sum of true positives and false
negatives, is the number of
voxels minus the maxima of the reference and segmentation
probabilities, equivalent to the sum of true negatives, and
is the number of voxels minus the sum of
reference probabilities, equivalent to the sum of true negatives
and false positives.
The reference and segmented volume were determined by
multiplication of and to the volume of 1
voxel in milliliters. The difference was examined to detect
over- or under-segmentation of the automated structure
volumes.
Inter-observer and routine fSI and sensitivity scores of
subcortical structures, cortical grey matter, peripheral CSF,
lateral ventricles, total brain, total CSF and intracranial
volume were analyzed. To investigate if inclusion of training
data in the validation data improved validation scores, fSI
scores were compared for a validation set of only training data
to a validation set of non training data.
III. RESULTS
Table I shows the inter-observer validation results for all
structures. Apart from peripheral CSF, fSI scores of all
structures are good with a score of 0.82 for cortical grey
matter and total CSF, 0.95 for lateral ventricles and total brain
and even 0.98 for intracranial volume. Contrary to their high
fSI score, sensitivity of cortical grey matter is moderate with a
score of 0.77.
Table II shows the routine validation results for all
structures. Intracranial volume, total brain and lateral
ventricles scored well with fSI scores of resp. 0.98, 0.93, 0.92
and similar sensitivity scores. Subcortical structures scored
less with a fSI score of 0.83 and a sensitivity score of 0.88.
Total CSF, cortical grey matter and peripheral CSF scored
moderately with fSI scores of resp. 0.77, 0.76 and 0.71.
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
8
IV. DISCUSSION
In this paper we proposed a kNN based routine to segment
subcortical structures, subcortical grey matter, peripheral CSF
and lateral ventricles on 3T T1 and T2 MR brain images of
patients after aSAH. To measure subtle differences in brain
volumes, high accuracy and precision were required.
Therefore, we based our routine on the kNN algorithm, which
is an accurate and precise method, and used accurate training
data of an expert and automated most routine steps for optimal
precision. The fSI scores of intracranial volume, total brain
and lateral ventricles were good, while subcortical structures,
total CSF, cortical grey matter and peripheral CSF scores were
lower.
A. Classification issues
The low scores of cortical grey matter, peripheral and total
CSF are partially explained by the slice thickness (4 mm),
which exceeded the thickness of cortical grey matter (2-4 mm)
and peripheral CSF (± 2 mm) [19], which made it largely
consist of partial volume. Subcortical structures and especially
cortical grey matter both have a lower fSI score than total
brain. This is partly explained by the large overlapping area
between subcortical structures and cortical grey matter, where
partial volume correction caused rounding errors, and partly
by the perivascular spaces, which were misclassified as
cortical grey matter (figure 6).
Several studies showed that fluid attenuation inversion
recovery (FLAIR) images were more suitable for classification
of parenchymal high-signal intensity lesions on T2-weighted
MRI since it showed them hyper-intense and ventricles hypo-
intense [20]–[23]. In a study by Anbeek, its optimal SI score
decreased from 0.81 to 0.63 when FLAIR images were
excluded from training data, which consisted of inverse
recovery (IR), proton-density (PD), T1 and T2 weighted
images [24]. Because we did not have FLAIR images, good
segmentation was not feasible, since parenchymal high-signal
intensity lesions and lateral ventricles were both hyper-intense
on T2-weighted MRI and closely located to each other, and
occur on different locations and in different amounts.
Therefore, they were combined with subcortical structures to
which they belong anatomically.
B. Validation issues
In order to fully exploit the observer segmentations, they
were combined into manual fractions, which take partial
volume into account. Both observers got equal share, even if
one observer did not assign any structure. Information about
the distribution of multiple structures in a voxel was not
indicated by the observers, so we considered equal importance
of all structures. For example, three structures in a voxel all
got a probability of 1/3, in case of one observer. In reality, one
of the three structures could be dominant and should have a
higher probability. For all partial volume voxels where
structures were not equally distributed, manual fractions
deviate, which caused lower classification scores. However,
TABLE II
ROUTINE VALIDATION RESULTS
Tissue type Sensitivity Specificity fSI
Subcortical structures 0.88 0.98 0.83
Cortical grey matter 0.70 0.98 0.76
Peripheral CSF 0.74 0.99 0.71
Lateral Ventricles 0.92 1.00 0.92
Total Brain 0.92 0.99 0.93
Total CSF 0.80 0.99 0.77
Intracranial 0.98 0.99 0.98
TABLE I
INTER-OBSERVER VALIDATION RESULTS
Tissue type Sensitivity Specificity fSI
Subcortical structures 0.89 0.99 0.87
Cortical grey matter 0.77 0.99 0.82
Peripheral CSF 0.87 0.99 0.77
Lateral Ventricles 0.95 1.00 0.95
Total Brain 0.93 1.00 0.95
Total CSF 0.90 0.99 0.82
Intracranial 0.98 1.00 0.98
Fig. 6. Example of perivascular spaces misclassified as cortical grey matter.
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
9
we assumed that in a voxel, dominant structures will always
be noticed by both observers and inferior structures could be
missed by one observer, which will compensate for some of
the deviation.
Manual fractions could only take a limited number of values,
while kNN output had a wide range. Hence there was always
an error margin added, which decreased our fSI scores. We
chose not to threshold kNN output to the range of manual
fractions because it would change results for validation
reasons, while the unadjusted results were used for volume
measurement.
Using fSI instead of SI is an improvement because it could
deal better with partial volume. Probabilistic outcome of our
kNN routine did not have to be rounded and information of
multiple structures of both observers could be utilized
effectively. However, fSI scores were not used in other studies
so far and could therefore not be compared. Measurement of
the SI and fSI between observers was possible, since their
segmentations are binary and could be transformed to
fractions. The relation of fSI to SI scores could therefore be
examined. Generally, the fSI scores were lower than SI scores,
especially for structures with lots of partial volume, like
peripheral CSF and total CSF, because the SI formula did not
correct for partial volume. Usually a SI of 0.80 or higher is
considered a good segmentation and given that fSI is probably
stricter than SI, we considered the same for fSI. Compared to
the optimal SI values of the kNN based routine used by
Anbeek, which were based on PD, T1 and T2 weighted scans,
the present routine scored similar and even higher for lateral
ventricles. This is true while fSI is stricter and PD weighted
images were not included [24]. The high fSI score for lateral
ventricles could be explained by the larger ventricle volume of
patients after aSAH. Larger ventricles consist mostly of non-
partial voxels, which could better be classified than partial
volume voxels. An even lower optimal SI for cortical grey
matter, compared to our fSI score, indicated that our routine
did not fail but performed well using the kNN algorithm and
the provided imagery.
Validation scores of the single observers versus the
automatic routine were approximately similar as the combined
observers versus the automatic routine. Adding extra
information of uncertainty did not improve the scores. Leaving
training data out of the validation data did not change the
scores significantly, which indicated good classification
quality for new participant scans.
C. Application
Present routine is based on the kNN algorithm, which can
deliver precise and accurate results, while it is also simple and
fast. Its quality depends apart from the quality of the images,
strongly on the composition of the training data, in which
cerebral abnormalities were included. Samples of the training
data were consistently used by kNN for precise classification.
Because kNN effectively measured spatial and intensity
distances in feature space, only a small training set of non-
partial voxels was enough to deal with partial volume. The k-
means algorithm, which was used for brain mask creation, is
also simple and provides precise cluster images, under
assumption that sufficient samples were taken. With the use of
our defined set of morphological operations, cluster images
could be transformed into closed masks, which kept original
borders unchanged. Hence, the core of our routine is clear and
simple so we could focus on application specific processing
for improvement of kNN results. Apart from cerebellum
segmentation, all steps in our routine were automated.
Selection of appropriate training data may require lots of
expensive man hours, although a study by Vrooman showed
that automatic training with kNN is possible and routine steps
need only little adaption for general use [25]. Hence, its
application is feasible and additions and changes could be
tested without much human intervention.
D. Strengths and limitations
The strength of the present study is the usage of non-partial
volume samples in the training data for kNN classification.
Accuracy of brain volume was evaluated using small,
representative manual segmentations, which contained partial
volume information, while other brain volume measurement
studies use binary manual segmentations. Precision of brain
volume could be evaluated because data was selected from a
significant number of scans with variety of cerebral
abnormalities. For optimal precision, a standardized scanning
protocol was used for acquiring images of the data set.
Automated routine steps ensured consistency whereas manual
steps were consequently performed, like cerebellum
segmentation.
A limitation of the present routine is that many cerebral
abnormalities, like infarcts and perivascular spaces, could not
be processed automatically. However, we had accurate manual
segmentations of those cerebral abnormalities to our disposal,
so this limitation did not hinder accurate brain volume
measurements. The small number of observers limited the
evaluation because only 6 different values could be assigned
to the manual fractions, while kNN probabilities could have
100 different values, but it is still better than using binary
manual values.
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
10
V. CONCLUSION
In this paper, we proposed an automated routine for brain
volume measurements on MR brain images from patients after
aSAH. We extended kNN classification with processing steps,
which we described and evaluated. Lateral ventricles, total
brain and intracranial volume, have good validation scores
while structures with more partial volume scored worse. It
could be explained by validation limitations, since visual
inspection showed good performance for structures with much
partial volume, like peripheral CSF.
VI. FUTURE PROSPECTS
Most cerebral abnormalities present in patients after aSAH
were manually segmented, but could be automated after more
study or under other conditions. For accurate automatic
cerebellum segmentation, sagittal images may be needed,
since they show the border between cerebrum and cerebellum
clearer. Validation scores of structures with much partial
volume should increase with the number of observers, because
it makes the manual fraction more accurate. These
assumptions need to be addressed in further studies.
APPENDIX A
A.1 Data
For cross-sectional volume measurements, 39 patients after
aSAH and 30 control participants from the COMET study
were selected. Inclusion criteria were mentioned in chapter
Materials and Methods, section Data. Additionally, control
participants with symptomatic ischemia were excluded. One
control participant had a large infarct because of a
neurotrauma and 3 control participants had clinically manifest
infarcts.
A.2 Cross-sectional routine
For all participants in the SAH database, pre-processing was
performed as mentioned. In two cases, only 3 clusters were
taken in k-means and in 5 cases an extra cluster was added
when a good cluster image initially did not result in a good
mask. For some masks, eyes were removed, moderate
imperfections were adjusted or k-means was performed with
fewer clusters because of movement artifacts, infarcts,
bleedings or without clear reason.
Post-processing on kNN probability maps were performed,
where in 18 cases, one or two ventricle horns, which voxels
did not attach to the lateral ventricles voxels, had to be
manually moved back from peripheral CSF to lateral
ventricles.
Automated segmented volumes of all structures were
calculated by multiplication of the size of one voxel in
milliliters to the sum of all probabilities. For the validation
data, the difference between the automated and manual
volume and the average volume for all validation participants
were calculated.
The total volumes of structures were calculated by
multiplication of the sum of their probabilities to the voxel
volume in milliliters.
The results of the probabilistic classification of all
structures were visually checked for all participants, and
incorrectly classified images were excluded. Also total brain
and total CSF volume were calculated. The mean and standard
deviation of the total brain, total CSF, subcortical structures,
cortical grey matter, peripheral CSF, and lateral ventricular
volume were measured for patients after aSAH and control
participants.
A.3 Cross-sectional volume measurements
Table A.I shows the mean and standard deviation of
automated volume measurements for control participants and
patients after aSAH. As expected, patients after aSAH had
larger lateral ventricles and infarcts than control participants.
TABLE A.I
MEAN VOLUMES AND STANDARD DEVIATION OF VOLUMES IN PATIENTS WITH SAH AND CONTROL PARTICIPANTS
Peripheral CSF Lateral ventricles Total brain Total CSF Intracranial Infarct1
Control participants Volume (ml)
232 ± 52.5 26.6 ± 10.6 978 ± 80.8 259 ± 57.4 1235 ± 125 1.10 [0.67, 1.53]
Patients with SAH
Volume (ml) 200 ± 40.4 48.0 ± 25.4 956 ± 112 248 ± 39.4 1194 ± 134 5.92 [1.49, 20.8]
Data are unadjusted mean brain volumes ± SD or 1 median infarct volumes and interquartile range
MASTER RESEARCH ARTICLE OF ANNE KASPERS, BIOMEDICAL IMAGE SCIENSES, UNIVERSITY MEDICAL CENTRE UTRECHT
11
ACKNOWLEDGMENT
My special thanks go to Jeroen de Bresser for his pleasant
supervision and for his approachableness during the project, to
Koen Vincken and Hugo Kuijf for their suggestions during the
meetings, to Nelly Anbeek for her suggestions between
meetings and to Bart Waalewijn and Ekke Kaspers for
reviewing my article.
REFERENCES
1. van Gijn J, Rinkel GJE (2001) Subarachnoid haemorrhage: diagnosis, causes and management. Brain 124:249-278
2. Linn FH, Rinkel GJ, Algra A, van GJ (1996) Incidence of subarachnoid
hemorrhage: role of region, year, and rate of computed tomography: a meta-analysis. Stroke
3. Broderick JP, Brott TG, Duldner JE, Tomsick T, Leach A (1994) Initial
and recurrent bleeding are the major causes of death following subarachnoid hemorrhage. Stroke; a journal of cerebral circulation
4. Hackett ML, Anderson CS (2000) Health outcomes 1 year after
subarachnoid hemorrhage: An international population-based study. The Australian Cooperative Research on Subarachnoid Hemorrhage Study
Group. Neurology
5. Bendel P, Koivisto T, Niskanen E, Kononen M, Aikia M, Hanninen T, Koskenkorva P, Vanninen R (2009) Brain atrophy and
neuropsychological outcome after treatment of ruptured anterior cerebral
artery aneurysms: a voxel-based morphometric study. Neuroradiology
51:711-722
6. Bendel P, Koivisto T, Aikia M, Niskanen E, Kononen M, Hanninen T,
Vanninen R (2009) Atrophic enlargement of CSF volume after subarachnoid hemorrhage: correlation with neuropsychological
outcome. American Journal of Neurology 31:370-376
7. de Bresser J, Portegies MP, Leemans A, Biessels GJ, Kappelle LJ, Viergever MA (2010) A comparison of MR based segmentation
methods for measuring brain atrophy progression. Neuroimage 2:760-
768
8. Cover T, Hart P (1967) Nearest neighbor pattern classification. {IEEE}
Transactions on Information Theory 13:21-27
9. Schaafsma JD, Velthuis BK, Majoie CB, van den Berg R, Brouwer PA, Barkhof F, Eshghi O, de Kort GA, Lo RT, Witkamp TD, Sprengers ME,
van Walderveen MA, Bot JC, Sanchez E, Vandertop WP, van Gijn J,
Buskens E, van der Graaf Y, Rinkel GJ (2010) Intracranial aneurysms treated with coil placement: test characteristics of follow-up MR
angiography--multicenter study. Radiology 1:209-218
10. Klein S, Staring M, Murphy K, Viergever MA, Pluim JP (2009) elastix: a toolbox for intensity-based medical image registration. IEEE Trans
Med Imaging
11. MacQueen J (1965) Some methods for classification and analysis of
multivariate observations.
12. Likar B, Viergever MA, Pernus F (2001) Retrospective correction of MR intensity inhomogeneity by information minimization. IEEE Trans
Med Imaging 20:1398-1410
13. Jongen C, van der Grond J, Kappelle LJ, Biessels GJ, Viergever MA,
Pluim JP (2007) Automated measurement of brain and white matter lesion volume in type 2 diabetes mellitus. Diabetologia 50:1509-1516
14. Smith SM (2002) Fast robust automated brain extraction. Human Brain
Mapping 3:
15. van Swieten JC, Koudstaal PJ, Visser MC, Schouten HJ, van GJ (1988)
Interobserver agreement for the assessment of handicap in stroke
patients. Stroke 19:604-607
16. Zijdenbos AP, Want BM, Margolin RA, Palmer AC (1994)
Morphometric analysis of white matter lesions in MR images: method
and validation. IEEE Trans Med Imaging 4:716-724
17. Dice LR (1945) Measures of the Amount of Ecologic Association
Between Species. Ecology 26:297-302
18. Crum WR, Camara O, Hill DL (2006) Generalized overlap measures for evaluation and validation in medical image analysis. IEEE Trans Med
Imaging 25:1451-1461
19. Kandel ER, Schwartz JH, Jessell TM (2000) Principles of Neural Science Fourth Edition. McGraw-Hill Medical,
20. Admiraal-Behloul F, van den Heuvel DM, Olofsen H, van Osch MJ, van
der GJ, van Buchem MA, Reiber JH (2005) Fully automatic segmentation of white matter hyperintensities in MR images of the
elderly. Neuroimage 3:607-617
21. Anbeek P, Vincken KL, van Osch MJ, Bisschops RH, van der GJ (2004) Probabilistic segmentation of white matter lesions in MR imaging.
Neuroimage
22. Murray AD, Staff RT, Shenkin SD, Deary IJ, Starr JM, Whalley LJ (2005) Brain white matter hyperintensities: relative importance of
vascular risk factors in nondemented elderly people. Radiology 1:251-257
23. Wen W, Sachdev PS, Li JJ, Chen X, Anstey KJ (2009) White matter
hyperintensities in the forties: their prevalence and topography in an epidemiological sample aged 44-48. Human Brain Mapping 4:1155-
1167
24. Anbeek P, Vincken KL, van Bochove GS, van Osch MJ, van der GJ (2005) Probabilistic segmentation of brain tissue in MR imaging.
Neuroimage 4:795-804
25. Vrooman HA, Cocosco CA, van der Lijn F, Stokking R, Ikram MA, Vernooij MW, Breteler MM, Niessen WJ (2007) Multi-spectral brain
tissue segmentation using automatically trained k-Nearest-Neighbor
classification. Neuroimage 1:71-81