Liver Tumor Segmentation

Accepted Manuscript

Semi-automatic liver tumor segmentation with hidden Markov measure field

model and non-parametric distribution estimation

Yrjö Häme, Mika Pollari

PII: S1361-8415(11)00093-4

DOI: 10.1016/j.media.2011.06.006

Reference: MEDIMA 621

To appear in: Medical Image Analysis

Received Date: 13 August 2010

Revised Date: 13 June 2011

Accepted Date: 16 June 2011

Please cite this article as: Häme, Y., Pollari, M., Semi-automatic liver tumor segmentation with hidden Markov

measure field model and non-parametric distribution estimation, Medical Image Analysis (2011), doi: 10.1016/

j.media.2011.06.006

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers

we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and

review of the resulting proof before it is published in its final form. Please note that during the production process

errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

http://dx.doi.org/10.1016/j.media.2011.06.006



Semi-automatic liver tumor segmentation with hidden Markov measure fieldmodel and non-parametric distribution estimation

Yrjo Hame1,∗, Mika Pollari

Department of Biomedical Engineering and Computational Science, Aalto University School of ScienceP.O. Box 12200, FI-00076 AALTO, Finland

Abstract

A novel liver tumor segmentation method for CT images is presented. The aim of this work was to reduce themanual labor and time required in the treatment planning of radiofrequency ablation (RFA), by providing accurateand automated tumor segmentations reliably. The developed method is semi-automatic, requiring only minimal userinteraction. The segmentation is based on non-parametric intensity distribution estimation and a hidden Markovmeasure field model, with application of a spherical shape prior. A post-processing operation is also presented toremove the overflow to adjacent tissue. In addition to the conventional approach of using a single image as input data,an approach using images from multiple contrast phases was developed. The accuracy of the method was validatedwith two sets of patient data, and artificially generated samples. The patient data included preoperative RFA imagesand a public data set from ”3D Liver Tumor Segmentation Challenge 2008”. The method achieved very high accuracywith the RFA data, and outperformed other methods evaluated with the public data set, receiving an average overlaperror of 30.3% which represents an improvement of 2.3 percentage points to the previously best performing semi-automatic method. The average volume difference was 23.5%, and the average, the RMS, and the maximum surfacedistance errors were 1.87, 2.43, and 8.09 mm, respectively. The method produced good results even for tumors withvery low contrast and ambiguous borders, and the performance remained high with noisy image data.

Keywords:Liver tumor segmentation, Semi-automatic segmentation, Hidden Markov measure field model

1. Introduction

Liver tumor segmentation has several applica-tions, such as treatment planning and evaluation, andcomputer-assisted surgery. Manual delineation of tu-mors is time-consuming and laborious, and the resultsdepend on the observer. For these reasons, there hasbeen increasing research interest directed at segmenta-tion methods that take advantage of existing comput-ing capabilities. The need for method development isunderlined by the fact that liver cancer is among the

IThe research leading to these results has received fundingfrom the European Community’s Seventh Framework Programme(FP7/2007-2013) under grant agreement n 223877.∗Corresponding authorEmail addresses: [email protected] (Yrjo Hame),

[email protected] (Mika Pollari)URL: http://users.tkk.fi/yhame/ (Yrjo Hame)

1Present address: Dept. of Biomedical Engineering, ColumbiaUniversity, New York, NY, USA

five cancers causing the most deaths worldwide, andmetastatic lesions are also common in the liver (Fried-man et al., 2003).

Contrast-enhanced computed tomography (CECT) ismost commonly used for liver lesion evaluation andstaging after initial ultrasound imaging (Hann et al.,2000). The imaging is commonly performed in twoor three phases that correspond to the different timesat which the contrast agent arrives to the liver throughthe dual blood supply of the organ (Baron, 1994). Inaddition, native computed tomography (CT) imaging iscommonly performed.

The correct timing of the CECT imaging phases isdifficult due to variability of patients, making the im-age data often sub-optimal. Typical CT data also hasa relatively high level of noise, and as the contrast be-tween the tumor and parenchyma is often low, the tumormay be difficult to detect, and even more so to reliablydelineate. In addition to the limitations of the imaging

method, liver tumor segmentation is also complicatedby tumor variability in size and structure, and they mayappear practically anywhere within the organ.

State-of-the-art segmentation methods offer reduc-tions in the amount of required user interaction, repeat-able results and accuracy comparable with manual seg-mentations. For the overall treatment process, thesetraits reduce expenses and increase the process relia-bility. Segmentation methods that require only mini-mal initial user interaction, i.e. semi-automatic meth-ods, have been the recent focus of research. They haveproved to be able to provide reliable results with ac-curacy similar to interactive methods (Deng and Du,2008). Fully automatic methods generally suffer fromlower accuracy and robustness, as well as a significantlyhigher computational cost.

The semi-automatic method by Smeets et al. (2009)is based on a level set method fitted on a fuzzy classi-fication of the image data. The method performed wellin the 3D Liver Tumor Segmentation Challenge 2008(LTS08) (Deng and Du, 2008) but its accuracy declinesif the tumor has a low-contrasted edge. In addition,since the classification assumes normal distributions forthe classes, it does not perform so well if the tumor isadjacent to other structures than healthy liver tissue.

Another semi-automatic method by Moltz et al.(2008) estimates typical tumor and parenchyma intensi-ties based on input from the user, and defines thresholdsfor region growing based on these estimates. The resultis post-processed with morphological operations. Themethod also performed well with the LTS08 data, but itencounters difficulties with tumors that have inhomoge-neous intensity distributions.

A more general approach to lesion segmentation pre-sented by Jolly and Grady (2008) also estimates the in-tensity distribution based on user-given points, and thesegmentation is based on a fuzzy connectedness algo-rithm that finds a cost value for every image point. Themethod is able to segment various kinds of tumors asproved by an extensive evaluation. However, the seg-mentation accuracy leaves room for improvement and itdoes not perform well with heterogeneous tumors.

Other related work includes semi-automatic methodby Li et al. (2006), where tumor boundaries are lo-cated with a machine learning-based classifier, and theliver structure segmentation method by (Freiman et al.,2008), which uses a multi-class Bayesian classifier andmorphological operations for adjustments. In addition,a recent publication includes a benchmark study of threesemi-automatic methods (Zhou et al., 2010).

Tumors with low contrast are challenging for thesetumor segmentation methods, especially if the image

has a high level of noise. Ambiguous borders cause dif-ficulties in particular for boundary-based segmentationmethods. This has created a need for a method that isable to perform reliably and accurately with these char-acteristics present, without increasing the amount of re-quired user interaction.

The target application of this work was radiofre-quency ablation (RFA) treatment planning, where reli-able tumor segmentations are needed for accurate nee-dle placement. Tumors treated with RFA are typicallyrelatively small in size, with diameters of less than 5 cm,and they have a generally spherical shape (Gazelle et al.,2000). A spherical shape is typical also for metastaticlesions in the liver (Halvorsen et al., 1982) and for singlenodular hepatocellular carcinoma (Kanai et al., 1987).

A novel semi-automatic method was developed forsegmenting liver tumors from low-quality CT data. Themethod is based on non-parametric intensity distribu-tion estimation and the hidden Markov measure field(HMMF) model (Marroquin et al., 2003). The HMMFmodel adds a continuous-valued measure field estima-tion step to the classical Markov Random Field (MRF).The application of the measure field provides a smoothcost function that is simple and efficient to optimize,improving the MRF by removing difficulties with lo-cal minima and oscillating behavior. Also, the measurefield captures the classification uncertainty by reducingthe weight for points with uncertain classifications inthe field cliques.

The method assumes a roughly spherical shape fortumors, and that in general, the intensity distributions oftumors and the adjacent tissue do not necessarily followany particular statistical distribution. A multivolume ap-proach is also presented for using all the available imagedata.

The developed method was evaluated using two setsof patient data, the publicly available data set of LTS08and a data set that consisted of pre-operative images ofpatients treated with RFA. Also, a novel framework ofcreating artificial data with ground truth segmentationswas developed. The artificial data is used for analyzingperformance with different levels of contrast.

Following this introduction, the developed methodand the data used for training and evaluation are de-scribed in Section 2. The evaluation results are reportedin Section 3, and Section 4 concludes the paper with adiscussion.

2

2. Methods

2.1. Segmentation task formulation

This general formulation follows the outline of theoriginal HMMF model (Marroquin et al., 2003). Somesignificant modifications have also been introduced,most importantly in the probability distribution P(q) ofthe measure field q, and in the non-parametric intensitydistribution estimates that are kept static in the segmen-tation process.

Let the observed image be I. The segmentation taskconsists of finding the label field f that maximizes theposterior probability P( f |I). Let Ω represent the imagedomain, with r ∈ Ω representing image points (voxels).Then f (r) ∈ ZM = 1, ...,M, where M is the numberof different classes for the segmentation task. For thepurposes of liver tumor segmentation, M = 2 (see dis-cussion in Section 2.2).

The label field f is found in two steps. The first stepconsists of generating a Markov random vector field qwith distribution

P(q) =Q(q)

Kexp

S D(q) −∑

C

WC(q)

, (1)

where Q(q) is a class-dependent prior probability func-tion, K is a positive normalizing constant, S D(q) is ashape prior dependent on the input data D, C are thecliques of a given neighborhood system, and WC arepotential functions. The used shape prior S D is simi-lar to the approach introduced in Flach and Schlesinger(2008). The M-dimensional vector q(r) is also con-strained by

M∑

k=1

qk(r) = 1, qk ≥ 0, (2)

where qk(r) is the kth component of q(r). Here the con-straint becomes q1(r) + q2(r) = 1; q1, q2 ≥ 0.

The label field f is generated from q in the secondstep, each f (r) being an independent sample from thedistribution q(r), with:

P( f |q) =∏

r∈Ωq f (r)(r),

where the component q f (r)(r) of vector q(r) correspondsto class f (r).

For finding the optimal estimator q∗ for the vectorfield in the first step, the MAP estimator is computedby

q∗ = arg maxq

P(q|I),

with the constraint (2) applied. Using the Bayes rule,the posterior distribution P(q|I) is defined as:

P(q|I) =1R

P(I|q)P(q), (3)

where R is a positive normalizing constant. The con-ditional distribution is defined as (see Marroquin et al.(2003) for proof):

P(I|q) =∏

r∈Ω

M∑

k=1

P(I(r)| f (r) = k)qk(r). (4)

For brevity, we denote the observation likelihood func-tions as P(I(r)| f (r) = k) = vk(r) and P(I(r)| f (r)) = v(r).The sum term in (4) can then be expressed as v(r) · q(r),or v1(r)q1(r) + v2(r)q2(r) for M = 2.

Combining (1), (3), and (4) results in

P(q|I) =1

KRexp

[−U(q)],

where

U(q) = −∑

r∈Ωlog(v(r) · q(r)) − log(Q(q))

−S D(q) +∑

C

WC(q). (5)

As 1/KR > 0, q∗ is found simply by computing theminimum of U(q). The details of this process are pre-sented in Section 2.5.

After obtaining the MAP estimator q∗, the optimalestimator f ∗ for the label field f is found by maximizingP( f |q = q∗, I). This is done by finding the mode for eachq∗(r):

f ∗(r) = arg maxk

q∗k(r). (6)

With M = 2, this is equal to f ∗(r) = max(q∗1(r), q∗2(r)).

2.2. Method overviewThe segmentation is performed in four stages:

1. Preprocessing and user input2. Estimation of observation likelihood functions3. HMMF segmentation4. Post-processing

The first stage involves input from the user, after whichall the subsequent stages are performed automatically.Here, the stages are described briefly, with a morethorough presentation in the following sections. Theprocess is illustrated in Fig. 1.

Given an image or multiple images of a patient as in-put to the method, the user selects two points indicating

3

Figure 1: Stages of the segmentation method

the location of the tumor. The method then performs thepreprocessing steps based on the user input.

The second stage uses the image data and regions de-fined in the previous stage to estimate intensity distri-butions for the segmentation classes. Using these esti-mates, each image point r is then assigned a likelihoodvalue vk(r), indicating how probable such an observa-tion I(r) is for each class k. The result is passed on tothe third stage of the method.

The third stage performs the actual segmentation us-ing the formulation presented in Section 2.1.

The final post-processing stage modifies the segmen-tation objects by removing overflown sections. This isdone by comparing the tumor object shape with the cen-ter of the tumor as defined by user input. The output ofthe post-processing stage is the final segmentation.

A segmentation in two classes was selected in the im-plementation, since the number of actual tissue typesaround the tumor is unknown in general. Two classessimply indicate whether a point is part of the tumor ornot. The used nonparametric intensity distribution esti-mates are very useful in cases with several tissue typesaround the tumor.

In the following, k = 1 represents the class corre-sponding to the tumor. The stages of the method areillustrated with an example in Fig. 2.

All of the parameter values presented here were se-lected based on results from training data and used forall evaluation data. The used training data was providedby the LTS08 competition.

2.3. Preprocessing and user input

From image I, the user selects the axial slice viewwhere the tumor appears the largest. Then, the user se-lects two points on opposite edges of the tumor, so thatif a line was drawn between the points, it would pass ap-proximately through the center of the tumor as observedin the slice view. Let the selected points be l1 and l2.

The next step is to construct a ROI ΩR ⊂ Ω. Thesegmentation is performed only for points r ∈ ΩR. Inaddition, training samples Tk ⊂ ΩR need to be deter-mined for each class k for estimating the observationlikelihood functions in the next step.

Using the user-defined points, the following variablesare determined:

1. ROI center rc = 12 (l1 + l2)

2. tumor radius dT = 12 |l1 − l2|

3. ROI radius dROI = max(1.5dT , dmin)

The ROI radius is limited to at least dmin = 8 mm, to pre-vent it from becoming too small for very small tumors.In addition, the width of the ROI edge de is assigned avalue of 1.5 mm.

To simplify notation, let x = x1, x2, x3 represent co-ordinate points with origin at rc, so that x = r − rc. Theterms x1, x2, and x3 are coordinates along the sagittal,coronal, and axial axis, respectively. Then ΩR is definedas a sphere, with center at rc and radius dROI :

ΩR =x ∈ Ω | x2

1 + x22 + x2

3 ≤ d2ROI .

The training samples T1 are determined as the set ofpoints within an ellipsoidal area centered at rc:

T1 =

x ∈ ΩR |x′21a2

1

+x′22a2

2

+x2

3

a22

≤ 1

,

where the coordinates x′1 and x′2 are obtained by rotatingx1 and x2 around the x3 axis with center of rotation atrc, so that the long axis of the ellipsoid passes throughthe input points l1 and l2. The variables a1 and a2 areassigned values depending on the tumor radius: a1 =

2dT and a2 = 0.8dT .The training samples T2 are determined as the set of

points at the ROI edge as defined by de:

T2 = x ∈ ΩR | d(x, rc) ≥ dROI − de ,4

(a) (b) (c)

(d) (e) (f)

Figure 2: Main stages of the method illustrated with an example: a) User input and ROI construction, with the following markers: outer ring forROI border, ellipsoid for sampling area of tumor training data, x-markers for input points, small circle for rc, b) observation likelihood function forclass 1, c) observation likelihood function for class 2, d) measure field MAP estimate q∗, e) axial slice visualization of the segmentation result, f)3D visualization of the segmentation result

where d(x, rc) is the Euclidian distance between x andrc.

An example of the selected points, the resulting ROIand the training sample regions are visualized in Fig.2(a).

The chosen approach for user interaction and inten-sity distribution estimation is similar to the solutionsused by Smeets et al. (2009) and Moltz et al. (2008).In all of these methods, the intensity distributions areestimated directly from the image using the location in-formation provided by the user. Also, all of the methodsuse a region of interest (ROI) or maximal radius to re-strict the segmentation.

2.4. Estimation of observation likelihood functionsAs the class intensity distributions do not neces-

sarily follow a specific statistical distribution, a non-parametric estimation method is used. An estimate vk(r)for vk(r) is obtained separately for each class k usingthe Parzen windows method (Parzen, 1962). Given thetraining data Tk for class k, there are nk training samplesTk(ti), i = 1, ..., nk at points t ∈ Tk. Then, vk(r) is definedas:

vk(r) =1

hnk

nk∑

i=1

K(

I(r) − Tk(ti)h

), (7)

where h is a free parameter controlling the smoothnessof the estimate, and K is a Gaussian kernel:

K(x) =1√

2πσ2exp

[− (x − µ)2

2σ2

].

5

The variables σ2 and µ are the variance and the mean ofthe kernel, respectively. Here we use values σ2 = 1 andµ = 0, so (7) can be written as

vk(r) =1

hnk√

2π

nk∑

i=1

exp[− (I(r) − Tk(ti))2

2h2

].

An empirically chosen value h = 2.2 is used here.The above likelihood estimation is used only for

points having intensity values higher than 0, since lowervalues do not usually occur in tumors in CT images(Dowsett et al., 1998). Points r with intensities of 0of less, i.e. I(r) ≤ 0 are assigned likelihood values ofv1(r) = 0 (tumor class) and v2(r) = vm, where vm is thehighest likelihood value assigned to any point for class2.

2.5. Segmentation with HMMF modelTo find the MAP estimate q∗, the minimum of U(q)

(5) is computed. Here the terms of U(q) are defined andthe process is described in detail.

The class-dependent prior probability Q(q) is used toadd sensitivity to the segmentation method by giving alower prior probability to class 2 (not tumor). This way,in uncertain cases the point will be classified more likelyas a part of the tumor than the background. However, itwas noted that if the tumor contrast is very low, addingsensitivity may cause poor results. For this reason, anadaptive Q was used, based on the overlap of the inten-sity distribution estimates.

The function is defined as:

Q(q) =∏

r∈ΩR

M∑

k=1

γkqk(r),

where the weights γk control the prior probability foreach class k. The used values were γ1 = 1 and γ2 =

1.08 − 0.55α, where α is a measure of separation of thetwo intensity distribution estimates:

α =12

∫

r| v1(r) − v2(r) | . (8)

The value of α varies from 0 for identical distributions,to 1 for completely separated distributions.

Next, the shape prior S D is defined. It takes advan-tage of the location information provided by the user,essentially indicating that an area around the center rc

should be segmented as the tumor and the region at theROI edge should not be a part of the tumor. The shapeprior is defined as:

S D(q) = λs

∑

r∈ΩR

∑

k∈M

sk(r,D)qk(r),

Figure 3: Example of shape prior s1, with input points shown withx-markers

where λs = 3.0 is a weighting constant. The function sfor class k = 1 is defined inside the ROI as:

s1(r,D) =

1 − (1 + aD(r))−1 , if d(r, rc) < dROI − de

−1, otherwise

with

aD(r) = exp[−β

(d(r, rc)

dT− ds

)],

where β = 20 controls the slope of the function and ds

is a modifier parameter controlling the size of the centerregion with respect to tumor radius dT , chosen here asds = 0.55. For class 2, s2(r) = −s1(r).

The function s takes the form of a logistic functionscaled with the tumor radius dT . For class k = 1, sis close to a value of one at the ROI center rc, and ap-proaches zero when the distance to rc is larger than dT ds.At the ROI edge, it is given a value of −1. An exampleof the shape prior is shown in Fig. 3.

The potential functions WC enforce the smoothnessof q. Here, pairwise cliques C in the 26-neighborhoodof each point r are used to compute the potentials. Thepotential functions are defined as:

Wr1r2 = λW exp[−d(r1, r2)2/(2σ2

W )] M∑

k=1

(qk(r1)−qk(r2))2,

where d(r1, r2) is the Euclidian distance between the twopoints, σW = 1.5 is the standard deviation of the expo-nential term used to modify the weights of the neighbor-ing points and λW = 20 is a weighting constant.

The function U is minimized using the gradient de-scent optimization method. After this, the label field fis found as described in (6).

2.6. Post-processingStructures adjacent to the tumor with similar intensity

distributions may cause the segmentation to overflow

6

outside the tumor. This often happens through a narrowpassage, and results in a segmentation object that has a’handle’ attached to the spherical tumor mass. Thesehandles are removed in the post-processing stage, alongwith any objects that were classified as tumor, but notattached to the actual tumor object.

The handle removal is based on comparing two dif-ferent distance values of points from the ROI center rc.The first value is the Euclidian distance dE(r, rc), and thesecond one is a weighted distance value dw(r, rc), whichapproximates the distance to be traveled inside the seg-mentation object to connect the two points. After ob-taining the two values, the difference is observed. If thedifference is large, it can be deduced that r belongs to ahandle, since there is no direct path from rc to r whenadvancing inside the segmentation object.

The implementation of this is done using the FastMarching Method (Sethian, 1996), so that a front with aspatially varying speed function F(r) is advanced start-ing from the center point rc. The passing time tr of thefront at each point r is assigned as its distance valued(r, rc).

To do this, the Eikonal equation is solved:F(r)|∇d(r, rc)| = 1, where ∇d(r, rc) is the gradient of thedistance function. To compute the Euclidian distance,F(r) = 1,∀r ∈ ΩR. For computing the weighted dis-tance, F(r) = 1 if f (r) = 1 and F(r) = 0.1 otherwise. Inthe case of the weighted distance, the advance of thefront is significantly slower outside the segmentationobject. Using the distance values, a probability volumeµ is then computed: µ(r) = exp[−(dE(r, rc)−dW (r, rc))2]if f (r) = 1, and µ(r) = 0 otherwise. The final seg-mentation f ′ is then found by thresholding, so that ifµ(r) > 0.8, then f ′(r) = 1, otherwise f ′(r) = 0.

The operations of the post-processing stage are illus-trated in Fig. 4 using an artificial example. The figureshows that the handles are removed in the process, whileleaving the spherical object and the protrusions in thebottom part intact.

Fig. 5 illustrates the post-processing stage using areal CT volume example. The figure shows a largetumor, where the segmentation has slightly overflown.The overflown region and a disconnected object areclearly seen in the 3D object of Fig. 5(b). The post-processing operation removes the handle and the dis-connected object while keeping the tumor segmentationintact.

2.7. Multiphase segmentationTo incorporate information from multiple CECT

phase images, all of the available images were regis-tered to a common coordinate system. The registration

was performed with an in-house registration program,which was similar to the IRTK software (Rueckert et al.,1999) with a few modifications.

The native CT image of each patient was selected asa common target and all CECT images of the same pa-tient were registered pair-wise with the target. The po-sitional differences were corrected with a rigid transfor-mation model and local deformations with a B-splinemodel (Rueckert et al., 1999; Rohlfing et al., 2003).During the non-rigid registration, the control point dis-tance was hierarchically refined from isotropic 40 mmto 20 mm. These values were chosen based on a previ-ous study (Rohlfing et al., 2004), where similar transfor-mation models were successfully used to estimate res-piratory motion in the liver.

As the energy function, the inverse of normalized mu-tual information (Studholme et al., 1999) was used:

E = −H(IS ) + H(IT )H(IS , IT )

,

where H(I) is the marginal entropy and H(IS , IT )the joint entropy for source IS and target IT images.For non-rigid registration, smoothness (Wahba, 1990;Rueckert et al., 1999) and incompressibility (Rohlfinget al., 2003) constraints were tested but not used, sincethey did not result in any increase of registration accu-racy.

The energy in rigid registration was minimized withthe downhill simplex method (Nelder and Mead, 1965),and in the non-rigid case with the conjugate gradientmethod (Press, 2007). For both optimization methodsan implementation from Press (2007) was used. Tospeed up the computation, two-level multiresolution op-timization was used. We used isotropic voxel dimen-sions of 1.0 and 2.0 mm3 for the fine and coarse reso-lution levels, respectively. After registration, the CECTimages were transferred and resampled to the target do-main with the computed transformations. Resamplingwas performed with trilinear interpolation.

The segmentation process mostly remains the sameas in the single-volume case, with the only differencebeing how the observation likelihood functions are esti-mated. The user can use any image for providing input,but this should be done using the one with the highest tu-mor contrast. The subsequent steps in the preprocessingand user input stage remain the same as in the single-volume case.

The estimation of observation likelihood functions isinitially done separately for each image. Then, a sepa-ration measure of the intensity distribution estimates (8)for each image is computed. The separation measure isused as a weighting factor for a joint estimate. With this

7

(a) (b) (c) (d)

Figure 4: Post-processing operation illustrated with an artificial example, where tumor center rc indicated with x-marker: a) spherical binaryobject with protrusions at the bottom and an overflown section at the top, b) weighted distance dW (r, rc), c) probability volume µ(r), and d) finalsegmentation f ′

(a) (b) (c) (d)

Figure 5: Example of a post-processing result: a) segmentation overlaid on slice image before post-processing, b) 3D view of segmentation objectbefore post-processing, c) slice view after post-processing and d) 3D view after post-processing

weighting factor, images with high contrast are weighedmore and subsequently given more influence on the re-sulting segmentation.

Let N be the number of available images and vIik (r)

the intensity distribution estimate generated from imageIi for class k. Then the joint estimate can be expressedas:

vk(r) =

∑Ni=1 αiv

Iik (r)

∑Ni=1 αi

,

where αi is the separation measure (8) for image Ii.Using all three images adds robustness to the result

and may prevent overflow occurring with the singlephase result. An example of the effect is displayed inFig. 6, showing a single phase segmentation with slightoverflow that is corrected with the multiphase segmen-tation. In this example, the multiphase segmentation re-sult is less sensitive than the single phase alternative.

The multiphase segmentation requires only slightlymore computation than the single-phase approach, sincethe most computationally expensive operation of esti-mating the MAP field remains the same. However,the registration step involves a high computational cost,

making the multiphase segmentation impractical unlessthe registered image volumes are already available. Inour target application of RFA treatment planning, theimages are registered for other purposes of the treatmentplanning system.

2.8. Evaluation measures and data

In the evaluation of the method, the following fivemeasures were computed by comparing each segmenta-tion with its reference segmentation (see Deng and Du(2008)):

I) Volumetric overlap error [%] (percentage of pointsin the intersection of the two segmentations) (OE)

II) Relative absolute volume difference [%] (VD)III) Average symmetric surface distance [mm] (SD)IV) Root mean square (RMS) symmetric surface dis-

tance [mm] (RD)V) Maximum symmetric surface distance [mm]

(MD)

For each measure, a value of 0 corresponds to an exactmatch with the reference segmentation and all are larger

8

(a) (b) (c) (d)

Figure 6: Example of multiphase segmentation effect: a) slightly overflown segmentation result using single portal vein phase image, b)-d)multiphase segmentation result using respective portal vein, arterial and native images.

than or equal to zero. The average human rater vari-ability has been reported for the LTS08 data as (Dengand Du, 2008): I) 12.94%, II) 9.64%, III) 0.40 mm, IV)0.72 mm, and V) 4.0 mm.

The first evaluation was made with artificial data,which provided a ground truth segmentation for refer-ence. In addition, it also enabled controlling the tumorcontrast.

Due to the autocorrelation of noise in CT images, itis difficult to construct an artificial image that resemblesa real CT image of a liver with a tumor. For this reason,artificial data was created by altering the intensities of aregion of the parenchyma from a real native CT image,to imitate the appearance of a tumor.

An artificial tumor object was created by constructinga small volume including a spherical region in the mid-dle with value −1, and value 0 outside. All the generatedartificial tumors were of the same size with a diameterof 2.0 cm, resembling the size of a small tumor, typicalfor RFA treatment. Gaussian filtering with standard de-viation 1.5 was performed on this object to simulate thepoint spread of the imaging device. The artificial tumorobject is shown in Fig. 7(b).

Two regions from separate CT images were then ex-tracted (see example in Fig. 7(a)). Five locations rep-resenting tumor center points were selected from bothregions, bringing the total to ten locations. A single ar-tificial tumor sample was generated by multiplying theartificial tumor object with the desired contrast valueand then adding the object to one of the ten locations.An example result is shown in Fig. 7(c). This way, tenartificial tumors were generated for each contrast level,one at each location. Four different contrast levels wereused: 20, 15, 10, and 7.5.

The second data set used for evaluation was the pub-licly available LTS08 competition data, which has al-

(a) (b) (c)

Figure 7: Example of generation of artificial evaluation data: a) a sec-tion of healthy liver tissue from CT data, b) a spherical tumor object,and c) sum of image and tumor object with contrast of 15 Hounsfieldunits

ready been used for evaluation of several other methods(Deng and Du, 2008). The data set included trainingdata of four images with 10 reference tumor segmenta-tions, and test data with a total of 13 images. A totalof 20 tumors were segmented from the test data. Thereference segmentations for the test data were not avail-able to the authors, and the evaluation was conducted bysending the final segmentations to the competition orga-nizer, who provided the evaluation measures and scores.The segmentations were given scores on a scale from 0to 100, with 100 corresponding to an exact match, and90 points corresponding to a segmentation with errorvalues equal to interobserver variability in manual de-lineations.

To set the conditions similar to other evaluations withthe LTS08 data, the user was allowed to modify the seg-mentation input if the output of the method was not sat-isfactory. The modification was done simply by choos-ing input points from another location. The modified in-put provided the method with a different set of trainingdata, giving an alternate result. Only the final segmen-tation was compared with the reference segmentation.

9

The third data set was provided by University ofLeipzig, Department of Diagnostic and InterventionalRadiology. The images were preoperative images ofnine patients undergoing RFA treatment. Three phaseimages were available for all but one patient, for whichonly two images were acquired. Since the RFA treat-ment is generally used only for relatively small tumors,all of the tumors in this data set were smaller than 5 cmin diameter. Patient 1 had three tumors, and the rest hada single tumor, bringing the total to 11 tumors. Threeof the tumors had previously been treated with tran-scatheter arterial chemoembolization (TACE). The vol-umes had axial dimensions of 512 × 512 points with anin-plane resolution between 0.68 mm and 0.89 mm. Theresolution between the axial image slices was between2 mm and 3 mm. The error values were computed bycomparing the segmentations produced by the methodto manual segmentations.

The average tumor contrasts for the data sets weremeasured by computing the absolute difference of themedians of the two training sets for each tumor and tak-ing the average of the difference values over the dataset.

3. Results

3.1. Artificial data

The average evaluation results for the segmentationsof ten tumors with different contrast levels are listedin Table 1. The results show that the method gener-ated very accurate segmentations at the highest contrastlevel (20). The accuracy clearly dropped with decreas-ing contrast. With a contrast of 10 Hounsfield units, theaccuracy was approximately the same as for the RFApatient data (Table 2).

These results can be used for estimating the reliabilityof the segmentation result of real tumors. For example,if an average overlap error of less than 30% is desired,the tumor contrast should be more than 10 Hounsfieldunits. However, it should be noted that the artificialsamples did not include any heterogeneity that is oftenpresent in patient data, and that the used shape is espe-cially suitable for the segmentation method. For thesereasons, the results could be interpreted as an upper-bound estimate for the method performance, rather thana measure of the average accuracy.

3.2. LTS08 data

A summary of the LTS08 data evaluation results isshown together with the Leipzig data results in Table2. The scores received in the evaluation are illustrated

Table 1: Evaluation result averages for artificial data, using four dif-ferent contrast levels for 10 artificial tumors. For descriptions of errormeasures, see Section 2.8.

Cont. OE VD SD RD MD

20 15.12 12.33 0.44 0.62 2.1315 21.13 16.50 0.64 0.84 2.9910 31.47 26.09 1.02 1.22 3.247.5 38.65 35.53 1.32 1.50 3.33

Table 2: Evaluation results for the LTS08 and Leipzig data sets, in-cluding multiphase (Leipzig M.) and single-phase (Leipzig S.) seg-mentation results. For descriptions of error measures, see Section 2.8.

OE VD SD RD MD

LTS08Mean 30.35 23.53 1.87 2.43 8.09SD 11.03 13.97 1.17 1.41 4.49Worst 53.62 51.69 4.75 5.87 19.39Best 15.23 0.67 0.43 0.71 2.60

Leipzig M.Mean 28.59 17.85 0.86 1.20 4.65SD 5.94 10.60 0.33 0.45 2.41Worst 35.83 30.48 1.54 2.15 9.52Best 15.52 2.00 0.61 0.85 2.32

Leipzig S.Mean 29.60 17.75 0.89 1.24 5.12SD 5.61 11.40 0.31 0.42 2.75Worst 36.66 37.18 1.56 2.14 11.06Best 17.72 2.05 0.58 0.78 2.20

with a boxplot in Fig. 8. The evaluation included all 20tumors of the test data. The training data evaluation isnot included in the results.

The average tumor contrast for the test data set was39 Hounsfield units. The post-processing stage left fiveof the segmentations unaltered and only four of the seg-mentations were altered by more than 1% in volume,with the maximum change being 8.12%. These fourtumors were among the five largest tumors of the dataset. In six cases, the user chose to have the tumor re-segmented. Four of the tumors were resegmented onceand two of them twice. The resegmentation was doneby choosing the input points at another location. Onlythe final segmentation was evaluated with the referencesegmentation.

The average total score ± standard deviation in theevaluation was 70.3 ± 14.3 points, with a median of 73

10

Overlap error Volume diff. Av. Surf. Dist. RMS Surf. Dist. Max. Surf. Dist. Total

0

10

20

30

40

50

60

70

80

90

100S

core

Figure 8: Scores of the LTS08 data set presented as boxplots. Scoreof average interobserver variability (90) is shown with dashed line forreference.

points. The segmentation results for very large tumorswere significantly poorer than the average results. Thedata set included three very large tumors, for which themean score was 52.7. For these three tumors, the rel-ative error measures (overlap error and volume differ-ence) were close to the average, but the surface distanceerrors were very high, the mean average symmetric sur-face distance being 3.51 mm, for example. This waspartly caused by the heterogeneous appearance of thelargest tumors. In addition, largest tumors often also oc-cupy regions close to the liver border, making overflowto adjacent structures more likely.

The computation time depended greatly on the size ofthe tumor. The smallest tumors were computed in lessthan 30 seconds, but the larger ones required up to 15minutes with the current implementation. However, theprogram code used for evaluation was non-optimizedand used only a single processor core. The computa-tionally most intensive part was the iterative process offinding the MAP estimate q∗, which is also possible tocompute in parallel.

In Table 3 the evaluation results are compared withthe results of the previously best-performing semi-automatic method for the LTS08 data set, and the in-terobserver variability. The presented method receivesa higher score and lower values for all of the evalua-tion measures, with the exception of the relative vol-ume difference. The previously highest score for theLTS08 data was by an interactive method (Stawiaskiet al., 2008), with 70.0 points.

Table 3: Comparison of validation results and respective scores withpreviously best-performing semi-automatic method (Smeets et al.,2009) and manual delineations evaluated on the LTS08 data set (Dengand Du, 2008). For descriptions of error measures, see Section 2.8.

OE VD SD RD MD Total

Meas.Hame 30.3 23.5 1.9 2.4 8.1Smeets 32.6 17.9 2.0 2.6 10.1Manual 12.9 9.6 0.4 0.7 4.0

ScoresHame 76.6 75.6 53.8 66.2 79.7 70.3Smeets 74.8 81.5 52.6 63.3 74.6 69.4Manual 90 90 90 90 90 90

3.3. Leipzig dataThe segmentation was performed using first a single

image and then multiple images from different phases.Both the multiphase and single phase segmentationswere generated using the same user input data for eachpatient data set. The single phase segmentation was per-formed on the image observed as having the best tumorcontrast. In the multiphase evaluation, the used patientdata included native, arterial and portal vein phase im-ages, except for Patient 1, for which the portal vein im-age was not available.

Tumor contrast was evaluated in the single phase ver-sion, and the average value for non-TACE tumors was21 Hounsfield units. Only one of the segmentations wasaltered in the post-processing stage, with a volume re-duction of 1.31%.

The volumes were aligned with the registrationmethod described in Section 2.7. The registration accu-racy was visually evaluated by overlaying the resampledsource image in the target image. A transparency of 40-60% and intensity window from −100 to 150 Hounsfieldunits was used. For one subject (Patient 5, arterialphase) a registration error of a few millimeters at thetop of the liver was detected. All the other registrationswere visually evaluated as successful.

The evaluation results are listed in Table 2. The sur-face distance error values are notably lower than for theLTS08 data set. Most of the average error values forthe multiphase segmentation results are slightly lowerthan for the single-phase results. The worst-case resultsshow that the multiphase version is more robust, reduc-ing some the highest error values. Examples of multi-phase segmentation results are visualized in Fig. 9.

The average computational time for the single vol-ume segmentation after initial user input was 33 seconds

11

50 50 50 50

Patient 1c Patient 3 Patient 4 Patient 9

Figure 9: Examples of multiphase segmentation results for Leipzig data

per tumor, of which 7.8 seconds were taken by the itera-tion of the HMMF model, on average. The large differ-ence between the total processing time and the computa-tionally most expensive step of the HMMF model is dueto disk operations that could be removed for optimizedprogram code. The equivalent average time taken formanual contouring was 254 seconds per tumor, or 7.7times more than the automated method. The registra-tion step of the multiphase version is computationallyvery costly and would dominate the processing time es-timates. As noted above, the multiphase segmentationis impractical unless the images are registered for otherpurposes.

4. Discussion

The developed method was shown to provide a suc-cessful framework for liver tumor segmentation. Thecapabilities of the developed method were validatedwith a varied collection of tumors. For the LTS08 data,the method outperformed all other methods that havepreviously been tested on the same data set. The averageoverlap error was improved by 2.3 percentage points.The evaluation with the Leipzig data set showed thatthe method produced excellent results even for tumorswith very low contrast and ambiguous borders, and theperformance remained high with noisy image data.

The HMMF model enabled an effective inclusion ofprior information, a spatially smooth segmentation anda computationally efficient way to find the optimal so-lution for the cost function. Learning intensity distri-butions directly from the available image data and ad-justing the model accordingly proved to be a good ap-proach, providing adaptivity and robustness.

A framework for creating artificial evaluation datawas also presented. The samples were made to resem-ble tumors treated with RFA. Artificial data with groundtruth segmentations provided a reliable estimate of the

method upper-bound performance with different con-trast levels. With a contrast of at least 20 Hounsfieldunits, the average overlap error was 15.12%. Contrastshould be at least 10 Hounsfield units in order to achievean overlap error of 31.47%.

Extremely good results were received for the Leipzigdata set, which included only relatively small tumors.The average contrast of non-TACE tumors was only21 Hounsfield units, about half of the LTS08 data setvalue. The surface distance error measures for theLeipzig data set were significantly lower than for theLTS08 data. For relative measures, the error valueswere similar between the data sets, since relative mea-sures are sensitive for small objects. It was noted thatthe method performance deteriorates with larger tumorsand high levels of heterogeneity. The training data usedfor the non-parametric intensity distribution estimationmay not represent the heterogeneous tumors sufficientlyin all cases.

The multiphase segmentation results with the RFApatient data were slightly better on average than the sin-gle volume results. However, such small differences inthe average errors did not conclusively indicate the su-periority of the multiphase approach, even though therewere notable differences in the individual results. In thiswork, the method parameter values were selected basedon the single volume training data, and the multiphaseversion might have performed better if the parametervalues were optimized for it. The multiphase methodseemed to add some robustness to the process.

The developed method has many parameter vari-ables, but only a few of them are significant for opti-mizing method performance. The important ones arethe weights and the parameters controlling the adaptiveprior in the cost function used for MAP estimation. Theremainder of the parameters control basic functionali-ties, and the method is relatively insensitive to their val-ues. A drawback of the method is its iterative nature,

12

that causes a relatively high computational cost for largetumors.

The edge of the ROI poses a hard limit for the seg-mentation area. This may cause an undersegmentationif the shape of the tumor deviates significantly froma sphere and its longest axis is close to perpendicularto the image plane used for selecting the input points.In the conducted evaluation, the ROIs were sufficientlylarge to include the shape variation present in the dataset. However, for ellipsoidal tumors the input methodshould be modified to allow more freedom in deter-mining the two input points in 3D space. This wouldnot require any modification to the actual segmentationmethod.

The introduced post-processing method provided aneffective approach for removing overflown regions, butwas not able to entirely eliminate the erroneous areain all cases. In most cases, the post-processing hadvery little effect on the segmentation result, in partic-ular only one of the Leipzig data segmentations was al-tered. This indicates that the main segmentation methodis robust, especially for relatively small tumors. In thepresented framework, it could be possible to includethe shape analysis of the post-processing stage in theHMMF model. This way, the post-processing stagewould be unnecessary and the whole process would beincluded in the optimized cost function of the model.

The presented method is best suited for small andmedium-sized liver tumors, for which the segmentationaccuracy is high and computational cost remains mod-est. The method performs reliably even for tumors withlow contrast, high levels of noise and ambiguous bor-ders. These traits and the reduction in expensive manuallabor make the method ideal for RFA treatment plan-ning.

Acknowledgements

We would like to thank Dr. Xiang Deng of SiemensLtd. China for data evaluations, our research partnersDr. Daniel Seider and Dr. Michael Moche of Univer-sity of Leipzig, Department of Diagnostic and Interven-tional Radiology for patient data, Bernhard Kainz andJudith Muehl of Graz University of Technology, Insti-tute for Computer Graphics and Vision for technical as-sistance, Mikko Lilja of Aalto University for proofread-ing, Prof. Dr. Tuomas Hame of VTT for comments, aswell as our other partners: Medical University of Graz;Fraunhofer Gesellschaft, Institute for Applied Informa-tion Technology FIT; University of Oxford, Institute ofBiomedical Engineering; and NUMA Engineering Ser-vices Ltd.

References

Baron, R., 1994. Understanding and optimizing use of contrast mate-rial for CT of the liver. American Journal of Roentgenology 163,323–331.

Deng, X., Du, G., 2008. Editorial: 3D Segmentation in the Clinic: AGrand Challenge II – Liver Tumor Segmentation. MICCAI Work-shop Proceedings .

Dowsett, D., Kenny, P., Johnston, R., 1998. The Physics of DiagnosticImaging. Chapman & Hall Medical London.

Flach, B., Schlesinger, D., 2008. Combining shape priors and MRF-segmentation. Structural, Syntactic, and Statistical Pattern Recog-nition , 177–186.

Freiman, M., Eliassaf, O., Taieb, Y., Joskowicz, L., Sosna, J., 2008.A bayesian approach for liver analysis: Algorithm and valida-tion study. Medical Image Computing and Computer-AssistedIntervention–MICCAI 2008 , 85–92.

Friedman, S., Grendell, J., McQuaid, K., 2003. Current Diagnosis &Treatment in Gastroenterology. McGraw-Hill Medical.

Gazelle, G., Goldberg, S., Solbiati, L., Livraghi, T., 2000. TumorAblation with Radio-frequency Energy. Radiology 217, 633.

Halvorsen, R., Korobkin, M., Ram, P., Thompson, W., 1982. CTappearance of focal fatty infiltration of the liver. American Journalof Roentgenology 139, 277.

Hann, L., Winston, C., Brown, K., Akhurst, T., 2000. Diagnosticimaging approaches and relationship to hepatobiliary cancer stag-ing and therapy. Journal of Surgical Oncology 19, 94–115.

Jolly, M., Grady, L., 2008. 3D general lesion segmentation in CT, in:IEEE ISBI, pp. 796–799.

Kanai, T., Hirohashi, S., Upton, M., Noguchi, M., Kishi, K., Maku-uchi, M., Yamasaki, S., Hasegawa, H., Takayasu, K., Moriyama,N., et al., 1987. Pathology of small hepatocellular carcinoma. Aproposal for a new gross classification. Cancer 60, 810–819.

Li, Y., Hara, S., Shimura, K., 2006. A machine learning approach forlocating boundaries of liver tumors in CT images, in: Proc. ICPR,pp. 400–403.

Marroquin, J., Santana, E., Botello, S., 2003. Hidden Markov measurefield models for image segmentation. IEEE Transactions on PatternAnalysis and Machine Intelligence 25, 1380–1387.

Moltz, J., Bornemann, L., Dicken, V., Peitgen, H., 2008. Segmenta-tion of liver metastases in CT scans by adaptive thresholding andmorphological processing, in: Workshop on 3D Segmentation inthe Clinic: A Grand Challenge II. Liver Tumor Segmentation Chal-lenge. MICCAI, New York, USA.

Nelder, J., Mead, R., 1965. The downhill simplex method. ComputerJournal 7, 308.

Parzen, E., 1962. On estimation of a probability density function andmode. The annals of mathematical statistics 33, 1065–1076.

Press, W., 2007. Numerical recipes: the art of scientific computing.Cambridge University Press.

Rohlfing, T., Maurer Jr, C., Bluemke, D., Jacobs, M., 2003. Volume-preserving nonrigid registration of MR breast images using free-form deformation with an incompressibility constraint. IEEETransactions on Medical Imaging 22.

Rohlfing, T., Maurer Jr, C., ODell, W., Zhong, J., 2004. Modelingliver motion and deformation during the respiratory cycle usingintensity-based nonrigid registration of gated MR images. MedicalPhysics 31, 427.

Rueckert, D., Sonoda, L., Hayes, C., Hill, D., Leach, M., Hawkes, D.,1999. Nonrigid registration using free-form deformations: applica-tion to breast MR images. IEEE Transactions on medical imaging18.

Sethian, J., 1996. A fast marching level set method for monotoni-cally advancing fronts. Proceedings of the National Academy ofSciences of the United States of America 93, 1591.

13

Smeets, D., Loeckx, D., Stijnen, B., De Dobbelaer, B., Vandermeulen,D., Suetens, P., 2009. Semi-Automatic Level Set Segmentation ofLiver Tumors combining a Spiral Scanning Technique with Super-vised Fuzzy Pixel Cassification. Medical Image Analysis .

Stawiaski, J., Decencieere, E., Bidault, F., 2008. Interactive liver tu-mor segmentation using graph cuts and watershed, in: Workshopon 3D Segmentation in the Clinic: A Grand Challenge II. LiverTumor Segmentation Challenge. MICCAI, New York, USA.

Studholme, C., Hill, D., Hawkes, D., 1999. An overlap invariant en-tropy measure of 3D medical image alignment. Pattern recognition32, 71–86.

Wahba, G., 1990. Spline models for observational data. Society forIndustrial Mathematics.

Zhou, J., Wong, D., Ding, F., Venkatesh, S., Tian, Q., Qi, Y., Xiong,W., Liu, J., Leow, W., 2010. Liver tumour segmentation usingcontrast-enhanced multi-detector CT data: performance bench-marking of three semiautomated methods. European Radiology, 1–11.

14

- Method produces accurate segmentations of liver tumors from low-quality data - Only minimal user interaction is required - Highest score for benchmark data set, with average overlap error of 30.35% - Multiphase segmentation uses several images and adds robustness to the method - Novel post-processing method removes extraneous regions

Documents

Liver Tumor Segmentation