1
Deep Active Lesion Segmentation Ali Hatamizadeh 1* , Assaf Hoogi 2* , Debleena Sengupta 1 , Wuyue Lu 1 , Brian Wilcox 2 , Daniel Rubin 2† , and Demetri Terzopoulos 1† 1 Computer Science Department, University of California, Los Angeles, CA, USA 2 Department of Biomedical Data Science, Stanford University, Stanford, CA, USA Co-primary authorship, Co-senior authorship Motivation Lesion segmentation is an important problem in computer assisted diagnosis that remains challenging due to the prevalence of low contrast, irregular boundaries that are unamenable to shape priors. We introduce Deep Active Lesion Segmentation (DALS), which effectively leverages the strengths of CNNs and ACMs to create a fully automated segmentation framework. Figure 1: Segmentation comparison Overview Methodology DALS: Best of Both Worlds Results Discussion An input image is fed into the encoder-decoder to localize the lesion and produce a segmentation probability map Y prob that specifies the probability that any point (x,y) lies in the interior of the lesion. The Transformer converts Y prob to a Signed Distance Map, Y SDM, that initializes the level- set ACM and estimates the parameter functions λ 1 (x,y) and λ 2 (x,y) in the energy functional (1). During training, Y prob and the ground truth map Y gt are fed into a Dice loss function and the error is back-propagated accordingly. During inference, a forward pass through the framework produces a final SDM, which is converted back into a probability map, producing the final segmentation map Y out. How Does it Perform? Figure 3: Comparison of the output segmentation of our DALS (red) against the U-Net [9] (yellow) and ground truth (green) segmentations MLS Dataset Dataset captures variation in: Organ types (brain, liver, lung) Size Modality (MR, CT) Lesion shapes Dataset will be publicly released. Figure 2: The DALS Framework: Multiscale Encoder-decoder + Level Set ACM Table 1: MLS dataset statistics. GC: Global Contrast; GH: Global Heterogeneity. Boundary Delineation: The DALS segmentation contours conform well to the irregular shapes of the lesion boundaries (Fig. 3), since the learned parameter maps, λ 1 (x,y) and λ 2 (x,y), provide flexibility to accommodate boundary irregularities. DALS performs well for different image characteristics, including low contrast and heterogeneous lesions. Parameter functions Conclusion The DALS framework includes a custom encoder-decoder that feeds a level-set ACM with per-pixel parameter functions. We evaluated our framework in the challenging task of lesion segmentation with a new dataset, MLS, which includes a variety of images of lesions of various sizes and textures in different organs acquired through multiple imaging modalities. Our results affirm the effectiveness our DALS framework. Multiscale Encoder-Decoder Level-Set ACM With Parameter Functions Encoder consists of dense blocks that concatenate feature maps from layers 0 to l-1 before being fed to successive dense blocks. Last dense block in the encoder is fed into a custom multiscale dilation block. This, along with dense connectivity, assists in capturing local and global context for highly accurate lesion localization # L () Brain MRI 369 0.56 0.029 0.901 0.003 17.42 ± 9.516 Lung CT 87 0.315 0.002 0.901 0.004 15.15 ± 5.777 Liver CT 112 0.825 0.072 0.838 0.002 20.48 ± 10.37 Liver MRI 164 0.448 0.041 0.891 0.003 5.459 ± 2.027 The DALS framework benefits from: A novel multiscale encoder-decoder CNN that learns an initialization probability map & parameter maps for the ACM. An improved level- set ACM formulation with a per-pixel- parameterized energy functional Dataset Brain MR U-Net 0.776 ± 0.214 0.090 2.988 ± 1.238 0.521 0.826 CNN Backbone 0.824 ± 0.193 0.078 2.755 ± 1.216 0.490 0.892 Manual Level-set 0.796 ± 0.095 0.038 2.927 ± 0.992 0.400 0.841 DALS 0.888 ± 0.076 0.030 2.322 ± 0.824 0.332 0.944 Lung CT 0.817 ± 0.098 0.080 2.289 ± 0.650 0.530 0.898 0.822 ± 0.115 0.094 2.254 ± 0.762 0.622 0.900 0.789 ± 0.078 0.064 3.270 ± 0.553 0.451 0.879 0.869 ± 0.113 0.092 2.095 ± 0.623 0.508 0.937 Dataset Liver MR U-Net 0.769 ± 0.162 0.093 1.645 ± 0.598 0.343 0.920 CNN Backbone 0.805 ± 0.193 0.110 1.347 ± 0.671 0.385 0.939 Manual Level-set 0.739± 0.102 0.056 2.227 ± 0.576 0.317 0.954 DALS 0.894 ± 0.036 0.036 1.298 ± 0.434 0.239 0.987 Liver CT 0.698 ± 0.149 0.133 4.422 ± 0.969 0.866 0.662 0.801 ± 0.178 0.159 3.813 ± 1.701 1.600 0.697 0.765 ± 0.039 0.034 3.153 ± 0.825 0.737 0.761 0.846 ± 0.090 0.081 3.113 ± 0.747 0.667 0.773 Table 1: Segmentation metrics for model evaluations. CI denotes the confidence interval. Figure 4: Box and whisker plots of Hausdorff (left) and Dice (right) scores We evaluate our lesion segmentation model on a new Multiorgan Lesion Segmentation (MLS) dataset. Our results demonstrate favorable performance compared to competing methods, especially for small training datasets. (a) Labeled Img (b) Level-set (d) λ 1 (x,y) (e) λ 2 (x,y) (c) DLAS Link to code repository Figure 5: The contribution of the parameter functions was validated by comparing the DALS (a) against a manually initialized level-set ACM with scalar parameters constants (b). The learned maps (λ 1 (x,y) and λ 2 (x,y) ) serve as an attention mechanism that provides additional degrees of freedom for the contour to adjust itself precisely to regions of interest. = Ω , , μ|∇∅ ,, |+ Ω , , , Where penalizes the length of the contour (we set = 0.1) and the energy density is F = 1 (, )( , 1 , ) 2 + 2 (, )( , 2 , ) 2 1 , = 2− (, ) 1+ (, ) 2 , = 1+ (, ) 2− (, ) We compared the DALS framework to manually- initialized Level-set with scalar λ parameters, U-Net, and DALS’s standalone backbone CNN. DALS achieves superior accuracies under all metrics and in all datasets (Table 1). Input Level-set ACM 2×2 Conv + BN + ReLU 2×2 Maxpool 2×2 Avg Pool Transition 2×2 Transpose Conv + BN Dense Block Series of 3x3 Conv + BN + ReLU Multiscale Dilation Block Sigmoid Transformer + 2 4 8 16 c c c c c Concatenation [256, 256, 1] [128, 128, 64] [64, 64, 64] [64, 64, 136] [32, 32, 6] [16, 16, 6] [8, 8, 6] [8, 8, 1176] [16, 16, 512] [32, 32, 256] [64, 64, 128] [128, 128, 64] [256, 256, 128] [32, 32, 150] [16, 16, 438] [256, 256, 1] 2 (, ) 1 (, ) Brain MR Liver CT Liver MR Lung CT (a) Brain MR (b) Liver MR (c) Liver CT (d) Lung CT Ground Truth DALS Output U-Net Output How Does it Work? Brain Liver CT Liver MR Lung Model CNN Backbone DALS Level-set U-Net 1.0 0.8 0.6 0.4 0.2 Brain Liver CT Liver MR Lung CNN Backbone DALS Level-set U-Net Model 7 6 5 4 3 2 1 0

Deep Active Lesion Segmentationweb.cs.ucla.edu/~ahatamiz/DALS_poster.pdf · 2020-01-06 · Deep Active Lesion Segmentation Ali Hatamizadeh1*, Assaf Hoogi2*, Debleena Sengupta1, Wuyue

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Active Lesion Segmentationweb.cs.ucla.edu/~ahatamiz/DALS_poster.pdf · 2020-01-06 · Deep Active Lesion Segmentation Ali Hatamizadeh1*, Assaf Hoogi2*, Debleena Sengupta1, Wuyue

Deep Active Lesion SegmentationAli Hatamizadeh1*, Assaf Hoogi2*, Debleena Sengupta1, Wuyue Lu1, Brian Wilcox2, Daniel Rubin2†, and Demetri Terzopoulos1†

1 Computer Science Department, University of California, Los Angeles, CA, USA 2 Department of Biomedical Data Science, Stanford University, Stanford, CA, USA

∗Co-primary authorship, †Co-senior authorship

MotivationLesion segmentation is an important problem in computer assisted diagnosis that remainschallenging due to the prevalence of low contrast, irregular boundaries that are unamenable toshape priors. We introduce Deep Active Lesion Segmentation (DALS), which effectively leveragesthe strengths of CNNs and ACMs to create a fully automated segmentation framework.

Figure 1: Segmentation comparison

Overview Methodology

DALS: Best of Both Worlds

Results Discussion

• An input image is fed into theencoder-decoder to localizethe lesion and produce asegmentation probability mapYprob that specifies theprobability that any point (x,y)lies in the interior of thelesion.

• The Transformer converts Yprob

to a Signed Distance Map,YSDM, that initializes the level-set ACM and estimates the

parameter functions λ1(x,y)and λ2(x,y) in the energyfunctional (1).

• During training, Yprob and theground truth map Ygt are fedinto a Dice loss function andthe error is back-propagatedaccordingly.

• During inference, a forwardpass through the frameworkproduces a final SDM, which isconverted back into aprobability map, producingthe final segmentation mapYout.

How Does it Perform?

Figure 3: Comparison of the output segmentation of our DALS (red) against the U-Net [9] (yellow) and ground truth (green) segmentations

MLS DatasetDataset captures variation in:• Organ types (brain, liver, lung)• Size• Modality (MR, CT)• Lesion shapesDataset will be publicly released.

Figure 2: The DALS Framework: Multiscale Encoder-decoder + Level Set ACM

Table 1: MLS dataset statistics. GC: Global Contrast; GH: Global Heterogeneity.

• Boundary Delineation: The DALS segmentation contours conform well to the irregular shapes of the lesionboundaries (Fig. 3), since the learned parameter maps, λ1(x,y) and λ2(x,y), provide flexibility toaccommodate boundary irregularities. DALS performs well for different image characteristics, including lowcontrast and heterogeneous lesions.

• Parameter functions

ConclusionThe DALS framework includes a custom encoder-decoder that feeds a level-set ACM with per-pixelparameter functions. We evaluated our framework in the challenging task of lesion segmentation with a newdataset, MLS, which includes a variety of images of lesions of various sizes and textures in different organsacquired through multiple imaging modalities. Our results affirm the effectiveness our DALS framework.

Multiscale Encoder-Decoder

Level-Set ACM With Parameter Functions

• Encoder consists of dense blocks that concatenate feature maps from layers 0 to l-1before being fed to successive dense blocks.

• Last dense block in the encoder is fed into a custom multiscale dilation block. This, alongwith dense connectivity, assists in capturing local and global context for highly accuratelesion localization

𝑶𝒓𝒈𝒂𝒏 𝑴𝒐𝒅𝒂𝒍𝒊𝒕𝒚 # 𝑺𝒂𝒎𝒑𝒍𝒆𝒔 𝑴𝒆𝒂𝒏𝑮𝑪 𝑽𝒂𝒓𝑮𝑪 𝑴𝒆𝒂𝒏𝑮𝑯 𝑽𝒂𝒓𝑮𝑯 L𝒆𝒔𝒊𝒐𝒏 𝑹𝒂𝒅𝒊𝒖𝒔(𝒑𝒊𝒙𝒆𝒍𝒔)

Brain MRI 369 0.56 0.029 0.901 0.003 17.42 ± 9.516

Lung CT 87 0.315 0.002 0.901 0.004 15.15 ± 5.777

Liver CT 112 0.825 0.072 0.838 0.002 20.48 ± 10.37

Liver MRI 164 0.448 0.041 0.891 0.003 5.459 ± 2.027

The DALS frameworkbenefits from:• A novel multiscale

encoder-decoderCNN that learns aninitializationprobability map &parameter maps forthe ACM.

• An improved level-set ACM formulationwith a per-pixel-parameterized energyfunctional

Dataset Brain MR

𝑀𝑜𝑑𝑒𝑙 𝐷𝑖𝑐𝑒 𝐶𝐼 𝐻𝑎𝑢𝑠𝑑𝑜𝑟𝑓𝑓 𝐶𝐼 𝐵𝑜𝑢𝑛𝑑𝐹

U-Net 0.776 ± 0.214 0.090 2.988 ± 1.238 0.521 0.826

CNN Backbone 0.824 ± 0.193 0.078 2.755 ± 1.216 0.490 0.892

Manual Level-set 0.796 ± 0.095 0.038 2.927 ± 0.992 0.400 0.841

DALS 0.888 ± 0.076 0.030 2.322 ± 0.824 0.332 0.944

Lung CT

𝐷𝑖𝑐𝑒 𝐶𝐼 𝐻𝑎𝑢𝑠𝑑𝑜𝑟𝑓𝑓 𝐶𝐼 𝐵𝑜𝑢𝑛𝑑𝐹

0.817 ± 0.098 0.080 2.289 ± 0.650 0.530 0.898

0.822 ± 0.115 0.094 2.254 ± 0.762 0.622 0.900

0.789 ± 0.078 0.064 3.270 ± 0.553 0.451 0.879

0.869 ± 0.113 0.092 2.095 ± 0.623 0.508 0.937

Dataset Liver MR

𝑀𝑜𝑑𝑒𝑙 𝐷𝑖𝑐𝑒 𝐶𝐼 𝐻𝑎𝑢𝑠𝑑𝑜𝑟𝑓𝑓 𝐶𝐼 𝐵𝑜𝑢𝑛𝑑𝐹

U-Net 0.769 ± 0.162 0.093 1.645 ± 0.598 0.343 0.920

CNN Backbone 0.805 ± 0.193 0.110 1.347 ± 0.671 0.385 0.939

Manual Level-set 0.739± 0.102 0.056 2.227 ± 0.576 0.317 0.954

DALS 0.894 ± 0.036 0.036 1.298 ± 0.434 0.239 0.987

Liver CT

𝐷𝑖𝑐𝑒 𝐶𝐼 𝐻𝑎𝑢𝑠𝑑𝑜𝑟𝑓𝑓 𝐶𝐼 𝐵𝑜𝑢𝑛𝑑𝐹

0.698 ± 0.149 0.133 4.422 ± 0.969 0.866 0.662

0.801 ± 0.178 0.159 3.813 ± 1.701 1.600 0.697

0.765 ± 0.039 0.034 3.153 ± 0.825 0.737 0.761

0.846 ± 0.090 0.081 3.113 ± 0.747 0.667 0.773

Table 1: Segmentation metrics for model evaluations. CI denotes the confidence interval.

Figure 4: Box and whisker plots of Hausdorff (left) and Dice (right) scores

We evaluate our lesion segmentation model on a new Multiorgan Lesion Segmentation (MLS) dataset. Our results demonstrate favorable performance compared to competing methods, especially for small training datasets.

(a) Labeled Img (b) Level-set (d) λ1(x,y) (e) λ2(x,y) (c) DLAS

Link to code repository

Figure 5: The contribution of the parameter functions was validated by comparing the DALS (a) againsta manually initialized level-set ACM with scalar parameters constants (b). The learned maps (λ1(x,y)and λ2(x,y) ) serve as an attention mechanism that provides additional degrees of freedom for thecontour to adjust itself precisely to regions of interest.

𝐸 ∅ = Ω𝛿𝜖 ∅ 𝑥, 𝑦, 𝑡 μ|∇∅ 𝑥, 𝑦, 𝑡 | + Ω𝑊𝑠𝐹 ∅ 𝑢, 𝑣, 𝑡 𝑑𝑢 𝑑𝑣 𝑑𝑥 𝑑𝑦,

Where 𝜇 penalizes the length of the contour (we set 𝜇 = 0.1) and the energy density is

F ∅ = 𝜆1(𝑢, 𝑣)(𝐼 𝑢, 𝑣 − 𝑚1 𝑥, 𝑦 )2𝐻𝜖𝐼 ∅ + 𝜆2(𝑢, 𝑣)(𝐼 𝑢, 𝑣 − 𝑚2 𝑥, 𝑦 )2𝐻𝜖

𝐸 ∅

𝜆1 𝑥, 𝑦 = 𝑒𝑥𝑝2 − 𝑌𝑝𝑟𝑜𝑏(𝑥, 𝑦)

1 + 𝑌𝑝𝑟𝑜𝑏(𝑥, 𝑦)𝜆2 𝑥, 𝑦 = 𝑒𝑥𝑝

1 + 𝑌𝑝𝑟𝑜𝑏(𝑥, 𝑦)

2 − 𝑌𝑝𝑟𝑜𝑏(𝑥, 𝑦)

• We compared the DALSframework to manually-initialized Level-set withscalar λ parameters, U-Net,and DALS’s standalonebackbone CNN.

• DALS achieves superioraccuracies under all metricsand in all datasets (Table 1).

Input

Level-set ACM

2×2 Conv + BN + ReLU

2×2 Maxpool

2×2 Avg Pool Transition

2×2 Transpose Conv + BN

Dense Block

Series of 3x3 Conv + BN + ReLU

Multiscale Dilation Block

Sigmoid

Transformer

+

2 4 8 16

c

c

c

c

c Concatenation

[256, 256, 1]

[128, 128, 64]

[64, 64, 64] [64, 64, 136]

[32, 32, 6]

[16, 16, 6]

[8, 8, 6] [8, 8, 1176]

[16, 16, 512]

[32, 32, 256]

[64, 64, 128]

[128, 128, 64]

[256, 256, 128]

[32, 32, 150]

[16, 16, 438]

[256, 256, 1]

𝜆2(𝑥, 𝑦)

𝒀𝑺𝑫𝑴

𝒀𝒈𝒕

𝜆1(𝑥, 𝑦)

𝒀𝒐𝒖𝒕

𝒀𝒑𝒓𝒐𝒃

Brain MRLiver CTLiver MRLung CT

(a) Brain MR (b) Liver MR (c) Liver CT (d) Lung CT

Gro

un

d T

ruth

DA

LS O

utp

ut

U-N

et

Ou

tpu

t

How Does it Work?

Brain

LiverCT

LiverMR

Lung

ModelCNN Backbone

DALS

Level-set

U-Net

1.00.80.60.40.2

Brain

LiverCT

LiverMR

Lung

CNN Backbone

DALS

Level-set

U-Net

Model

76543210