A Semiautomated Deep Learning Approach for Pancreas

Research ArticleA Semiautomated Deep Learning Approach forPancreas Segmentation

Meixiang Huang 1 Chongfei Huang1 Jing Yuan2 and Dexing Kong 1

113e School of Mathematical Sciences Zhejiang University Hangzhou 310027 China213e School of Mathematics and Statistics Xidian University Xirsquoan 710069 China

Correspondence should be addressed to Dexing Kong dxkongzjueducn

Received 25 April 2021 Revised 28 May 2021 Accepted 21 June 2021 Published 3 July 2021

Academic Editor Jialin Peng

Copyright copy 2021 Meixiang Huang et al is is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work isproperly cited

Accurate pancreas segmentation from 3D CT volumes is important for pancreas diseases therapy It is challenging to accuratelydelineate the pancreas due to the poor intensity contrast and intrinsic large variations in volume shape and location In thispaper we propose a semiautomated deformable U-Net ie DUNet for the pancreas segmentation e key innovation of ourproposed method is a deformable convolution module which adaptively adds learned offsets to each sampling position of 2Dconvolutional kernel to enhance feature representation Combining deformable convolution module with U-Net enables ourDUNet to flexibly capture pancreatic features and improve the geometric modeling capability of U-Net Moreover a nonlinearDice-based loss function is designed to tackle the class-imbalanced problem in the pancreas segmentation Experimental resultsshow that our proposed method outperforms all comparison methods on the same NIH dataset

1 Introduction

Pancreatic diseases are relatively hidden and difficult todetect and cure especially for pancreatic cancers which havehigh mortality rate worldwide [1] Accurate pancreas seg-mentation from 3D CT scans can provide assistance todoctors in the diagnosis of pancreas diseases such as vol-umetric measurement and analysis for diabetic patients aswell as surgical guidance for clinicians [2] However it ischallenging to segment the pancreas due to the large ana-tomical variability in pancreas position size and shapeacross patients (as shown in Figure 1) Moreover the am-biguous boundaries around the pancreas with its adjacentstructures further increase the difficulty of pancreasdelineation

Traditional methods on abdominal pancreas segmen-tation mainly have statistical shape models [3 4] or multi-atlas techniques [5 6] Wolz et al proposed a fully auto-mated method based on a hierarchical atlas registration andweighting scheme for abdominal multiorgan segmentation[6]is method was evaluated on a database of 150 CTscansand achieved Dice score of 70 for the pancreas Karasawa

et al exploited the vasculature around the pancreas to betterselect atlases for pancreas segmentation [7] is methodwas evaluated on 150 abdominal CT scans and obtained anaverage Dice score of 785 However the performance ofatlas-based methods highly relies on the selection of atlasesand the accuracy of the image registration algorithm Aboveall it is difficult to select atlases that are general enough tocover all variabilities in the pancreas across differentpatients

Convolutional networks [8 9] have achieved greatsuccess in medical image segmentation which also boost theperformance of pancreas segmentation U-Net [10] a se-mantic segmentation architecture attracted great attentionsfrom researchers by exploiting multilevel feature fusion eskip connections in U-Net are used to incorporate high-resolution low-level feature maps from the encoding branchinto the decoding branch of U-Net to alleviate the importantinformation loss caused by successive downsampling andthen refine and recover target details Namely using skipconnections to fuse multilevel feature tensors can effectivelylocalize and segment target organs [11] Many works [12ndash14]have demonstrated that U-Net is a good framework for

HindawiJournal of Healthcare EngineeringVolume 2021 Article ID 3284493 10 pageshttpsdoiorg10115520213284493

semantic segmentation tasks especially for small datasetsSince the pancreas is a small soft organ in the abdomenmost pancreas segmentation algorithms based on con-volutional neural network (CNN) provide iterative algo-rithms [15] in a coarse-to-fine manner to relieve theinterference of complex background Roth et al first pro-posed a probabilistic bottom-up coarse-to-fine approach forpancreas segmentation [16] where a multilevel deep Con-vNet model is utilized to learn robust pancreas features Twosubsequent holistically nested segmentation networks[17 18] advanced this previous work [16] Zhou et alpresented a two-stage fixed-point approach for the pancreassegmentation which utilized the predicted segmentationsfrom coarse model to localize and obtain smaller pancreasregions which were further refined by another model [14]Yu et al presented the recurrent saliency transformationnetwork to tackle the challenge of small organ segmentationwhere a saliency transformation module is utilized toconnect coarse and fine stage to realize joint optimization[19] Cai et al designed a convolutional neural networkequipped with convolutional LSTM to impose spatial con-textual consistency constraints on successive image slices[20] Cai et al [21] further improved the pancreas initialsegmentation in [20] by aggregating the multiscale low-levelfeatures and strengthened the pancreatic shape continuity bybidirection recurrent neural network (BiRNN) Liu et al [22]used superpixel-based approach to obtain coarse pancreassegmentations which were then used to train five same-architecture fully convolutional networks (FCNs) withdifferent loss functions to achieve accurate pancreas seg-mentations is method is evaluated on 82 public CTvolumes and achieved a Dice coefficient of 8410 plusmn 491Man et al [23] proposed a two-stage method composed ofdeep Q network (DQN) and deformable U-Net for thepancreas segmentation in which DQN is used to obtaincontext-adaptive coarse pancreas segmentations whichwere then input to deformable U-Net for refinement Zhuet al [24] proposed a 3D coarse-to-fine network to segmentthe pancreas is 3D method outperformed the 2D

counterpart due to the full usage of the rich spatial infor-mation along the long axial dimension Some commontechniques such as dense connection [25] residual blockand sparse convolution [26 27] are also widely utilized tosegment the pancreas

Google DeepMind proposed a spatial transformer [28]which is the first work to allow neural networks learn thetransformation matrix from data and transform featuremaps spatially Specifically spatial transformer network(STN) can globally deform feature maps through learnedtransformations such as scaling cropping rotation as wellas nonrigid deformation Recently Dai et al proposed adeformable convolution to get over the limitation of fixedreceptive field in standard convolution [29] In detailconvolutional kernel with explicit offsets learned from theprevious feature maps can adaptively change predefinedreceptive field in order to extract more target features especific deformable convolution is shown in Figure 2 inwhich some standard convolution layers are first utilized tolearn and regress the deformation displacements for eachsampling point in the image and then the learned dis-placements are added to original sampling positions of the2D convolution to enable network extract relevant and richfeatures far from original fixed neighborhood [30] Differentfrom STN [28] deformable convolution adopts a local anddense instead of global manner to warp feature mapsMoreover deformable convolution focuses on learningexplicit offset for each neuron instead of kernel weightsSince the pancreas has various scales and shapes acrosspatients and traditional convolutional kernel cannot addresswell on organs with high deformation due to the fixed re-ceptive field we believe deformable convolution is moresuitable for the task of pancreas segmentation [31]

In this paper we propose a semiautomated deformableU-Net model utilizing the power of U-Net and Deformable-ConvNets e proposed architecture for pancreas seg-mentation has two merits First deep segmentation net-works such as FCN [9] U-Net [10] and DeepLab [32] easilysuffer from confusion by the large irrelevant background

(a) (b) (c)

Figure 1 Examples of 2D CT slices with pancreas annotations (red regions) showing the highly variable shape and size of pancreas elargest area of pancreas is less than 08 of entire slice while the smallest area is less than 01 (best viewed in color)

2 Journal of Healthcare Engineering

information due to the small size of the pancreas in the entireabdominal CT volume Motivated by [14] we take a similarstrategy ie first manually shrink the size of input imageand then refine the extracted pancreas regions by the pro-posed deformable U-Net e proposed method has thecapability to extract the geometry-aware features of thepancreas with the help of deformable convolution Secondwe propose a novel loss function focal generalized Dice loss(FGDL) function to balance the size of foreground andbackground and enhance the ability of network for smallorgan segmentation A conference version of this work waspublished in ISICDM 2019 [33] In this extended version weprovide a more comprehensive description of literaturereview and detailed analysis of the proposed method andexperimental investigation e main modifications includepresenting and analyzing the difference between standardconvolution block and deformable convolution block (asshown in Figure 3) adding and analyzing the visualizationresults of the proposed DUNet (as shown in Figures 4 and 5)as well as the comparison results between the proposedDUNet and two baseline methods on the NIH dataset [34](as shown in Figure 6 and Table 1) adding more evaluationmetrics for testing the performance of the proposed DUNet(as shown in (9)ndash(11)) conducting new experiment todemonstrate the effectiveness of the proposed loss functionfor pancreas segmentation (as shown in Table 2) discussingadvantages and limitations of the proposed DUNet andadding more references

2 Materials and Methods

In this section a semiautomated deformable U-Net isproposed to segment the pancreas Our method is built uponU-Net which employed skip connections to aggregatemultiple feature maps with the same resolution from dif-ferent levels to recover the grained details lost in decoderbranch and thus strengthen the representative capability ofnetwork Since the pancreas only occupies a small fraction ofthe whole scan and the large and complex backgroundinformation tends to interfere or confuse semantic seg-mentation framework such as U-Net [10] we followedcascade-based methods [5 12 14] ie first localize targetregions and then refine the extracted regions Specifically wefirst estimate the maximum and minimum coordinates of

the pancreas to approximately locate its and then input theextracted pancreas regions to the refinement segmentationmodel to improve segmentation accuracy Here we designeda deformable U-Net (abbreviated as DUNet) as the re-finement model e key component in DUNet is de-formable convolution which can adaptively augment thesampling grid by learning 2D offsets from each image pixelaccording to the preceding feature maps Incorporatingdeformable convolution into the baseline U-Net can im-prove the geometry-aware capability of U-Net e overallstructure of the proposed method is shown in Figure 7

21 Network Architecture Our approach is an encoder-decoder structure designed for pancreas segmentation Asshown in Figures 7 and 3 the proposed architecture includesthe standard convolution block deformable convolutionblock skip connection downsampling and upsamplingConsidering that the deformable convolution block requiresa little more computing resources and the aim of deformableconvolution block is to help the network capture low-leveldiscriminative details at various shapes and scales in orderto balance the efficiency and accuracy we experimentallyapply the deformable convolution in the second and thirdlayers of U-Net Specifically we replaced the standardconvolution block of the second and third layers in theencoder as well as the counterpart layers in the decoder withdeformable convolution block Figure 3(b) shows thecomponent of deformable convolution block Concretelyeach deformable convolution block is composed of con-volutional offset layer followed by convolution layer BN[35] and ReLU layer in which convolutional offset layerplays an important role in telling U-Net how to deform andsample feature maps [36] e advantage of deformableconvolution block is to utilize changeable receptive fields toeffectively learn pancreas features with various shapes andscales

Here we describe the standard convolution and de-formable convolution in detail On the one hand thestandard 2D convolution can be seen as the weighed sumover a regular 2D sampling grid with weight W For the 3 times 3sized kernel with the dilation value of 1 (as shown inFigure 8(a)) the sampling grid G in standard convolutiondefines the receptive field size and can be given by

Whole image

Input patch Offset field Offsets Offset kernel

Deformable convolutionOutput feature map

Figure 2 Illustration of 3 times 3 deformable convolution Offset field is generated from the preceding feature maps and the number of outputchannels is 2N

Journal of Healthcare Engineering 3

G (minus1 minus1) (minus1 0) (0 1) (1 1) (1)

e value of each location p0 on the output feature mapY can be calculated as

Y p0( 1113857 1113944pnisinG

W pn( 1113857 middot X p0 + pn( 1113857(2)

where pn enumerates all locations in 2D sampling gridG Onthe other hand rather than using the predefined samplinggrid deformable convolution automatically learns offsetpn

to augment the regular sampling grid and is calculated as

Y p0( 1113857 1113944pnisinG

W pn( 1113857 middot X p0 + pn +pn( 1113857(3)

(a) (b) (c) (d)

Figure 4 Comparisons of 2D pancreas segmentations from the proposed DUNet with the manual segmentations e first second andthird columns denote the CT slices with their segmentations and bounding boxes of pancreas (red) the manual segmentations and thenetwork predictions respectively e last column denotes the overlapped maps between the network predictions and manual seg-mentations with overlapped regions marked by magenta (a) Original (b) Groundtruth (c) Prediction (d) Overlapped

Standardconvolution

block

Conv

BN

ReLU

(a)

Deformableconvolution

block

ConvOffset

Conv

BN

ReLU

(b)

Figure 3 e comparison between (a) standard convolution block and (b) deformable convolution block


(a) (b) (c)

Figure 5 Comparisons of 3D pancreas segmentations from the proposed DUNet with the manual segmentations e first second andthird columns denote the manual segmentations the network predictions and the overlapped maps between the network predictionsand manual segmentations respectively e manual segmentations are shown in red and the network predictions are shown in lightgreen (a) Label (b) Prediction (c) Overlapped

(a) (b) (c) (d) (e)

Figure 6 Comparison of segmentation results between different models on the NIH dataset (a) Original images with their segmentations andbounding boxes of pancreas (red) (b) e ground truths (cndashe) e predictions generated by our DUNet U-Net and Deformable-ConvNetrespectively


Table 1 Quantitative comparisons between the three different models on the NIH dataset Bold denotes the best

Model F-measure Recall Precision Mean DSCModified Deformable-ConvNet 08201 08084 08378 08203U-Net 08738 09010 08499 08670DUNet(Ours) 08878 08997 08898 08725

Table 2 Comparison of the DUNet with Dice loss (DL) and the proposed loss (DSC) Bold denotes the best

Method Min DSC Max DSC Mean DSCDUNet +DL 6865 9318 8629 plusmn 433DUNet + FGDL(Ours) 7703 9329 8725 plusmn 327

Input Output

Standard convolution block

deformable convolution block

Downsample

Upsample

Skip connection

Figure 7 An overview of the proposed DUNet Input data are progressively convolved and downsampled or upsampled by factor of 2 ateach scale in both encoding and decoding branches Schematic of the standard convolution block and deformable convolution block isshown in Figure 3

(a) (b) (c) (d)

Figure 8 Comparisons of the sampling points in 3 times 3 standard and deformable convolution (a) Sampling points (marked as blue) ofstandard convolution (b) Deformed sampling points (marked as red) with learned displacements (pink arrows) in deformable convolution(c-d) Two cases of (b) illustrating that the learned displacements contain translation and rotation transformations


In particular the 2D deformable convolution can bemathematically formalized as follows

WdegX( 1113857(i j) 1113944

1

mminus11113944

1

nminus1W(i j) times X i minus m + δverticleijmn j minus n + δhorizontalijmn1113872 1113873 foralli 1 H forallj 1 N (4)

where deg denotes the deformable convolution operation W isa 3times 3 kernel with pad 1 and stride 1 X is the image withheightH and width N and (i j) denotes the location of pixelin image δverticleijmn and δhorizontalijmn denote the vertical offset andthe horizontal offset respectively which are learned by ad-ditional convolution on the preceding feature maps Since thelearned offset is usually not an integer we performed bilinearinterpolation on the output of the deformable convolutionallayers to enable gradient back-propagation available

22 Loss Function Since the pancreas occupies a small re-gion relative to the large background and Dice loss is rel-atively insensitive to class-imbalanced problem mostpancreas segmentation works adopt soft binary Dice loss tooptimize pancreas segmentation and it is defined as follows

L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936

Ni1 1 minus pi( 1113857 1 minus gi( 1113857 + ε1113936

Ni1 2 minus pi minus gi( 1113857 + ε

(5)

where gi isin 0 1 and pi isin [0 1] correspond to the probabilityvalue of a voxel in the manual annotation G and the networkprediction P respectively N and ϵ denote the total number ofvoxels in the image and numerical factor for stable trainingrespectively However Dice loss does not consider the impactof region size on Dice score To balance the voxel frequencybetween the foreground and background Sudre et al [37]proposed the generalizedDice loss which is defined as follows

GDL 1 minus 21113936

2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)

where coefficient wl 1(1113936Ni1 gli) is a weight for balancing

the size of regionPancreas boundary plays an important role in dealineating

the shape of pancreas However the pixels around theboundaries of the pancreas are hard samples which are difficultto delineate due to the ambiguous contrast with the sur-rounding tissues and organs Inspired by the focal loss [38 39]we propose a new loss function the focal generalized Dice loss(FGDL) function to alleviate class-imbalanced problem in thepancreas segmentation and allow network to concentrate thelearning on those hard samples such as boundary pixels efocal generalized Dice loss function can be defined as follows

FGDL 11139442

l11 minus 2

wl 1113936Ni pligli + ε

wl 1113936Ni pli + gli + ε

1113888 1113889

1c

(7)

where c varies in the range [1 3] We experimentally setc 43 during training

3 Experiments

31 Dataset and Evaluation We validated the performanceof our algorithm on 82 abdominal contrast-enhanced CTimages which come from the NIH pancreas segmentationdataset [34] e original size of each CT scan is 512 times 512with the number of slices from 181 to 460 as well as the slicethickness from 05mm to 10mm e image intensity ofeach scan is truncated to [minus100 240] HU to filter out theirrelevant details and further normalized to [0 1] In thisstudy we cropped each slice to [192 256] For fair com-parisons we trained and evaluated the proposed model with4-fold cross validation

Four metrics including the Dice Similarity Coefficient(DSC) Precision Recall and F-measure (abbreviated as F1)[40] are used to quantitatively evaluate the performance ofdifferent methods

(1) Dice Similarity Coefficient (DSC) measures thevolumetric overlap ratio between the ground truthsand network predictions It is defined as follows [41]

DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)

(2) Precision measures the proportion of truly positivevoxels in the predictions It is defined as follows

Precision Vgt capVseg

Vseg

(9)

(3) Recall measures the proportion of positives that arecorrectly identified It is defined as follows

Recall Vgt capVseg

Vseg

(10)

(4) F-measure shows the similarity and diversity oftesting data It is defined as follows

F-measure 2 middotPrecision middot RecallPrecision + Recall

(11)

where Vgt and Vseg represent the voxel sets of manualannotations and network predictions respectively For DSCthe experimental results are all reported as the mean withstandard deviation over all 82 samples For Precision Recalland F-measure metrics we just reported the mean score overall 82 samples


32 Implementation Details e proposed method wasimplemented on the Keras and TensorFlow platforms andtrained using Adam optimizers for 10 epochs on a NVIDIATesla P40 with 24GB GPU e learning rate and batch sizewere set to 00001 and 6 for training respectively In totalthe trainable parameters in the proposed DUNet are 644 Mand the average inference time of our DUNet per volume is0143 seconds

33 Qualitative and Quantitative Segmentation ResultsTo assess the effectiveness of deformable convolution in thepancreas segmentation we compared the three modelsDeformable-ConvNet U-Net and DUNet To make theoutput size of Deformable-ConvNet to be the same as inputwe make modification on Deformable-ConvNet [29] bysubstituting the original fully connected layers withupsampling layers Figure 6 qualitatively shows the im-provements brought by deformable convolution It can beobserved that our DUNet focuses more on the details of thepancreas which demonstrates that deformable convolutioncan extract more pancreas information and enhance thegeometric recognition capability of U-Net

e quantitative comparisons of different models interms of the Precision Recall F1 and mean DSC are re-ported in Table 1 It can be observed that our DUNetoutperforms the modified Deformable-ConvNet and U-Netwith improvements of average DSC up to 522 and 055Furthermore it is worth noting that our proposed DUNetreported the highest average F-measure with 8878 whichdemonstrates that the proposed DUNet is a high-qualitysegmentation model and more robust than other two ap-proaches Figures 4 and 5 visualize the 2D and 3D overlap ofsegmentations from the proposed DUNet with respect to themanual segmentations respectively Visual inspection of theoverlapping maps shows that the proposed DUNet can fitthe manual segmentations well which further demonstratesthe effectiveness of our method

34 Impact ofLossFunction To assess the effectiveness of theproposed loss function we test standard Dice loss and theproposed loss with DUNet ie Dice loss and the proposedfocal generalized Dice loss (FGDL) the segmentation per-formance of the DUNet with different loss function is re-ported in Table 2 It can be noted that DUNet with theproposed FGDL improves mean DSC by 096 and minDSC by 838 compared with Dice loss

35 Comparison with Other Methods We compared thesegmentation performance of the proposed DUNet withseven approaches [14 16 17 21ndash24] on the NIH dataset [34]Note that the experimental results of other seven methodswere obtained directly from their corresponding literaturesAs shown in Table 3 our method achieves the min DSC of7703 max DSC of 9329 and mean DSC of8725 plusmn 327 which outperforms all comparisonmethods Moreover the proposed DUNet performed thebest in terms of both standard deviation and the worst casewhich further demonstrates the reliability of our method inclinical applications

4 Discussion

e pancreas is a very important organ in the body whichplays a crucial role in the decomposition and absorption ofblood sugar and many nutrients To handle the challenges oflarge shape variations and fuzzy boundaries in the pancreassegmentation we propose a semiautomated DUNet toadaptively learn the intrinsic shape transformations of thepancreas In fact DUNet is an extension of U-Net bysubstituting the standard convolution block of the secondand third layers in the encoder and counterpart layers in thedecoder of U-Net with deformable convolution e mainadvantage of the proposed DUNet is that DUNet utilizes thechangeable receptive fields to automatically learn the in-herent shape variations of the pancreas then extract robustfeatures and thus improve the accuracy of pancreassegmentation

ere are several limitations in this work First duringdata processing we first need radiologists to approximatelyannotate the minimum and maximum coordinates of thepancreas in each slice in order to localize it and thus reducethe interference brought by complex background is workmay be laborious Second the trainable parameters arerelatively excessive In future work we will further improvepancreas segmentation performance from two aspects Firstwe will explore and adopt attention mechanism to eliminatelocalization module and construct a lightweight networkSecond we will consider how to fuse prior knowledge (egshape constraint) to the network

5 Conclusions

In this paper we proposed a semiautomated DUNet tosegment the pancreas especially for the challenging caseswith large shape variation Specifically the deformable

Table 3 Comparison with other segmentation methods on the NIH dataset (DSC) Bold denotes the best

Method Min DSC Max DSC Mean DSCRoth et al MICCAIrsquo2015 [16] 2399 8629 7142 plusmn 1011Roth et al MICCAIrsquo2016 [17] 3411 8865 7801 plusmn 820Zhou et al MICCAIrsquo2017 [14] 6243 9085 8237 plusmn 568Cai et al 2019 [21] 5900 9100 8370 plusmn 510Liu et al IEEE access 2019 [22] NA NA 8410 plusmn 491Zhu et al 3DVrsquo2018 [24] 6962 9145 8459 plusmn 486Man et al IEEE T MED IMAGING 2019 [23] 7432 9134 8693 plusmn 492DUNet(Ours) 7703 9329 8725 plusmn 327


convolution andU-Net structure are integrated to adaptivelycapture meaningful and discriminative features en anonlinear Dice-based loss function is introduced to super-vise the DUNet training and enhance the representativecapability of DUNet Experimental results on the NIHdataset show that the proposed DUNet outperforms all thecomparison methods

Data Availability

Pancreas CT images used in this paper were from a publicavailable pancreas CT dataset which can be obtained fromhttpdoiorg107937K9TCIA2016tNB1kqBU

Disclosure

An earlier version of our study has been presented as aconference paper in the following link httpsdoiorg10114533648363364894

Conflicts of Interest

e authors declare that there are no conflicts of interest

Acknowledgments

is work was supported by the National Natural ScienceFoundation of China (Grant nos 12090020 and 12090025)and Zhejiang Provincial Natural Science Foundation ofChina (Grant no LSD19H180005)

References

[1] P Ghaneh E Costello and J P Neoptolemos ldquoBiology andmanagement of pancreatic cancerrdquo Postgraduate MedicalJournal vol 84 no 995 pp 478ndash497 2008

[2] S V DeSouza R G Singh H D Yoon R MurphyL D Plank and M S Petrov ldquoPancreas volume in health anddisease a systematic review andmeta-analysisrdquo Expert Reviewof Gastroenterology amp Hepatology vol 12 no 8 pp 757ndash7662018

[3] J J Cerrolaza R M Summers and M G Linguraru ldquoSoftmulti-organ shape models via generalized pca a generalframeworkmrdquo in Medical Image Computing And Computer-Assisted InterventionSpringer Berlin Germany 2016

[4] A Saito S Nawano and A Shimizu ldquoJoint optimization ofsegmentation and shape prior from level-set-based statisticalshape model and its application to the automated segmen-tation of abdominal organsrdquo Medical Image Analysis vol 28pp 46ndash65 2016

[5] M Oda N Shimizu H R Roth et al ldquo3D FCN feature drivenregression forest-based pancreas localization and segmenta-tionrdquo in Proceedings of the Deep Learning in Medical ImageAnalysis and Multimodal Learning for Clinical DecisionSupport pp 222ndash230 Quebec Canada September 2017

[6] R Wolz C Chu K Misawa M Fujiwara K Mori andD Rueckert ldquoAutomated abdominal multi-organ segmen-tation with subject-specific atlas generationrdquo IEEE Transac-tions on Medical Imaging vol 32 no 9 pp 1723ndash1730 2013

[7] K I Karasawa M Oda T Kitasaka et al ldquoMulti-atlaspancreas segmentation atlas selection based on vesselstructurerdquo Medical Image Analysis vol 39 pp 18ndash28 2017

[8] A Krizhevsky I Sutskever and G Hinton ldquoImageNetclassification with deep convolutional neural networksrdquoAdvances in Neural Information Processing Systems vol 2pp 1097ndash1105 2012

[9] J Long E Shelhamer and T Darrell ldquoFully convolutionalnetworks for semantic segmentationrdquo IEEE Conference onComputer Vision and Pattern Recognition vol 7-12pp 3431ndash3440 2015

[10] O Ronneberger P Fischer and T Brox ldquoU-net convolu-tional networks for biomedical image segmentationrdquo LectureNotes in Computer Science vol 9351 pp 234ndash241 2015

[11] C Lyu G Hu and D Wang ldquoHRED-net high-resolutionencoder-decoder network for fine-grained image segmenta-tionrdquo IEEE access vol 8 pp 38210ndash38220 2020

[12] A Farag L Lu H R Roth J Liu E Turkbey andR M Summers ldquoA bottom-up approach for pancreas seg-mentation using cascaded superpixels and (deep) image patchlabelingrdquo IEEE Transactions on Image Processing vol 26no 1 pp 386ndash399 2017

[13] X Li H Chen Q Dou C-W Fu and P-A Heng ldquoH-DenseUNet hybrid densely connected UNet for liver andtumor segmentation from CTvolumesrdquo IEEE Transactions onMedical Imaging vol 37 no 12 pp 2663ndash2674 2018

[14] Y Zhou L Xie W Shen Y Wang E K Fishman andA L Yuille ldquoA fixed-point model for pancreas segmentationin abdominal ct scansrdquo Medical Image Computing andComputer Assisted InterventionmdashMICCAI 2017 vol 10pp 693ndash701 2017

[15] P J Hu X Li Y Tian et al ldquoAutomatic pancreas segmen-tation in CT images with distance-based saliency-awareDenseASPP networkrdquo IEEE journal of biomedical and healthinformatics vol 25 no 5 pp 1601ndash1611 2020

[16] H R Roth L Lu A Farag et al ldquoDeepOrgan multi-leveldeep convolutional networks for automated pancreas seg-mentationrdquo in Proceedings of the Medical Image ComputingAnd Computer Assisted Intervention pp 556ndash564 MunichGermany June 2015

[17] H R Roth L Lu A Farag A Sohn and R M SummersldquoSpatial aggregation of holistically-nested networks for au-tomated pancreas segmentationrdquo Medical Image Computingand Computer-Assisted InterventionndashMICCAI 2016 vol 9901pp 451ndash459 2016

[18] H R Roth L Lu N Lay et al ldquoSpatial aggregation of ho-listically-nested convolutional neural networks for automatedpancreas localization and segmentationrdquo Medical ImageAnalysis vol 45 pp 94ndash107 2018

[19] Q Yu L Xie Y Wang et al ldquoRecurrent saliency transfor-mation network incorporating multi-stage visual cues forsmall organ segmentationrdquo in Proceedings of the IEEECVFConference on Computer Vision and Pattern Recognitionpp 8280ndash8289 Salt Lake UT USA June 2018

[20] J Cai L Lu Y Xie F Xing and L Yang ldquoImproving deeppancreas segmentation in ct and mri images via recurrentneural contextual learning and direct loss functionrdquo inMedical Image Computing And Computer-Assisted Inter-ventionSpringer Berlin Germany 2017

[21] J Cai L Lu F Xing and L Yang ldquoPancreas segmentation inCT and MRI via task-specific network design and recurrentneural contextual learningrdquo in Deep Learning and Convolu-tional Neural Networks for Medical Imaging and ClinicalInformaticsSpringer Berlin Germany 2019

[22] S Liu X Yuan R Hu S Liang and S Feng ldquoAutomaticpancreas segmentation via coarse location and ensemblelearningrdquo IEEE Access vol 8 pp 2906ndash2914 2019


[23] Y Man Y Huang J Feng X Li and F Wu ldquoDeep Q learningdriven CT pancreas segmentation with geometry-awareU-netrdquo IEEE Transactions on Medical Imaging vol 38 no 8pp 1971ndash1980 2019

[24] Z Zhu Y Xia W Shen E Fishman and A Yuille ldquoA 3Dcoarse-to-fine framework for volumetric medical imagesegmentationrdquo in Proceedings of the International Conferenceon 3D Vision pp 682ndash690 Verona Italy September 2018

[25] E Gibson F Giganti Y Hu et al ldquoTowards image-guidedpancreas and biliary endoscopy automatic multi-organ seg-mentation on abdominal CT with dense dilated networksrdquoMedical Image Computing and Computer Assisted Inter-ventionmdashMICCAI 2017 vol 10 pp 728ndash736 2017

[26] M P Heinrich M Blendowski and O Oktay ldquoTernaryNetfaster deep model inference without GPUs for medical 3Dsegmentation using sparse and binary convolutionsrdquo Inter-national Journal of Computer Assisted Radiology and Surgeryvol 13 no 9 pp 1311ndash1320 2018

[27] M P Heinrich and O Oktay ldquoBRIEFnet deep pancreassegmentation using binary sparse convolutionsrdquo MedicalImage Computing and Computer Assisted Intervention minus

MICCAI 2017 vol 435 pp 329ndash337 2017[28] M Jaderberg K Simonyan A Zisserman and

K Kavukcuoglu ldquoSpatial transformer networksrdquo Advances inNeural Information Processing Systems vol 2015 pp 2017ndash2025 2015

[29] M F Dai H Qi Y Xiong et al ldquoDeformable convolutionalnetworksrdquo in Proceedings of the IEEE International Confer-ence on Computer Vision pp 764ndash773 Venice Italy October2017

[30] Y Wang J Yang L Wang et al ldquoLight field image super-resolution using deformable convolutionrdquo IEEE Transactionson Image Processing vol 30 pp 1057ndash1071 2021

[31] S A Siddiqui M I Malik S Agne A Dengel and S AhmedldquoDeCNT deep deformable CNN for table detectionrdquo IEEEaccess vol 6 pp 74151ndash74161 2018

[32] L Chen G Papandreou I Kokkinos K Murphy andA Yuille ldquoSemantic image segmentation with deep con-volutional nets and fully connected CRFsrdquo InternationalConference on Learning Representations vol 40 2015

[33] M X Hunag C F Huang J Yuan and D X Kong ldquoFixed-point deformable U-net for pancreas CT segmentationrdquo inProceedings of the 3rd International Symposium on ImageComputing and Digital Medicine pp 283ndash287 Xian ChinaAugust 2019

[34] H R Roth A Farag E B Turkbey et al ldquoData from pancreas-CTrdquo 13e Cancer Imaging Archive vol 32 2016

[35] S Ioffe and C Szegedy ldquoBatch normalization acceleratingdeep network training by reducing internal covariate shiftrdquo inProceedings of the 32nd International Conference on MachineLearning pp 448ndash456 July 2015

[36] W Liu Y Song D Chen et al ldquoDeformable object trackingwith gated fusionrdquo IEEE Transactions on Image Processingvol 28 no 8 pp 3766ndash3777 2019

[37] C H Sudre W Li T Vercauteren S Ourselin and M JorgeCardoso ldquoGeneralised dice overlap as a deep learning lossfunction for highly unbalanced segmentationsrdquo DeepLearning in Medical Image Analysis and Multimodal Learningfor Clinical Decision Support vol 55 pp 240ndash248 2017

[38] T-Y Lin P Goyal R Girshick K He and P Dollar ldquoFocalloss for dense object detectionrdquo IEEE Transactions on PatternAnalysis and Machine Intelligence vol 42 no 2 pp 318ndash3272020

[39] N Abraham and N M Khan ldquoA novel focal tversky lossfunction with improved attention U-net for lesion segmen-tationrdquo in Proceedings of the IEEE 16th International Sym-posium on Biomedical Imaging pp 683ndash687 Venice ItalyApril 2019

[40] N Lazarevic-Mcmanus J R Renno D Makris andG A Jones ldquoAn object-based comparative methodology formotion detection based on the F-Measurerdquo Computer Visionand Image Understanding vol 111 no 1 pp 74ndash85 2008

[41] L R Dice ldquoMeasures of the amount of ecologic associationbetween speciesrdquo Ecology vol 26 no 3 pp 297ndash302 1945


semantic segmentation tasks especially for small datasetsSince the pancreas is a small soft organ in the abdomenmost pancreas segmentation algorithms based on con-volutional neural network (CNN) provide iterative algo-rithms [15] in a coarse-to-fine manner to relieve theinterference of complex background Roth et al first pro-posed a probabilistic bottom-up coarse-to-fine approach forpancreas segmentation [16] where a multilevel deep Con-vNet model is utilized to learn robust pancreas features Twosubsequent holistically nested segmentation networks[17 18] advanced this previous work [16] Zhou et alpresented a two-stage fixed-point approach for the pancreassegmentation which utilized the predicted segmentationsfrom coarse model to localize and obtain smaller pancreasregions which were further refined by another model [14]Yu et al presented the recurrent saliency transformationnetwork to tackle the challenge of small organ segmentationwhere a saliency transformation module is utilized toconnect coarse and fine stage to realize joint optimization[19] Cai et al designed a convolutional neural networkequipped with convolutional LSTM to impose spatial con-textual consistency constraints on successive image slices[20] Cai et al [21] further improved the pancreas initialsegmentation in [20] by aggregating the multiscale low-levelfeatures and strengthened the pancreatic shape continuity bybidirection recurrent neural network (BiRNN) Liu et al [22]used superpixel-based approach to obtain coarse pancreassegmentations which were then used to train five same-architecture fully convolutional networks (FCNs) withdifferent loss functions to achieve accurate pancreas seg-mentations is method is evaluated on 82 public CTvolumes and achieved a Dice coefficient of 8410 plusmn 491Man et al [23] proposed a two-stage method composed ofdeep Q network (DQN) and deformable U-Net for thepancreas segmentation in which DQN is used to obtaincontext-adaptive coarse pancreas segmentations whichwere then input to deformable U-Net for refinement Zhuet al [24] proposed a 3D coarse-to-fine network to segmentthe pancreas is 3D method outperformed the 2D

counterpart due to the full usage of the rich spatial infor-mation along the long axial dimension Some commontechniques such as dense connection [25] residual blockand sparse convolution [26 27] are also widely utilized tosegment the pancreas

Google DeepMind proposed a spatial transformer [28]which is the first work to allow neural networks learn thetransformation matrix from data and transform featuremaps spatially Specifically spatial transformer network(STN) can globally deform feature maps through learnedtransformations such as scaling cropping rotation as wellas nonrigid deformation Recently Dai et al proposed adeformable convolution to get over the limitation of fixedreceptive field in standard convolution [29] In detailconvolutional kernel with explicit offsets learned from theprevious feature maps can adaptively change predefinedreceptive field in order to extract more target features especific deformable convolution is shown in Figure 2 inwhich some standard convolution layers are first utilized tolearn and regress the deformation displacements for eachsampling point in the image and then the learned dis-placements are added to original sampling positions of the2D convolution to enable network extract relevant and richfeatures far from original fixed neighborhood [30] Differentfrom STN [28] deformable convolution adopts a local anddense instead of global manner to warp feature mapsMoreover deformable convolution focuses on learningexplicit offset for each neuron instead of kernel weightsSince the pancreas has various scales and shapes acrosspatients and traditional convolutional kernel cannot addresswell on organs with high deformation due to the fixed re-ceptive field we believe deformable convolution is moresuitable for the task of pancreas segmentation [31]

In this paper we propose a semiautomated deformableU-Net model utilizing the power of U-Net and Deformable-ConvNets e proposed architecture for pancreas seg-mentation has two merits First deep segmentation net-works such as FCN [9] U-Net [10] and DeepLab [32] easilysuffer from confusion by the large irrelevant background

(a) (b) (c)

Figure 1 Examples of 2D CT slices with pancreas annotations (red regions) showing the highly variable shape and size of pancreas elargest area of pancreas is less than 08 of entire slice while the smallest area is less than 01 (best viewed in color)








Whole image







Y p0( 1113857 1113944pnisinG

W pn( 1113857 middot X p0 + pn( 1113857(2)



Y p0( 1113857 1113944pnisinG


(a) (b) (c) (d)


Standardconvolution

block

Conv

BN

ReLU

(a)


block

ConvOffset

Conv

BN

ReLU

(b)



(a) (b) (c)


(a) (b) (c) (d) (e)







Input Output



Downsample

Upsample

Skip connection


(a) (b) (c) (d)




WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References



















































Whole image







Y p0( 1113857 1113944pnisinG

W pn( 1113857 middot X p0 + pn( 1113857(2)



Y p0( 1113857 1113944pnisinG


(a) (b) (c) (d)


Standardconvolution

block

Conv

BN

ReLU

(a)


block

ConvOffset

Conv

BN

ReLU

(b)



(a) (b) (c)


(a) (b) (c) (d) (e)







Input Output



Downsample

Upsample

Skip connection


(a) (b) (c) (d)




WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References















































Y p0( 1113857 1113944pnisinG

W pn( 1113857 middot X p0 + pn( 1113857(2)



Y p0( 1113857 1113944pnisinG


(a) (b) (c) (d)


Standardconvolution

block

Conv

BN

ReLU

(a)


block

ConvOffset

Conv

BN

ReLU

(b)



(a) (b) (c)


(a) (b) (c) (d) (e)







Input Output



Downsample

Upsample

Skip connection


(a) (b) (c) (d)




WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References













































(a) (b) (c)


(a) (b) (c) (d) (e)







Input Output



Downsample

Upsample

Skip connection


(a) (b) (c) (d)




WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References

















































Input Output



Downsample

Upsample

Skip connection


(a) (b) (c) (d)




WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References














































WdegX( 1113857(i j) 1113944

1

mminus11113944

1




L(P G) 1 minus1113936

Ni1 pigi + ε

1113936Ni1 pi + gi + ε

minus1113936



(5)



2l1 wl 1113936

Ni pligli

11139362l1 wl 1113936

Ni pli + gli

(6)




FGDL 11139442

l11 minus 2



1113888 1113889

1c

(7)


3 Experiments




DSC 2 Vgt capVseg

Vgt

+ Vseg

(8)



Vseg

(9)


Recall Vgt capVseg

Vseg

(10)



(11)








4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References


















































4 Discussion



5 Conclusions






Data Availability


Disclosure




Acknowledgments


References














































Data Availability


Disclosure




Acknowledgments


References


































































Documents

A Semiautomated Deep Learning Approach for Pancreas