AUTOMATIC SEGMENTATION OF SMALL PULMONARY NODULES …drtimothytuinstra.com/tuinstra_diss.pdf · tation of nodules within three dimensional (3D) computed tomography (CT) data as well

AUTOMATIC SEGMENTATION OF SMALLPULMONARY NODULES IN COMPUTEDTOMOGRAPHY DATA USING A RADIALBASIS FUNCTION NEURAL NETWORK

WITH APPLICATION TOVOLUME ESTIMATION

Dissertation

Submitted to

The School of Engineering of the

UNIVERSITY OF DAYTON

In Partial Fulfillment of the Requirements for

The Degree

Doctor of Philosophy in Electrical Engineering

by

Timothy Ryan Tuinstra

UNIVERSITY OF DAYTON

Dayton, Ohio

December 2008

AUTOMATIC SEGMENTATION OF SMALL PULMONARY NODULES

IN COMPUTED TOMOGRAPHY DATA USING RADIAL BASIS FUNC-

TION NEURAL NETWORK WITH APPLICATION TO VOLUME ESTI-

MATION

APPROVED BY:

Russell C. Hardie, Ph.D.Advisor Committee ChairmanProfessor, Electrical and ComputerEngineering Department

John S. Loomis, Ph.D.Committee MemberAssociate Professor, Electrical andComputer Engineering Department

Raul Ordonez, Ph.D.Committee MemberAssociate Professor, Electrical andComputer Engineering Department

Julie A. Skipper, Ph.D.Committee MemberAssistant Professor, Department ofBiomedical, Industrial and HumanFactors Engineering, Wright StateUniversity

Malcolm W. Daniels, Ph.D.Associate DeanSchool of Engineering

Joseph E. Saliba, Ph.D., P.E.DeanSchool of Engineering

ii

ABSTRACT

AUTOMATIC SEGMENTATION OF SMALL PULMONARY NODULES IN COMPUTEDTOMOGRAPHY DATA USING RADIAL BASIS FUNCTION NEURAL NETWORKWITH APPLICATION TO VOLUME ESTIMATION

Name: Tuinstra, Timothy RyanUniversity of Dayton

Advisor: Dr. Russell C. Hardie

Lung cancer continues to be the leading cause of cancer death in the United States. The

automatic detection and characterization of this deadly form of cancer is an area of on-

going research. This dissertation focuses primarily on the characterization of pulmonary

nodules once they have been detected. This characterization includes the accurate segmen-

tation of nodules within three dimensional (3D) computed tomography (CT) data as well

as developing accurate volume estimates from these segmentations. In this dissertation a

novel approach to the segmentation of pulmonary nodules from CT data in which we com-

pute a set of candidate segmentations which are characterized by a set of measured features.

We then apply a trained artificial neural network to attempt to estimate the quality of the

candidate segmentations. The highest quality candidate segmentation is kept as the winner.

In addition, we propose a couple of techniques for reducing the computational complexity

of the segmentation algorithm. We apply both simulated annealing and the golden section

test as intelligent ways of searching the solution space. Finally, techniques for the accu-

rate estimation of nodule volume are discussed. We discuss existing volume estimation

iii

approaches as well as introduce a new variation. We provide experimental results for seg-

mentation and volume estimation algorithms that we present including a comparison of our

segmentation algorithm to segmentations created by board certified radiologists.

iv

For my loving wife Kelly, my son Iain, and my dear parents who taught me the

importance of life-long learning.

v

ACKNOWLEDGMENTS

I would like to thank Dr. Russell Hardie for all his help as my dissertation advisor to

help me to complete this degree. Thanks also to my committee members for reading my

dissertation and for participating in my defense. I would also like to thank the Weill Med-

ical College of Cornell University for making their ELCAP CT data publicly available on

the Internet. Thanks go to Dr. Randy D. Ernst, MD for making the UTMB dataset available

to me. Thanks go to the Lung Imaging Database Consortium for making radiologist truthed

data available to researchers such as myself. I would also like to thank Dr. Metin Gurcan

of The Ohio State University for his helpful input. Dr. Steven Rogers of the Air Force Re-

search Laboratory also provided several useful brain-storming sessions about neural nets

for which I am extremely thankful. I appreciate Dr. John Loomis, Dr. Raul Ordonez, and

Dr. Julie Skipper for participating on my committee. Thanks go to the Dayton Area Grad-

uate Studies Institute for their financial support for my degree. I also need to thank my

department chair and my colleagues at Cedarville University for giving me extra time and

to the administration who provided additional funds to help me pay for tuition. I certainly

could not have done this without the loving support of my wife Kelly who put up with me

working on this dissertation for the first six years of our marriage. My son Iain will get his

daddy back! Finally, all glory goes to my Lord Jesus Christ who has given me the ability

to persevere. Soli Deo Gloria!

vi

TABLE OF CONTENTS

Page

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii

CHAPTERS:

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 A CT Primer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.2 CT and Lung Cancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121.3 Dissertation Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

II. The Segmentation Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.1 Survey of Existing Image Segmentation Algorithms . . . . . . . . . . . 182.2 The Segmentation Engine . . . . . . . . . . . . . . . . . . . . . . . . . 23

III. Using artificial neural networks to characterize segmentation quality . . . . . . 27

3.0.1 Basic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . 273.1 ANN Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

IV. Intelligent Search of the R and T Solution Space . . . . . . . . . . . . . . . . . 38

4.1 Efficiency Analysis for the Exhaustive Search . . . . . . . . . . . . . . . 384.2 Simulated Annealing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394.3 Golden Section Search . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

vii

V. Estimation of Pulmonary Nodule Volume . . . . . . . . . . . . . . . . . . . . 51

5.1 2-dimensional methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 535.2 3D Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

VI. Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.1 Nodule Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.1.1 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 616.1.2 Neural Network Training . . . . . . . . . . . . . . . . . . . . . 646.1.3 Segmentation Performance . . . . . . . . . . . . . . . . . . . . 65

6.2 Volume Estimator Performance . . . . . . . . . . . . . . . . . . . . . . 726.2.1 Nodule Segmentation and Datasets . . . . . . . . . . . . . . . . 726.2.2 Phantom Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 776.2.3 ELCAP Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 846.2.4 UTMB data . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

VII. Discussion and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

viii

LIST OF FIGURES

Figure Page

1.1 CT projection acquisition. . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Original backprojection example image. . . . . . . . . . . . . . . . . . . . 7

1.3 Sinogram (Radon Transform) of original image. . . . . . . . . . . . . . . . 8

1.4 Reconstruction using backprojection. . . . . . . . . . . . . . . . . . . . . . 9

1.5 CT scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6 1G CT Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.7 2G CT Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.8 3G CT Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.9 4G CT Scanner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1 Segmentation Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.1 Radial Basis Function ANN . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.2 GUI interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.3 Segmentation Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.4 Illustration of Mean Convergence Index . . . . . . . . . . . . . . . . . . . 36

4.1 Sampling in the T −R space. . . . . . . . . . . . . . . . . . . . . . . . . 40

ix

4.2 Simulated annealing cooling regime. . . . . . . . . . . . . . . . . . . . . . 44

4.3 Illustration of the Golden Ratio. . . . . . . . . . . . . . . . . . . . . . . . 46

4.4 Illustration of golden section search . . . . . . . . . . . . . . . . . . . . . 47

4.5 Golden section search convergence . . . . . . . . . . . . . . . . . . . . . . 48

4.6 Segmentation engine function calls . . . . . . . . . . . . . . . . . . . . . . 50

4.7 Feature computation function calls . . . . . . . . . . . . . . . . . . . . . . 50

5.1 Illustration of area method of nodule volume estimation. . . . . . . . . . . 54

5.2 Minimax and perimeter method height estimates . . . . . . . . . . . . . . . 57

6.1 Forward sequential selection of features . . . . . . . . . . . . . . . . . . . 65

6.2 LIDC overlap results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.3 Overlap vs. radiologist rated features . . . . . . . . . . . . . . . . . . . . . 68

6.4 LIDC segmentation example 1 . . . . . . . . . . . . . . . . . . . . . . . . 69

6.5 LIDC segmentation example 2 . . . . . . . . . . . . . . . . . . . . . . . . 70

6.6 System selection of T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.7 System selection of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.8 Kostis overlap performance . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.9 Simulated Annealing Trajectory . . . . . . . . . . . . . . . . . . . . . . . 73

6.10 Golden section search example 1 . . . . . . . . . . . . . . . . . . . . . . . 74

6.11 Golden section search example 2 . . . . . . . . . . . . . . . . . . . . . . . 75

6.12 Muenster segmentation example . . . . . . . . . . . . . . . . . . . . . . . 77

x

6.13 ELCAP segmentation example . . . . . . . . . . . . . . . . . . . . . . . . 78

6.14 UTMB example segmentation . . . . . . . . . . . . . . . . . . . . . . . . 79

6.15 K parameter histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

6.16 Apparent Magnification comparison . . . . . . . . . . . . . . . . . . . . . 81

6.17 Volume estimate comparison . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.18 Volume estimate error comparison . . . . . . . . . . . . . . . . . . . . . . 84

xi

LIST OF TABLES

Table Page

4.1 Computational Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . 49

6.1 Summary of Datasets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.2 Radiologist inter-observer variability . . . . . . . . . . . . . . . . . . . . . 63

6.3 Test results for the LIDC data. . . . . . . . . . . . . . . . . . . . . . . . . 66

6.4 Intelligent search overlap performance . . . . . . . . . . . . . . . . . . . . 72

6.5 Timing analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.6 Summary of Datasets Used. . . . . . . . . . . . . . . . . . . . . . . . . . . 76

6.7 ELCAP standard deviation results . . . . . . . . . . . . . . . . . . . . . . 86

6.8 ELCAP mean results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

6.9 ELCAP standard deviation results for volume estimates . . . . . . . . . . . 87

6.10 ELCAP mean results for volume estimates . . . . . . . . . . . . . . . . . . 87

6.11 UTMB error statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

xii

CHAPTER I

Introduction

Lung cancer remains the leading cause of cancer death in the United States of America.

The American Cancer Society estimates that in 2008, 215,000 new cases of lung cancer will

be diagnosed in the U.S [1]. In addition, it is estimated that 90,810 men and 71,030 women

will die of the disease and related complications this year. In addition, approximately

150,000 small pulmonary nodules are detected every year in the United States [2]. This

makes the battle against lung cancer one of the primary fronts in the ongoing war against

cancer.

The accurate characterization and quantization of malignant pulmonary nodules as well

as other lung lesions is extremely important for the diagnosis and clinical management

of lung health. Physicians, scientists, and engineers are increasingly turning to tomo-

graphic medical imaging modalities such as X-ray computed tomography (CT), magnetic

resonance imaging (MRI), and positron emission tomography (PET) as tools for the non-

invasive characterization of cancer in general. X-ray CT is becoming an especially helpful

tool for non-invasive characterization and monitoring of pulmonary nodules. The increas-

ing resolution of CT imagery makes it an effective detection and monitoring tool for radiol-

ogists and oncologists alike. In addition computer aided detection and diagnosis (CADD)

1

2

tools are playing an increasingly large role in the analysis of the vast quantities of data that

are currently being produced by radiology units in numerous hospitals.

Programs such as the International Early Lung Cancer Action Program (I-ELCAP) are

helping health care providers to diagnose lung cancer much earlier than has been previously

possible. Individuals who are more susceptible to lung cancer such as smokers and those

who have been exposed to asbestos or high levels of radon gas may be screened annually as

a part of such programs and treatment regimens may be begun for those who test positive.

By detecting the disease earlier, effective treatment may be begun earlier and the mortality

rate may be lowered. Excellent results have been reported in [3].

In this chapter, a brief overview of the physics and image formation process for X-ray

CT will be given in Section 1.1. Understanding the origins of a given image is essential for

understanding the processing of the resulting images. Following this brief primer, the use

of CT in the analysis of pulmonary nodules will be introduced in Section 1.2. Finally, an

overview of the current dissertation research will be presented in Section 1.3.

1.1 A CT Primer

No discussion of an image processing algorithm for CT images would be complete

without a brief introduction to the physics and phenomenology of CT. To understand the

images, we must understand their origin. The mathematical underpinnings that enable CT

technology were first proposed by Johann Radon (1887-1956) in 1917. Radon showed that

it would be possible to reconstruct a 2D function based on a set of its 1D projections. The

mathematical operation necessary for creating a set of these projections from a 2D function

has since been dubbed the “Radon Transform”. However, the computational machinery

3

necessary to implement a practical CT system was not available until the advent of digital

computers. It was for this reason that X-ray CT for use in medicine was developed primarily

in the late 1960’s to early 1970’s. Godfrey Hounsfield (1919-2004) and Allen Cormack

(1924-1998) received the Nobel Prize in Medicine in 1979 for their independent, pioneering

work in the development of X-ray CT [4]. The first commercial CT imaging system was

deployed in the United States in 1973 at the Mayo Clinic in Rochester, Minnesota.

The fundamental operating principle of any tomographic imaging modality is that a

two dimensional function f(x, y) is completely determined by a complete set of radial

projections derived from it. The idea of projections of a 2D function is illustrated in Figure

1.1. Essentially, what we are doing is performing line integrals along each of the parallel

lines shown in the figure. In X-ray CT, the function of interest is the 2D function describing

the density of the tissue in a single slice. Multiple slices may then be acquired yielding a

full 3D perspective of the internal anatomy. We can define these lines perimetrically based

on their perpendicular distance, l, from the origin and their angle, θ, with respect to the x

axis. That is, any line in the x − y plane can be defined as x cos(θ) + y sin(θ) = l. The

projections (the Radon Transform) are then defined mathematically as

g(l, θ) =∫ ∞

−∞

∫ ∞

−∞f(x, y)δ(x cos(θ) + y sin(θ)− l)dxdy (1.1)

where δ(·) is the Dirac delta function. Sometimes the Radon Transform is displayed as an

image with θ as one axis and l as the other axis. This image is called a sinogram. How does

the line integral idea follow from the physics of CT? We answer this question next.

Matter has the tendency to attenuate X-ray radiation that is propagating through it.

The primary phenomena that cause this attenuation include Compton scattering and the

4

),( yxf

),( qlg

l

q

Figure 1.1: Illustration of the acquisition of a single projection.

photoelectric effect. Compton scattering occurs when an incident X-ray photon collides

with an atom and frees a valence shell electron. A lower energy photon called a Compton

photon is released along with the ejected electron. The photoelectric effect describes the

ejection of an electron from an inner orbital shell (usually K-shell) from an atom after it has

been hit by an incident photon. The macro effect of these radiation/matter interactions is

the attenuation of the X-ray beam. That is, fewer photons exit the object than were incident

upon it. High density matter has a greater attenuation effect than lower density matter.

It is this property of matter that allows us to differentiate among various types of tissue

in medical imaging. For example, bone has higher density than fatty tissue and therefore

absorbs more photons. Various types of matter are characterized by their linear attenuation

coefficient µ which is a function of the density. The linear attenuation coefficient describes

5

the rate at which the intensity decreases per unit distance traveled in a medium. That is,

dI

ds= −µs (1.2)

where s is the variable describing position in the medium. For this derivation of image

formation, we will assume a monoenergetic (single frequency) X-ray source since it sim-

plifies the math. In reality, the X-ray beam originating from an X-ray tube in a CT scanner

is polyenergetic. If an object is comprised of a single material such that µ is the same

throughout, then the intensity of the beam which emerges from an object is related to the

incident radiation intensity by the exponential relationship

I = I0e−µD (1.3)

which is simply the solution of (1.2) and where I0 is the incident beam intensity (the initial

condition), D is the width of the material, and I is the intensity of the emerging beam. If the

attenuation coefficient is allowed to vary now as a function of distance traveled, s, through

the medium, we may find the intensity of exiting beam by computing a line integral

I = I0e−

∫ D

0µ(s)ds (1.4)

where D is the distance the beam travels through the object. If the incident intensity, I0,

is known ahead of time, and the exiting intensity, I , is measured, then we have the line

integral,∫ D

0µ(s)ds = − ln

I

I0

. (1.5)

This is the physical basis for the line-integral description of CT given in the analysis above.

In CT, it is desirable to measure the linear attenuation coefficient for every point in the

6

slice being imaged. The reconstruction algorithm described below will accomplish this

objective.

One practical difficulty with CT imaging is that various CT scanners will produce dif-

ferent values for µ. This is clearly an undesirable situation especially when we would like

to compare imagery derived from various scanners. For this reason, a normalized unit is

used for CT in which the attenuation coefficient is normalized with respect to the atten-

uation coefficient of water. This unit is called the Hounsfield Unit (HU) and is defined

as

h = 1000µ− µwater

µwater

(1.6)

where µwater is the linear attenuation coefficient of water.

Given the Radon transform of a function, g(l, θ) = <{f(x, y)}, it is possible to create

an estimate of the original function f(x, y). We allow 0◦ ≤ θ ≤ 180◦ in order to obtain a

complete set of projections realizing that g(l, θ) = g(l, θ+180◦). Essentially, prior to image

reconstruction, the data collected by a CT system is the Radon transform of the desired im-

age. The question is how to perform the inverse Radon transform, f(x, y) = <−1{g(l, θ)}.

Algorithms for computing the inverse Radon transform fall into several broad classes.

Backprojection is the primary reconstruction algorithm used in X-ray CT. A Fourier trans-

form technique is also sometimes used. A brief description of backprojection is given here

to illustrate the image formation process.

The backprojection CT reconstruction algorithm assumes that we have a set of pro-

jections available, g(l, θ) for 0◦ ≤ θ ≤ 180◦. In practice, we sample θ in some finite

increments. Each projection is “back-projected” by “smearing” it in the θ direction. In

other words, we create one back projection bθ(x, y) = g(xcos(θ) + ysin(θ), θ). We then

7

compute the sum of all of these back projection images

fb(x, y) =∫ π

0bθ(x, y)dθ (1.7)

in order to create an estimate of the original slice function f(x, y). While there are numer-

ous improvements that can be made to this algorithm, including filtered backprojection, the

backpropagation algorithm is the basis for most modern CT reconstruction techniques. An

example of CT image formation and reconstruction is shown below. The original image,

corresponding sinogram, and the reconstructed image are shown in Figures 1.2, 1.3, and

1.4 respectively.

Figure 1.2: Original backprojection example image.

Revolutionary technologies usually experience a period of evolution during which time

a series of marked improvements are made. X-ray CT development is no exception to this

8

Angle, θ in degrees

0 20 40 60 80 100 120 140 160 180

100

200

300

400

500

600

700

Figure 1.3: Sinogram (Radon Transform) of original image.

rule. The simultaneous development of digital computing occurring from the 1970’s to the

present time as well as the advances made in digital signal processing (DSP) algorithms

have really revolutionized the entire medical imaging field [5]. A typical X-ray CT scanner

is shown in Figure 1.5.

It is generally agreed that CT has gone through at least 4 distinct generations as imager

geometries have changed and the resulting imaging speeds have improved [4–6]. First

Generation (1G) scanners consisted of a single radiation source collimated as a pencil beam

and a single detector. The source and the detector move linearly to create a single projection

of the subject. This is followed by a rotation of the source and detector combination to the

next angular position where the source and the detector once again move linearly creating a

second projection. This series of steps is repeated until the desired number of projections is

created. The collection of these projections constitutes the data necessary for reconstruction

9

Figure 1.4: Image reconstructed using backprojection.

of a single slice. If more than one slice is desired as in the case of 3D imaging, then the

table on which the patient is lying is automatically shifted to the next position for collection

of the next set of projections. An illustration of 1G scanner operation is given in Figure

1.6.

Second Generation (2G) scanners consist of a single fan-beam source which allows a

linear array of detectors to be used. This way a single projection could be created in a

shorter time than the 1G scanners and multiple projections can be acquired during the same

linear scan time. Imaging time is a fundamental consideration, especially for thoracic CT

since all images need to be collected during a single breath hold. CT scanners utilizing a

fan beam must implement a more sophisticated reconstruction algorithm since the X-ray

beams are no longer parallel. An illustration of the operation of a 2G scanner is given in

Figure 1.7.

10

Figure 1.5: Typical X-ray CT scanner (courtesy of Siemens).

11

Patient

Scan

X-ra

y

sour

ce

Detector

Figure 1.6: Operation of a First Generation CT Scanner.

Third Generation (3G) scanners have a sufficiently sized detector array such that no

linear scanning is required. The fan-beam is wide enough completely illuminate the pa-

tient while in one angular position. Once again the benefit is increasing scan speed. An

illustration of the operation of a 3G scanner is shown in Figure 1.8. Fourth Generation

(4G) scanners have a ring of detectors which completely surround the patient requiring no

detector motion during scanning. Only the fan-beam collimated source must rotate. The

mechanical simplification inherent in the 4G scanner makes it attractive. An illustration is

shown in Figure 1.9.

Later generations of scanners include developments such as cone beams and helical

scanning. In helical scanning, the patient table slides at a constant velocity through the

source and detector gantry as the source repeatedly circles the body of the patient. This

12

Patient

Scan

X-ra

y

sour

ce

Detector

Array

Figure 1.7: Operation of a Second Generation CT Scanner.

produces a source trajectory that forms a helix with respect to the patient. Sophisticated

reconstruction algorithms are required to create individual slice images using this system.

1.2 CT and Lung Cancer

X-ray CT has become a vital tool in lung cancer diagnosis and clinical management.

First of all, CT is now being widely used to aid in the early detection of pulmonary nodules.

While radiologists are analyzing volumes of CT data, new algorithms for the automatic

detection of nodules are being developed. Such algorithms employ sophisticated pattern

recognition techniques to identify whether or not nodules are present and where.

13

Patient

Detector

Array

X-ray

Source

Figure 1.8: Operation of a Third Generation CT Scanner.

CT is also becoming a tool of choice for automated nodule analysis and characteri-

zation. Commercial CT scanners now may be purchased with software for lung nodule

analysis. For example, Siemens has developed the syngo LungCARE software package

for use in measuring pulmonary nodule size as well as other parameters of clinical signifi-

cance [7]. GE Healthcare offers Lung VCAR software which may be used from the early

diagnosis and detection stage all the way through clinical management of the disease [8].

Once a nodule is detected, as much information as possible must be extracted from it to

determine whether it is malignant or benign and to determine what type of threat it poses

to the patient. A large number of abnormalities that appear as nodules in a chest radio-

graph actually are the result of other causes. Follow up CT scans can go a long way toward

more accurate differentiation. Erasmus et al. provide a good overview of the appearance

14

Patient

Detector

Array

X-ray

Source

Figure 1.9: Operation of a Fourth Generation CT Scanner.

of various lung abnormalities in CT in their work [2]. Greater understanding of a given

nodule and its pathology based on analysis of imagery could potentially reduce the need

for invasive and inherently dangerous procedures such as lung biopsy sometimes ordered

to determine nodule pathology [9]. The majority of detected pulmonary nodules are be-

nign. The growth rate of pulmonary nodules is a parameter that is of critical importance

to health-care providers. Nodules that do not grow do not pose nearly the threat to the

patient as nodules that are malignant and rapidly growing. Nodule growth rate is typically

measured in terms of doubling time. Doubling time is the time it takes for the nodule

to double in volume. A number of algorithms for computing nodule sizes from imagery

15

have been suggested in the literature [2; 9–20]. Broadly, they may be divided into 2D and

3D techniques. Two-dimensional techniques require 2D nodule segmentations while 3D

techniques rely on segmentation in three dimensions.

Health-care providers are also interested in the morphological properties of nodules.

The shape of a pulmonary nodule is often strongly related to its pathology. Shape features

such as spiculation and lobulation provide clues about whether the lesion is malignant

or benign. It is important that nodule segmentation algorithms capture this information

as accurately as possible. Both the measurement of doubling time and an understanding

of nodule morphology require an accurate 3D segmentation of the nodule within the CT

dataset. That is, it must be determined which voxels are primarily nodule tissue and which

voxels are primarily composed of background tissue. The 3D pulmonary nodule segmen-

tation as well as the computation of nodule volume are the primary problems considered in

this dissertation research. We present a novel nodule segmentation algorithm which makes

use of a trained artificial neural network (ANN). In addition we discuss the details of how

the resulting 3D segmentations may be used to obtain accurate estimates of nodule volume,

a critical part of growth rate estimation.

Before continuing, let us briefly discuss how CADD algorithms such as an automatic

pulmonary nodule segmentation tool should be evaluated. To do so, it is important for us

to address the reason that such algorithms are created. The most obvious purpose is that

the automatic algorithm should be able to “stand in” for a human observer. That is, when

presented with a set of segmented nodules some of which are segmented by radiologists

and some of which are segmented by machines, it should be difficult if not impossible to

differentiate the two. This does not imply that all of the segmentations are exactly the

16

same. In fact among radiologists, there will always be some level of disagreement as to

the actual nodule boundary. These differences result from the training level and experience

of the radiologist as well as many other intangible factors. We propose evaluating CADD

algorithms in terms inter-observer variability that is observed when performing a similar

task.

1.3 Dissertation Overview

In Chapter II a description of the basic segmentation engine is presented. This seg-

mentation engine is the core of the automated segmentation algorithm described in this

dissertation. It has been designed to be simple and to make use of fundamental image

processing operations including thresholding and morphological processing. The segmen-

tation engine requires the selection of two parameters either manually (interactively) or

automatically. When used in an automated mode, the segmentation engine is used to create

candidate segmentations the best of which is then selected by the ANN.

Chapter 3 is a discussion of the automatic selection of the segmentation parameters

which are the input to the segmentation engine. An ANN is introduced as the solution to

the selection of the correct segmentation from a set of candidates. The ANN is trained to

recognize the difference between low-quality and high-quality segmentations. The highest

quality segmentation is then selected from the set of candidates. We also address the critical

issue of the selection of appropriate features that are strongly correlated with segmentation

quality. We show how a set of good features may be selected from a large field of potential

features.

17

In Chapter 4, we tackle the issue of computational complexity. While an exhaustive

search of the segmentation parameter space works, it requires searching many potential

combinations of segmentation parameters in order to find the one that works the best. That

is, we must create candidate segmentations for a large number of parameter combinations.

However, it is possible to limit the search space so that fewer candidates need be created.

Intelligent searching is more desirable and several approaches for doing so are presented

including a simulated annealing technique as well as a form of golden section search.

As previously stated, pulmonary nodule volume is a parameter that may be estimated

once a suitable nodule segmentation is computed. This, however, is not a trivial problem.

Issues such as partial volume effects, slice thickness and spacing, as well as other consid-

erations complicate this matter. In Chapter 5 we present an overview of the state-of-the-art

for nodule volume computation and present a new variation of an existing algorithm which

proves useful. In Chapter 6, the results of all experimental work are presented. We quan-

tify the performance of the nodule segmentation by comparing them to manually produced

“truth” segmentations that have been created by board certified radiologists. We make use

of the overlap measure to quantify the difference between two different segmentation fields.

We show that our automatic segmentation algorithm produces good segmentation results.

Such results are well within the inter-observer variability exhibited among radiologists. It

is our view that CADD algorithms that could stand in for an experienced radiologist should

be the goal of our algorithm development. We also present a comparison of various volume

estimation results. Finally, Chapter 7 includes a discussion of both the segmentation results

as well as of the volume estimation portion of the system.

CHAPTER II

The Segmentation Engine

Accurate 3D segmentation is the first step in automatic quantification of lung nodule

size in addition to being useful for understanding the morphological properties of a given

nodule. Image segmentation is a topic that has been extensively researched and widely

discussed in the literature. In image segmentation, the basic goal is to classify pixels (or

in this case voxels) according to some property or characteristic. In this case, voxels that

primarily contain nodule tissue are to be separated from background structures such as the

lungs themselves, air, vessels, and the lung parenchyma.

2.1 Survey of Existing Image Segmentation Algorithms

Image segmentation is a major topic within the discipline of image processing. Images

are segmented based on intensity, color, texture, shape, and even motion to name just a

few potential characteristics. Essentially, segmentation is a process of dividing an image

into regions that share some characteristic feature in common. Numerous algorithms have

been devised to handle various segmentation problems and these algorithms are typically

coupled with morphological processing to achieve the final result. Common approaches

to segmentation include intensity thresholding, k-means clustering, edge detection, region

18

19

growing, active contours, and numerous other techniques. Most image processing texts

contain at least one chapter devoted to image segmentation.

Intensity thresholding is a common segmentation tool. In thresholding, pixels are se-

lected whose numerical value is greater than the threshold or if multiple thresholds are

involved, pixels lying between thresholds are selected. Thresholds may be chosen in a

number of different ways. One way is to analyze the histogram of image intensities to

determine which modes are present. The threshold(s) may be chosen in such a way so as

to separate the dominant intensity modes. Thresholds may also be chosen adaptively such

that the final segmentation has some desirable property.

K-means clustering is an iterative algorithm that requires prior knowledge about the

number of classes into which we desire to divide the image [21]. Estimates of the number

of classes may be made by manual or automated analysis of the histogram. A set of initial

property means are chosen and the pixels are grouped according to which mean they are

geometrically closest. The means are then recomputed and this process continues until the

number of pixels in each class remains the same. This constitutes the convergence of the

algorithm. One drawback for this algorithm is that the number of segmentation classes

must be known ahead of time and this is simply not always the case.

Edge detection algorithms also have found wide use in segmentation problems. Edge

detection usually involves the use of 2D derivative operators such as the gradient or LaPla-

cian operator combined with a thresholding of the resulting derivative image. Post pro-

cessing can then be used to find the resulting closed contours which may constitute a single

object. Edge linking algorithms may be used to fill the inevitable gaps in the edge contours.

One example of edge linking is the Hough transform which does edge linking for edges that

20

are approximately linear. Edge detection is typically not as helpful for edges that are less

well-defined. Such edges are “fuzzy” and such are the edges that we often see in working

with pulmonary nodules.

Region growing and active contours are a more recent segmentation development. With

region growing, we begin with a seed pixel or region within the object we wish to segment.

Pixels are then added to the region based on whether they meet some criteria. Active

contour algorithms are similar except we now introduce “forces” which tend to push a

contour in or out [22–24]. When the forces are equal and opposite, the contour does not

expand. The analogy is the blowing up of a balloon inside a jar where the balloon grows

to fill the jar, but the jar keeps the balloon from growing any larger. Forces are computed

based on gradient, local object shape, or other desired segmentation characteristics. Active

contour algorithms have been applied to the pulmonary nodule segmentation problem with

some success.

The automatic segmentation of pulmonary nodules is an area of ongoing and extensive

research. Researchers have presented a number of automated algorithms suitable for im-

plementation in software. These algorithms are specially tuned to the nodule segmentation

problem with its unique problems. Commercial CT systems often come with proprietary

software for automating the segmentation process as was previously mentioned. The need

for such software is precipitated by the vast quantities of data that needs to be analyzed

by radiologists. Any automation of these analysis tasks decreases the workload as well as

has the potential to actually provide “computer-in-the-loop” benefits to the analysis task. A

good overview of segmentation techniques used in medical imaging is given in [25].

21

Coleman et al. have developed a technique which segments pulmonary nodules by

looking for a segmentation mask which mimimizes the total variation within the segmented

image intensities [26]. In other words, one should expect that the best nodule candidates

will have the least intensity variation within the nodule boundary. They make use of edge

detection and edge linking to produce candidate boundaries. These are then evaluated ac-

cording to the variation minimization approach. Xu et al. have suggested an automated

technique which considers the CT imagery on a slice by slice basis [27]. They do so by

fitting predefined shapes such as circles and ellipses to the nodule at various scales and

finding those that provide the best approximation to the nodule shape. Fan et al. propose

a method in which takes a variety of shape templates and overlay them on the object of

interest. The cross-correlation between the template and object is computed to determine

which shape provides the best fit [28]. Zhao et al. have introduced a two-dimensional

(2D) technique in which a set of features measured from candidate segmentations are used

to select the best segmentation [29]. They measure features such as gradient strength and

compactness to make an automatic selection from a set of candidates. In addition, they

have introduced an algorithm suitable for use on 3D data sets using similar principles [30].

Mullally et al. evaluate several pulmonary nodule segmentation techniques including fixed

thresholding, variable thresholding, and shape-based segmentation techniques [11]. These

algorithms were tested using both phantoms and clinical data. More recently, Kuhnigk et

al. have discussed an algorithm for the segmentation and volume estimation of larger pul-

monary nodules by making use of image morphology techniques [19]. Another technique

that has been applied to this problem is segmenting using active contours and “snakes.”

Given a seed point, the segmentation is allowed to grow based on internal and external

22

force functions that are dependent on such image parameters such as gradient and such a

priori knowledge including the common morphological shapes of nodules. Examples of

this work includes the work of Elmoataz et al.[23] and Way et al.[31] Finally, Wang et al.

have proposed a unique segmentation solution in which they perform a 3D to 2D trans-

formation of the nodule volume of interest and do the processing in the 2D image domain

[32]. They then convert the resulting 2D solution back to a desired 3D segmentation and

compare their results to radiologist truthed nodule segmentations.

A simple and commonly used approach to 3D nodule segmentation is based on thresh-

olding and morphological processing [12; 29; 30]. While there certainly are more compu-

tationally complex algorithms that have been proposed, the purpose in this dissertation is

to study the capabilities and limitations of the thresholding-opening approach and present

an effective automated way to optimally select the parameters for such a system. In this

paper a novel segmentation algorithm utilizing an Artificial Neural Network (ANN) is pre-

sented. This work builds on the work of Zhao et al. in that it too creates segmentation

candidates which are evaluated for their quality based on a set of measured features[30].

In their approach, the best candidate is determined by using an objective function that is

a function of the computed features. Kostis et al. use thresholding and morphological

processing and attempt to select an optimum threshold and structuring element radius suit-

able for all nodules. They do concede that in practice this radius may need to be adjusted

based on the nodule under consideration. In our approach, the segmentation candidates are

evaluated using the ANN. The introduction of the ANN allows the algorithm to be trained

using nodules segmented manually. This training process encodes information about how

23

the nodules should be segmented allowing the algorithm to closely approximate the work

of trained radiologists.

2.2 The Segmentation Engine

Our segmentation engine is roughly based on the segmenter of Zhao et al. in [30].

Unlike their algorithm, we introduce a variable valued structuring element radius. Our seg-

mentation algorithm assumes that a pulmonary nodule has been located based on manual

or automatic nodule detection. In either case, a local cue point would be passed from the

detection system to the segmentation algorithm indicating the x, y, and z coordinates of

the nodule within some volume of interest (VOI). We refer to x and y as the in-plane coor-

dinates while z references the axial direction in the CT dataset. Segmentation itself is an

algorithm which produces a binary, 3D scalar field in which voxels that are “turned on” or

set to one are voxels that contain predominately nodule tissue and voxels that are “turned

off” or set to zero are represent background voxels. In the case of thoracic CT, the back-

ground consists primarily of air in the lungs as well as other lung structures. We denote

segmentation fields as s(x, y, z).

Intensity thresholding finds a place in many image segmentation techniques and is a

fundamental building block of our pulmonary nodule segmentation algorithm. Specifi-

cally, we combine intensity thresholding together with morphological processing to define

a segmentation algorithm that lends itself to automation. The segmentation algorithm we

have designed requires the appropriate selection of two parameters using either an inter-

active manual selection process or using automatic processing. The two parameters which

24

must be selected are the intensity threshold T , and the radius of a disk-shaped morpholog-

ical structuring element R. The parameter T is specified in units of Hounsfield units (HU).

Voxels whose intensities are greater than the specified value for T are “turned on” during

the thresholding phase of the algorithm. Mathematically, for a given threshold T0, we have

sT (x, y, z) =

{1 v(x, y, z) > T0 otherwise.

. (2.1)

Since intensity in CT is proportional to tissue density, this process selects structures whose

densities are greater than those of the background. The parameter R in our algorithm is

used to create the disk-shaped structuring element that we use to perform morphological

opening in the (x, y) plane. Clearly, a structuring element radius of R = 0 has no effect on

the thresholded binary image while increasingly large structuring elements tend to remove

larger extremities from the segmentation mask. Very large values of R can result in exces-

sive degradation of the segmentation. All steps in our segmentation algorithm assume that

a lung mask is available which prohibits segmentations which includes the lung wall. Sev-

eral lung segmentation algorithms have been presented in the literature [33–37]. We have

found that thresholding with a constant threshold followed by morphological opening with

a large structuring element produces reasonable lung masks in most cases. This is done for

each VOI to disconnect the nodule from the pleural surface.

At a given threshold T , we apply a 3D connectivity requirement to the resulting binary

image stack which removes thresholded voxels that are not connected to the cue-point. This

has the effect of removing regions which appear bright such as the lung wall, vessels, or

even other nodules. We have imposed a 6-connected 3D connectivity requirement for our

algorithm because it is a stringent requirement and results in cleaner segmentation. Fol-

lowing the application of this connectivity requirement, we use 2D morphological opening

25

in each CT layer to remove residual structures such as vessels which may be attached to

the segmented nodule. We use 2D morphological processing due to the fact that we desire

the algorithm to be applicable to thick-slice data. Using a 3D structuring element would

be detrimental to nodules that are only apparent in one slice. Finally, a 3D connectivity

requirement is again imposed to ensure that only voxels connected to the cue-point voxel

are included in the segmentation. Manual experiments show that numerous nodules can be

segmented by the appropriate selection of the T and R parameters. A block diagram of this

segmentation engine is shown in Figure 2.1. The representation of our system mathemati-

cally is given by

s(x, y, z) = F(v(x, y, z), R, T ) (2.2)

where v(x, y, z) is a 3D scalar field of CT image intensities and represents the VOI under

consideration. The function F(·) represents the segmentation engine described above and


26

Threshold ConnectivityRequirement

MorphologicalOpening

ConnectivityRequirement

v(x, y, z)

RT

F (v(x, y, z), R, T)

s(x, y, z)

sT (x, y, z)sT (x, y, z) sR(x, y, z)

Figure 2.1: Block diagram of basic segmentation engine s(x, y, z) = F(v(x, y, z), R, T ).

CHAPTER III

Using artificial neural networks to characterize segmentation quality

3.0.1 Basic Approach

The system shown in Figure 2.1 may easily be used to create a set of candidate segmen-

tations. That is, the kth candidate is given by

sk(x, y, z) = F(v(x, y, z), Rk, Tk) (3.1)

where Rk and Tk are specific values of R and T and k = 1, 2, . . . , K for a total of K

parameter combinations. The key to the automated segmentation strategy is to determine

which candidate segmentation is the optimal one in some sense. The “goodness” of a

given segmentation is conceivably a function of many things including how well vessels

are trimmed from the nodule, how well the segmentation matches the boundary of a given

nodule, etc. These factors may be measured in the form of features and some function

of the features may be used to predict the quality or lack thereof and ultimately select the

proper segmentation candidate.

In order to discuss the notion of segmentation quality, it is helpful to suppose the ex-

istence of an “ideal” segmentation which we may call s(x, y, z). Given that there exists

such a segmentation field, we can now conceive of a measure of quality that quantifies the

27

28

similarities between sk(x, y, z) and s(x, y, z). In comparing segmentations there are two

sources of error. There are voxels that consist of nodule tissue but are not segmented, and

there are voxels that are segmented and do not in fact contain nodule tissue . The former

are false negatives and the latter are false positives. Any measure of error between segmen-

tation fields must account for both kinds of error. A commonly used quantity is the overlap

which is defined as

d(k) =

∑x,y,z s(x, y, z)

⋂sk(x, y, z)

∑x,y,z s(x, y, z)

⋃sk(x, y, z)

. (3.2)

Clearly, a perfect segmentation will yield d(k) = 1. For comparing a test segmentation

with a truth segmentation, d(k) forms a good measure of the quality.

The problem, of course, is that if one had access to the ideal segmentation, s(x, y, z),

then the segmentation problem would be solved. Since s(x, y, z) is not accessible, it is

necessary to devise an approximation to d(k) which we shall call d(xk) which is a function

of measurable segmentation features xk, an N -element feature vector defined as

xk = [xk,1, xk,2, . . . , xk,N ]T . (3.3)

We will refer to this function, d(xk) as the quality function. Once this function is available,

we may compute sk∗ where

k∗ = arg maxk

d(xk) (3.4)

is the argument that maximizes the quality function and the desired segmentation is

sk∗ = F(v(x, y, z), Rk∗ , Tk∗). (3.5)

To determine which of a set of segmentations is the best, we select a set of features for

each candidate segmentation that serve as good predictors of segmentation quality. These

29

features then become the arguments to the quality function which performs the mapping

<N → < and which can then be maximized over the set of K candidate segmentations as

described above in (3.4). The segmentation whose predicted overlap is the greatest, sk∗ ,

may then be returned as the optimal segmentation.

The challenge at this point is to determine the form for the quality function d(xk). One

possible approach to this task is to design a function based on the common-sense rules that

a radiologist or image analyst might apply in segmenting a nodule by manually adjusting R

and T . For example, one might try to determine a function that favors a medium T , a low

R, and a high gradient strength on the boundary of the nodule. In addition we might try

to add the requirement that R must increase as T decreases. The exact functional form is,

however, elusive at best. Particularly, as features interact with one another and the number

of features in the candidate description gets large, determining a closed form for d(xk) is

simply intractable.

Another possible solution is to use examples of good segmentations and let these form

the input to a “learning” system such as a neural network. In other words, a good set of

training data should take us a long way toward creating an appropriate quality function.

Our approach is to make use of a trained radial basis function (RBF) ANN to estimate the

quality (predict d(k)) for a given segmentation. RBF neural network systems are known

for their ability to approximate nonlinear functions based on a set of sample inputs and

outputs. They train very quickly and easily and always yield the same network for the same

set of training data. Excellent introductions to neural nets including radial basis function

architectures are found in [38–40]. Chen et al. presented a seminal paper on radial basis

function networks [41].

30

The RBF network that we have used is shown in Figure 3.1. This network forms the

mapping

d(xk) =M∑

j=1

wjφj(‖xk − cj‖2), (3.6)

where M is the number of centers chosen, ‖ · ‖2 denotes the Euclidean distance, and cj is

the set of centers. The quantity φj(·) is a radial basis function that maps <N → <. One

of the most common forms for φj(·) is the N dimensional Gaussian function and we have

chosen this function for our system. To train this system, M training vectors are chosen

and we set cj = xj , where xj is the jth training vector. The weights, wj , are set to the

training target value d(j). Setting the centers and weights in this way creates a generalized

regression neural network (GRNN). The one additional parameter that is required for such

a network is the spread parameter which controls the width of the Gaussian radial basis

function.

In order to specify the network weights, a set of training data is required. To accom-

plish this, each of the nodules in the training set is segmented by hand using a interactive

graphical user interface (GUI) which allows the user to interactively choose R and T while

observing the resulting segmentation. An image of this GUI is shown in Figure 3.2. Once

a good segmentation is found, these segmentations are saved as well as the correspond-

ing values of R and T . These manual segmentations are used to approximate the “ideal”

segmentations s discussed earlier.

Each nodule in the training set is segmented for every R and T combination on an

evenly spaced grid in the T − R space. The overlap measurement between these K seg-

mentations and the manual segmentations are then computed. After the set of candidate

segmentations is created, the features that correspond to these test segmentations also are

31

f1

f2

fM

S

xk ,1

xk ,2

xk ,3

xk N,

$( )dk

x

w1

w2

wM

Figure 3.1: Diagram of the RBF ANN used to approximate the quality function, d(xk).

computed as previously described. That is, a set of feature vectors xk are created for

k = 1, 2, . . . K. These features are then presented to the neural network as inputs with

the actual distance to the ideal (manual) segmentation, d(k) being presented to the network

as the target output. After training, the ANN network may be used for the segmentation of

new nodules. The system used for doing so is shown in Figure 3.3. The system in Figure

32

Figure 3.2: GUI interface used for manual segmentations with R and T . A slice by sliceview is given as well as a 3D rendering. As the user adjusts the R and T sliders, thesegmentation is updated.

3.3 must be used to compute the output d(xk) for every value of k. We then select k∗ to be

the value for k that yields the greatest d(xk).

Computing candidate segmentations for every possible value for k and evaluating these

constitutes an exhaustive search. Performing an exhaustive search is computationally ex-

pensive. We have had reasonable success with using simulated annealing to search the

T −R solution space [42]. More will be said about this topic in the next chapter. The crit-

ical issue to be determined now is how best to populate the feature vector xk with features

that are useful for predicting quality in terms of overlap. It is to this topic that we now turn.

33

SegmentationEngine

v(x, y, z)

Rk

Tk

sk(x, y, z)

FeatureCalculation

xk

ArtificialNeuralNetwork d(xk)

Figure 3.3: The segmentation algorithm setup following neural network training. All thatis required here is the segmentation engine, the feature calculation, and the trained neuralnetwork. As with the training setup, the output d(k) must be computed for k = 1, 2, . . . , K.The desired segmentation can then be found by computing sk∗ = F(v(x, y, z), Rk∗ , Tk∗)where k∗ = arg mink d(k).

3.1 ANN Features

One of the most critical and challenging tasks in any problem of this nature is the

selection of appropriate features that when combined in the ANN, will be capable of pre-

dicting the overlap. We began with a list of about 50 potential features. These included

features that were primarily morphological in nature such as sphericity and compactness

while other features were functions of the underlying image data such as gradient strength

and the standard deviation of the image data inside the segmentation mask. Some of these

34

potential features are highly correlated indicating that we may be able to reduce the dimen-

sionality of the feature space significantly. Ideally, every combination of possible features

should be investigated in order to make an appropriate selection. Doing so, however, would

be computationally prohibitive. We were able to make use of Sequential Forward Selection

(SFS) to choose an adequate subset of features [43]. SFS calls for the selection of features

one at a time as long as the addition of new features contributes to an increase in the objec-

tive function. In our case, the desired objective function is mean overlap produced for any

given set of features. By using SFS, we succeeded in reducing the number of features to

four.

Image gradient information has been shown historically to be very useful in segmenta-

tion problems [4]. Since the intensity gradient is a measure of the spatial rate of change of

intensity, image gradients provide important data about the locations of changes in image

intensity located at the boundaries between structures in an image. 3D gradient is generally

defined as

g(x, y, z) =

∂v∂x∂v∂y∂v∂z

. (3.7)

Since we are dealing with 3D, sampled data that is non-isotropic in the axial direction, we

simply scale the dimensional gradients by the corresponding dimension spacings. Three

different features involving image gradients were selected during the SFS process.

The first of these features is the mean convergence index (MCI). Convergence index

was first introduced by Kobatake et al. [44]. While magnitude information is often useful,

the information of importance to us here is the angle of g and so we normalize the gradient

by dividing by the magnitude yielding a unit vector in the gradient direction. That is,

gu = g/|g|. The convergence index is a measure of the amount of agreement between

35

the gradient angle 6 gu(x, y, z) and the angle of a unit magnitude vector field 6 ru(x, y, z)

pointing radially toward the cue point. We assume that for lung nodules, the gradient

vectors within the nodule generally point toward the middle of the nodule, particularly

at the nodule boundary. This is due to the fact that the nodules are generally brighter in

the center and then decrease in intensity away from the center due to density changes and

partial volume effects. Clearly, the closer the cue point is to the middle of the nodule, the

better will be the performance. However, we realize that cue point locations derived from

both manual and automated detection will be random variables. Taking the inner product

of these two vector fields yields a scalar field whose intensity represents the amount of

agreement between the angles of the two fields. We call this scalar field h(x, y, z) and

define it as

h(x, y, z) = gu(x, y, z) · ru(x, y, z). (3.8)

In order to exploit the convergence index to create a feature, we propose computing the

sample mean of h(x, y, z) over the voxels contained in the segmentation candidate under

consideration. That is, the MCI may be expressed as

MCI =1

P

∑

x,y,z∈wh(x, y, z), (3.9)

where w = {x, y, z : s(x, y, z) = 1} and s(x, y, z) is the segmentation field, and P is the

number of voxels within s(x, y, z) that are turned on. If the vectors in gu tend to point

in random directions, the inner product h(x, y, z) will tend to go toward zero. However,

more structured gradient directions such as what we would expect to see resulting from an

object such as a lung nodule would yield a meaningful non-zero inner product. We may

36

then expect that reasonable segmentation candidates will have a higher value for MCI while

poor segmentation candidates will have low values for the MCI feature. Figure 3.4 presents

a 2D visualization of the convergence index idea.

Figure 3.4: a) The original nodule image. b) Radial vector field from cue, −ru. c) Normal-ized gradient field, −gu. d) Convergence index image h(x, y, z) = gu(x, y, z) · ru(x, y, z).Note the bright area in the center corresponding to an area of high convergence.

Gradient strength and radial deviation features were also indicated by SFS analysis to

be salient features. Gradient strength is simply the mean gradient magnitude along the

37

boundary of a structure while radial deviation indicates how much the angle of the gradient

along the boundary of the object deviates from the radial direction as measured from the

center of the nodule. The final feature selected was a measure of contrast between voxels

in the segmentation versus voxels outside the segmentation in the z direction.

CHAPTER IV

Intelligent Search of the R and T Solution Space

The segmentation algorithm presented so far essentially requires an exhaustive search

of the R−T space to find the best segmentation candidate. In other words, we need to sam-

ple the R−T space on a tight grid and to create the candidate segmentation corresponding

to each of these combinations. This process is computationally expensive and therefore

time consuming. In addition to the generation of candidate segmentations, features for

each candidate must be computed, as well as processed using the trained neural network

in order to compute the predicted quality (overlap) for the candidate under consideration.

This fact begs the question of efficiency. Can the algorithm be made more computationally

efficient? Can the R and T space be searched in a more intelligent and purposeful manner?

Is it possible to consider fewer candidate segmentations? What computational cost savings

may be achieved? We have considered two different options for more efficient searches of

the solution space. Other methods undoubtedly exist as well.

4.1 Efficiency Analysis for the Exhaustive Search

Let us begin by attempting to quantify the complexity of the exhaustive search. This

will be done in terms of function calls to the primary software functions involved when

programming the algorithm. Let us assume for the sake of analysis that we allow the

38

39

intensity threshold, T , to span a range of H Hounsfield Units and the structuring element

radius, R, to span a range of W voxels. If we apply uniform sampling in this space then this

implies that T will take on NT different values and R will take on NR different values where

NT = W∆T

+1 and NR = H∆R

+1 and ∆T and ∆R represent the sample spacing in the T and R

direction respectively. That is, we will sample the R−T space on a NT×NR grid as shown

in Figure 4.1. For a single nodule then, a total of NT NR candidate segmentations must be

created using NT NR calls to the segmentation engine. Following the creation of candidate

segmentations, a set of features must be computed for every candidate. This would require

NT NR calls to the feature computation software. The last major step then is to pass these

features through the trained neural network, and once again this requires NT NR calls to

the neural network code. Clearly, the computational complexity increases linearly with

increasing NT or increasing NR and it increases quadratically if we desire to increase both

NT and NR. We consider next both a simulated annealing approach and a Golden Section

Search approach to reducing the complexity of this segmentation algorithm.

4.2 Simulated Annealing

Simulated annealing is a stochastic optimization algorithm that finds its origins in met-

allurgy. Its goal is to find the minimum of some objective function without having to eval-

uate the function for every potential solution in the solution space. In metallurgy, a given

metal specimen is annealed or tempered by first heating the metal to a high temperature.

While the metal is maintained at this high temperature, the atoms of the material move more

freely and tend to move toward an equilibrium state. This intense heating is followed by a

cooling regime which allows the metal to set and any weaknesses to be strengthened. In an

40

W

H

TD

RD

Figure 4.1: Sampling in the T −R space.

analogous manner, the simulated annealing algorithm begins with simulated “heating” of

the solution space. That is, the solution trajectory is allowed to take large random steps in

the solution space, some of which actually lead to an increase in the value of the objective

function. Taking random steps in the solution space some of which result in moving uphill

is sometimes referred to as Metropolis sampling [45]. If the new output from the objective

function is smaller than the previous output, the new solution is always accepted for the

next iteration. However, if the new output is larger than the previous output, the new one is

only retained with some probability. The larger the “uphill” step in the objective function

value, the lower the probability that it will be accepted. This probability of accepting uphill

41

solutions decreases as the system is allowed to “cool”. Keeping these uphill solutions on

occasion in the early stages of algorithm operation prevents the solution from converging

to a local minimum. However, as the algorithm progresses, it is desired that the solution

not jump out of the solution bowl. A popular cooling regime was developed by Geman

and Geman in [46]. This cooling regime has been shown to have outstanding convergence

properties. An excellent overview of simulated annealing in general may be found in [47].

Given an objective function y = G(u) which maps <M → < where M is the dimen-

sionality of the solution space and u is a vector in that space, the basic flow of simulated

annealing proceeds as follows:

1. Begin with an initial guess at the solution, u0 and evaluate the objective function at

that point, y0 = G(u0).

2. Take a random step in the solution space to u1 and reevaluate the objective function

y1 = G(u1).

3. Continue iterating for a specified number of steps or until yk = G(uk) < threshold.

4. At the kth iteration, if yk < yk−1 then update the current solution to uk.

5. If at any iteration yk > yk−1 then keep xk as the solution with probability Pk where

Pk = e−∆C , ∆ = |yk − yk−1| and C = τ

ln(k+1). τ is a tuning parameter called the

annealing constant.

6. The final solution is uk such that yk is the smallest value that we achieved at any point

in the algorithm. Since the algorithm may take “uphill” paths, the final position in

42

the state space is not always the optimal answer. We retain the best solution visited

over the course of iterating.

For us, the solution space is two dimensional in that u = [R, T ]T . In addition, our

objective function is computed by first creating the set of features xk that result from the

vector uk. In other words xk is really a function of uk. Since we have posed the optimization

problem from the standpoint of minimization, we must subtract our predicted quality from

one such that we are actually looking for the minimum of this new function. Ultimately

then, the objective function we wish to use is the neural network response to the candidate

segmentations 1−d(x(uk)). What we desire to avoid is the computation of xk for numerous

values of T and R as well as the computation of d(xk) for all of these combinations.

We must next address the computation of the features at each step of the simulated

annealing algorithm. Unfortunately, the computation of the MCI feature requires a normal-

ization with respect to the largest MCI value found for a given nodule. This implies that we

cannot compute a given MCI value without having some indication of the range of conver-

gence index values for the nodule. That is, we need to arrive at some nominal value for this

maximum MCI value for a given nodule. To do so we create a subset of candidate segmen-

tations on a course grid and compute the MCI feature value for each one. The largest one

is used to normalize the measurement for input into the neural network. As the simulated

annealing process proceeds, if a higher MCI value is found for the nodule in question, then

future MCI measurements are normalized by this new value. The same needs to be said for

the radial deviation feature and the gradient strength feature. This pre-computation of a set

of NMCI , values requires additional segmentation candidates prior to actually running the

43

simulated annealing algorithm. A similar procedure is necessary for the Golden Section

Search as well.

Applied to our segmentation problem the simulated annealing algorithm proceeds by

first selecting at random a starting point in the solution space which we will denote as

u0 = [R0, T0]T . At the kth step, the solution is uk = [Rk, Tk]

T . The algorithm has one

tuning parameter τ which is called the annealing constant. The parameter τ may be used to

control how quickly the system is allowed to “cool” or converge to a solution. This value

must be set carefully. If τ is too large, the solution may become caught in a local minimum.

On the other hand, if τ is too small, the solution may never converge at all. At each step, a

random step is taken in the solution space. If 1 − d(xk) < 1 − d(xk−1), then this solution

is always accepted as the current solution. However, if 1 − d(xk) > 1 − d(xk−1), then the

new solution is accepted with probability

P = e−∆C (4.1)

where C = τln(k)+1

and ∆ = |d(xk) − d(xk−1)|. Clearly C decreases with increasing k.

This has the effect of reducing the probability P that a “uphill” step will be accepted at

later stages of the algorithm. An illustration of this cooling regime is given in Figure 4.2.

Curves are shown for various values of τ .

There are a couple of different stopping criteria for this algorithm. The first is to stop

when the function 1− d(xk) falls below some pre-determined threshold. However, it is not

always possible to know how to set this threshold since for any given nodule, the absolute

minimum value of the objective function will be different. In this case, the algorithm

may be set to operate over a set number of iterations and then stop. Once the algorithm

44

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

k

Pro

babi

lity

of a

ccep

ting

uphi

ll st

ep

τ = 10

τ = 5

τ = 2

Figure 4.2: Illustration of the simulated annealing cooling regime for several values of theannealing constant τ .

has stopped, the solution, Rk and Tk, is selected which produced the smallest value for

1 − d(xk) over the course of all the iterations. A combination of these stopping criteria

was used in our implementation. A minimum, and maximum number of iterations were

specified. We will call these Imin and Imax respectively. The algorithm was set up to

always perform Imin iterations even if the objective function falls below a threshold. If an

objective function value less than a given threshold is found prior to the Imin, the algorithm

stops upon reaching Imin. Otherwise, it would continue on to complete a total of Imax

iterations. This guarantees that the algorithm will always stop and places an upper limit on

the complexity.

We must now answer the question about the computational cost savings of this search

technique over the exhaustive search. The worst case scenario occurs when all nodules

require Imax iterations. This will involve the creation of Imax + NMCI candidate segmen-

tations, the computation of Imax sets of features, and finally Imax calls to the trained neural

network. Assuming that Imax ¿ NT NR, the computational savings may be significant.

45

Clearly, however, there is a trade off, since with simulated annealing as with other stochas-

tic search techniques, we are not guaranteed to find the minimum solution, especially with

a small number of iterations. Computational savings must therefore be balanced with the

desired segmentation accuracy. Some experimental results are provided in Chapter VI.

4.3 Golden Section Search

The second approach used to attempt to minimize the computational complexity re-

quired by a exhaustive search of the solution space was the so-called Golden Ratio Search

or Golden Section Search. This search technique was first proposed by statistician Jack

Kiefer in 1953 [48]. A related search is known as the Fibonacci search. Such a search is

used to find the minimum of a 1D unimodal function. That is, the Golden Section Search is

designed to find the minimum of a one dimensional function that has exactly one minimum

within the region of interest. We propose using this search along the T dimension of the

segmentation solution space. In other words, for each value of R, we will find the value for

T that approximately minimizes the error function. We may then select the radius R that

yields the smallest of these NR results. While the assumption of a single minimum does

not hold true in general for our objective function, we have found that the search works

well enough to yield reasonable segmentations for a reduced computational cost.

To explain the application, let us first introduce the notion of the Golden Ratio. The

Golden Ratio is yet another mathematical constant like the constants π or e. Specifically, it

is derived from the formula,A + B

A=

A

B= φ. (4.2)

46

This formula may be illustrated geometrically as shown in Figure 4.3. The ratio of the

entire length of the line segment to A is equal to the ratio of A to B. It can be shown that φ

is also the positive root of the polynomial function x2 − x− 1 = 0. Solving equation (4.3)

A B

Figure 4.3: Illustration of the Golden Ratio.

for AB

= φ yields the irrational number φ = 1.618 . . ..

How is the Golden Ratio applied to the minimization of a function? The basic idea is to

successively bracket the minimum of the function between two points gradually honing in

on the solution. Assume for a moment that the one dimensional objective function is G(x).

We begin by evaluating the function at the end points which we could call x0 and x1. We

then pick a third point x2 such that x2 = x0+1φ(x1−x0). We evaluate the objective function

at all three points yielding G(x0), G(x1), and G(x2). Clearly, based on the assumption that

the function is unimodal, we will have G(x2) < G(x0) and G(x2) < G(x1). We now probe

the function at a further point x3 where x3 = x2 − 1φ(x2 − x0). If G(x3) > G(x2) then it is

clear that the solution, call it x∗ is between x3 and x1. In this case x3, x2, and x1 form a new

set of points and the algorithm is iterated. Otherwise, if G(x3) < G(x2), then the solution,

x∗, lies between x0 and x2. The new triplet here would be x0, x3, and x2. The only other

interesting question is why the search interval should be partitioned into sub-intervals that

satisfy the Golden Ratio. The reason for this is that it can be shown that doing so yields the

47

most rapid convergence for the search. This is because it ensures that the interval from x0

to x2 is the same length as the interval from x3 to x1. An illustration of this algorithm is


0 1 2 3 4 5 6 7 8 9 100

5

10

15

20

25

30

35

x

G(x

)

G(x0)

G(x1)

G(x2)

G(x3)

New Search Range

10 1

φ= 6.18

Figure 4.4: Illustration of the golden section search for an example objective function G(x).

In the case of the Golden Section Search, we are able to determine ahead of time the

number of steps required for convergence. For an initial T search interval of width H , it

can be shown that the required number of iterations necessary to obtain a solution tolerance

less than one is given by

Igss = 2 + log 1φ

1

H. (4.3)

48

This implies that the total number of candidate segmentations required as well as feature

computations and neural network evaluations is given as NRIgss. Again, since we are not

guaranteed to find the absolute minimum based on the fact that we are violating the assump-

tions of the search to some extent, we must be willing to trade computational complexity

for final segmentation quality. We give some experimental numbers in Chapter VI. Golden

section convergence proceeds as ( 1φ)k. This is illustrated in Figure 4.5.

0 1 2 3 4 5 6 7 8 9 100

0.2

0.4

0.6

0.8

1

k

Wid

th o

f sol

utio

n br

acke

t

Figure 4.5: Golden section search convergence shown as a fraction of the original searchregion width. For example, the search region is less than 10% of its original value afteronly 5 iterations.

We now apply this approach to our segmentation algorithm. We do so by considering a

given value of R and evaluating the objective (error) function by probing various T values.

For each value of R we get one minimum value for the objective function.

A summary of the computational complexity of each of the three search algorithms

is given in Table 4.1 in terms of calls to the major functional elements of the software. In

addition, a plot of computational complexity in terms of number of calls to the segmentation

49

Table 4.1: Computational Complexity.

Function Calls Exhaustive Search Simulated Annealing Golden SectionSegmentation Engine NRNT NMCI + Imax NMCI + NR(2 + log 1

φ

1H

)

Feature Computation NRNT Imax NR(2 + log 1φ

1H

)

Neural Net NRNT Imax NR(2 + log 1φ

1H

)

engine is shown in Figure 4.6. Complexity in terms of calls to the feature computation

software and calls to the neural network is shown in Figure 4.7. In these plots we assume

that NR = 4, Imax = 0.25NRNT , NMCI = 40 and that H is divided into increments of 10

HU. Observe that both the exhaustive search method and the simulated annealing search

increase in computational complexity linearly with the size of the search space for a fixed

NR. The golden section search on the other hand increases as the logarithm of the size of

the search space for fixed NR. This makes it extremely desirable from a complexity point

of view. In fact, for the searching of a large range of thresholds, it is unquestionably the

most computationally efficient algorithm. We will also see in the results section that the

Golden Section Search provided for the fastest segmentations.

50

0 100 200 300 400 500 600 700 8000

50

100

150

200

250

300

350

Width of threshold search in HU

Num

ber

segm

enta

tion

engi

ne c

alls

Brute ForceSimulated AnnealingGolden Section Search

Figure 4.6: Number of calls to the segmentation engine as a function of threshold searchsize.

0 100 200 300 400 500 600 700 8000

50

100

150

200

250

300

350

Width of threshold search in HU

Num

ber

of fe

atur

e co

mpu

tatio

ns a

nd n

eura

l net

cal

ls

Brute Force

Simulated Annealing

Golden Section Search

Figure 4.7: Number of calls to the feature computation module and neural network as afunction of threshold search size.

CHAPTER V

Estimation of Pulmonary Nodule Volume

The measurement of pulmonary nodule volume is critical to accurate diagnosis of lung

cancer. Numerous lesions visible in chest radiographs and CT scans are not cancerous. It is

therefore critical for radiologists to be able to distinguish between non-threatening benign

lesions and the more dangerous malignant nodules. Such differentiation from CT data is

currently an active research objective. Typically, differentiation follows from an estimate

of how fast the lesion is growing. Growth rates are typically determined based on a com-

parison of follow-up volume estimate to a baseline volume estimate of the nodule. Nodule

growth rates are most often discussed in terms of volume doubling times. Small volume

doubling times are indicative of a possibly malignant lesion while long doubling times are

often associated with benign tumors. In computing volume doubling times an exponential

growth model is typically assumed. Such a model may be represented mathematically as

V = V0eαt (5.1)

where V0 is the original nodule volume, α is the growth rate, and t is time. After a time

period of ∆t has elapsed a second volume measurement is made and then according to the

model we have V = V0eα∆t. The first step in determining the doubling time is to compute

51

52

the parameter α. Clearly we may write

α =ln( V

V0)

∆t(5.2)

The volume doubling time tD may then be calculated as

tD =ln(2)

α(5.3)

What is clear is that accurate volume estimation is necessary for accurate doubling time

measurement. In fact, using sensitivity analysis we may show that the equation (5.3) is

extremely sensitive to errors in volume measurement especially for smaller nodules.

A number of different techniques are used in clinical practice to estimate the volume of

pulmonary nodules. These techniques may be broadly divided into 2D and 3D techniques.

In many cases, 3D methods are preferred because 2D methods do not necessarily account

for the possible asymmetric growth of nodules. However, if the volumes are estimated by

hand, 2D methods are the tool of choice since they may be estimated by looking at a single

slice. An excellent comparison of 2D vs. 3D volume estimation techniques is presented

in [49]. The 3D techniques must attempt to take into consideration the slice thickness and

slice spacing of the data involved. Among the common techniques used in practice are the

area method [15; 16], the minor axis method [14], and the perimeter method [13]. Winer-

Muram et al. have also presented a technique which seeks to compensate for the apparent

magnification that occurs for thick slice CT data [18]. Ko et al. present experimental results

of a technique that compensates for partial volume effects by utilizing the fact that partially

filled voxels will exhibit a lower attenuation than voxels that are completely filled with

nodule tissue [10]. The fraction of the voxel filled by tissue is assumed to be the same as

the fraction of the intensity to the full voxel intensity. Finally, Gurcan et al. have presented

53

a novel minimax algorithm which attempts to account for the over estimation due to partial

volume effects [17]. At the end of this chapter, we present a novel approach to computing

the true nodule volume based on the compensation model of Winer-Muram et al. Let us

consider each of these algorithms in detail. Experimental results and a comparison of these

estimation techniques is presented in Chapter VI.

5.1 2-dimensional methods

The area method volume estimate is computed by

Varea =4π

3

(√A

π

)3, (5.4)

where A = πr1r2 is the area of an ellipse which has r1 and r2 as the major and minor

axes radii of the nodule as measured in plane. Here A is the area of the ellipse as shown

in Figure 5.1. The volume computed is the volume of a sphere whose area is A. This

technique operates on the slice with the largest nodule cross-section and effectively treats

the nodule as a sphere. The minor axis method is computed by

Vminor =4π

3

(r1r

22

). (5.5)

where r1 and r2 are measured the same way as in the area method. This technique assumes

that the nodule is an ideal ellipsoid. Obviously, these simplifying assumptions simplify the

computations, but also introduce error into the volume measurements.

These methods only require simple in-plane measurements in a single slice. Essentially,

the radiologists finds the slice with the largest nodule cross-section and then measures r1

and r2. Alternatively, this may be done automatically using software. In thick-slice data,

54

1r 2

r

A

Figure 5.1: Illustration of area method of nodule volume estimation.

they avoid the problems of severe partial volume effects in the axial dimension. How-

ever, the volume estimates may be highly inaccurate because they neglect possible nodule

variation in the axial dimension caused by asymmetric growth.

5.2 3D Methods

The most straight-forward of the 3D methods is the perimeter method described in [13]

which simply sums the volumes for all of the segmented voxels yielding

Vper = Nvδxδys, (5.6)

where Nv is the total number of segmented voxels in the nodule, δx and δy are the in-plane

voxel dimensions, and s is the slice spacing. This is a straightforward and intuitive 3D vol-

ume estimate. It is accurate when CT slices do not overlap and when partial volume effects

55

are negligible such as for cases when s is small. However, in many practical situations, it

tends to overestimate volume because of partial volume effects. In addition, in a number of

cases, the slices overlap. That is s < Ts where Ts is the slice thickness. This partial volume

overestimation is particularly pronounced with thick slice data [10; 17; 50].

One possible way to deal with partial volume overestimation is the recently proposed

minimax approach. The minimax nodule volume estimation technique was first proposed

in [50] and [17]. The minimax technique focuses on partial volume effects in the axial

dimension, as these are generally the largest source of volume estimation error for standard

thick slice images. The technique attempts to estimate the actual height of the segmented

object at point given the number of segmented voxels in each object column. In fact, it can

be shown the the minimax algorithm forms a maximum likelihood estimate of the object

height. This height is used along with the number of columns to estimate the volume. In

particular, the column-based volume estimate is given by

Vcol = δxδy

Nc∑

i=1

h(ni), (5.7)

where Nc is the total number of contiguous columns in the segmentation mask, ni is the

number of segmented slices associated with column i, and h(ni) are the height estimates

for each column as a function of the number of slices in that column.

Note that the perimeter method can be cast into this framework of the column-based

estimator by letting

hper(ni) = nis. (5.8)

The problem with this height estimation method is that it does not take into account par-

tial volume effects nor does it take into account redundant volume that results when slice

56

spacing is less than slice thickness. Therefore, this method tends to overestimate volume

in many practical imaging scenarios [10; 17; 50].

The problem boils down to the fact that there is not a unique mapping from the number

of segmented voxels in a column and the true height of the underlying object. Due to partial

volume effects, the number of segmented voxels for an object of a given height depends

on the slice thickness, slice spacing, density of the object, segmentation threshold, and

position of the object relative to the CT slices. Many of the same effects occur for in-plane

measurements, but these tend to be small compared with the axial errors when the slice

thickness is larger than the in-plane pixel spacing.

The minimax height estimator attempts to take many of these factors into account. The

estimator is based on an observation model that assumes that an object that spans more than

hmin in the axial direction of a voxel will be segmented. Using this model, it can be shown

that the maximum height of an object detected in ni slices of thickness T and spacing s is

given by

hu(ni) = (ni + 1)s− T + 2hmin. (5.9)

Similarly, the minimum height of an object detected in ni slices is given by

hl(ni) = max{(ni − 1)s− T + 2hmin, hmin}. (5.10)

The minimax height estimate (which minimizes the maximum absolute height error [51])

is given by the average of hu(ni) and hl(ni) yielding

hmm(ni) =

{nis− (T − 2hmin) ni ≥ 1 + T−hmin

s

nis− (ni−1)s+T−3hmin

2otherwise

. (5.11)

57

In other words, the estimate is the midpoint between the maximum possible object height

hu(ni) and minimum possible height hl(ni) that could have caused the observed segmen-

tation according to our model. This is a maximum likelihood estimator.

Note that the estimate in (5.11) can be viewed as the perimeter method height, nis,

minus a “correction” factor that depends on hmin, ni, T , and s. As expected, when hmin

decreases, so does the height and corresponding volume estimate using this method. In fact,

when ni ≥ 1 + T−hmin

sand hmin < 0.5T , the minimax volume estimate is guaranteed to

be less than the perimeter method estimate. A demonstration of this is provided in Figure

5.2. This shows the minimax height estimate, height bounds, and the perimeter method

height estimate as a function of the number of segmented slices, ni, for s = T = 5mm

and hmin = 0.2T . Note that the minimax estimate is consistently lower than the perimeter

method. This difference becomes more pronounced as T increases or hmin decreases.

1 2 3 4 5 6 7 8 90

5

10

15

20

25

30

35

40

45

50

Obj

ect H

eigh

t (m

m)

ni

hu(n

i) (upper limit)

nis (perimeter method)

hmm

(ni) (minimax estimator)

hl(n

i) (lower limit)

Figure 5.2: Height estimates for the minimax and perimeter methods along with the heightbounds defined by the observation model as a function of the number of slices, ni. Heres = T = 5 mm and hmin = 0.2T .

58

Winer-Muram et al. have found that the apparent magnification in size of a tumor due to

using the perimeter method is directly proportional to the slice thickness and inversely pro-

portional to the cube root of the tumor volume [18]. They have developed a compensatory

equation to be applied to perimeter method volume estimates. This equation was derived

by using spherical phantoms of known volume and comparing these actual volumes with

the volume estimates. Their equation takes the form

M =KTs

V13

t

+ 1, (5.12)

where T is the slice thickness, Vt is the true nodule volume, M is the magnification re-

sulting from partial volume effects, and K is a scaling parameter. While equation (5.12)

is based primarily on empirical data, it is interesting to note that it also makes sense from

a geometric point of view. To see this requires thinking of each nodule in terms of its

surface area and its volume. If the perimeter of the nodule is large when compared with

the volume, then there are numerous voxels which could potentially exhibit partial volume

overestimation. However, if the volume is large compared with the surface area, then there

are many more “full” voxels than partially-filled voxels. This would translate into little or

no magnification. This formulation suggests an equation of the form

M =KSTs

V+ 1 (5.13)

where K is a scaling constant, S is the nodule surface area, and V is the nodule volume.

Assuming spherical nodules leads to an equation of the form of (5.12).

Using the magnification model given in (5.12) we observe that

59

Vo = MVt = (KTs

V13

t

+ 1)Vt, (5.14)

where Vo is the observed tumor volume including magnification. We refer to equation

(5.14) as the forward magnification model. That is, it allows us to predict the observed

volume given that we know the true volume along with the imaging parameters and the

constant K. It is now necessary to solve for Vt in terms of Vo, but first it is necessary

to determine a suitable value for K for the dataset in question. We address this issue in

Section 6.2.

In order to solve for Vt given Vo requires solving the equation

KTs

V13

t

+ 1− Vo

Vt

= 0. (5.15)

Equation (5.15) can be algebraically manipulated to produce a cubic equation which with

additional work may be solved analytically. Specifically, we let x = V13

T which yields the

cubic equation

x3 + KTsx2 − Vo = 0. (5.16)

We can now solve this cubic using Cardano’s method for finding cubic roots [52]. In

general, cubic polynomials have three roots, two of which may be complex. However, it

can be shown that for the polynomial x3 + ax2 + bx + c = 0 where b = 0, c < 0 and

a > 0 that there will be only one positive root. The other two roots are either complex or

negative. Solving this cubic equation yields Vt in terms of Vo. We do not elaborate here on

Cardano’s method for solving cubics but give the solution

VT = (p

3u− u− KTs

3)3 (5.17)

60

where p = −(KTs)2

3and

u = (−Vo + 2(KTs)2

27

2+ (

(−Vo + 2(KTs)3

27)2

4+

(KTs)6

729)

12 )

13 . (5.18)

Numerical methods such as Newton’s method could also be used to find the roots of this

equation [53]. Solving this cubic either analytically or numerically is somewhat different,

and we believe more straightforward, than the method proposed in [18]. We will refer to

this method of solving for Vt as the modified Winer-Muram (MWM) method.

CHAPTER VI

Experimental Results

6.1 Nodule Segmentation

6.1.1 Data

Three different datasets were used in the training and testing of the segmentation algo-

rithm that has been proposed. VOIs containing nodules were extracted from these full lung

studies such that each VOI contained one nodule. This caused a significant reduction in

computational complexity. These VOIs were 129 × 129 in-plane and a sufficient number

of slices in the axial direction to fully contain all the nodules in the set. Obviously, the thin

slice data required more slices than the thick slice data.

The first dataset is a publicly available set acquired through the Early Lung Cancer

Action Program (ELCAP) from the Weill Medical College of Cornell University [54]. This

dataset contains full lung studies for 50 patients. These scans were collected using a GE

Medical Systems LightSpeed Ultra scanner set to a tube voltage of 120 kVp and operating

in helical mode. The slice spacing and thickness were both 1.25 mm.

The second dataset was provided with the IRB approval by the Department of Radiol-

ogy at the University of Texas Medical Branch (UTMB) in Galveston and contains CT lung

61

62

studies from 57 patients reconstructed to both 2.5 mm and 5 mm slice thicknesses. This

dataset was collected using a GE Medical Systems LightSpeed QX/i scanner operating at

120 kVp and a tube current of 160-270 mA. This imager was also operated in helical mode

during the data collection.

The third dataset was provided through the publicly available National Cancer Imaging

Archive website. In this case, LIDC dataset was used. The LIDC data contains 84 full

lung CT studies with data at various slice spacings from 0.75 mm to 3 mm. The data in

this set was collected using a number of different scanners and scanner settings. The peak

voltage ranged from 120-140 KVP while the tube current ranged from 40-422 mA. In ad-

dition, various convolution kernels were used during reconstruction. This dataset includes

radiologist segmentations for a number of nodules. Each nodule has been hand segmented

by up to four radiologists. For the purpose of our study, we used nodules segmented by at

least 3 radiologists. In addition, we for the purpose of comparison, we made use of a 50%

consensus criterion meaning that 50% or more of the segmenting radiologists must agree

that a given voxel is part of the nodule for it to be included in the truth segmentation mask.

This method of combining segmentations from multiple radiologists into a single truth is

common a common practice [31; 55]. Table 6.1 provides a summary of the datasets as well

as giving a breakdown of how they were divided up into a training set and a testing set for

evaluating the performance of the neural network segmentation approach.

An interesting thing to study is the variation among radiologists who manually segment

the same nodule. Studying this variation gives some insight into the performance level that

might be expected from automated segmentation systems. We performed such a study by

63

Table 6.1: Summary of Datasets.

Name Slice Spacing Training Nodules Testing NodulesELCAP 1.25 mm 72 0UTMB 2.5 mm 76 0UTMB 5 mm 76 0LIDC 0.75 mm 0 1LIDC 1 mm 0 1LIDC 1.25 mm 0 4LIDC 1.8 mm 0 39LIDC 2 mm 0 3LIDC 2.5 mm 0 4LIDC 3 mm 0 17Total 224 69

computing the mean overlap between a given radiologists segmentation and other segmen-

tations of the same nodules performed by other radiologists. The results of this analysis are

shown in Table 6.2. In this table, the radiologist on the left represents the “truth” segmen-

tation for the purpose of comparison.

Table 6.2: Comparison of inter-observer variability measured in terms of overlap valuesbetween radiologists for the LIDC dataset.

Radiologist 1 Radiologist 2 Radiologist 3 Radiologist 4Radiologist 1 100% 57.5% 71.0% 61.8%Radiologist 2 57.5% 100% 80.0% 67.2%Radiologist 3 71.0% 80.0% 100% 63.4%Radiologist 4 61.8% 67.2% 63.4% 100%

64

6.1.2 Neural Network Training

The neural network was trained with a training set consisting of nodules from the EL-

CAP and UTMB datasets. Approximately half of these were used for training and half

were used to accomplish SFS. Nodules from the LIDC dataset were not used in training or

feature selection. All of the nodules described in Table 6.1 except for the LIDC nodules

were segmented manually by the principal author to create truth segmentations s(x, y, z)

using a GUI tool created for this purpose. To complete the creation of the training data,

it was necessary to create candidate segmentations using R and T values selected from a

uniformly sampled grid. We allowed R to take on the values 0, 1, 2, and 3. We discovered

that larger values for R caused undesirable effects. We allowed T to take on the values

{−1024,−1014, . . . ,−4} for a total of 103 different values of T . Such a wide range for T

is critical especially when extremely low density nodules are encountered as in the LIDC

dataset. This resulted in a total of K = 412 different candidate segmentations for every

nodule. This was repeated for each of the L = 100 nodules in the training set.

After the candidate segmentations were created, the values for the overlap, d(j) for

j = 1, 2, . . . , K×L were computed. This yielded a total of K×L =51,088 sample points

in the function we desired to encode using the ANN. However, some of these combinations

resulted in no segmentation at all if the threshold was too high for example. These were

discarded leaving a total of about 26,000 input-output combinations. A random subset of

2,000 of these 26,000 test vectors were presented to the neural network together with the

corresponding quality measurement d(k) during training. As discussed previously, the SFS

technique was used to select the best features. Figure 6.1 shows the mean overlap for the

65

first few features determined during SFS. The four features that were selected for training

an operational network in order of salience were MCI, surface gradient, contrast in the z

direction, and radial deviation.

1 2 3 4 50.65

0.66

0.67

0.68

0.69

0.7

0.71

0.72

Number of features

Mea

n O

verla

p

Figure 6.1: The results of forward sequential feature selection for the first 5 features.

6.1.3 Segmentation Performance

The performance of this segmentation algorithm was quantified by comparing the au-

tomatically generated segmentations to the radiologist truthed nodules from the LIDC

dataset. In particular the overlap value between the truth and the test segmentations was

computed followed by the calculation of a sample mean and sample standard deviation

across all of the nodules available in each test set. The overlap performance for the LIDC

66

data is shown in Table 6.3 with a corresponding histogram shown in Figure 6.2. Specif-

ically, we have shown the results when R and T are chosen manually and when they are

chosen automatically using the ANN using an exhaustive search.

Table 6.3: Test results for the LIDC data.

LIDC Manual LIDC AutomatedMean Overlap 72% 61%Standard Deviation ofOverlap

16% 20%

0.2 0.4 0.6 0.80

2

4

6

8

10

12

14

16

18

Overlap

Num

ber

of O

ccur

ence

s

Figure 6.2: A histogram showing the overlap results for the LIDC data.

67

In addition to manually segmenting the nodules in the LIDC database, the radiologists

were asked to characterize the nodules in terms of their shapes and features. Four features

of interest are spiculation, lobulation, sphericity, and margin. These were rated on a scale

of 1 to 5 with nodules exhibiting low values indicating little or no presence of the feature

in question while high values indicating that the feature is definitely present. The margin

rating indicates how sharp or diffuse the boundary of the nodule is. For each of the 69

LIDC nodules that we extracted, we computed the mean radiologist rating for each of

these four features in order to quantify the performance of our segmentation algorithm

as a function of these. Plots indicating this performance are shown in Figure 6.3. Two

examples of segmentations of the LIDC data are shown in Figures 6.4 and 6.5 with the

former being an example of a good segmentation and the latter being an example of a less

optimal segmentation. We have also present histograms indicating the distribution of T and

R chosen by the network in Figs. 6.6 and 6.7 respectively.

Finally, in order to compare our algorithm to an existing pulmonary nodule segmenta-

tion algorithm that is similar to this one, we reproduced an algorithm very similar to the

technique of Kostis et al. as a baseline [12]. Kostis et al. perform isotropic resampling in

the axial dimension to produce uniformly sized voxels. They follow this by thresholding

followed by 3D morphological opening with a 3D spherical kernel. They make use of a

constant threshold value which is selected based upon volume computation experiments

done with a set of phantoms. In addition they make use of a constant structuring element

radius of three voxels while making the point that this radius may need to be adjusted dur-

ing segmentation. We applied the Kostis algorithm to the LIDC dataset for a number of

68

1 2 3 4 50

0.5

1

Mean radiologist margin rating

Ove

rlap

1 2 3 4 50

0.5

1

Mean radiologist spiculation rating

Ove

rlap

1 2 3 4 50

0.5

1

Ove

rlap

Mean radiologist lobulation rating1 2 3 4 5

0

0.5

1

Mean radiologist sphericity rating

Ove

rlap

Figure 6.3: Plot showing overlap performance as a function of mean radiologist rated fea-tures.

different thresholds and for a structuring element radius of three. A plot showing the over-

lap as a function of threshold for the Kostis algorithm is shown in Figure 6.8. It can be seen

that this algorithm achieves a maximum overlap of 51% for a threshold of -600.

Intelligent Search of the Solution Space

As discussed previously in Chapter IV, a method to reduce the computational com-

plexity of the segmentation algorithm is extremely desirable. Two such approaches were

considered as previously discussed. The first approach was the method of simulated an-

nealing. This approach was tested using the LIDC dataset and the results are shown in

Table 6.4. In addition an example trajectory through the R − T solution space is shown

69

1 2 3

4 5

AutomatedRadiologist

Figure 6.4: An example of the segmentation of an LIDC nodule. This example exhibitedan Overlap of 83%.

in Figure 6.9. Reasonably good segmentation results were also achieved using the Golden

Section Search. Two examples of the convergence of this search are shown in Figure 6.10

and Figure 6.11. Again, this search technique yields approximately the same mean overlap

as the exhaustive search technique. The results are shown in Table 6.4.

In Chapter 4, we presented a theoretical analysis of the computational complexity of this

segmentation algorithm. If the range of T to be searched is large, the Golden Section Search

provides the most efficient searching while the exhaustive search clearly provides the least

70

1 2

3

Automated

Radiologist Truth

Figure 6.5: A second example of the segmentation of an LIDC nodule. This exampleexhibited an Overlap of 42%. Note that the main problem is that slice 1 was missed by theautomated algorithm.

efficient search. We tested this by segmenting 5 nodules with each of the methods and

using software to analyze the time consumed computing the segmentations. Specifically,

we recorded the mean time spent creating candidate segmentations as well as the mean time

spent computing features as these two processes are the most time consuming components

of the algorithm. Table 6.5 gives the results of this experiment and indicates agreement

with the theoretical results given in Chapter 4.

71

Figure 6.6: A histogram showing the values of T chosen by the neural network for theLIDC data.

Figure 6.7: A histogram showing the values of R chosen by the neural network for theLIDC data.

72

Figure 6.8: Performance of the Kostis algorithm as a function of threshold. A structuringelement radius of 3 was used in this case.

Table 6.4: Comparison of segmentation performance for the intelligent search techniquesalong with the results for the exhaustive search.

Exhaustive Search Simulated Annealing Golden Section SearchMean Overlap 61% 63% 61%Standard Deviation ofOverlap

20% 19% 21%

6.2 Volume Estimator Performance

6.2.1 Nodule Segmentation and Datasets

Since the objective of this part of our research was to analyze and compare the accuracy

of volume estimation techniques, the goal of the segmentation portion of this study was to

obtain the most accurate nodule segmentations possible. Cue points were provided for each

73

Radius

Thr

esho

ld

Solution Trajectory

−0.5 0 0.5 1 1.5 2 2.5 3 3.5

−1000

−900

−800

−700

−600

−500

−400

−300

−200

−100

0 0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

Highest Quality

Lowest Quality

Figure 6.9: Portrayal of an example trajectory resulting from the simulated annealing. Theimage in the background represents the levels of the error function, white representing higherror and black representing low error. The green dot represents the start point while thered dot represents the final solution.

Table 6.5: Mean computation times for creating segmentation candidates and computingfeatures.

Candidate creation Feature computa-tion

Exhaustive Search 82 s 85 sGolden Section Search 18 s 24 sSimulated Annealing 24 s 30 s

74

0 5 10 15 20−800

−750

−700

−650

−600

−550

Iteration

T

Threshold TrajectoryActual Minimum

Figure 6.10: Golden section search example 1. Note that the system converges to a localminimum.

of the datasets so that the positions of nodules were known in advance. This knowledge of

nodule location could then be easily incorporated into the segmentation process.

A number of segmentation techniques for solid pulmonary nodules have been suggested

in the literature [11; 29; 30; 41]. For our work, each of the segmentations were accom-

plished by the author using an interactive supervised process based on the work of Kostis

et al.[12] and described previously in this dissertation. This interactive process involved

both intensity thresholding and morphological processing of the resulting binary image

stack as described in Chapter 2. Prior to segmentation, the region of interest was displayed

using a window width of 2000 HU and a window level of -700 HU which are the standard

window parameters for viewing lung data.

Three different datasets were used to compare the volume estimators. The first data set

contained 13 nodule phantoms of known volumes imaged at four different slice thicknesses.

75

0 2 4 6 8 10 12 14 16 18−800

−750

−700

−650

−600

−550

−500

Iteration

T

Threshold TrajectoryActual Minimum

Figure 6.11: Golden section search example 2.

The phantoms were embedded in Plexi-glass. These data was acquired using the Siemens

Volume Zoom scanner at 120 kVp, and a tube current of 100 mA. The slice thicknesses

were 1.25, 5, 8, and 10 mm respectively. A wide band B70 reconstruction filter was used to

maintain sharp boundaries within the images. The nodule phantoms were all of spherical

shape. For these data, the slice spacing was always less than the slice thickness resulting in

some slice overlap for the data.

The second dataset is a publicly available set acquired through the ELCAP from the

Weill Medical College of Cornell University. This dataset contains full lung studies for

50 patients. These scans were collected using a GE Medical Systems LightSpeed Ultra

scanner set at a tube voltage of 120 kVp and operating in helical mode. The slice spacing

was 1.25 mm. Each study contains a number of nodules. Since thick slice data were not

available for these studies, the 1.25 mm data were filtered and then down-sampled in the

76

axial direction to produce data equivalent to 2.5 mm and 5 mm slice thicknesses. A total of

66 nodules from this dataset were used.

The third dataset was provided with the IRB approval by the Department of Radiology

at the University of Texas Medical Branch (UTMB) in Galveston and contains CT lung

studies from 57 patients reconstructed to both 2.5 mm and 5 mm slice thicknesses. This

dataset was collected using a GE Medical Systems LightSpeed QX/i scanner operating at

120 kVp and a tube current of 160-270 mA. This imager was also operated in helical mode

during the data collection. Table 6.6 gives a summary of the datasets used for computing

the experimental results for this section. An additional dataset was used which included

phantom nodules of known volumes.

Table 6.6: Summary of Datasets Used.

Name Dataset Slice Thickness Slice SpacingMuenster 1 1.25 mm 0.6 mmMuenster 2 5 mm 4 mmMuenster 3 8 mm 6.1 mmMuenster 4 10 mm 8 mmELCAP 5 1.25 mm 1.25 mmELCAP 6 2.5 mm 2.5 mmELCAP 7 5 mm 5 mmUTMB 8 2.5 mm 2.5 mmUTMB 9 5 mm 5 mm

Each of the nodules in these datasets was segmented for the slice thicknesses available.

An example of the segmented phantom data is shown in Figure 6.12. An example of seg-

mented ELCAP data is shown in Figure 6.13. An example of segmented UTMB data is

77

shown in Figure 6.14. Notice the effect of increased slice thickness that appears in the

3-D renderings of the nodules. This demonstrates the difficulty of getting accurate volume

estimates for greater slice thicknesses.

Figure 6.12: Example segmentation of the Muenster phantom data for 1.25 5, 8, and 10mm slice thicknesses.

6.2.2 Phantom Data

Using phantom data from datasets 1-4 was a logical place to start in testing the perfor-

mance of the various nodule volume estimators. In this dataset, the distance between slices

78

Figure 6.13: Example segmentation of ELCAP nodule showing one slice and 3-D render-ings for 1.25 mm, 2.5 mm, and 5 mm slice spacings.

79

Figure 6.14: Example segmentation of UTMB nodule showing one slice and 3-D render-ings for 2.5 mm and 5 mm slice spacings.

80

is less than the slice thickness indicating a degree of slice overlap. These nodules were seg-

mented manually using the segmentation technique described in Section 6.2.1. Following

segmentation, volume estimation was performed for all phantoms in datasets 1-4.

Let’s first look at the MWM technique. To determine a value for the parameter K we

made use of the phantom data in datasets 1-4 and the known values of Vt and measured

values of Vo. A histogram of the K parameters for each of the phantom nodules is shown

in Figure 6.15. To determine one K value, we computed the average value computed

using all the combinations of known volumes Vt and the observed volumes Vo using the

perimeter method. This yielded K = 0.902. Figure 6.16 shows scatter plots indicating

the performance of the MWM model presented in Chapter 5 against the actual measured

magnifications at each slice spacing. These plots show reasonably good agreement between

the measured magnifications and the model.

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.80

1

2

3

4

5

6

7

K

Num

ber

of O

ccur

ence

s

Figure 6.15: Histogram of the values of the K parameter calculated from the phantom data,Datasets 1-4.

81

0 100 200 300 400 500 600 700 800 9001

1.2

1.4

1.6

1.8

2

2.2

2.4

Mag

nific

atio

n

Nodule Volume (mm3)

Measured MagnificationModified Winer−Muram Model

0 100 200 300 400 500 600 700 800 9001

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

Mag

nific

atio

n

Nodule Volume (mm3)


(a) (b)

0 100 200 300 400 500 600 700 800 9001

2

3

4

5

6

7

8

9

Mag

nific

atio

n

Nodule Volume (mm3)


0 100 200 300 400 500 600 700 800 9001

2

3

4

5

6

7

8

9

10

11

Mag

nific

atio

n

Nodule Volume (mm3)


(c) (d)

Figure 6.16: Comparison of measured magnification and MWM predicted magnificationfor (a) 1.25, (b) 5, (c) 8, and (d) 10 mm thickness phantom data.

82

In order to optimize the performance of the minimax method, we applied it to the

computation of the volume of the nodules over a range of hmin values. For each hmin

value over the range, we computed the average relative error. The range of hmin values

used was between 0 and 2.5 mm for the 2.5 mm data and between 0 and 5 mm for the 5

mm data. A similar test was conducted for the5 mm slice thickness ELCAP data and based

on these experiments the hmin values chosen were 0.4 mm and 0.1 mm for the 2.5 mm data

and the 5 mm data respectively.

Figure 6.17 shows the volume estimates for both the MWM and minimax methods

along with the truth volumes for all the slice thicknesses. We are particularly interested

here in comparing the 3D volume estimators. Since the phantom nodules in this data are

spherical, we would expect that the 2-D methods would perform very well in this case. The

plot in Figure 6.18 shows a comparison between the perimeter and minimax estimators

based on the absolute percent error (APE) in the estimates. The minimax estimator was

used here with hmin = 0.1 mm while the parameter k = 0.902 for the MWM algorithm.

Based on these data, we conclude that the MWM estimator yields the best results in

this case and appears to be a good way to account for partial volume over-estimation in

the perimeter method. It can be argued that the MWM estimator outperforms the minimax

method for large slice thicknesses. We observed that this is especially true when there is

moderate slice overlap in the data. However, both methods significantly lower the volume

estimation error resulting from partial volume effects.

83

1 2 3 4 5 6 7 8 9 10 11 12 130

100

200

300

400

500

600

700

800

900

Nodule

Vol

ume

(mm

3 )

TruthModified Winer−MuramMinimax

1 2 3 4 5 6 7 8 9 10 11 12 130

100

200

300

400

500

600

700

800

900

Nodule

Vol

ume

(mm

3 )


(a) (b)

1 2 3 4 5 6 7 8 9 10 11 12 130

100

200

300

400

500

600

700

800

900

Nodule

Vol

ume

(mm

3 )


1 2 3 4 5 6 7 8 9 10 11 12 130

100

200

300

400

500

600

700

800

900

Nodule

Vol

ume

(mm

3 )


(c) (d)

Figure 6.17: Comparison of the 3D volume estimates for (a) Dataset 1, (b) Dataset 2, (c)Dataset 3, and (d) Dataset 4.

84

1 2 3 4 5 6 7 8 9 100

20

40

60

80

100

120

140

160

180

Slice Thickness (mm)

Abs

olut

e P

erce

nt E

rror

Perimeter MethodMinimax MethodModified Winer−Muram Method

Figure 6.18: Absolute percent error comparison of various volume estimates for Datasets1-4.

6.2.3 ELCAP Data

The primary objective in analyzing in vivo data is to compare the performance of vari-

ous nodule volume estimation techniques on real nodules. For the in vivo data from Cornell

University ELCAP dataset, the actual nodule volumes were unknown begging the question

of how we can make such comparisons without the availability of truth volumes. A num-

ber of techniques have been proposed in the literature for determining the agreement of

new measurement techniques with accepted measurement techniques particularly within

the medical field. Some researchers use correlation methods to try to determine the cor-

relation coefficient between data that has been declared to be the truth and the results of

the new technique. However, these correlation methods naturally have non-zero correlation

coefficients because they are attempting to measure the same quantity.

85

An alternative method has been suggested by Bland and Altman[56]. Their method ap-

plies to situations in which there is a clinically accepted measurement technique for some

quantity of interest and a new measurement technique that is to be evaluated as to its inter-

changeability with the accepted method. Their technique involves plotting the difference

between an accepted method and a test method with respect to their mean and looking at

the spread of this difference. Taking a look at the standard deviation of this difference

gives a quantitative measure of what Bland and Altman call the “agreement” between the

two methods. The judgment about whether sufficient agreement exists between two mea-

surement techniques is a function of the specific application under consideration. We are

interested here in measuring the relative agreement between the nodule volume estimators

that we are considering here.

The Bland-Altman technique does not require that either of the techniques be declared

to be the “gold standard”. For the ELCAP data, we use both the 1.25 mm perimeter method

volume estimates and the 1.25 mm MWM volume estimates as the “accepted” volume

estimates in the Bland-Altman sense. The lower the standard deviation, the greater the

agreement or consistency between the two methods. The mean of these differences is also

of interest as it indicates the bias of the new technique with respect to the existing technique.

We follow Bland and Altman by realizing that bias may be corrected by the addition of the

bias to the final estimates. We focus therefore on the spread of the error as a measure of

consistency with the accepted volume measurement technique.

The problem with applying Bland-Altman analysis to the raw data is that there is nat-

urally a strong relationship between the size of the nodules and the absolute difference

86

between the volume and the mean. This may cause an unacceptable skewing of the stan-

dard deviation. Bland and Altman suggest that the solution to this problem is to transform

the data by taking the base-10 logarithm of the data prior to analysis. Doing so tends to

“decouple” the error from the size of the nodule. We elected to approach the problem by

considering instead the percent error between the new technique and the existing technique.

The next issue to be addressed is the spread of the difference. To calculate the spread in

this difference, the standard deviation of the differences in the percent error was calculated.

This quantity allows us to assign a measure of “goodness” to the estimates when compared

with the existing, accepted method. Its magnitude gives us an indication of the agreement

between the methods where smaller magnitudes indicate a greater level of agreement. The

standard deviation and mean calculations for the 66 segmented nodules used from the EL-

CAP data with the 1.25 mm perimeter method as the reference are shown in Tables 6.7 and

6.8, respectively. The percent error statistics using the 1.25 mm MWM as the reference

are shown in Tables 6.9 and 6.10, respectively. These results suggest that the MWM vol-

ume estimator used for 2.5 mm and 5 mm slice thicknesses agrees most closely with the

“accepted” methods.

Table 6.7: Standard Deviation of Percent Error Results for the ELCAP Data.

Slice Thickness Perimeter MWM Area Minor Axis Minimax2.5 mm 44% 9% 58% 45% 30%5 mm 130% 9% 100% 86% 69%

87

Table 6.8: Mean of Percent Error Results for the ELCAP Data.

Slice Thickness Perimeter MWM Area Minor Axis Minimax2.5 mm 24% -42% 3% -13% -12%5 mm 126% -62% 12% -7% 19%

Table 6.9: Standard Deviation of Percent Error Results for the ELCAP Data With Respectto the 1.25 mm MWM estimates.

Slice Thickness Perimeter MWM Area Minor Axis Minimax2.5 mm 73% 38% 77% 63% 50%5 mm 217% 88% 134% 118% 114%

Table 6.10: Mean of Percent Error Results for the ELCAP Data With Respect to the 1.25mm MWM estimates.

Slice Thickness Perimeter MWM Area Minor Axis Minimax2.5 mm 70% -1% 40% 18% 20%5 mm 215% 36% 53% 26% 66%

88

6.2.4 UTMB data

Again with the UTMB data, there is no truth for the volumes of the nodules since this

is in vivo data. Additionally, we have access only to 2.5 mm and 5 mm data. We therefore

set out to compare and contrast the consistency and bias of the various estimators across

the slice thicknesses. It is clear that the 2-D techniques produce consistent (not necessarily

accurate) volume estimates across slice thicknesses since the nodule cross section does not

change much with slice thickness.

For this part of the work, 77 nodules were segmented at both the 2.5 mm and 5 mm

slice thickness and then we used all of the volume estimation algorithms to estimate their

volumes. The percent error in the volume estimates between the 5 mm and 2.5 mm slice

thicknesses was computed. To determine the bias between slice thicknesses, the mean

of these percentage error calculations was computed. Ideally, an unbiased percent error

would have a mean around zero. To determine the consistency, the standard deviation of

the percent error across all nodules was then computed. Both the bias and consistency data

are tabulated in Table 6.11 below.

Table 6.11: Percent error statistics between the 2.5 mm and 5 mm slice thickness data.

Perimeter MWM Area Minor Axis Minimaxmean of % error 28% -7% -16% -8% -6%standard deviation of % error 57% 41% 30% 31% 35%

What we notice from the measurements shown in Table 6.11 is that the minimax and

MWM estimators appears to be the least biased estimators across slice thicknesses doing as

89

well or better than the 2D techniques. On the other hand, the area and minor axis methods

are the most consistent estimators which is what we would expect. The minimax and MWM

have similar consistency measures. We would conclude that the minimax estimator and the

MWM algorithms are helpful for accounting for partial volume effects.

CHAPTER VII

Discussion and Conclusions

We have presented a novel automated pulmonary nodule segmentation algorithm for

use with thoracic CT data. This algorithm uniquely combines the simplicity of a straight-

forward segmentation engine with the unique computational capabilities of an ANN. Fi-

nally, we presented experimental results which showed that an good overlap values may be

obtained using 4 features to predict the quality of a given segmentation.

In Chapter 6 we presented mean overlap results for the test data containing nodules

segmented by board certified radiologists in the LIDC dataset. High mean overlap values

were achieved for this data indicating that we are able to accurately predict the quality of

test segmentations using the features selected. We consider this to be an important proof of

concept for this algorithm. It is particularly important to realize that our segmentation al-

gorithm was trained on the ELCAP and UTMB datasets which are completely independent

of the LIDC dataset. We find this encouraging.

When comparing our results to the LIDC radiologists we compared to a 50% consensus

level mask. A mean overlap of 63% was achieved. This is only slightly lower than the over-

lap reported by Wang et al. who reported a mean overlap of 64% for the same dataset using

their spiral-scanning technique when compared with a gold-standard computed through a

50% consensus level[32]. Our result, however, is considerably better than Tachibana who

90

91

achieved a 51% overlap [57] and Way et al. who achieved a mean overlap of 58% [31]. We

also desired to know how well the T and R segmentation engine would perform. To do so,

we manually selected T and R to get as close of a match to the radiologist gold standard

as possible. We were able to achieve a mean overlap of 72% by doing this. This indicates

that our segmentation engine is capable of producing candidates, the best of which will

out-perform any of the other cited techniques. Optimization of this algorithm therefore

involves improvements to the mechanism which selects the correct candidates.

In order to compare to a similar algorithm, we created an implementation which to the

best of our knowledge approximates the Kostis algorithm [12]. We tested this algorithm for

a range of threshold values as shown in Fig. 6.8. The algorithm achieves a maximum mean

overlap of 51% when applied to the LIDC dataset. We are convinced that the reason for this

performance is that there is simply not one threshold and structuring element radius that

are suitable for segmenting all nodules. Our mean overlap number of 63% for the LIDC

data shows that there is a great advantage for adaptively adjusting T and R.

The performance of this algorithm was also studied in terms of nodule type. Fig. 6.3

gives a depiction of this performance. Our results indicate that the algorithm performance

is not significantly affected by spiculation or lobulation as perceived by the radiologists

who rated the data. On the other hand, there is an obvious connection to sphericity and to

margin rating. Clearly, the more spherical a nodule is, the better the segmentation that is

achieved. Nodules with sharp boundaries (high margin rating) also tend to achieve better

segmentations than those with fuzzy boundaries (low margin ratings). This is consistent

with our observations. In addition, we investigated the performance when the nodules

were juxtapleural or were adjacent to moderate vasculature. We found that we achieved a

92

mean overlap of 63% for juxtapleural nodules, 65% for juxtavascular nodules, and 63% for

nodules with no obvious connected structures.

We also streamlined the segmentation algorithm by using efficent searches of the T −R

solution space. We found that the simulated in annealing approach yielded particularly

good results. Having used a T spacing of 10 HU for our exhaustive search test, we found

that simulated annealing was actually capable of producing better results than the exhaus-

tive search test. This is because simulated annealing can produce any integer value of T

including those not considered in the exhaustive search. It does this while computing fewer

segmentation candidates. In addition, timing studies described in Chapter 6 indicate that

both the Golden Section Search and the simulated annealing search are vastly more efficient

than the exhaustive search. In fact, if we select simulated annealing because of its overlap

performance, the computational time required was improved by a factor of approximately

three. Segmentations using this approach take well under a minute.

There are several aspects of this pulmonary nodule segmentation algorithm that make

it attractive. The first is that the T and R framework is conceptually simple. This makes

it relatively simple to implement software for manual segmentation in order to create the

training data. A second feature that commends this algorithm is its ability to “learn” how

to do the segmentation based on observation of the segmentations done manually. A third

advantage that makes this algorithm attractive is the fact that additional segmentation fea-

tures can be easily added to the framework. This is an area on which we continue to work.

Perhaps there are additional features that would make a significant improvement in seg-

mentation quality across the board.

93

The second contribution involves the modification of an existing 3D nodule volume es-

timator. We have presented and compared several nodule volume estimators for CT data

including the minimax technique and a new variation of a previously published technique

for estimating the volumes (the MWM technique). The MWM approach involves simply

the computation of roots of a cubic polynomial, something that is easy to do either nu-

merically or using an analytic solution. The results indicate that the MWM compensation

technique performs the best when compared with other 3D methods on spherical phantom

nodules. The minimax algorithm also performs fairly well when compared to the perimeter

method even when the slice thickness is different than the slice spacing. The perimeter

method breaks down in this case.

Based on the results from the in vivo studies, the 3D volume estimators yield better

volume estimates across slice spacing than do the 2D techniques. In addition, the MWM

and minimax techniques perform well for datasets 5-7, the ELCAP data, when the 1.25 mm

perimeter estimate is used as the “accepted measurement”. They significantly outperform

the 2D techniques in that case. For datasets 8 and 9, the results are not as conclusive with

the perimeter method, the Winer-Muram method, and the minimax method all showing

about equal agreement when the accepted method was the 2.5 mm perimeter measurements.

Comparing 5 mm data with 2.5 mm data is difficult since a large amount of partial volume

magnification is already present within the estimates.

We would recommend based on our results from in vivo data that when large slice

thicknesses and spacings are employed, the 2-D volume estimation techniques should not

be used when accurate volume estimates are desired. Instead techniques which attempt

to compensate for such volume magnifications such as the MWM or minimax techniques

94

should be used. Such applications requiring such measurements include the estimation

of volume doubling times, a critical part of cancer diagnosis. If accurate volume estima-

tions can be made effective at greater slice thicknesses, then the effective dose of ionizing

radiation that the patient receives can be made smaller which is a significant benefit.

Finally, we would encourage further investigation of the segmentation algorithm to

attempt to further quantify their usefulness for various clinical applications. The author

would also like to point out that the neural network segmenter presented here could with

some effort be moved to a fuzzy logic approach to segmentation. We would also encourage

further development of compensation algorithms to further increase the accuracy of nodule

volume estimates at greater CT slice thicknesses. The numerical results presented here are

promising and we hope that the work which we have presented here will find a place in

software which can be deployed in clinical practice.

BIBLIOGRAPHY

[1] American Cancer Society, “Cancer facts and figures 2008,” 2008.

[2] J. J. Erasmus, J. E. Connolly, H. P. McAdams, and V. L. Roggli, “Solitary pulmonary

nodules: Part i. morphologic evaluation for differentiation of benign and malignant

lesions,” Radiographics, vol. 20, no. 1, pp. 43–58, Jan. 2000.

[3] C. I. Henschke, D. F. Yankelevitz, D. M. Libby, and M. W. Pasmantier, “Survival of

patients with stage I lung cancer detected on CT screening,” New England Journal of

Medicine, vol. 355, no. 17, pp. 1763–1772, Oct. 2006.

[4] Gonzalez and Woods, Digital Image Processing. Upper Saddle River, NJ: Prentice

Hall, 2007.

[5] W. R. Hendee and E. R. Ritenour, Medical Imaging Physics. New York, New York:

Wiley, 2002.

[6] J. L. Prince and J. M. Links, Medical Imaging Signals and Systems. Upper Saddle

River, New Jersey: Pearson, 2006.

[7] “syngo lungcare CT,” http://www.medical.siemens.com/webapp/wcs/stores/servlet

/ProductDisplay q catalogId e -11 a catTree e 100010,1007660,12752,1008405,

1008410 a langId e -11 a productId e 11611 a storeId e 10001.htm.

95

96

[8] “Advanced clinical applications,” http://www.gehealthcare.com/usen/ct/clin app/

products/lunganalysis.html.

[9] D. Wormanns and S. Diederich, “Characterization of small pulmonary nodules by

CT,” European Radiology, vol. 14, no. 8, pp. 1380–1391, Aug. 2004.

[10] J. P. Ko, H. Rusinek, E. L. Jacobs, J. S. Babb, M. Betke, G. McGuinness, and D. P.

Naidich, “Small pulmonary nodules: Volume measurement at chest CT - phantom

study,” vol. 228, no. 3, pp. 864–870, Sept. 2003.

[11] W. Mullally, M. Betke, and J. Wang, “Segmentation of nodules on chest computed

tomography for growth assessment,” Medical Physics, vol. 31, no. 4, pp. 839–848,

Apr. 2004.

[12] W. J. Kostis, A. P. Reeves, D. F. Yankelevitz, and C. I. Henschke, “Three-dimensional

segmentation and growth-rate estimation of small pulmonary nodules in helical CT

images,” IEEE Transactions on Medical Imaging, vol. 22, no. 10, pp. 1259–1274,

Oct. 2003.

[13] H. T. Winer-Muram, S. G. Jennings, R. D. Tarver, A. M. Aisen, M. Tann, D. J. Conces,

and C. A. Meyer, “Volumetric growth rate of stage 1 lung cancer prior to treatment:

Serial CT scanning,” Radiology, vol. 223, no. 3, pp. 798–805, June 2002.

[14] M. Hasegawa, S. Sone, S. Takashima, F. Li, Z. Yang, Y. Maruyama, and T. Watan-

abe, “Growth rate of small lung cancers detected on mass CT screening,” The British

Journal of Radiology, vol. 73, no. 876, pp. 1252–1259, Dec. 2000.

97

[15] D. F. Yankelevitz, R. Gupta, B. Zhao, and C. I. Henschke, “Small pulmonary nodules:

Evaluation with repeat CT-preliminary experience,” Radiology, vol. 212, no. 2, pp.

561–566, Aug. 2003.

[16] D. F. Yankelevitz, A. P. Reeves, W. J. Kostis, B. Zhao, and C. I. Henshke, “Small

pulmonary nodules: Volumetrically determined growth rates based on CT evaluation,”

Radiology, vol. 217, no. 1, pp. 251–256, Oct. 2000.

[17] M. N. Gurcan, R. C. Hardie, B. H. Allen, S. K. Rogers, D. E. Dozer, R. V. Burns, and

J. W. Hoffmeister, “Automated nodule volume estimation from CT images: Minimax-

based,” in PACS 2004, Mar. 2004.

[18] H. T. Winer-Muram, S. G. Jennings, C. A. Meyer, Y. Liang, A. M. Aisen, R. D.

Tarver, and R. C. McGarry, “Effect of varying CT section width on volumetric mea-

surement of lung tumors and application of compensatory equations,” Radiology, vol.

229, no. 1, pp. 184–194, Oct. 2003.

[19] J. Kuhnigk, V. Dicken, L. Bornemann, A. Bakai, D. Wormanns, S. Krass, and H. Peit-

gen, “Morphological segmentation and partial volume analysis for volumetry of solid

pulmonary lesions in thoracic CT scans,” IEEE Transactions on Medical Imaging,

vol. 25, no. 4, pp. 417–434, Apr. 2006.

[20] A. P. Reeves, A. B. Chan, D. F. Yankelevitz, C. I. Henscheke, B. Kressler, and W. J.

Kostis, “On measuring the change in size of pulmonary nodules,” IEEE Transactions

on Medical Imaging, vol. 25, no. 4, pp. 435–450, Apr. 2006.

98

[21] G. B. Coleman and H. C. Andrews, “Image segmentation by clustering,” Proceedings

of the IEEE, vol. 67, pp. 773–785, May 1979.

[22] H. Wang and B. Ghosh, “Geometric active deformable models in shape models,”

IEEE Transactions on Image Processing, vol. 9, no. 2, pp. 459–466, Feb. 2000.

[23] A. Elmoataz, S. Schupp, and D. Bloyet, “Fast and simple discrete approach for active

contours for biomedical applications,” International Journal of Pattern Recognition

and Artificial Intelligence, vol. 15, no. 7, pp. 1201–1212, July 2001.

[24] D. L. Vilarno, D. Cabello, X. M. Pardo, and V. M. Brea, “Cellular neural networks

and active contours: a tool for image segmentation,” Image and Vision Computing,

vol. 21, no. 2, pp. 189–204, Feb. 2003.

[25] D. L. Pham, C. Xu, and J. L. Prince, “Current methods in medical image segmenta-

tion,” Annual Review of Biomedical Engineering, vol. 2, pp. 315–337, 2000.

[26] T. F. Coleman, Y. Li, and A. Mariano, “Segmentation of pulmonary nodule images

using total variation minimization, Tech. Rep. TR98-1704, 8, 1998. [Online].

Available: citeseer.ist.psu.edu/coleman98segmentation.html

[27] N. Xu, N. Ahuja, and R. Bansal, “Automated lung nodule segmentation using dy-

namic programming and em based classification,” Proceedings of SPIE, vol. 4684,

pp. 666–676, 2002.

[28] L. Fan, J. Qian, B. Odry, and H. Shen, Proceedings of SPIE.

99

[29] B. Zhao, D. Yankelevitz, A. Reeves, and C. Henschke, “Two-dimensional multi-

criterion segmentation of pulmonary nodules on helical CT images,” Medical Physics,

vol. 26, no. 6, pp. 889–895, June 1999.

[30] B. Zhao, A. P. Reeves, D. F. Yankelevitz, and C. I. Henschke, “Three-dimensional

multicriterion automatic segmentation of pulmonary nodules of helical computed to-

mography images,” Optical Engineering, vol. 38, no. 8, pp. 1340–1347, Aug. 1999.

[31] T. Way, L. Hadjiiski, B. Sahiner, H. Chan, P. Cascade, E. Kazerooni, N. Bogot, and

C. Zhou, “Computer-aided diagnosis of pulmonary nodules on CT scans: segmenta-

tion and classification using 3d active contours,” Medical Physics, vol. 33, no. 7, pp.

2323–2337, July 2007.

[32] J. Wang, R. Engelmann, and Q. Li, “Segmentation of pulmonary nodules in three-

dimensional CT images by use of a spiral-scanning technique,” Medical Physics,

vol. 34, no. 12, pp. 4678–4689, Dec. 2007.

[33] S. Hu, E. Hoffman, and J. Reinhardt, “Automatic lung segmentation for accurate

quantitation of volumetric x-ray CT images,” IEEE Transactions on Medical Imaging,

vol. 20, no. 6, pp. 490–498, jun 2001.

[34] J. Reinhardt and W. Higgins, “Paradigm for shape-based image analysis,” Optical

Engineering, vol. 37, no. 2, pp. 570–581, feb 1998.

[35] M. S. Brown, M. F. McNitt-Gray, N. J. Mankovich, J. G. Goldin, J. Hiller, L. S. Wil-

son, and D. R. Aberie, “Method for segmenting chest CT image data using an anatom-

ical modal: preliminary results,” IEEE Transactions on Medical Imaging, vol. 16,

100

no. 6, pp. 828–839, dec 1997.

[36] J. Leader et al, “Automated lung segmentation in x-ray computed tomography devel-

opment and evaluation of a heuristic threshold-based scheme,” Academic Radiology,

vol. 10, no. 11, pp. 1224–1236, nov 2003.

[37] S. G. Armato and W. F. Sensakovic, “Automated lung segmentation for thoracic CT

impact on computer-aided diagnosis,” Academic Radiology, vol. 11, no. 9, pp. 1011–

1021, sep 2004.

[38] B. Kosko, Neural Networks and Fuzzy Systems: A Dynamical Systems Approach to

Machine Intelligence. Englewood Cliffs, NJ: Prentice Hall, 1992.

[39] F. M. Ham and I. Kostanic, Principles of Neurocomputing for Science and Engineer-

ing. New York, New York: McGraw Hill, 2001.

[40] S. K. Rogers and M. Kabrisky, An Introduction to Biological and Artificial Neural

Networks for Pattern Recognition. Bellingham, WA: SPIE Press, 1991.

[41] C. W. Chen, J. Luo, and K. J. Parker, “Image segmentation via adaptive k-mean clus-

tering and knowledge-based morphological operations with biomedical applications,”

IEEE Transactions on Image Processing, vol. 7, no. 12, pp. 1673–1683, Dec. 1998.

[42] S. Kirkpatrick, C. Gelatt, and M. Vecchi, “Optimization by simulated annealing,”

Science, vol. 220, no. 4598, pp. 671–680, May 1983.

[43] K. Fukunaga, Introduction to Statistical Pattern Recognition. San Diego, CA: Aca-

demic Press, 1990.

101

[44] H. Kobatake and S. Hashimoto, “Convergence index filter for vector fields,” IEEE

Transactions on Image Processing, vol. 8, no. 8, pp. 1029–1038, Aug. 1999.

[45] N. Metropolis, A. W. Rosenbluth, A. H. Teller, and E. Teller, “Equation of state cal-

culations by fast computing machines,” Journal of Chemical Physics, vol. 21, pp.

1087–1092, 1953.

[46] S. Geman and D. Geman, “Stochastic relaxation, gibbs distribution, and bayesian

restoration of images,” IEEE Transactions on Pattern Analysis and Machine Intelli-

gence, vol. 6, pp. 721–741, Nov. 1984.

[47] M. Tekalp, Digital Video Processing. Upper Saddle River, NJ: Prentice Hall, 1995.

[48] J. Kiefer, “Sequential minimax search for a maximum,” Proceedings of the American

Mathematical Society, vol. 4, pp. 502–506, Apr. 1953.

[49] M. Revel, C. Lefort, A. Bissery, M. Bienvenu, L. Aycard, G. Chatellier, and G. Frija,

Radiology, no. 2, May.

[50] M. N. Gurcan, R. C. Hardie, B. H. Allen, S. K. Rogers, D. E. Dozer, R. V. Burns,

and J. W. Hoffmeister, “Accurate nodule volume estimation from helical CT im-

ages: Comparison of slice-based and volume-based methods,” in Radiology Society

of North America (RSNA) 2004, McCormick Place, Chicago, IL (poster), Dec. 2002.

[51] D. P. Bertsekas, Nonlinear Programming. Belmont, MA: Athena Scientific, 1995.

[52] P. J. Nahin, An Imaginary Tale: The Story of√−1. Princeton, NJ: Princeton Uni-

versity Press, 1998.

102

[53] E. W. Cheney and D. Kincaid, Numerical Mathematics and Computing. Pacific

Grove, CA: Brooks/Cole, 1999.

[54] http://www.via.cornell.edu/databases/lungdb.html.

[55] W. Wang, P. James, and D. Partridge, “Assessing the impact of input features in a

feedforward neural network,” Neural Computing and Applications, no. 9, pp. 101–

112, Sept. 2000.

[56] J. M. Bland and D. G. Altman, “Statistical methods for assessing agreement between

two methods of clinical measurement,” Lancet, vol. 1, no. 2, pp. 307–310, Feb. 1986.

[57] R. Tachibana and S. Kido, “Automatic segmentation of pulmonary nodules on CT

images by use of NCI Lung Image Database Consortium,” Proceedings of SPIE, vol.

6144, no. 6144-1-6144M-9, 2006.

VITA

September 7, 1973 Born - Kalamazoo, Michigan

1996 B.S.E.E., Cedarville University, Cedarville, Ohio

1996-1998 Research Assistant, University of Dayton

1998 M.S.E.E. University of Dayton, Dayton, Ohio

1998-2002 Senior Engineer, General Dynamics

2002-present Assistant Professor, Cedarville University

2008 Ph.D., University of Dayton

103

Documents

AUTOMATIC SEGMENTATION OF SMALL PULMONARY NODULES …drtimothytuinstra.com/tuinstra_diss.pdf · tation of nodules within three dimensional (3D) computed tomography (CT) data as well