Iterative Reconstruction in CT and MRIweb.eecs.umich.edu/~fessler/papers/lists/files/talk/15/f3d-keynote.pdf · Iterative Reconstruction in CT and MRI (and a bit of PET and SPECT)

Iterative Reconstruction in CT and MRI

(and a bit of PET and SPECT)

Jeffrey A. Fessler

EECS Dept., BME Dept., Dept. of RadiologyUniversity of Michigan

web.eecs.umich.edu/fessler

Fully 3D Conference

02 June 2015

1 / 74

web.eecs.umich.edu/~fessler

Disclosure

Research support from GE Healthcare Supported in part by NIH grants P01 CA-87634, U01 EB018753 Equipment support from Intel Corporation

Acknowledgment:many collaborators and many students and post-docs

2 / 74

Outline

WhatCTMRI

WhyWhy CT iterativeWhy MRI iterative

HowOptimization transferSeparable quadratic surrogatesMomentumOrdered subsets

Parallelization

3 / 74

Outline

WhatCTMRI



Parallelization

4 / 74

X-ray CT scans

CT image reconstruction problem:Determine unknown attenuation map x given sinogram data yusing system matrix A.cf. SPECT with orbiting gamma camera

5 / 74

MRI scans

(No moving partsto animate)

MR image reconstruction problem:Determine unknown magnetization image x given k-space data yusing system matrix ADefer motion for now...

6 / 74

Inverse problems

Unknownobject

x Imagingsystem

Datay Recon

Imagex

How to reconstruct object x from data y?Non-iterative methods: analytical / direct Filtered back-projection (FBP) for CT (textbook: Radon transform) Inverse FFT for MRI (textbook: FFT)

idealized description of the system (textbook model) geometry / sampling disregards noise and simplifies physics

typically fastIterative methods: model-based / statistical based on reasonably accurate models for physics and statistics usually much slower

7 / 74

Statistical image reconstruction: CT example

A picture is worth 1000 words (and perhaps several 1000 seconds of computation?)

Thin-slice FBP ASIR (denoise) StatisticalSeconds A bit longer Much longer

(Same sinogram, so all at same dose)

8 / 74

Outline

WhatCTMRI



Parallelization

9 / 74

Why statistical/iterative methods for CT?

Accurate physics models X-ray spectrum, beam-hardening, scatter, ...

= reduced artifacts? quantitative CT? X-ray detector spatial response, focal spot size, ...

= improved spatial resolution? detector spectral response (e.g., photon-counting detectors)

= improved contrast between distinct material types?

Nonstandard geometries transaxial truncation (wide patients) long-object problem in helical CT irregular sampling in next-generation geometries coarse angular sampling in image-guidance applications limited angular range (tomosynthesis) missing data, e.g., bad pixels in flat-panel systems

10 / 74

Why iterative for CT ... continued

Appropriate models of (data dependent) measurement statistics weighting reduces influence of photon-starved rays (cf. FBP)

= reducing image noise or X-ray dose

Object constraints / priors nonnegativity object support piecewise smoothness object sparsity (e.g., angiography) sparsity in some basis motion models dynamic models ...

Henry Gray, Anatomy ofthe Human Body, 1918,Fig. 413.

Constraints may help reduce image artifacts or noise or dose.

Similar motivations/benefits in PET and SPECT.

11 / 74

Disadvantages of iterative methods for CT?

I Computation timeI Must reconstruct entire FOVI Complexity of models and softwareI Algorithm nonlinearities Difficult to analyze resolution/noise properties (cf. FBP) Tuning parameters Challenging to characterize performance / assess IQ

12 / 74

Sub-mSv example

3D helical X-ray CT scan of abdomen/pelvis:100 kVp, 25-38 mA, 0.4 second rotation, 0.625 mm slice, 0.6 mSv.

FBP ASIR Statistical

13 / 74

MBIR example: Chest CT

Helical chest CT study with dose = 0.09 mSv.Typical CXR effective dose is about 0.06 mSv.(Health Physics Soc.: http://www.hps.org/publicinformation/ate/q2372.html)

FBP MBIRVeo (MBIR) images courtesy of Jiang Hsieh, GE Healthcare

14 / 74

http://www.hps.org/publicinformation/ate/q2372.html

History: Statistical reconstruction for X-ray CT

Iterative method for X-ray CT (Hounsfield, 1968) ART for tomography (Gordon, Bender, Herman, JTB, 1970) ... Roughness regularized LS for tomography (Kashyap & Mittal, 1975) Poisson likelihood (transmission) (Rockmore and Macovski, TNS, 1977) EM algorithm for Poisson transmission (Lange and Carson, JCAT, 1984) Iterative coordinate descent (ICD) (Sauer and Bouman, T-SP, 1993) Ordered-subsets algorithms (Manglos et al., PMB 1995)

(Kamphuis & Beekman, T-MI, 1998)(Erdogan & Fessler, PMB, 1999)

... Commercial OS for Philips BrightView SPECT-CT (2010) Commercial ICD for GE CT scanners (circa 2010) FDA 510(k) clearance of Veo (Sep. 2011) First Veo installation in USA (at UM) (Jan. 2012)

( numerous omissions, including many denoising methods)

15 / 74

Statistical image reconstruction for CT: Formulation

Optimization problem formulation: x = arg minx0 (x)

(x) cost

function

,12 y Ax

2W

data-fit termphysics & statistics

+N

j=1

kNj

(xj xk) regularizer

prior models

y : measured data (sinogram)A : system matrix (physics / geometry)W : weighting matrix (statistics)x : unknown image (attenuation map) : regularization parameter(s)Nj : neighborhood of jth voxel : edge-preserving potential function(piece-wise smoothness / gradient sparsity)

16 / 74

Statistical image reconstruction for CT: Research

x = arg minx0

(x), (x) , 12 y Ax2W +

j

k

j,k (xj xk)

Apparent topics: regularization design / parameter selection , jk statistical modeling W , system modeling A optimization algorithms (arg min) assessing IQ of x

Other topics: system design motion spectral dose ...

17 / 74

MRI: Why iterative reconstruction?

Inverse FFT is fast (like FBP). Why change?(Joint work with D. Noll, J. Nielsen, ...)

Recall rationale for CT/PET/SPECT:I physics modeling reduce artifacts improve resolution improve contrast

I noise modeling: (dose, variability)I sampling: non-standard geometriesI constraints on object

Which of these matter for MRI?

18 / 74

MRI why iterative: PhysicsPhysics modeling (e.g., field inhomogeneity) = reduced artifacts

Example: T2*-weighted imaging (Sutton et al., IEEE T-MI, 03)

uncorrected traditional iterative field map

x = arg minx

12 y Ax

22 + R(x)

System matrix A depends on (measured) field map:

aij = ej ti e2~i ~rj

No analytical inverse of A. cf. nonuniform attenuation correction in SPECT19 / 74

http://dx.doi.org/10.1109/TMI.2002.808360

MRI why iterative: PhysicsJoint estimation of field map and magnetization image x:

(x, ) = arg minx,

12 y A()x

22 + 1 R1(x) +2 R2()

Useful when field map drifts in dynamic imaging.(Sutton et al., MRM 04) (Olafsson et al., T-MI 08)

cf. joint estimation of attenuation map and activity image in SPECT, PET and TOF-PET.(Censor et al., T-NS 79) (Clinthorne et al., NSS 91) (Rezaei, Defrise, Nuyts, T-MI 14)

20 / 74

MRI why iterative: Physics

RF pulse design

RF pulseb Bloch Eqn

Excited magnetizationm

Small-tip approximation: m AbIterative RF pulse design (with RF power regularization):

arg minbm Ab22 + b

22

Minimize using CG. (Yip et al., MRM, Oct. 2005)

d. Non-iterative:e. Iterative:

21 / 74

MRI why iterative: Noise

I MRI measurements: (complex) AWGN = easy !?

I Variance of image phase depends on image magnitude.I Image phase useful in some applications, e.g., B1 mapping:

Unregularized vs regularized phase estimate. (Zhao et al., T-MI 14)

22 / 74

MRI why iterative: Noise

I MRI measurements: (complex) AWGN = easy !?I Variance of image phase depends on image magnitude.I Image phase useful in some applications, e.g., B1 mapping:

Unregularized vs regularized phase estimate. (Zhao et al., T-MI 14)23 / 74

MRI why iterative: Sampling

I Reducing k-space sampling = reduced scan timeI Especially compelling for dynamic imaging (cf. CT and SPECT)I Popular under-sampled patterns: (cf. sparse-view CT)

Random Cartesian

kx

kyRadial

I Solution strategies Multiple receive coils Object model assumptions (e.g., sparsity) iterative reconstruction (compressed sensing)

24 / 74

Parallel MRI

Under-sampled Cartesian k-space: use multiple receive coils withindividual spatial sensitivity patterns. (Pruessmann et al., MRM, 1999)

Array coil images

1 64

1

64 A =

FS1FS2...

Compressed sensing parallel MRI (random) under-samplingLustig et al., IEEE Sig. Proc. Mag., Mar. 2008cf. multiple-source CT (speed) or multi-camera SPECT (counts)

25 / 74

http://dx.doi.org/10.1002/(SICI)1522-2594(199911)42:53.0.CO;2-Shttp://dx.doi.org/10.1109/MSP.2007.914728

Model-based image reconstruction in parallel MRI

Regularized estimator:

x = arg minx

12 y FSx

22

data fit

+ Rxp sparsity

.

F is under-sampled DFT matrix (wide)Features: coil sensitivity matrix S is block diagonal F F is circulant (for Cartesian sampling)

Challenges: Data-fit Hessian S F FS is highly shift variant due to coil

sensitivity maps Non-quadratic (edge-preserving) regularization p Non-smooth regularization 1 (cf. sparse view CT) Complex quantities Large problem size (if 3D or dynamic or many coils)

26 / 74

2.5D parallel MR image reconstruction

Example of compressed sensing MRI reconstruction:

0.4

32

63.6

95.1

126.7

158.2

189.8

221.3

252.9

284.4

316

347.5

0.4

32

63.6

95.1

126.7

158.2

189.8

221.3

252.9

284.4

316

347.5

0.4

32

63.6

95.1

126.7

158.2

189.8

221.3

252.9

284.4

316

347.5

0

8.4

16.8

25.2

33.5

41.9

50.3

58.7

67

75.4

83.8

92.2

original k-space IFFT iterative difference

Fully sampled body coil image of human brain (144 128) Poisson-disk-based k-space sampling, 16% sampling (acceleration 6.25) Square-root of sum-of-squares inverse FFT of zero-filled k-space data for 8 coils Regularized reconstruction x ()

combined TV and `1 norm of two-level undecimated Haar wavelets Difference image magnitude

(Sathish Ramani & JF, IEEE T-MI, Mar. 2011)

27 / 74


Summary of What and Why

I CT and MRI both involve inverse problemsI Some similarities in motivations and formulationsI Some similarities in computation challengesI Some opportunities for cross-fertilizationI Caution: MRI reconstruction field is crowded!

28 / 74

Outline

WhatCTMRI



Parallelization

29 / 74

SIR for CT: Optimization challenges

x = arg minx0

(x), (x) , 12 y Ax2W +

Nj=1

k

j,k (xj xk)

Optimization challenges: large problem size: x R512512600, y R888647000 A is sparse but still too large to store; compute Ax on-the-fly W has enormous dynamic range (1 to exp(9) 1.2 104) Gram matrix AWA highly shift variant is non-quadratic but convex (and often smooth) nonnegativity constraint data size grows: dual-source CT, spectral CT, wide-cone CT, ... Moores law insufficient

30 / 74

Optimization transfer (Majorize-Minimize) methods: 1D(x) (x)

(n)(x)

x(n)

x(n+1)

x

Surrogate function (majorizer)Cost function

(n)(x (n)) = (x (n))(n)(x) (x)

cf. ML-EM

x (n+1) = arg minx

(n)(x)

31 / 74

Optimization transfer (Majorize-Minimize) methods: 2D

32 / 74

Outline

WhatCTMRI



Parallelization

33 / 74

Separable Quadratic Surrogates (SQS): Math

L(x) = 12 y Ax2W

= L(x (n)

)+ L

(x (n)

)(x x (n)) + 12

(x x (n)

)AWA (x x (n)) non-separable

L(x (n)

)+ L

(x (n)

)(x x (n)) + 12

(x x (n)

)D (x x (n)) separable

, (n)L (x), a SQS,

where AWA D = diag{AWA1} . (De Pierro, T-MI, Mar. 1995)Proofs: Convexity of x2 Gersgorin disk theorem Cauchy-Schwarz inequality

34 / 74

Separable Quadratic Surrogates (SQS): Pictures

Find minimizer of L(x): challenging Find minimizer of (n)L (x): easy (separate 1D problems)

35 / 74

WLS-SQS: Iteration

General optimization transfer (majorize-minimize) method:

x (n+1) = arg minx

(n)L (x)

For SQS:

(n)L (x) = L(x (n)

)+ L

(x (n)

)(x x (n)) + 12

(x x (n)

)D (x x (n))(n)L (x) = L

(x (n)

)+D

(x x (n)

)0 = (n)L

(x (n+1)

)= L

(x (n)

)+D

(x (n+1) x (n)

)x (n+1) = x (n) D1 L

(x (n)

)diagonally preconditioned gradient descent

(Erdogan & JF, PMB, 1999)

36 / 74

SQS versus GD: Math

Ordinary gradient descent (GD) for WLS:

x (n+1) = x (n) L(x (n)

)= x (n) AW (Ax (n) y),

where textbook step size is reciprocal of Lipschitz constant:

= 1max(AWA)

.

WLS-GD is equivalent to WLS-SQS with isotropic majorizerHessian:

D = max(AWA)I.

Drawbacks: max(AWA) usually impractical to compute (in CT) Usually slower convergence due to smaller step sizes

37 / 74

SQS versus GD: Pictures

38 / 74

SQS versus GD: Pictures

39 / 74

Outline

WhatCTMRI



Parallelization

40 / 74

Classical gradient descent (GD)Assumptions: is convex (need not be strictly convex) has non-empty set of global minimizers

x X ={

x (?) RN : (x (?)) (x), x RN}

is smooth (differentiable with L-Lipschitz gradient)(x)(z)2 L x z2 , x, z RN

GD with step size 1/L ensures monotonic descent of :

x (n+1) = x (n) 1L (x (n)

).

Drori & Teboulle (2014) derive tightest inaccuracy bound:

(x (n)

)

(x (?)

) inaccuracy

Lx (0) x (?)22

4n + 2 .

For a Huber-like function , GD achieves that (tight) bound.O(1/n) rate is undesirably slow.

41 / 74

Nesterovs fast gradient method (FGM1)

Nesterov (1983) iteration: Initialize: t0 = 1, z (0) = x (0)

z (n+1) = x (n) 1L (x (n)

)(usual GD update)

tn+1 =12

(1 +

1 + 4t2n

)(magic momentum factors)

x (n+1) = z (n+1) + tn 1tn+1

(z (n+1) z (n)

)(update with momentum)

I Reverts to GD if tn = 1, n.I Comparable computation as GDI Store one additional image-sized vector z (n)

42 / 74

FGM1 properties

FGM1 shown by Nesterov to be O(1/n2) for primary sequence:

(z (n)

)

(x (?)

)

2Lx (0) x (?)22(n + 1)2 .

Nesterov constructed a function such that any first-ordermethod achieves

332L

x (0) x (?)22(n + 1)2

(x (n)

)

(x (?)

).

Thus O(1/n2) rate of FGM1 is optimal.Donghwan Kim (2014) analyzed secondary sequence:

(x (n)

)

(x (?)

)

2Lx (0) x (?)22(n + 2)2 .

43 / 74

SQS plus momentum for parallel MRI

I Traditional iterative soft thresholding algorithm (ISTA)uses (global) Lipschitz constant of data-fit term:

2 12 y FS22 = S

F FS S S maxI, max = maxj[S S

]j,j

max is maximum sum-of-squares value of sensitivity maps.I Augmented Lagrangian (AL) methods converge faster than

ISTA, FISTA, MFISTA (Ramani & JF, T-MI, 2011)I BARISTA (B1-based, adaptive restart, ISTA)

(Muckley, Noll, JF, T-MI, 2015)For synthesis operator x = Qz with z sparse:

2 12 y FSQ22 = Q

S F FSQ QS SQ D

for a suitable diagonal matrix D. (cf., SQS)I D1 becomes voxel-dependent step size, akin to SQS in CT

44 / 74


BARISTA convergence rates

Compressed sensing MRI reconstruction:Total variation (TV) regularizer Undecimated Haar Wavelets

Corresponding D for each of the two cases:BARISTA requires no algorithm parameter tuning, unlike AL.Includes momentum with adaptive restart of ODonoghue andCandes (2014).

45 / 74

Generalizing Nesterovs FGM

FGM1 is in the general class of first-order methods:

x (n+1) = x (n) 1L

nk=0

hn+1,k (x (k)

)where the step-size factors {hn,k} are

1 0 0 0 0 00 1.25 0 0 0 00 0.10 1.40 0 0 00 0.05 0.20 1.50 0 00 0.03 0.11 0.29 1.57 0...

. . .

Use of previous gradients = momentumIs this the optimal choice for {hn,k} ?Can we improve on the constant 2 in worst-case convergence rate?Drori & Teboulle (2014) numerically found 2 better {hn,k}

46 / 74

Optimized gradient method (OGM1)New approach by optimizing {hn,k} analyticallyInitialize: t0 = 1, z (0) = x (0) (Donghwan Kim and JF; 2014, 2015)

z (n+1) = x (n) 1L (x (n)

)(usual GD update)

tn+1 =12

(1 +

1 + 4t2n

)(momentum factors)

x (n+1) = z (n+1) + tn 1tn+1

(z (n+1) z (n)

)+ tntn+1

(z (n+1) x (n)

)

new momentum

Smaller (worst-case) convergence bound than Nesterov by 2:

(z (n)

)

(x (?)

)

1Lx (0) x (?)22(n + 1)2 .

Recently DK found a Huber-like function for which OGM1 achieves that upper bound(thus tight), inspired by numerical work of Taylor et al. (2015).

47 / 74

Example: Image restoration (!?)

Truex

Blurryy

Restoredx

Rate

0 50 100 150 200

102

100

GM

FGM

OGM

(x (n))(x) vs iteration n

48 / 74

Outline

WhatCTMRI



Parallelization

49 / 74

Ordered subsets approximation

I Data decomposition (aka incremental gradients, cf. stochastic GD):

(x) =M

m=1m(x), m(x) ,

12 ym Amx

2Wm

1/Mth of measurements

+ 1M R(x)

I Key idea. For x far from minimizer: (x) Mm(x)I SQS:

x (n+1) = x (n) D1(x (n)

)I OS-SQS:

for n = 0, 1, . . . (iteration)for m = 1, . . . ,M (subset)

k = nM + m (subiteration)

xk+1 = xk D1Mm(xk)

less work

I Coil-wise in parallel MRI (Muckley, Noll, JF, ISMRM 2014)50 / 74

Ordered subsets version of OGM1

For more acceleration, combine OGM1 with ordered subsets (OS).

OS-OGM1:Initialize: t0 = 1, z (0) = x (0)for n = 0, 1, . . . (iteration)

for m = 1, . . . ,M (subset)

k = nM + m (subiteration)

zk+1 =[xk D1Mm

(xk)]

+(typical OS-SQS)

tk+1 =12

(1 +

1 + 4t2k

)xk+1 = zk+1 + tk 1tk+1

(zk+1 zk

)+ tktk+1

(zk+1 xk

)

51 / 74

OS-OGM1 properties

I Approximate convergence rate for : O(

1n2M2

)(Donghwan Kim and JF; CT 2014)

I Same compute per iteration as other OS methods(One forward / backward projection and M regularizer gradients per iteration)

I Same memory as OGM1 (two more images than OS-SQS)I Guaranteed convergence for M = 1I No convergence theory for M > 1 unstable for large M small M preferable for parallelization

I Now fast enough to show X-ray CT examples...

52 / 74

OS-OGM1 results: data

3D cone-beam helical X-ray CT scan pitch 0.5 image x: 512 512 109 with 70 cm FOV and 0.625 mm slices sinogram : y 888 detectors 32 rows 7146 views

53 / 74

OS-OGM1 results: convergence rate

Root mean square difference (RMSD) between x (n) and x () overROI (in HU), versus iteration.(Compute times per iteration are very similar.)

54 / 74

OS-OGM1 results: images

At iteration n = 10 with M = 12 subsets.

55 / 74

OS divergence example

1 1

4

one-pixel image three intersecting rays

A =

114

x = 2, y = Ax =

228

condition number of AA = 1 consistent system of eqns.

56 / 74

OS divergence example

OS-SQS-LS for M = 3 subsets:

xnew = xold D13mxold = xold D13A(Axold y)

D = diag{AA1} = 12 + 12 + 42 = 18After 3 updates:

x (n+1) x =(

1 31812)(

1 31812)(

1 31842) (

x (n) x)

= 2(15/18)3(x (n) x

)= 125108

(x (n) x

)Divergence of OS-SQS-LS is possible even in well-conditioned,consistent case

57 / 74

Outline

WhatCTMRI



Parallelization

58 / 74

Amazon Cloud version of OS-OGMDistribute long object (320 useful slices) into (overlapping) slabs(128 slices each) across 5 separate clusters, each with 10 nodeshaving 16 cores.Use MPI (message passing interface) for within-clustercommunication:

Forward Projection

Back Projection

Regularization UpdateForward

Projection

. . .Broadcast

CommunicationBroadcast

Communication

1 16 324864 8096 128 160 192 2241

16

32

48

64

80

96

128

160

192

224

Numbers of Cores

Sp

ee

du

p

Ideal Speedup

Observed Speedup

1 32 64 96 128 160 192 22410

1

102

103

Number of Cores

Tim

e p

er

Ite

rati

on

(s

ec

)

FP BP Reg Update Comm0

5

10

15

20

25

30

35

Tim

e (

se

c)

10 nodes

1 node

Linearly scaled10 nodes

Rosen, Wu, Wenisch, JF (Fully 3D, 2013) Overlapping slabs is inefficient Communication time (within cluster, after every subset) is

serious bottleneck59 / 74

Block-separable surrogates for distributed reconstruction

Conventional OS approach uses a voxel-wise SQS:

(x) (x (n)

)+

(x (n)

)(x x (n)) + 12(x x

(n))D(x x (n))

= (x (n)

)+

Nj=1

xj(x (n)

)(xj x (n)j ) +

12 dj

(xj x (n)j

)2Diagonal matrix D majorizes the Hessian of : 2 (x) D.Distributed computing alternative: slab-separable surrogate:

(x)(x (n)

)

Bb=1

b(xb)

b(xb) , xb (x (n)

)(xb x (n)b ) +

12(xb x (n)b

)Hb(xb x (n)b

)Block diagonal matrix H = diag{H1, . . . ,HB} majorizes 2 (x) .

60 / 74

BSS continued

b(xb) , xb (x (n)

)(xb x (n)b ) +

12(xb x (n)b

)Hb(xb x (n)b

)

Hb , AbW bAb, b , diag{A1 Ab1b}

Updates parallelizable across blocks (slabs):

x (n+1)b , arg minxb0

b(xb) .

I Reduces communication.I (Apply favorite optimization method within slab.)I (Donghwan Kim and JF; Fully 3D, 2015) [Mo18]

61 / 74

Block-separable surrogate (BSS) OS-OGM

62 / 74

BSS OS-OGM: data

256 256 160 XCAT phantom (Segars et al., 2008) Simulated helical CT, 444 32 492 M = 12 subsets, B = 10 blocks, L = 5 inner iterations Matlab emulation

FBP initializer x (0) Converged x ()

63 / 74

BSS OS-OGM: rates

Outer loop interrupts momentum= BSS is slower per iteration than OS-OGM

Reduced communication reduces overall time

64 / 74

BSS OS-OGM: images

Comparable images Algorithm designed for distributed computation

65 / 74

Duality approach for using GPU Data transfer between system RAM and GPU can be bottleneck Hide communication time by overlapping with computation

Algorithm synopsis: (Madison McGaffin and JF; Fully 3D, 2015) [Wed. AM] Write cost function (x) in terms of dual variables v and u for

data-fit and regularizer:

(x) =M

i=1hi ([Ax]i ) +

k([Cx]k)

x (n+1) = arg minx

supu,v(

A u +C v) x M

i=1hi (ui )

k(vk) +

2x x (n)22

hi and denote convex conjugates of hi and Alternate between updating several projection view dual variables {ui} dual variables for one regularization direction {vk}

Using dual variables decouples regularizer and data terms OS-like method with convergence theorem More details in Dr. McGaffins talk

66 / 74

Duality-GPU: data

3D cone-beam helical X-ray CT scan pitch 0.5 image x: 512 512 109 with 70 cm FOV and 0.625 mm slices sinogram : y 888 detectors 32 rows 7146 views OpenCL on aging NVIDIA GTX 480 GPU with 2.5 GB RAM

FBP initializer x (0) Converged x ()

67 / 74

Duality-GPU: timing results

Algorithm designed specifically for GPU architecturecharacteristics Future work: combine with BSS for multiple nodes ?

68 / 74

Duality-GPU: image results

69 / 74

Summary

I Model-based image reconstruction can improve image quality for low-dose X-ray CT enable faster MRI scans via under-sampling

I Much more: dynamic image reconstruction, motioncompensation, ...

I Computation time remains a significant challengeI Moores law will not solve the problemI Algorithms designed for distributed computation are essential Block-separable surrogates to reduce communication

(Donghwan Kim and JF; Fully 3D, 2015) [Mo18] Duality approach to overlap communication with

computationAlso provides a OS-like algorithm with convergence theory(Madison McGaffin and JF; Fully 3D, 2015) [Wed. AM]

70 / 74

Bibliography I

* G. Hounsfield, A method of apparatus for examination of a body by radiation such as x-ray or gammaradiation, US Patent 1283915. British patent 1283915, London., 1972.

* R. Gordon, R. Bender, and G. T. Herman, Algebraic reconstruction techniques (ART) for thethree-dimensional electron microscopy and X-ray photography, J. Theor. Biol., vol. 29, no. 3, 47181,Dec. 1970.

* R. Gordon and G. T. Herman, Reconstruction of pictures from their projections, Comm. ACM, vol.14, no. 12, 75968, Dec. 1971.

* G. T. Herman, A. Lent, and S. W. Rowland, ART: mathematics and applications (a report on themathematical foundations and on the applicability to real data of the algebraic reconstructiontechniques), J. Theor. Biol., vol. 42, no. 1, 132, Nov. 1973.

* R. Gordon, A tutorial on ART (algebraic reconstruction techniques), IEEE Trans. Nuc. Sci., vol. 21,no. 3, 7893, Jun. 1974.

* R. L. Kashyap and M. C. Mittal, Picture reconstruction from projections, IEEE Trans. Comp., vol. 24,no. 9, 91523, Sep. 1975.

* A. J. Rockmore and A. Macovski, A maximum likelihood approach to transmission imagereconstruction from projections, IEEE Trans. Nuc. Sci., vol. 24, no. 3, 192935, Jun. 1977.

* K. Lange and R. Carson, EM reconstruction algorithms for emission and transmission tomography, J.Comp. Assisted Tomo., vol. 8, no. 2, 30616, Apr. 1984.

* K. Sauer and C. Bouman, A local update strategy for iterative reconstruction from projections, IEEETrans. Sig. Proc., vol. 41, no. 2, 53448, Feb. 1993.

* S. H. Manglos, G. M. Gagne, A. Krol, F. D. Thomas, and R. Narayanaswamy, Transmissionmaximum-likelihood reconstruction with ordered subsets for cone beam CT, Phys. Med. Biol., vol. 40,no. 7, 122541, Jul. 1995.

71 / 74

Bibliography II

* C. Kamphuis and F. J. Beekman, Accelerated iterative transmission CT reconstruction using an orderedsubsets convex algorithm, IEEE Trans. Med. Imag., vol. 17, no. 6, 10015, Dec. 1998.

* H. Erdogan and J. A. Fessler, Ordered subsets algorithms for transmission tomography, Phys. Med.Biol., vol. 44, no. 11, 283551, Nov. 1999.

* E. Hansis, J. Bredno, D. Sowards-Emmerd, and L. Shao, Iterative reconstruction for circular cone-beamCT with an offset flat-panel detector, in Proc. IEEE Nuc. Sci. Symp. Med. Im. Conf., 2010, 222831.

* B. P. Sutton, D. C. Noll, and J. A. Fessler, Fast, iterative image reconstruction for MRI in the presenceof field inhomogeneities, IEEE Trans. Med. Imag., vol. 22, no. 2, 17888, Feb. 2003.

* ,Dynamic field map estimation using a spiral-in / spiral-out acquisition, Mag. Res. Med., vol. 51,no. 6, 1194204, Jun. 2004.

* V. T. Olafsson, D. C. Noll, and J. A. Fessler, Fast joint reconstruction of dynamic R2 and field maps infunctional MRI, IEEE Trans. Med. Imag., vol. 27, no. 9, 117788, Sep. 2008.

* Y. Censor, D. E. Gustafson, A. Lent, and H. Tuy, A new approach to the emission computerizedtomography problem: simultaneous calculation of attenuation and activity coefficients, IEEE Trans.Nuc. Sci., vol. 26, no. 2, 27759, Apr. 1979.

* N. H. Clinthorne, J. A. Fessler, G. D. Hutchins, and W. L. Rogers, Joint maximum likelihood estimationof emission and attenuation densities in PET, in Proc. IEEE Nuc. Sci. Symp. Med. Im. Conf., vol. 3,1991, 192732.

* H. Erdogan and J. A. Fessler, Algorithms for joint estimation of attenuation and emission images inPET, in Proc. IEEE Conf. Acoust. Speech Sig. Proc., vol. 6, 2000, 37836.

* A. Rezaei, M. Defrise, and J. Nuyts, ML-reconstruction for TOF-PET with simultaneous estimation ofthe attenuation factors, IEEE Trans. Med. Imag., vol. 33, no. 7, 156372, Jul. 2014.

* C. Yip, J. A. Fessler, and D. C. Noll, Iterative RF pulse design for multidimensional, small-tip-angleselective excitation, Mag. Res. Med., vol. 54, no. 4, 90817, Oct. 2005.

72 / 74

Bibliography III

* F. Zhao, J. A. Fessler, S. M. Wright, and D. C. Noll, Regularized estimation of magnitude and phase ofmulti-coil B1 field via Bloch-Siegert B1 mapping and coil combination optimizations, IEEE Trans.Med. Imag., vol. 33, no. 10, 202030, Oct. 2014.

* S. S. Vasanawala, M. T. Alley, B. A. Hargreaves, R. A. Barth, J. M. Pauly, and M. Lustig, Improvedpediatric MR imaging with compressed sensing, Radiology, vol. 256, 60716, 2010.

* K. P. Pruessmann, M. Weiger, M. B. Scheidegger, and P. Boesiger, SENSE: sensitivity encoding forfast MRI, Mag. Res. Med., vol. 42, no. 5, 95262, Nov. 1999.

* M. Lustig, D. L. Donoho, J. M. Santos, and J. M. Pauly, Compressed sensing MRI, IEEE Sig. Proc.Mag., vol. 25, no. 2, 7282, Mar. 2008.

* S. Ramani and J. A. Fessler, Parallel MR image reconstruction using augmented Lagrangian methods,IEEE Trans. Med. Imag., vol. 30, no. 3, 694706, Mar. 2011.

* A. R. De Pierro, A modified expectation maximization algorithm for penalized likelihood estimation inemission tomography, IEEE Trans. Med. Imag., vol. 14, no. 1, 1327, Mar. 1995.

* Y. Drori and M. Teboulle, Performance of first-order methods for smooth convex minimization: A novelapproach, Mathematical Programming, vol. 145, no. 1-2, 45182, Jun. 2014.

* Y. Nesterov, A method for unconstrained convex minimization problem with the rate of convergenceO(1/k2), Dokl. Akad. Nauk. USSR, vol. 269, no. 3, 5437, 1983.

* ,Smooth minimization of non-smooth functions, Mathematical Programming, vol. 103, no. 1,12752, May 2005.

* D. Kim and J. A. Fessler, Optimized first-order methods for smooth convex minimization, arxiv1406.5468, 2014.

* ,Optimized first-order methods for smooth convex minimization, Mathematical Programming,2015, Submitted.

73 / 74

Bibliography IV

* M. J. Muckley, D. C. Noll, and J. A. Fessler, Fast parallel MR image reconstruction via B1-based,adaptive restart, iterative soft thresholding algorithms (BARISTA), IEEE Trans. Med. Imag., vol. 34,no. 2, 57888, Feb. 2015.

* B. ODonoghue and E. Candes, Adaptive restart for accelerated gradient schemes, Found.Computational Math., 2014.

* D. Kim and J. A. Fessler, An optimized first-order method for image restoration, in Proc. IEEE Intl.Conf. on Image Processing, To appear., 2015.

* A. B. Taylor, J. M. Hendrickx, and Francois. Glineur, Smooth strongly convex interpolation and exactworst-case performance of first- order methods, arxiv 1502.05666, 2015.

* M. Muckley, D. C. Noll, and J. A. Fessler, Accelerating SENSE-type MR image reconstructionalgorithms with incremental gradients, in Proc. Intl. Soc. Mag. Res. Med., 2014, p. 4400.

* D. Kim and J. A. Fessler, Optimized momentum steps for accelerating X-ray CT ordered subsets imagereconstruction, in Proc. 3rd Intl. Mtg. on image formation in X-ray CT, 2014, 1036.

* J. M. Rosen, J. Wu, T. F. Wenisch, and J. A. Fessler, Iterative helical CT reconstruction in the cloudfor ten dollars in five minutes, in Proc. Intl. Mtg. on Fully 3D Image Recon. in Rad. and Nuc. Med,2013, 2414.

* D. Kim and J. A. Fessler, Distributed block-separable ordered subsets for helical X-ray CT imagereconstruction, in Proc. Intl. Mtg. on Fully 3D Image Recon. in Rad. and Nuc. Med, To appear., 2015.

* W. P. Segars, M. Mahesh, T. J. Beck, E. C. Frey, and B. M. W. Tsui, Realistic CT simulation using the4D XCAT phantom, Med. Phys., vol. 35, no. 8, 38008, Aug. 2008.

* M. G. McGaffin and J. A. Fessler, Fast GPU-driven model-based X-ray CT image reconstruction viaalternating dual updates, in Proc. Intl. Mtg. on Fully 3D Image Recon. in Rad. and Nuc. Med, Toappear., 2015.

74 / 74

WhatCTMRI



Parallelization

fd@rm@0: fd@rm@1:

Documents

Iterative Reconstruction in CT and MRIweb.eecs.umich.edu/~fessler/papers/lists/files/talk/15/f3d-keynote.pdf · Iterative Reconstruction in CT and MRI (and a bit of PET and SPECT)