Analysis of Diffusion MRI Data in the Presence of …...Acknowledgements First of all, I'd like to thank my supervisor and mentor, Greg Stanisz, for giving me the freedom to nd my

Analysis of Diffusion MRI Data in the Presence of Noise

and Complex Fibre Architectures

by

Ryan Fobel

A thesis submitted in conformity with the requirements

for the degree of Master of Science

Graduate Department of Medical Biophysics

University of Toronto

Copyright c© 2008 by Ryan Fobel

Abstract

Analysis of Di�usion MRI Data in the Presence of Noise and Complex Fibre

Architectures

Ryan Fobel

Master of Science

Graduate Department of Medical Biophysics

University of Toronto

2008

This thesis examines the advantages to nonlinear least-squares (NLS) �tting of di�usion-

weighted MRI data over the commonly used linear least-squares (LLS) approach. A

modi�ed �tting algorithm is proposed which accounts for the positive bias experienced

in magnitude images at low SNR. For b-values in the clinical range (≈1000 s/mm2),

the increase in precision of FA and �bre orientation estimates is almost negligible,

except at very high anisotropy. The optimal b-value for estimating tensor param-

eters was slightly higher for NLS. The primary advantage to NLS was improved

performance at high b-values, for which complex �bre architectures were more easily

resolved. This was demonstrated using a model-selection classi�er based on higher-

order di�usion models. Using a b-value of 3000 s/mm2 and magnitude-corrected NLS

�tting, at least 10% of voxels in the brain exhibited di�usion pro�les which could not

be represented by the tensor model.

ii

Acknowledgements

First of all, I'd like to thank my supervisor and mentor, Greg Stanisz, for giving me

the freedom to �nd my own way, but always being available when I needed help,

and for teaching me to think like a scientist. To the other members of our research

group, Sharon, Wendy, Colleen, Kim, Lisa, Voytek and Emidio, for maintaining a

lighthearted mood in the o�ce and exposing me to lots of interesting science along

the way.

To the members of my supervisory committee, Simon Graham and Chuck Cun-

ningham, for all of their support and thoughtful feedback. To my fellow lab mates,

Gord, Patrick, General, Garry, Mathieu, Helen and Rachel; whether it was frisbee in

the park, hitting the pub after a hard days work, or long debates in the sixth �oor

lounge, you all helped to make these past couple of years at Sunnybrook both fun and

memorable. To Sadie Yancey, my wonderful lab/roommate, for her generous spirit

and for providing a welcome source of distractions. To the other DTI researchers

at Sunnybrook, Nancy and So�a, thanks for sharing your insight and always being

willing to lend an ear.

Thanks to my good friends and thrill seekers, Jon Lovell, Dave Crane, and Matt

Ellis. Tuckerman's and Burning man were legendary experiences, and really lit up my

imagination. To my younger brother, Christian, for our frequent nerdy conversations,

and for not rubbing it in too much when he �nished his thesis before me. To my best

friend and companion, Aislinn Clancy, for her cheesy jokes, love of life, and for never

failing to bring a smile to my face. Finally, a huge thanks to my parents, Richard

and Maribeth, for all of their love and support over the years. None of this would

have been possible without them.

iii

Contents

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii

Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii

List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

List of Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x

1 Introduction 1

1.1 Di�usion-weighted MRI . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Di�usion Tensor Imaging . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Image artifacts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 Optimal imaging parameters . . . . . . . . . . . . . . . . . . . . . . . 15

1.5 Thesis statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2 Comparison of least-squares �tting methods 19

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.2.1 Linear least-squares . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2.2 Weighted linear least-squares . . . . . . . . . . . . . . . . . . 25

2.2.3 Nonlinear least-squares . . . . . . . . . . . . . . . . . . . . . . 26

iv

2.2.4 Correcting for magnitude bias . . . . . . . . . . . . . . . . . . 27

2.2.5 Comparing �tting algorithms . . . . . . . . . . . . . . . . . . 28

2.2.6 Optimal b-value . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.2.7 Number of non-di�usion-weighted images . . . . . . . . . . . . 31

2.2.8 Rotational bias . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.3.1 Fitting performance under clinical conditions . . . . . . . . . . 34

2.3.2 Optimal b-value . . . . . . . . . . . . . . . . . . . . . . . . . . 36


2.3.4 Rotational bias . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.4.1 Fitting performance under clinical conditions . . . . . . . . . . 37

2.4.2 Optimal b-value . . . . . . . . . . . . . . . . . . . . . . . . . . 41


2.4.4 Rotational bias . . . . . . . . . . . . . . . . . . . . . . . . . . 44

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3 Testing the validity of the tensor model 59

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.2 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2.1 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2.2 Model selection . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.2.3 Fitting higher-order models . . . . . . . . . . . . . . . . . . . 66

3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.3.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.3.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

v

3.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.4.1 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.4.2 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4 Conclusions and future work 87

4.1 Magnitude-correction . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

4.2 Noise characterization . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.3 Improving SNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.4 Tractography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Bibliography 95

vi

List of Tables

2.1 b-values optimized for FA . . . . . . . . . . . . . . . . . . . . . . . . 42

2.2 b-values optimized for ε1 . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.3 Number of b0 images optimized for FA . . . . . . . . . . . . . . . . . 44

3.1 Elements of a rank-4 generalized DT . . . . . . . . . . . . . . . . . . 70

vii

List of Figures

1.1 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Stejskal-Tanner PGSE sequence . . . . . . . . . . . . . . . . . . . . . 3

1.3 Isotropic vs. anisotropic di�usion . . . . . . . . . . . . . . . . . . . . 8

1.4 Di�usion Tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 displacement vs ADC pro�le . . . . . . . . . . . . . . . . . . . . . . . 11

1.6 Trace, FA and �bre orientation maps . . . . . . . . . . . . . . . . . . 13

1.7 Fibre tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.8 Eddy currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.9 Eddy current distortions . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.10 Twice-refocussed spin echo sequence . . . . . . . . . . . . . . . . . . . 17

2.1 Log-transform of the di�usion signal . . . . . . . . . . . . . . . . . . 23

2.2 TEmin and TRmin vs. b-value . . . . . . . . . . . . . . . . . . . . . . 31

2.3 Brain masking algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.4 SNR, trace, and FA histograms from experiment . . . . . . . . . . . . 38

2.5 Reduced chi-squared, FA, and directional statistics . . . . . . . . . . 39

2.6 Reduced chi-squared, FA, and directional statistic histograms . . . . . 40

2.7 Root mean squared error in Fractional Anisotropy vs. b . . . . . . . . 46

2.8 Near-optimal range of b-values . . . . . . . . . . . . . . . . . . . . . . 47

viii

2.9 Root mean squared error in ε1 vs. b-value . . . . . . . . . . . . . . . 48

2.10 Rotational bias in σFA versus number of unique directions . . . . . . 49

2.11 Rotational variance of σε1 versus number of unique directions . . . . . 50

2.12 Rotational variance of σFA vs. FA . . . . . . . . . . . . . . . . . . . . 51

2.13 Rotational variance of σε1 vs. FA . . . . . . . . . . . . . . . . . . . . 52

3.1 Crossing �bres schematic . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.2 Crossing �bres vs. b-value . . . . . . . . . . . . . . . . . . . . . . . . 62

3.3 Hypothesis testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.4 Model selection by F-test . . . . . . . . . . . . . . . . . . . . . . . . . 65

3.5 Selected locations demonstrating non-Gaussian di�usion . . . . . . . 75

3.6 Voxel classi�cation maps vs. SNR . . . . . . . . . . . . . . . . . . . . 76

3.7 Voxel classi�cation maps vs. b-value . . . . . . . . . . . . . . . . . . 77

3.8 ADC pro�les vs. b-value . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.9 Crossing �bre-detection simulations vs. b-value (SNR=30 at b=1000 s/mm2) 80

3.10 Crossing �bre-detection simulations vs. b-value (SNR=70 at b=1000 s/mm2) 81

3.11 Crossing �bre-detection simulations at FA=0.7 . . . . . . . . . . . . 82

ix

List of Abbreviations

ADC Apparent Di�usion Coe�cient

DT Di�usion Tensor

DTI Di�usion Tensor Imaging

EPI Echo Planar Imaging

FA Fractional Anisotropy

GDT Generalized Di�usion Tensor

LLS Linear Least-Squares

MCNLS Magnitude-Corrected Nonlinear Least-Squares

MR Magnetic Resonance

MRI Magnetic Resonance Imaging

NEX Number of Excitations

NLS Nonlinear Least-Squares

pdf probability density function

RF radio frequency

x

RMSE Root Mean Squared Error

SH Spherical Harmonics

SNR Signal-to-Noise Ratio

TE Echo Time

TR Repetition Time

WLLS Weighted Linear Least-Squares

xi

Chapter 1

Introduction

The brain is often classi�ed into two distinct tissue types: grey matter and white

matter. Grey matter is the part of the brain responsible for synthesizing and process-

ing information. White matter, on the other hand, constitutes the physical �wiring�

of the brain. It is largely composed of axonal �bre bundles which transmit signals

between various brain regions. The primary application for the work presented in

this thesis is the study of these white matter �bres. Both the physical layout and the

viability of white matter connection pathways are of great interest to researchers and

clinicians.

The ability to non-invasively probe the structural organization of white matter

is made possible by a speci�c type of Magnetic Resonance Imaging (MRI) called

di�usion-weighted MRI. This chapter introduces the historical development of this

technique, its underlying theory, and a broad overview of this rapidly developing �eld

of research.

1

Chapter 1. Introduction 2

1.1 Di�usion-weighted MRI

In 1905, Einstein showed that the random motion of spherical particles suspended in

�uid, a phenomenon known as Brownian motion, was the result of thermal energy [1].

The following equation describes the displacement probability for a single molecule

along direction, x, at time, t:

P (x, t) =1√

4πDtexp

(−x2

4Dt

)(1.1)

where the self-di�usion coe�cient, D, describes the rate at which molecules tend to

spread out in a �uid medium. D = µpkBT , where µp is the mobility of the particles

(related to the particle size and viscosity), kB is the Boltzmann constant, and T is

the absolute temperature. In the case of free di�usion, the shape of the displacement

probability function, P (x, t), is Gaussian, and the average displacement is equal to√

2Dt, as demonstrated in Fig. 1.1.

Figure 1.1: (a) Brownian motion of a single water molecule modeled by a randomwalk simulation. (b) Displacement probability, P (x, t), along the x-axis for a singlemolecule starting at intial position x=0 (Eq. 1.1). <x> is the average displacementafter time t.

Almost �fty years after Einstein's discovery, Hahn [2] and Carr and Purcell [3]


Figure 1.2: The Stejskal-Tanner Pulsed-Gradient Spin Echo (PGSE) sequence [4].The image is not to scale. δ is the pulse duration, ∆ is the di�usion time, and TE isthe time to echo (or echo time).

showed how the self-di�usion coe�cient, D, could be measured using Nuclear Mag-

netic Resonance (NMR). A modi�cation of this approach by Stejskal and Tanner used

pulsed-gradients to achieve a more accurate measurement [4]. This Pulsed-Gradient

Spin Echo (PGSE) sequence, shown in Fig. 1.2, is widely used in di�usion-weighted

MRI to measure the displacement of water molecules in tissue.

The PGSE sequence resembles a standard spin echo [2] with the addition of two

large di�usion-weighting gradients after the 90◦ and 180◦ Radio Frequency (RF)

pulses. The e�ect of this sequence can be explained in terms of a single hydrogen

atom, or spin. The �rst gradient causes the spin to accumulate phase in relation to its

position along the axis of measurement, x. After the gradient is turned o�, the spin

has accumulated phase φ = γGxδx1 (assuming negligible gradient ramp times), where

γ is the gyromagnetic ratio, Gx is the gradient amplitude, δ is the gradient duration,

and x1 is the initial position of the spin along the x-axis. A second identical di�usion

gradient is applied after time ∆, but because these gradients are separated by a 180◦

RF pulse, the phase accumulation now has the opposite sign, φ2 = −γGxδx2, where


x2 is the position of the spin along the x-axis after the second gradient pulse. The

total phase accumulated by the spin is therefore:

φ1 + φ2 = γGxδ(x1 − x2) (1.2)

and is proportional to the spin displacement along the x-axis (x1 − x2). If the spin

remains stationary during the time between the two gradients (i.e. x1 = x2), the phase

terms cancel out and the echo amplitude is una�ected. If, however, the position of

the spin along the x-axis changes during di�usion time ∆, the phase terms do not

cancel. Comparing the signal to a reference signal without the di�usion gradients

allows quantitative measurement of the spin displacement.

In reality, it is not possible to measure the signal from a single spin. Instead, the

measured signal represents the net magnetization of all spins in the imaging voxel.

If the spins remain stationary throughout the experiment or if no di�usion gradients

are used, the signal is described by:

S0 =1

N

N∑j=1

µj exp(−iϕj) (1.3)

where µj and ϕj represent the magnetic moment and phase of spin j, respectively.

Vector rj describes the displacement of spin j along the direction of measurement,

and therefore the net di�usion-weighted signal is:

S =1

N

N∑j=1

µj exp(−iϕj) exp(iγGδrj) (1.4)

The signal can also be expressed as an integral in terms of the spin displacement

probability density function, P (r,∆):


S = S0

∫P (r,∆) exp(iγGδr)dr (1.5)

where S0 is the signal in the absence of any di�usion gradients. A simple change of

variables, q = γGδ2π

, reveals the Fourier relationship between S and P (r,∆):

S = S0

∫P (r,∆) exp(i2πqr)dr (1.6)

This is referred to as the q-space formalism. In the case of free di�usion, where

P (r,∆) is Gaussian (Eq. 1.1), the inverse Fourier transform describes the net loss in

magnetization due to the di�using spins [5]:

S = S0 exp[−q2∆D

]= S0 exp

[−γ2G2δ2∆D

](1.7)

To account for di�usion that occurs during the �nite pulse width, ∆ is replaced with

the term ∆− δ3, or the �e�ective� di�usion time:

S = S0 exp

[−γ2G2δ2(∆− δ

3)D

](1.8)

The sequence parameters are commonly represented in the MRI literature by a single

di�usion parameter, b:

b = γ2G2δ2(∆− δ

3) (1.9)

which reduces Eq. 1.8 to the form:

S = S0 exp [−bD] (1.10)

By performing a logarithmic transform of Eq. 1.10 and rearranging, a linear rela-


tionship between the log-normalized MRI signal, log(S/S0), and the self-di�usion

coe�cient, D, is established:

log(S/S0) = −bD (1.11)

The two unknowns in this equation are S0, the echo amplitude without di�usion

gradients, and the di�usion coe�cient, D. b is an independent experimental variable

controlled by pulse sequence parameters. If the signal is measured at two di�erent

b-values, calculation of the remaining parameters, S0 and D, is straightforward.

Discussion to this point has been limited to the case of free di�usion, in which

water molecules can freely move in any direction. In this case, the displacement

probability depends only on the temperature and mobility of the water molecules,

and the signal attenuation caused by di�using spins is monoexponential. However,

di�usion in biological tissue is complicated by the various barriers to water movement,

including cell membranes and organelles. While the self-di�usion coe�cient, D, may

be the same in both cases, the measured di�usion coe�cient in tissue is reduced. The

di�erence provides information about the cellular microstructure in the vicinity of

the water molecules and its ability to hinder and/or restrict the movement of water.

To address this distinction, the term Apparent Di�usion Coe�cient (ADC) [6] is

typically used. A further implication of di�usion in tissue is that the displacement

probability function is no longer necessarily Gaussian. The result is that the di�usion

signal in biological tissue is not monoexponential, though for b-values in the clinical

range of DTI (b≈1000 s/mm2), this is a reasonable approximation [7].

The Apparent Di�usion Coe�cient (ADC) describes the mobility of water molecules

in tissue, and this information can be used to develop insight into tissue microstruc-

ture. This is illustrated in Fig. 1.3, which shows a schematic of di�using water


molecules in two di�erent environments. In the �rst case, there is no preferential

direction for the water molecules to move, meaning that the apparent di�usion coef-

�cient is approximately the same in every direction. This is referred to as isotropic

di�usion and it is typical of tissues that lack a dominant directional organization

(e.g. grey matter). Fig. 1.3b shows cylindrical tubes that represent a simpli�ed model

of white matter with axons running in parallel. The water molecules are relatively

free to di�use along the length of the axons, but their movement is highly restricted

perpendicular to the �bre orientation. This results in what is referred to as anisotropic

di�usion.

Di�usion in the majority of tissue types is only weakly anisotropic. Therefore,

measuring the Apparent Di�usion Coe�cient (ADC) along a single direction is usually

su�cient. In highly organized tissues such as white matter and muscle, it is necessary

to sample the ADC along several directions. Although some of the early research on

di�usion-weighted MRI recognized di�erences in the ADC along the x, y and z-axes

in white matter [8], these three measurements did not provide enough information to

completely describe the di�usion pro�le.

1.2 Di�usion Tensor Imaging

In 1994, Basser et al. introduced Di�usion Tensor Imaging (DTI) [9]. DTI enables full

three-dimensional characterization of the di�usion process in vivo, making it practical

to obtain detailed knowledge of white mater �bre orientation in the human brain. The

di�usion tensor is represented by a 3x3 symmetric matrix:


Figure 1.3: Schematic of di�using water molecules in (a) isotropic (e.g. grey mat-ter) and (b) anisotropic (e.g. white matter) media. (c) Spin displacement probabilitydensity function for the case of free di�usion (i.e. no restriction). (d) Restricted dis-placement probability density function along directions parallel and (e) perpendicularto the �bres.


D =

Dxx Dxy Dxz

Dyx Dyy Dyz

Dzx Dzy Dzz

(1.12)

The three diagonal terms represent the apparent di�usion along the x, y, and z axes

in the laboratory frame of reference. The o�-diagonal terms describe the degree

of correlation between the di�usion along the primary axes. Because the matrix is

symmetric (i.e.Dij=Dji), there are only six unique elements of the di�usion tensor, D.

Assuming that the molecular displacement probability density function is Gaussian,

it can be related to the di�usion tensor by the following equation:

P (r, t) =1√

(4πt)3|D|exp

(−rTD−1r

4t

)(1.13)

The measured signal along direction r is related to the di�usion tensor by the equation:

S = S0 exp(−brTDr) (1.14)

Eigenvalue decomposition of D determines the apparent di�usion coe�cients λ1,

λ2, and λ3, corresponding to eigenvectors ε1, ε2, and ε3. The primary eigenvector,

ε1, gives the estimated �bre orientation. If the eigenvalue and eigenvector terms are

written as 3x3 matrices, Λ and E,

Λ =

λ1 0 0

0 λ2 0

0 0 λ3

(1.15)


E =

ε1x ε2x ε3x

ε1y ε2y ε3y

ε1z ε2z ε3z

(1.16)

The following equation demonstrates their relationship to the di�usion tensor, D:

D = EΛE−1 (1.17)

E is the rotation matrix relating the diagonalized tensor, Λ, to the laboratory frame.

Fig. 1.4 shows a graphical representation of the three-dimensional isoprobability dis-

placement pro�le for an anisotropic tensor in its standard from and aligned with the

laboratory axes.

Figure 1.4: The di�usion ellipsoid represents the isoprobability displacement surface.ε1 is the vector representing the estimated �bre orientation. The 3x3 eigenvectormatrix, E, rotates the di�usion tensor, D, such that its axes are aligned with thelaboratory frame, Λ.

Fig. 1.5a and b illustrate the relationship between the di�usion tensor and its

corresponding Apparent Di�usion Coe�cient (ADC) pro�le (i.e. a plot of the ADC

values versus orientation). The reason that the ADC pro�le is �peanut� shaped is

that it is a projection of the average displacement along the axis of measurement.

Fig. 1.5c shows that the in the case of multiple �bre bundles, the orientation of the

composite �bres is not obvious from the ADC pro�le.


Figure 1.5: (a) Di�usion tensor ellipsoid representing the average water displacementpro�le and (b) the corresponding ADC pro�le. The "peanut" shape of the ADCpro�le stems from the fact that it is a projection of the average di�usion along agiven axis. (c) For a pair of crossing �bres, �bre orientation cannot be easily deducedfrom the ADC pro�le.


Tensor eigenvalues can be used to calculate several rotationally invariant scalar

indices, including the trace and mean di�usivity, 〈λ〉:

trace(D) = λ1 + λ2 + λ3 (1.18)

〈λ〉 =λ1 + λ2 + λ3

3=

1

3trace(D) (1.19)

There are also several indices that describe anisotropy [10]. The most commonly

used is the Fractional Anisotropy (FA), which is calculated according to the following

equation:

FA =

√3[(λ1 − 〈λ〉)2 + (λ2 − 〈λ〉)2 + (λ3 − 〈λ〉)2]

2 (λ21 + λ2

2 + λ23)

(1.20)

These indices are often displayed using parametric maps. The orientation of the

principal eigenvector is visualized using a colour-coded map for which the x, y, and z

vector components are mapped to the red, green, and blue channels respectively, and

weighted by FA [11]. Examples of each of these maps are shown in Fig. 1.6.

It is also possible to process tensor information using �bre tracking algorithms.

These algorithms can be divided into two broad categories: streamline [12] and prob-

abilistic tractography [13]. Fig. 1.7 shows a schematic describing a simple streamline

tracking approach. These algorithms can be used to de�ne three-dimensional repre-

sentations of white matter �bre pathways. Probabilistic tractography algorithms are

an extension of this approach. They apply statistical tools and measures of uncer-

tainty to estimate the probability that two di�erent brain regions are connected.

The sensitivity of DTI to subtle changes in the cellular microstructure such as

axonal loss, demyelination, and/or in�ammation make it a valuable tool in the study


Figure 1.6: (a) Trace(D) map, (b) FA map and (c) colour-coded �bre orientationmap for a single slice of a normal human brain. These are the most commonly usedmetrics for evaluating white matter integrity and pathology. Note the strong contrastbetween white and grey matter in the FA map, and the lack of contrast in the tracemap.

of white matter disease [14]. Because of its non-invasive nature, DTI is also well suited

for a host of research applications including the study of brain development [15] and

aging [16].

1.3 Image artifacts

The magnetic �eld in the scanner is constantly changing due to rapidly switching

gradients. These changing magnetic �elds induce electrical currents in conducting

materials within the scanner. The induced currents are referred to as eddy currents,

and result in magnetic �eld gradients whose direction is opposite to the change in

�eld as shown in Fig. 1.8b and c [17]. Eddy currents build up during the time varying

part of the gradient waveform and decay during stationary phases. The resulting

waveform looks as though it has gone through a low-pass �lter (Fig. 1.8d).

The gradients resulting from eddy currents are usually classi�ed according to


Figure 1.7: Fibre tracking initiated at two di�erent seed voxels. The grey value ofeach voxel indicates its Fractional Anisotropy. Starting at a seed point (*), �bretracts follow the principal eigenvector of whatever voxel they are contained within.When they reach a voxel boundary, their direction is updated. Tracts are terminatedwhen either the change in direction is too great, or the FA drops below a certainthreshold [12].

their spatial dependence. B0 eddy current gradients are spatially constant over the

imaging volume. Linear eddy current gradients, gx(t), gy(t) and gz(t) vary linearly

with position in the x, y, and z direction, respectively. Each of these terms represents

a di�erent type of phase error, re�ected by their characteristic distortions shown in

Fig. 1.9. The net e�ect is a combination of all of these distortion types.

The di�usion-weighting gradients are very large relative to those used in most

other imaging sequences. When combined with the usual Echo Planar Imaging (EPI)

readout and its associated low bandwidth in the phase-encoding direction, this makes

DTI particularly susceptible to eddy current artifacts [17]. Because the tensor model

is �t on a per-voxel basis, any translation or deformation of the imaging volume can

cause the voxel-grouped measurements to originate from di�erent physical locations.

If the misalignment is too severe, it can render a data set useless.

To address this problem, Reese et al. developed a twice-refocused spin echo se-

quence (Fig. 1.10), which uses an additional refocusing pulse and carefully timed


Figure 1.8: (a) Gradient waveform, Gx(t), (b) the �rst derivative of the gradientwaveform, dGx(t)/dt and (c) one of the induced eddy current terms, gx(t).

pulsed-gradients to null the dominant eddy-currents [18]. This substantially reduces

image distortion relative to the standard PGSE sequence.

Like all forms of MRI, DTI is subject to partial volume e�ects. This can a�ect

tensor estimation if voxels containing white matter also contain a mixture of grey

matter and/or cerebrospinal �uid. Crossing, bending, and diverging �bres also pose

a problem because the tensor model is insu�cient to describe the di�usion pro�les

resulting from these complex geometries [19]. The detection of voxels for which the

tensor model is invalid is the subject of Chapter 3.

1.4 Optimal imaging parameters

The quality of DTI results depends on many factors, including equipment, acquisi-

tion parameters, and post-processing methods. Much work has been done to optimize

pulse sequence parameters and gradient orientations to minimize the e�ects of sys-

tematic noise and improve the accuracy of anisotropy and �bre orientation estimates.

Jones et al. used �rst-order error propagation methods to derive an analytical expres-


Figure 1.9: Distortion artifacts related to the di�erent eddy current gradient terms.(a) For an Echo Planar Imaging (EPI) readout, the B0 term causes an image shift inphase encode direction. The gz(t) term result in an image shift that is dependent onslice position. (b) The gy(t) term stretches or contracts the image along phase-encodedirection. (c) The gx(t) term produces image shear.

sion of the error in the tensor parameters [20]. This allows solving for the optimal

b-value with respect to the pulse sequence echo time (TE) and properties of the

sample (T2-relaxation and the apparent di�usion coe�cient). Alexander et al. ex-

amined the same problem using Monte Carlo simulations [21]. Both of these studies

suggest that a b-value in the neighbourhood of 1000 s/mm2 is optimal for healthy

white matter, though higher b-values may be useful for elucidating more complex

structures [21, 22, 23]. The ratio of the number of di�usion-weighted (NDW ) to non-

di�usion-weighted (Nb0) images is also important. An optimal value for the ratio

NDW :Nb0 depends on the parameter that is being measured. For example, to esti-

mate the principal direction of di�usion, only di�usion-weighted images should be

acquired [21] (i.e. NDW :Nb0 should be as high as possible). For estimating the frac-

tional anisotropy and/or trace, ratios in the range of 5:1 [21] to 8:1 [20] have been

proposed. It must be stressed that these and other �optimal� parameters apply only

to the speci�c conditions under which they were designed. Even then, they make a

number of assumptions which may not always be valid. They are intended to serve


Figure 1.10: The Twice-refocussed spin echo sequence [18]. Gradient timings areadjusted to cancel out the dominant eddy current gradients.

only as a guideline.

There is an extensive body of research concerning the orientation of di�usion

gradients [24, 25, 20, 26, 27, 28]. Because �bre orientation is unknown a priori, it is

important that the gradient sampling scheme performs similarly across all possible

�bre orientations. This property is referred to as rotational invariance. While it is

impossible to design a truly rotationally invariant scheme [29], for a given number

of gradient orientations, rotational bias is minimized by spreading out the gradient

orientations evenly on a spherical shell. This is typically performed using either the

electrostatic repulsion algorithm [20, 27] or using the vertices of a set of geometric

volumes known as the platonic solids [25, 30].

While only six gradient directions are necessary to estimate all of the parame-

ters of the di�usion tensor, additional directions can reduce rotational bias. Monte

Carlo simulations by Papadakis et al. [27] show that increasing the number of gra-

dient orientations reduces rotational bias in the mean and standard deviation of FA.

Jones demonstrated that increasing the number of directions (with an equivalent


total imaging time) reduces the rotational bias in �bre orientation, mean FA and

trace [26]. Both of these studies reported that for �tting a single tensor model, there

is a negligible bene�t to using more than thirty directions.

1.5 Thesis statement

One aspect of DTI that has been largely neglected by the research community is the

impact of data analysis on DTI results. Once di�usion-weighted images have been

collected from the MR scanner, computer algorithms are used to �t the tensor model

for each voxel in the data set. This thesis examines the implementation and di�erences

between various �tting algorithms, and proposes a new method to compensate for

signal bias at low SNR.

Chapter 2 addresses the following questions:

1. How important is the choice of �tting method to estimates of anisotropy and

�bre orientation?

2. Under what experimental conditions are the di�erences between �tting algo-

rithms most signi�cant?

3. Do optimal imaging parameters depend on the choice of �tting algorithm?

Chapter 3 explores a method for identifying voxels for which the di�usion tensor

model is invalid (i.e. those likely to contain complex �bre geometries). It extends an

existing model-selection framework [31] through the development of nonlinear �tting

algorithms for higher-order di�usion models. The resulting classi�er is appropriate

for high b-values (above 1000 s/mm2), for which complex �bre geometries are more

easily resolved [19, 22].

Chapter 2

Comparison of least-squares �tting

methods

2.1 Introduction

Since the introduction of Di�usion Tensor Imaging in 1994 by Basser et al. [9],

much progress has been made in designing more robust and accurate DTI acqui-

sition strategies. However, clear consensus within the DTI community is lacking

with regards to the choice of �tting algorithm. Historically, most studies have used

simple and e�cient linear regression techniques [9, 25, 20, 26, 28]. A small mi-

nority have opted for more sophisticated and computationally expensive nonlinear

methods [32, 27]. From a theoretical standpoint, nonlinear techniques are more

sound. The added computational time has probably been the major barrier to

their adoption. Furthermore, the long history of using linear regression methods

in the �eld, combined with their wide availability in popular DTI software packages

(e.g. DTI Studio [33] and FSL [34]), has also meant that a transition to nonlin-

ear �tting has yet to gain wide acceptance. This leads to an important question:

19

Chapter 2. Comparison of least-squares fitting methods 20

How much of an impact does the choice of �tting algorithm have on DTI results?

For b-values in the clinical range (b≈1000 s/mm2), Koay et al. recently demon-

strated that nonlinear techniques o�er a modest improvement in the accuracy of

Fractional Anisotropy (FA) and trace measurements for simulated data [35]. For

tractography applications, the principal direction of di�usion, ε1, is of critical impor-

tance, yet the e�ect of di�erent �tting algorithms on the estimation of this parameter

has not yet been studied.

Furthermore, bias introduced by the magnitude operation as signals approach

the noise �oor has been shown to a�ect DTI acquisitions with low SNR, high dif-

fusivity and/or high b-values [32]. This chapter describes the implementation of a

tensor �tting algorithm that compensates for this magnitude bias. This approach is

compared to linear, nonlinear and weighted least-squares �tting methods using both

simulations and in vivo data. The e�ect of �tting algorithms on the estimation of

Fractional Anisotropy and �bre orientation over a range of SNR, b-value, and gra-

dient orientation schemes is investigated to identify experimental conditions under

which di�erences between �tting algorithms are signi�cant.

Chapter 1 introduced several previous studies that examine optimization of the

DTI experiment. These focused on the b-value [21, 20], the ratio of di�usion-weighted

to non-di�usion-weighted images (NDW :Nb0) [21, 20], and the number of gradient

orientations necessary to achieve relative rotational invariance [26, 27]. All but one of

these studies [27] employ simulations and analytical expressions based on linear least-

squares regression. This chapter reevaluates the optimization of these parameters

using Monte Carlo simulations to determine whether or not the results are dependent

on the choice of �tting algorithm.


2.2 Theory

The three �tting methods commonly used to �t di�usion tensor data are linear least-

squares (LLS), weighted linear least-squares (WLLS), and nonlinear least-squares

(NLS). These are sometimes referred to as linear, weighted linear, and nonlinear

regression methods. All of these methods are constructed in a similar fashion. For

each voxel in the imaging volume, there are six unique tensor elements as well as the

non-di�usion-weighted signal value, S0, that need to be estimated. These parameters

are represented by a 7x1 column vector:

x = [Dxx, Dyy, Dzz, Dxy, Dxz, Dyz, ln(S0)]T (2.1)

Each gradient sampling direction is represented by a three-dimensional unit vector,

ri, which has x, y, and z components: rix,riy, and riz. Each image also has an

associated b-value, bi. In most cases, this b-value is the same for all di�usion-weighted

images, but it is also possible to use multiple b-values in a single experiment. One

or more non-di�usion-weighted images are necessary to normalize the signal. These

non-di�usion-weighted images are commonly referred to as b0 images. The gradient

directions and b-values are used to construct an Nx7 experimental design matrix, B.

Each row of this matrix, Bi, corresponds to one of the N images.

Bi =[−bir2

ix, −bir2iy, −bir2

iz, −2birixriy, −2birixriz, −2biriyriz, 1]

(2.2)


B =

B1

...

Bi

...

BN

(2.3)

The same design matrix can be used for any of the �tting techniques. The linear

least-squares, weighted linear least-squares and nonlinear least-squares algorithms

are described in the following sections.

2.2.1 Linear least-squares

Linear least-squares is both the simplest and most widely used �tting algorithm ap-

plied to DTI data. Measurements are �rst transformed by taking their natural log-

arithm. This converts the monoexponential di�usion signal (Eq. 1.14) into a simple

linear relationship as illustrated by Fig. 2.1. If the log-transformed measurements are

written as a column vector,

Y = [ln(S1), ln(S2), · · · , ln(SN)]T (2.4)

the linear relationship between the design matrix, B, and the model parameters, x,

is described in matrix form by the equation:

Y = Bx+ ln(e) (2.5)

The ln(e) term is a column vector representing the residuals between the tensor

model and the log-signal measurements. The linear least-squares algorithm seeks to

minimize the sum of the squares of these residual terms, thus solving for x. This


is accomplished by calculating the Moore-Penrose pseudoinverse [36] of the design

matrix, B+, and multiplying it by the log-transformed measurements.

x = (BTB)−1BT = B+Y (2.6)

One performance advantage of the LLS approach is that the pseudoinverse of the

design matrix only needs to be calculated once. It is then multiplied by the log-

transformed signal from each voxel in the image.

In order for the linear least-squares solution to be optimal, there are three neces-

sary conditions. All of the residuals must be uncorrelated, have a mean of zero, and

have equal variances [37]. Although the errors should be independent in most cases,

satisfying the �rst condition, the second and third conditions are violated by a pair

of operations performed on DTI data; the magnitude and logarithmic transforms.

Figure 2.1: Simulated di�usion signal (a) before and (b) after the log-tranformation.The noisy data points give an indication of the noise behaviour across the full rangeof b-values (SNR=20 at b=0 s/mm2). The measured mean is positively biased due tothe magnitude operation. These graphs represent an Apparent Di�usion Coe�cientof 2x10−3 mm2/s, which is close to the maximum observed in healthy white matter.

Di�usion-weighted images have a real and imaginary component, though it is com-


mon for the MR reconstruction software to apply the magnitude operation (i.e. |Si| =√Re(Si) + Im(Si)). With su�ciently high SNR, noise follows a roughly Gaussian

distribution, and it is common practice to treat it as such. In reality, the signal in mag-

nitude images follows a Rician distribution [38], and measurements approaching to

the noise �oor exhibit a positive bias. This is clearly illustrated in Fig. 2.1. Although

this problem exists in many other �elds, its applicability to MRI was �rst demon-

strated by Henkleman [39]. More recently, Jones and Basser examined the e�ect of

magnitude bias in the context of DTI [32]. Overestimation of the di�usion-weighted

signal causes an underestimation of the Apparent Di�usion Coe�cient (ADC), and

since this e�ect is most prominent for those measurements with a high ADC, it leads

to a �squashed peanut� e�ect [32].

The logarithmic transformation also introduces other problems to the linear least-

squares solution. This operation results in the variance being dependent on the mag-

nitude of the signal, and therefore on the b-value and Apparent Di�usion Coe�cient

(ADC). At low SNR, noise is ampli�ed as demonstrated in Fig. 2.1b. The expected

value for a log-transformed signal is also negatively biased and the variance is no

longer symmetric about the mean. These e�ects become increasingly important as

SNR decreases.

Considering the properties of both operations, the linear least-squares algorithm

can be expected to perform reasonably well at high SNR and in voxels where the

signal magnitude is relatively consistent across the entire set of measurements (nearly

isotropic di�usion). The following algorithms aim to address the limitations inherent

in the linear least-squares approach.


2.2.2 Weighted linear least-squares

The weighted linear least-squares algorithm improves on the standard LLS method

by accounting for di�erences in variance about each of the signal measurements.

Following the application of the log-transform, the signal variance at each point is

proportional to the square of the signal magnitude divided by the square of the noise.

This can be used to construct the diagonals of the covariance matrix W. Because

individual measurements are independent, all of the o�-diagonal terms are equal to

zero.

W = diag(S2i /σ

2i ) (2.7)

The model parameters are then calculated according to the equation:

x = (BTWB)−1(BTW)Y (2.8)

Notice that Eq. 2.7 includes noise terms, σi. Although it is possible to estimate this

parameter, if the noise can be assumed to be consistent for all measurements, its

value will have no e�ect on the �t. Therefore, it can simply be set to one. Because

the diagonal entries in the covariance matrix depend on noisy signal measurements,

it is advantageous to recalculate W following the initial �t with the updated signal

estimates:

S = exp (Bx) (2.9)

Performing a second iteration of Eq. 2.8 with the updated covariance matrix produces

a better solution. Although the weighted least-squares method compensates for the

unequal variance caused by the logarithmic transform, it does not address the non-


symmetric (negatively skewed) residuals. It is also susceptible to the magnitude bias.

Computational demands for the WLLS algorithm are slightly higher than those

of LLS because pseudoinversion of the weighted design matrix must be performed

twice for each voxel in the image, whereas for LLS, the pseudoinverse only needs to

be calculated once for the entire experiment.

2.2.3 Nonlinear least-squares

The nonlinear least-squares method di�ers from both WLLS and LLS by �tting the

model directly to the di�usion-weighted signal without any need for the logarithmic

transform. As a result, residuals have equal variance and an expected value equal to

zero in all situations where the magnitude bias is not signi�cant. In matrix form, the

signal equation is:

S = [S1, S2, · · · , SN ]T (2.10)

S = exp(Bx) + e (2.11)

A problem with nonlinear �tting is that there is no analytical solution. Since it is

not possible to solve the system of equations algebraically, an iterative minimization

of the sum of squared residuals is performed.

minN∑i=1

e2i = minN∑i=1

[Si − exp(Bix)]2 (2.12)

A consequence of nonlinearity is that there is no guarantee of �nding the global

minimum to the optimization problem. The search is usually initialized with the

either the LLS or WLLS solution, ensuring that the NLS solution is at least as


good. The NLS algorithm also requires an e�cient means for searching the parameter

space. Several parameter search algorithms are available. The Levenberg-Marquardt

algorithm is widely used in the DTI community [27, 32, 40], though it has been

suggested that others may perform moderately better [41].

2.2.4 Correcting for magnitude bias

Jones and Basser proposed a correction scheme that introduces an additional noise-

estimation parameter to the nonlinear �tting algorithm [32]. This method was used

to independently �t ADCs along a single direction using multiple b-values. By adopt-

ing the same objective function, fMCNLS (Eq. 2.13), it is possible to �t the tensor

model. In addition, the noise parameter can be measured from a background region

of the image and incorporated as a �xed value. This reduces the number of model

parameters, improving the accuracy of the �t and computational e�ciency.

fMCNLS(x) =N∑i=1

[Si −

√exp2(Bix) + σ2

]2(2.13)

The MR signal consists of two independent channels, one real and one imaginary.

Assuming that they are both normally distributed with a standard deviation of σ, it

is possible to estimate σ from a background region of the magnitude image using the

following equation [39]:

σ =mean(Sbackground)√

π2

(2.14)

In regions of high SNR,√

exp2(Bix) + σ2 ≈ exp(Bix), and the magnitude-corrected

�t reduces to the standard nonlinear least-squares �t.


2.2.5 Comparing �tting algorithms

The chi-squared statistic, χ2, is a measure of the overall �goodness� of �t [42]. A

lower chi-squared value indicates better agreement between the data and the model.

If the model is appropriate and the noise term, σ, is known, the expected value for

χ2 is equal to the degrees of freedom in the system, ν. The degrees of freedom is

the number of measurements minus the number of model parameters, (i.e. for �tting

the di�usion tensor model, ν=N -7). Normalizing χ2 by the degrees of freedom gives

the reduced chi-squared statistic, χ2r. This reduced form has the advantage that it

allows comparison between �ts with di�erent numbers of measurements and/or model

parameters. The expected value of χ2r is equal to one.

χ2 =N∑i=1

[Si − exp(Bx)]2

σ2i

=1

σ2

N∑i=1

e2i (2.15)

χ2r =

χ2

ν=

χ2

(N − 7)(2.16)

Besides knowing the expected value for chi-squared statistics, the distribution is also

well characterized [42]. This makes it possible to calculate the probability of an

obtained chi-squared value. For the purpose of comparing di�erent �tting algorithms,

assuming that the model is correct and using the same data, the best algorithm is

the one that produces the smallest chi-squared value on average.

Using computer simulations, the error in Fractional Anisotropy (FA) and �bre

orientation, ε1, can be determined because their true values are known. Metrics such

as the root mean square error (RMSE) or mean absolute error (MAE) can therefore

be used to compare the relative performance of di�erent �tting algorithms. In the

case of in vivo experiments, it is not possible to calculate these metrics because the


truth data is unavailable. Instead, the variation in these parameters over repeated

experiments allows for estimation of experimental precision.

The variables σFA and σε1 represent the experimental precision in FA and �bre ori-

entation. σFA is de�ned as the standard deviation in FA measurements over repeated

experiments. σε1 is the average angular distance between the measured principal

eigenvector, ε1, and the dyadic tensor average, 〈ε1〉 [43]. This is a parametric ana-

logue to Jones' cone of uncertainty [44], where 2σε1 corresponds to a 95% directional

con�dence interval. σε1 can be interpreted as the root mean squared angular di�erence

between the measured direction of di�usion and the sample average. Mathematical

expressions for these parameters are:

σFA =

√√√√ 1

n− 1

n∑i=1

(FAi − 〈FA〉)2 (2.17)

σε1 =

√√√√ 1

n− 1

n∑i=1

(arccos |ε1i · 〈ε1〉|)2 (2.18)

When comparing the performance of di�erent �tting algorithms for the same data,

the best method is the one that minimizes χ2r, σFA, and/or σε1 .

2.2.6 Optimal b-value

The most common approach to collecting DTI data is to acquire one or more b0

images, followed by a set of di�usion-weighted images at a single b-value. This can

be thought of as acquiring points on a spherical shell. Increasing the b-value improves

contrast between signal measurements, but it also results in a loss of SNR, because

the di�usion signal decays in proportion to exp(-bD). Several groups have examined

this trade-o� in the context of tensor estimation [21, 20]. They reported that the


optimal b-value depends heavily on the mean di�usivity, and to a lesser extent, the

anisotropy. For healthy white matter, a b-value in the neighbourhood of 1000 s/mm2

is recommended.

For the standard Stejskal-Tanner PGSE sequence (Fig. 1.2), b is a function of three

pulse sequence parameters: the pulsed gradient amplitude, G, gradient pulse width,

δ, and the di�usion time, ∆. Assuming negligible ramp times (i.e. high slew rates),

this relationship is described by Eq. 1.9. There are several important considerations

when increasing the value of b. First of all, the pulse width should be kept as small

as possible to minimize the amount of di�usion that occurs during application of the

gradients, enabling use of the short pulse-width approximation [5]. Secondly, b varies

linearly with the di�usion time, so increasing the di�usion time has less of an impact

relative to changing the gradient width or amplitude, which both have a second-order

relationship. Increasing the di�usion time also results in a longer echo time (TE),

which leads to reduced SNR due to T2-relaxation e�ects. Finally, if the di�usion time

is too long, intra/extracellular exchange and restricted di�usion e�ects can become

signi�cant [45]. For these reasons, it is preferable to increase b by adjusting the gra-

dient amplitude parameter, G. However, the maximum available amplitude depends

on scanner hardware, and this limitation is commonly reached. Therefore, achieving

very high b-values typically involves some form of compromise [7].

In order to evaluate the performance of tensor estimation across a range of b-

values, it is important to compensate for changes in relative SNR caused by T2-

relaxation and variable echo times. While previous studies use analytical models

relating b to the minimum achievable TE [21, 20], this relationship is more convoluted

for the twice-refocused spin echo sequence [18] commonly used to reduce eddy current

artifacts. For this reason, minimum TE values were collected directly from the 3T GE

Signa console. For a given T2, SNR was normalized by exp(-TE/T2) in simulations


to account for T2-relaxation e�ects.

Higher b-values are also associated with an increased scan time due to longer

TRs. For a �xed imaging time, this limits the total number of images that can be

acquired. Previous groups have ignored this e�ect [21, 20], but assuming that the

SNR is proportional to the square root of the total imaging time, SNR in simulations

can be normalized by√TR. The minimum TE, TR and the relative SNR factors are

shown in Fig. 2.2 for a T2 of 80 ms, which corresponds to healthy white matter [46].

Figure 2.2: SNR normalization curves for the 3T GE Signa scanner based on scanparameters described in section 2.3. (a) Minimum TE for a range of b-values. (b) Min-imum TR for a range of b-values. (c) SNR relative to b=1000 s/mm2 for a T2 of 80 ms,accounting for changes in minimum TE only, and for a combination of minimum TEand minimum TR.

2.2.7 Number of non-di�usion-weighted images

Non-di�usion-weighted images are extremely important because they provide the ref-

erence to which all di�usion-weighted images are normalized. As such, this mea-

surement has a large impact on all of the tensor parameters. Acquiring multiple b0

images can improve tensor precision, but for a �xed imaging time, this comes at the

cost of reducing the number of gradient directions. The number of b0 images can

be expressed as a ratio of the number of di�usion-weighted images to non-di�usion-


weighted images, NDW :Nb0 , allowing this attribute to scale for experiments with a

di�erent number of total image acquisitions.

Jones et al. tested a range of NDW :Nb0 ratios in a water phantom experiment [20].

Their results indicated that a ratio of 8.33 minimized the standard deviation in the

trace of the di�usion tensor, and minimized the Fractional Anisotropy (for a water

phantom, FA should be equal to zero). However, this NDW :Nb0 ratio may not be

optimal for a clinical DTI exam, since the experiment measured isotropic di�usion

in �uid water and used a relatively small b-value (b=453 s/mm2). Alexander et al.

performed Monte Carlo simulations under conditions typical of clinical practice [21].

For determining the principal eigenvector, they showed that the optimal strategy is to

collect no b0 images at all. This is probably because the eigenvector does not depend

on the value of the tensor parameters, just their relative sizes (i.e. normalization is

not important). For a good trade-o� between directional information and size and

shape indices, they suggested a NDW :Nb0 ratio of about 5. Both of these experiments

used the linear least-squares �tting algorithm.

2.2.8 Rotational bias

Rotational bias refers to di�erences in the accuracy and precision of tensor-based

parameter estimation relative to �bre orientation. These di�erences are a form of

systematic error, i.e. they are not simply the product of random noise �uctuations,

but have an underlying, repeatable cause. Rotational bias applies to all tensor pa-

rameters, but its e�ects on Fractional Anisotropy and �bre orientation are the most

signi�cant for the majority of applications. There are two categories of rotational bias

to consider. First, there is rotational bias in the mean. This is a question of accu-

racy, and whether or not the level of accuracy depends on the �bre orientation. The


second type has to do with the degree of uncertainty. In other words, is the precision

in Fractional Anisotropy or �bre orientation dependent on �bre orientation?

Papadakis et al. examined both types of bias with respect to anisotropy in-

dices [27]. They showed that rotational bias in the mean value of these indices was

relatively small compared to rotationally dependent di�erences in precision. Both

were reduced with the addition of more gradient directions, but there was negligible

bene�t beyond 18-21 directions. Interestingly, this early study employed nonlinear

�tting methods, but the question remains as to whether or not the results would

have been di�erent using LLS. It is also important to point out that the b-value used

in these simulations, b=1570 s/mm2, is higher than that used in standard clinical

practice. Finally, this study makes no mention of rotational bias in �bre orientation

estimates.

This second question was addressed by simulations performed by Jones [26]. These

were designed to examine the trade-o� between increasing the number of gradient di-

rections versus acquiring a smaller number of directions multiple times, for an equiv-

alent total imaging time. This study used LLS �tting and b=1000 s/mm2. Increasing

the number of unique directions dramatically reduced rotational bias in both mean

Fractional Anisotropy and uncertainty in the principal eigenvector, and the bias be-

came increasingly signi�cant at higher FA values. Jones concluded that to reduce

rotational bias e�ects to negligible levels, 20 directions were necessary for FA and 30

for the �bre orientation.


2.3 Methods

2.3.1 Fitting performance under clinical conditions

Experimental DTI data were obtained from 3 healthy volunteers using a 3T GE

Signa system. Imaging parameters were as follows: 23 gradient orientations based

on the electrostatic-repulsion algorithm [20], b=1000 s/mm2, 2 b0 images, 2.6 mm

isotropic voxels, 48 slices, and a single excitation EPI readout. The maximum gradient

amplitude was 40 mT/m, and echo time and repetition time were 84.5 and 12 000 ms

respectively. The scan was repeated eight times to estimate experimental precision

in tensor parameters for a total scan time of 45 minutes. A twice-refocused spin

echo sequence was used to reduce eddy-current e�ects [18]. The noise parameter, σ,

was calculated from a 15x15 background region using all b0 images across all slices

(15x15x2x48=21 600 voxels). The region was carefully placed to avoid zero-padding

at the edges of the �eld of view, as well as regions potentially e�ected by Nyquist

ghosting [17]. All analysis was performed using Matlab (Mathworks, Natick, MA)

and the immoptibox �tting routines [47]. Complete source-code is available as the

dwi-toolbox package [48].

In order to facilitate comparison with simulations across a range of anisotropy

values, whole-brain masks were designed to avoid voxels likely to be a�ected by sus-

ceptibility artifacts and those containing CSF. First, b0 images were thresholded at

S0 > 13σ and a 3x3 median �lter was applied to remove isolated voxels [49]. This was

the minimum threshold for which no voxels were selected from outside of the brain.

An upper threshold of S0 > 30σ was used to select voxels containing CSF [31]. In

this case, the threshold was increased as far as possible such that the ventricles were

clearly removed. A 3x3 median �lter was also applied to the CSF mask. Finally,

edge detection was performed on the resulting binary mask, and this edge mask was


convolved with a 3x3 box function. This de�ned a mask of boundary voxels, i.e.

brain/air, brain/bone or brain/ventricle borders. These voxels were removed in or-

der to reduce the impact of susceptibility artifacts and partial volume e�ects on the

whole-brain analysis. A graphical representation of this process is demonstrated in

Fig. 2.3. Note that although this masking algorithm is quite aggressive in removing

voxels from analysis, its purpose is to automatically de�ne a large set of voxels for

which artifacts are small. All of the �tting algorithms are being compared against

the same data. The goal of this experiment was to evaluate the di�erences between

�tting algorithms in a best-case scenario.

Figure 2.3: (a) Binary brain mask (SNR>13) �ltered with a 3x3 median �lter.(b) CSF mask (SNR>30) �ltered with a 3x3 median �lter. (c) Brain mask minusCSF. (d) Edge detection of c convolved with a 3x3 box function. (e) c minus d.

Di�usion tensors were �t for all voxels using the LLS, WLLS, NLS and MCNLS

routines. Reduced chi-squared (χ2r), Fractional Anisotropy (FA), and �bre orientation

(ε1) were calculated for each voxel. The means and standard deviations of these

parameters were estimated over the set of eight repeated measurements. Voxels were

binned by their mean Fractional Anisotropy values (as calculated by NLS) over the

eight repetitions, with bins centered at 0.1, 0.2, 0.3, ..., 0.9.

Monte Carlo simulations were performed across the same range of FA values

for comparison. The b-value, number of b0 images, and gradient orientations were

matched to the experiment. SNR of the b0 images was set to 20. For each FA value,


100 instances of a reference tensor, D0, with trace 2.1 mm2/ms, were oriented uni-

formly in 3-D space. Di�usion-weighted signals were calculated and complex noise

was added in quadrature. The simulation was repeated 1 000 times producing 100 000

data sets (100 orientations x 1 000 repetitions) at each FA value.


Monte Carlo simulations were performed as in the previous section over a range of

b-values between b=300 to 3000 s/mm2 and FA values from 0 to 0.9. The full set

of simulations was repeated three times, once with no SNR normalization (i.e. the

same SNR for all b-values), once normalizing for minimum TE, and once normalizing

for the minimum TE and TR (based on Fig. 2.2). Optimal b-values were de�ned as

those which minimized the root mean squared error in Fractional Anisotropy (FA) or

�bre orientation. b-values for which the root mean squared error was within 5% of

its optimal value were also calculated, giving the range of near optimal performance.


Simulations were performed for a total of N=25 images, with the number of b0 images

(Nb0) incremented from 1 to 10. The remaining images (NDW=N -Nb0) were used for

di�usion-weighting at a b-value of 1000 s/mm2. The optimal number of b0 images

was de�ned as that which minimized the root mean squared error in FA or ε1. As

in the previous experiment, the range of Nb0 values for which the root mean squared

error was within 5% of its optimal was also calculated.



Following Jones [26], a total of 60 di�usion-weighted images were broken into seven

possible scenarios: 6, 10, 12, 15, 20, 30, and 60 unique directions, with 10, 6, 5, 4, 3,

2, and 1 NEX respectively. In each case, 10 b0 images were simulated, resulting in an

NDW :Nb0 ratio of 6:1. Each experiment represents an equivalent total imaging time:

60 DW images plus 10 b0 image, for a total of 70 images. The remaining simulation

parameters were as follows: b=1000 s/mm2, trace=2.1 mm2/ms, and SNR of the b0

images was set to 20.

For each value of FA between 0 and 0.9, 10 000 repetitions were performed over

100 evenly spaced tensor orientations [50]. More repetitions were required relative

to previous experiments to obtain adequate precision in parameter values at each

tensor orientation. Simulated data were �t using the LLS, WLLS, NLS and MCNLS

algorithms. Images repeated along the same direction were included as separate points

in each �t.

2.4 Results

2.4.1 Fitting performance under clinical conditions

Fig. 2.4 shows the SNR, trace, and Fractional Anisotropy (FA) histograms, averaged

across eight repetitions for one of the subjects. The average trace and FA statistics

correspond to the NLS �t. The FA histogram had a peak at 0.2 and less than 5%

of the voxels had an FA greater than 0.7. Simulations used SNR and trace values

corresponding roughly to the peaks of the experimental distributions (SNR=20 and

trace=2.1 mm2/ms).

Fig. 2.5 shows the average reduced chi-squared, σFA, and σε1 over a range of FA


Figure 2.4: Whole-brain mask (a) SNR (b) trace and (c) Fractional Anisotropy (FA)histograms, averaged across eight repetitions for a single subject.

values from simulations and for the combination of all 3 subjects. WLLS, NLS, and

MCNLS were virtually indistinguishable under these conditions. WLLS, NLS, and

MCNLS reduced chi-squared statistics were in agreement with the expected value of

one for all simulations. For the LLS �t, deviation from the expected value increased

with FA (Fig. 2.5a). This is consistent with other recent �ndings [35]. Similar, though

slightly higher, chi-squared values were observed in vivo. Experimentally measured

uncertainty in Fractional Anisotropy (FA) and �bre orientation agreed qualitatively

with simulations. For both parameters, LLS showed greater uncertainty versus other

�t types, and this di�erence became more pronounced with increased anisotropy

(Fig. 2.5).

Fig. 2.6 gives probability density functions for the reduced chi-squared, Fractional

Anisotropy, and �bre orientation from simulations with FA=0.9. This is the situation

under which di�erences between the �tting algorithms are most apparent. WLLS,

NLS and MCNLS are indistinguishable from one another under these conditions, so

only the linear least-squares and nonlinear least-squares results are shown. WLLS,

NLS, and MCNLS algorithms followed the theoretical reduced chi-squared distribu-

tion, while the LLS values were slightly higher. Uncertainty in measuring Fractional


Figure 2.5: Mean reduced chi-squared, χ2r, from (a) simulations and (b) combined

data from 3 test subjects binned by Fractional Anisotropy (FA). Standard deviationin FA, σFA, from (c) simulations and (d) experiment. Standard angular deviationin the principal eigenvector, σε1 , from (e) simulations and (f) experiment. Imagingparameters: b=1000 s/mm2, 23 gradient orientations, and SNR≈20. Error bars show± one standard deviation con�dence intervals for the sample means.


Figure 2.6: Simulation histograms for FA=0.9. (a) Reduced chi-squared, χ2r, (b)

Fractional Anisotropy (FA), and (c) angular di�erence of ε1 from the mean dyadictensor in degrees. σε1 is also labeled, representing the root mean squared angulardi�erence between the �bre orientation and the mean dyadic tensor.

Anisotropy was also increased using LLS, as evidenced by the broader distribution in

Fig. 2.6b. Fig. 2.6c shows a histogram of the angular di�erence between the estimated

direction, ε1, and the mean dyadic tensor, 〈ε1〉 [43]. The further this distribution shifts

to the right, the greater the degree of uncertainty in estimating the �bre orientation,

which corresponds to an increase in σε1 .

The average computational time for the four �tting algorithms was 0.01, 1.5,

4.9 and 5.2 ms/tensor for the LLS, WLLS, NLS and MCNLS �ts respectively using

Matlab 7.0 on a 2.13 GHz Intel Core 2 PC. The WLLS, NLS, and MCNLS �ts were

of the same order of magnitude, while the LLS �t was roughly 100 times faster. All

Matlab code was optimized where possible, though some di�erences may be related

to implementation.



Fig. 2.7 shows the root mean squared error in Fractional Anisotropy (FA) plotted

against the maximum b-value for three values of FA: 0.1, 0.5, and 0.9. Fig. 2.7a-c

represent the case where SNR is consistent across all b-values, while d-f shows the

case where the SNR is normalized for minimum TE and minimum TR. Normalizing

for just TE produced results that were intermediate to the two. For low anisotropy,

although the optimal b-value is the same across all �t types, NLS outperforms all

other algorithms for b-values greater than 1500 s/mm2. The bene�t of the MCNLS

�tting algorithm is only apparent at high b-values in the highly anisotropic case, and

it is relatively small (Fig. 2.7cf).

Fig. 2.8a-c shows the range of b-values for which the root mean squared error

in Fractional Anisotropy is within 5% of its minimum value under the three SNR

scenarios (SNR=20, normalized for minimum TE, and normalized for minimum TE

and minimum TR). MCNLS and WLLS (not shown) had results similar to NLS. Full

details are available in Table 2.1. The major di�erence between the �tting algorithms

relates to their behaviour relative to di�usion anisotropy. At low anisotropy, optimal

b-values are consistent across all �tting algorithms. However, for the LLS �t, the

optimal b-value decreases with increased anisotropy. This is consistent with previous

reports [21]. The other �tting algorithms show the opposite trend, with the optimal

b-value increasing with anisotropy. When trying to select a b-value that performs well

across the full range of anisotropies, the consequence is that the maximum b-value for

LLS is limited by performance at high anisotropies, while the other �tting algorithms

are limited by their performance at low anisotropies.

Fig. 2.8d-f, Fig. 2.9 and Table 2.2 show the analogous results for minimizing the

root mean squared error in ε1. In general, b-values are slightly lower than those opti-


Table 2.1:

mized for Fractional Anisotropy (FA). In addition, SNR normalization has a stronger

impact on reducing the optimal b-value than it did in the case of FA. For estimating

ε1, the NLS �t has performance equivalent to or better than all other algorithms at all

b-values tested. The MCNLS �t shows no advantage for estimating �bre orientation

even at high b-values. LLS performs signi�cantly worse than all other algorithms in

cases of high b-value combined with high-anisotropy.


Table 2.2:


Table 2.3 shows the number of b0 images that minimize the root mean squared error

in Fractional Anisotropy across a range of anisotropies. Although WLLS, NLS, and

MCNLS showed a tendency towards a higher number of b0 images, this di�erence was

never greater than one. Lower anisotropies required a smaller relative number of b0

images, i.e. a higher NDW :Nb0 ratio.


Table 2.3:


Fig. 2.10 shows σFA (actual FA=0.9) plotted as a function of tensor orientation for the

LLS and NLS algorithms under 3 scenarios: 6 directions with 10 NEX, 10 directions

with 6 NEX, and 60 directions with 1 NEX. The results for NLS, WLLS, and MCNLS

were indistinguishable under these simulation conditions, so only the NLS and LLS

results are shown. In the case of 6 directions repeated 10 times, all �tting algorithms

had comparable performance, both in terms of the mean uncertainty in Fractional

Anisotropy and in the standard deviation of σFA with respect to tensor orientation

(Fig. 2.10g and h).

The standard deviation with respect to tensor orientation is a measure of how

much the uncertainty changes with respect to �bre orientation, indicating the rota-

tional bias in precision. Increasing the number of directions beyond 6 reduced the

mean value of σFA by about 20% in the case of NLS, WLLS, and MCNLS, whereas

no e�ect was observed for the LLS �t (Fig. 2.10g). The rotational bias in σFA was re-

duced with an increase in the number of directions for all �tting algorithms, although

the LLS asymptote was reached with 30 directions, while the asymptotic value for all

other algorithms was reached with just 15 directions (Fig. 2.10h).

The results for �bre orientation were similar. Moving beyond 6 unique directions


reduced the mean value of σε1 by about 25% in the case of NLS, WLLS, and MCNLS,

whereas the mean value for LLS did not change (Fig. 2.11g). Rotational bias in σε1

was minimized with 30 directions for the LLS �t, and about 20 directions for all other

�t types. Fig. 2.12 and 2.13 show the mean value and rotational bias in σFA and σε1

plotted versus the number of unique directions for a range of di�usion anisotropies.


Figure 2.7: Root mean squared error in Fractional Anisotropy plotted vs. b, for(a) FA=0.1, (b) FA=0.5, and (c) FA=0.9, with SNR of the b0 images held constantat 20 for all b-values. (d-f) The corresponding plots where SNR is normalized for TEand imaging time.


Figure 2.8: Range of b-values for which the root mean squared error in FractionalAnisotropy is within 5% of its minimum value. (a) SNR=20 for all b-values, (b) SNRnormalized for minimum TE, and (c) SNR normalized for minimum TE and minimumTR. (d-f) The corresponding range of b-values for which the root mean squared errorin �bre orientation is within 5% of its minimum value.


Figure 2.9: Root mean squared error in ε1 plotted vs. b, for (a) FA=0.5, (b) FA=0.7,and (c) FA=0.9, with SNR of the b0 images held constant at 20 for all b-values. (d-f) The corresponding plots where SNR is normalized for minimum TE and imagingtime. FA values below 0.5 are not shown because directional esimation is much lessreliable at low anisotropy.


Figure 2.10: Uncertainty in Fractional Anisotropy as a function of �bre orientationusing the LLS �t for (a) 6 directions, (b) 10 directions and (c) 60 directions. (d-f) Analagous results for NLS �tting. (g) Mean uncertainty in FA (averaged over alltensor orientations) versus unique number of directions. (h) Standard deviation inσFA over all �bre orientations, representing the rotational bias in uncertainty. Notethat the vertical scale of g is 10x that of h.


Figure 2.11: Uncertainty in ε1 as a function of �bre orientation using the LLS �t for(a) 6 directions, (b) 10 directions, and (c) 60 directions. (d-f) Analagous results forthe NLS �t. (g) Mean uncertainty in ε1 (averaged over all tensor orientations) versusunique number of directions. (h) Standard deviation in σε1 over all �bre orientations,representing rotational bias in uncertainty. Note that the vertical scale of g is 10xthat of h.


Figure 2.12: Mean value of σFA averaged over all �bre orientations versus the numberof unique directions for an FA of (a) 0.1, (b) 0.5, and (c) 0.9. Standard deviation inσFA over all tensor orientations for an FA of (d) 0.1, (e) 0.5, and (f) 0.9 respectively.The vertical scale of a-c is 10x that of d-f.


Figure 2.13: Mean value of σε1 averaged over all �bre orientations versus the number ofunique directions for an FA of (a) 0.5, (b) 0.7, and (c) 0.9. (d-f) Standard deviationin σε1 over all tensor orientations for an FA of 0.5, 0.7, and 0.9 respectively. Thevertical scale of a-c is 10x that of d-f.


2.5 Discussion

This study demonstrated that at b=1000 s/mm2, DTI data analysis using the LLS al-

gorithm resulted in reduced directional reliability and precision in Fractional Anisotropy

(FA) especially for highly anisotropic voxels. At b=1000 s/mm2, there was very lit-

tle di�erence between any of the alternatives to LLS. Results in vivo qualitatively

agreed with the corresponding simulations. The average reduced chi-squared values,

χ2r, were slightly elevated in vivo, indicating the presence of voxels for which the dif-

fusion tensor model may have been insu�cient. Slightly higher uncertainty in both

FA and �ber orientation was also present in the in vivo data relative to simulations,

especially at high anisotropy. This was probably due to outliers, which had a dispro-

portionately large e�ect at high FA because there were relatively few voxels with FA

values in this range (Fig. 2.4c). Reduced precision in experimental results relative to

simulations could also be partially explained by variations in trace and SNR over the

brain volumes (in simulations, these variables were �xed).

Fig. 2.7, 2.8, and 2.9 clearly show that the optimal b-value depends on the �tting

algorithm being used. In general, the b-value that minimized the root mean squared

error in Fractional Anisotropy (FA) was slightly higher than the b-value optimized

for ε1. A b-value between 800 and 1000 s/mm2 o�ered a good trade-o� between

directional and anisotropy information for LLS (if SNR was normalized for minimum

TE and TR), while for WLLS, NLS, and MCNLS, the optimal range of b-values was

slightly higher; somewhere in the range of 900 to 1200 s/mm2.

Gradients with a higher maximum amplitude have potential to shorten the min-

imum TE necessary to achieve a given b-value (Fig. 1.2), which would reduce SNR

losses associated with higher b-values. This could shift the optimal b-value towards

the unnormalized SNR case. Ignoring SNR losses due to di�erences in TE and TR,


the upper limit for LLS �tting was around 1100 s/mm2 for ε1, and 1400 s/mm2 for

FA. In the case of the WLLS, NLS and MCNLS �ts, the b-value value could poten-

tially be raised as high as 1800 s/mm2 while still maintaining near optimal accuracy

in parameters. In general, it seems that an added bene�t of the more advanced �tting

techniques is that they were better able to adapt to high b-values, a property that

will be exploited in Chapter 3.

The MCNLS algorithm essentially modulates signal measurements based on their

amplitude, i.e. the degree to which they are likely to experience a positive magnitude

bias. Low signals are especially prone to magnitude bias, so they are suppressed

more, whereas higher signals are modulated less. This technique estimated Fractional

Anisotropy more accurately than NLS for highly anisotropic voxels at high b-values,

but at low anisotropy, NLS showed better performance (Fig. 2.7a,d). A possible

explanation is that in the isotropic case, all measurements were suppressed almost

equally, which is analogous to a slight signal loss. The implication is that it may be

better to use the magnitude-compensation only for highly anisotropic voxels, possibly

using some sort of thresholding approach.

For a total of 25 images, 2-3 b0 images o�ered good performance across the full

range of anisotropies using LLS �tting, corresponding to a NDW :Nb0 ratio between

7 and 12. For WLLS, NLS, and MCNLS, 3-4 b0 images was optimal, which corre-

sponds to a NDW :Nb0 ratio between 5 and 8. While these guidelines may be useful for

minimizing the error in Fractional Anisotropy measurements, the acquisition of any

non-di�usion-weighted images at the expense of more sampling directions increases

the error in estimating �bre orientation. In this respect, these NDW :Nb0 ratios repre-

sent the minimum values that should be considered. Higher ratios improve the ability

to estimate the direction of ε1 at the expense of FA, which may be desirable in some

situations (e.g. tractography).


WLLS, NLS, and MCNLS showed reduced rotational bias relative to LLS, and

reached their asymptotic values with fewer directions (15 for FA versus 30 with LLS,

20 for ε1 versus 30 with LLS). All of the alternatives to LLS also demonstrated lower

mean uncertainty in both �bre orientation and Fractional Anisotropy (FA) for any

number of directions greater than 6. For both FA and �bre orientation, the reduction

in mean uncertainty between LLS and the other algorithms was of a similar magnitude

to the reduction in the standard deviation versus orientation realized by acquiring an

increased number of unique directions (Fig. 2.12 and 2.13). This implies that using one

of the more advanced �tting algorithms had about the same degree of improvement

in results as using 20-30 directions versus 6; one reduced overall uncertainty, while

the other reduced rotational bias. Obviously, the greatest bene�t was achieved by

using one of the advanced �tting algorithms and acquiring more than 20 directions.

The combined results from this chapter illustrate several reasons for moving be-

yond the framework of linear least-squares �tting. Improved precision in tensor-based

parameters and reduced rotational bias both translate into increased con�dence in

DTI results. Although these performance gains are modest, the only expense is com-

putational time. With today's standard PC workstation, the time to �t DTI data

using nonlinear methods is roughly equivalent to the time it takes to acquire the data

from the MR scanner. Processor speed is no longer an excuse for suboptimal analysis

practices.

This chapter also demonstrated that a migration from LLS �tting requires a re-

thinking of data collection itself. Scan parameters that may have been optimal for

LLS algorithms would bene�t from minor adjustments. These results suggest that

slightly higher b-values and reduced NDW :Nb0 ratios are necessary to take full advan-

tage of the WLLS, NLS and MCNLS approaches.

It should be stressed that the optimal parameters reported in this thesis are only


valid for data that can be fully described by the di�usion tensor model. Optimizing

di�usion-weighted MRI for the purpose of resolving crossing or bending �bres would

obviously require a di�erent approach. Even in the simplest case of parallel �bre bun-

dles, the phrase �optimal parameters� can be a bit misleading. Fitting performance

depends on a number of variables including, but not limited to: mean di�usivity, b-

value, tissue anisotropy, T2-relaxation, pulse sequence timing, etc. The fact that the

optimal b-value and NDW :Nb0 ratio vary considerably across the range of FA values

implies that it is impossible to optimize for voxels with low and high anisotropy si-

multaneously. A possible solution to this problem may involve the design of sampling

strategies that utilize multiple b-values. This is an area that deserves future study.

It should also be stressed that any choice of imaging parameters necessarily in-

volves trade-o�s, and choosing the best parameters involves a careful consideration

of the entire imaging system. The simulations presented in this chapter examined

some of these trade-o�s under a standard set of conditions. While the general trends

should translate to other situations, the speci�c �optimal� values may change. Dif-

ferences in scanner hardware will a�ect the SNR normalization factors used, and will

therefore have an e�ect on the recommended parameter values. It is also important

to consider biological di�erences in the subject population, which could change, for

example, mean di�usivity or T2 values. Source code for these simulations is freely

available [48], and the interested reader is encouraged to adjust it to suit their own

needs.

The optimization problem is further complicated by a host of potential image ar-

tifacts that include cardiac pulsatility, patient motion, susceptibility, eddy currents,

and partial volume e�ects. None of these e�ects were considered in the simulation

studies but each can have a major impact on experimental results. Once again, reduc-

ing these artifacts typically involves some form of trade-o�. For example, averaging


multiple excitations can improve SNR, but this increases the potential for subject

motion and misregistration. A more thorough look at scan parameter optimization

is probably warranted, speci�cally to evaluate performance in vivo, where variability

beyond random noise exists. In any case, optimization of data collection should take

into account the intended �tting algorithm.

The presence of these additional sources of variation highlights a problem com-

mon to all least-squares methods: they are extremely sensitive to data outliers [51].

Other groups have implemented robust �tting techniques which are less susceptible

to outlier-related artifacts. Mangin et al. used the Geman-McLure M-estimator to

�t the di�usion tensor [52]. They showed improved performance in the presence of

corrupted data points, but the quality of �tting in the absence of outliers was re-

duced. Chang et al. used a similar technique to perform automated outlier-rejection,

after which they applied standard nonlinear �tting to the processed data [53]. The

application of robust statistics to the �eld of DTI shows promise, but there are several

concerns that still need to be addressed. Robust techniques must be able di�erenti-

ate between outliers caused by image artifacts and those that re�ect true deviations

from the tensor model, for example, those due to complex tissue structure and par-

tial volume e�ects. The latter case re�ects useful information that should not be

discarded.

Another area of interest that has not yet been explored is the impact of �tting

algorithms on bootstrap analysis [54]. Bootstrapping is a statistical technique that

involves resampling of experimental data to generate probability distributions for es-

timated model parameters. Model-based techniques such as the wild bootstrap [55]

and residual bootstrap [54] make an assumption that the residuals are either sym-

metric or similar across data points. These assumptions are not valid for LLS or even

WLLS �tting, though the degree to which this might in�uence bootstrap results is


unknown. The results of this study suggest that it may be problematic for voxels

with high anisotropy and/or high b-values.

Finally, there are two important advantages to nonlinear �tting beyond those

explicitly tested in these experiments. First of all, NLS �tting produced chi-squared

statistics that were truer to the theoretical chi-squared distribution, as evidenced

by Fig. 2.5 and 2.6. This is important from the perspective of model validation,

as the chi-squared value is indicative of the agreement between the data and the

model. A �tting algorithm that arti�cially in�ates this statistic could obviously lead

to problems in this context. Secondly, while the performance of the LLS algorithm

drops o� dramatically at high b-values, nonlinear methods are less susceptible to

this e�ect. High b-values are of great interest because they accentuate non-Gaussian

di�usion processes for which the tensor model is insu�cient [19, 22]. This suggests

that an attempt to characterize non-Gaussian voxels may be facilitated by a move

to higher b-values. The combination of these two properties makes nonlinear �tting

particularly well suited for the study of model validation. This will be the focus of

Chapter 3.

Chapter 3

Testing the validity of the tensor

model

�All models are wrong, some are useful.�

-George Box

3.1 Introduction

It is well known that the tensor model is limited to describing a single population

of �bres running in parallel. While this is generally su�cient for the majority of

voxels in white matter, more complex tissue architectures including crossing, bending,

and diverging �bres, are known to exist [19]. Applying the tensor model in these

situations can lead to misinterpretation of DTI results, as demonstrated in Fig. 3.1.

Heterogeneous �bre populations within a single voxel can result in erroneous �bre

directions and reduced Fractional Anisotropy (FA), e�ects which could be falsely

attributed to pathology or cause premature termination of tractography algorithms.

Therefore, it is important to realize the limitations of DTI and avoid making inferences

59

Chapter 3. Testing the validity of the tensor model 60

Figure 3.1: (a,b) ADC pro�les for two distinct �bre populations separated by a 45-degree angle. (c) ADC pro�le from a voxel containing equivalent proportions of thesetwo �bre populations. (d) The di�usion tensor �t to the ADC pro�le in c. (e) FA mapwith overlayed ellipsoids representing the displacement pro�le for each voxel. Regionsi and ii contain homogeneous populations of �bres a and b, respectively. Region iiicontains a mixed population, resulting in an estimated �bre orientation that matchesneither of the included �bre populations, and an apparent reduction in FA.

based on the tensor model in situations where it is not justi�ed.

The inadequacy of the di�usion tensor model has motivated the development of

advanced reconstruction schemes including q-space [5], q-ball [56], and Persistent

Angular Structure (PAS) MRI [50]. These techniques require the acquisition of more

data, usually in the form of additional sampling directions. Although increased data

allows the �tting of more complex models, the question remains as to whether or not

these higher-order models are always necessary, or if the data quality is su�cient to


warrant the �t. In voxels for which the underlying architecture is adequately described

by a tensor, the use of higher-order models can actually introduce additional error

due to over-�tting (i.e. �tting the noise). Therefore, even in those cases where it is

possible to �t higher-order models, one should always ask whether or not the increased

complexity is supported by the data.

Alexander et al. proposed a model-selection algorithm for classifying voxels into

categories of isotropic Gaussian, non-isotropic Gaussian (equivalent to the di�usion

tensor) and non-Gaussian di�usion [31]. This technique involves the �tting of a

hierarchical set of models based on the spherical harmonic series, and selecting the

most appropriate model using an F-test. It performs reasonably well for �bres crossing

at 90 degrees and in equal volume fractions, but su�ers as the separation angle is

reduced and/or the volume fractions become more unbalanced.

Because non-Gaussian di�usion is more apparent at high b-values [19, 22], it may

be possible to improve classi�er performance by imaging at b-values greater than

1000 s/mm2 (Fig. 3.2). However, the linear least-squares algorithm commonly used

to �t the spherical harmonic models performs poorly if the b-value is increased too

much, analagous to the results presented for the di�usion tensor in the previous

chapter. In addition, high b-values combined with high di�usivity can result in signal

measurements close to the noise �oor. In this situation, magnitude bias can result in a

�squashed peanut� artifact [32], which itself resembles non-Gaussian di�usion. Based

on the results of Chapter 2, a possible solution to this problem is an extension of the

magnitude-corrected nonlinear least-squares (MCNLS) �tting algorithm to higher-

order models. The ability to �t spherical harmonic models at high b-values should

result in an improved ability to detect voxels for which the di�usion tensor is invalid.

Note that throughout this chapter, di�usion pro�les for isotropic media and those

with parallel �bres are said to exhibit Gaussian di�usion. Technically speaking, all


Figure 3.2: ADC pro�les of two �bres crossing at 90 degrees for 3 di�erent b-values:(a) b=1000 s/mm2, (b) b=2000 s/mm2, and (c) b=3000 s/mm2.

di�usion in biological tissue is non-Gaussian. Separate water compartments (e.g. in-

tra/extra cellular water), exchange, and restrictions to water mobility all violate the

Gaussian di�usion model [45]. However, deviation from Gaussian behaviour is rela-

tively minor for the aforementioned tissue types in the range of b-values available on

most clinical scanners (≤3000 s/mm2) [7]. Under these conditions, the Gaussian/non-

Gaussian distinction provides a useful way to di�erentiate between the voxels for

which the di�usion tensor is valid (e.g. isotropic di�usion and parallel �bre bundles),

and those with more complex geometries.

3.2 Theory

3.2.1 Hypothesis testing

Hypothesis testing is a commonly employed statistical technique to compare evidence

for a given hypothesis with the probability that the the results are a product of random

chance. In the context of this chapter, the test is whether or not the di�usion pro�le

in a given voxel is Gaussian in shape; that is, whether or not the tensor model is


valid. The �rst step is to formulate the null hypothesis: that the di�usion pro�le

is Gaussian in shape. Next a metric that relates to this hypothesis is needed, for

example, the chi-squared statistic. The chi-squared value is calculated according to

Eq. 2.15, and it indicates how well the tensor model �ts the data.

The probability density function (pdf) of the metric should change if the null

hypothesis is false (e.g. in the case of non-Gaussian di�usion, the chi-squared value

should be higher). The pdf may be shifted, stretched, or have an entirely di�erent

shape. Ideally, it would be completely separated from that of the null hypothesis, in

which case it would be easy to separate the two scenarios. More likely, there will be

some degree of overlap and no matter where the threshold is drawn, some voxels will

be misclassi�ed. In this case, there are two types of error, false positives and false

negatives, as demonstrated in Fig. 3.3.

Figure 3.3: Hypothesis test schematic. The solid curve represents metric pdf if thenull hypothesis (Gaussian di�usion) is true . The dashed curve represents the pdf ofthe metric if the null hypothesis is false. The metric could be any measurement thatwould be expected to di�er between the two cases (e.g. chi-squared statistic). If themetric is greater than the decision threshold, the null hypothesis is rejected. Falsepositive and false negative errors are shown.

The theoretical distribution of the chi-squared statistic can be calculated based on

the number of model parameters and the number of data points. It is then possible to


base the decision threshold on a speci�c p-value, for example, 0.01. This means that

if a voxel is imaged multiple times with the null hypothesis being true, its chi-squared

value would be less than the threshold 99% of the time. Any voxel with a chi-squared

value above this threshold would lead to a rejection of the null hypothesis.

While a hypothesis test based on chi-squared values can successfully identify some

voxels with complex architecture, performance is relatively poor except in the case

of a large separation angle. A more sophisticated approach involves reframing the

question of model validation as one of model selection. This strategy is examined in

the following section.

3.2.2 Model selection

The previously described work of Alexander et al. [31] utilized a hierarchical set of

models based on the spherical harmonic series. Assuming that di�usion is antipodally

symmetric, only the even-ordered terms are required. The 0th-order term alone rep-

resents isotropic di�usion. Truncating the series at the 2nd-order produces a model

that is equivalent to the di�usion tensor. The 4th-order model has �fteen parameters

and is capable of describing shapes that resemble crossing �bres [57]. Fig. 3.4 shows

a graphical representation of these models �t to four di�erent simulated test cases:

isotropic di�usion, prolate di�usion, oblate di�usion, and a pair of �bre bundles cross-

ing at 90 degrees. Once all of the models have been �t to each case, the simplest

model that can adequately describe the data is selected.

The method described by Alexander et al. relies on an F-test for model selection.

The F-test compares the chi-squared statistics from a pair of models and the relative

number of parameters in each [42]. It is calculated according to Eq. 3.1, where pa and

pb are the number of parameters for model a and b (pb > pa), and χ2a and χ

2b are the


Figure 3.4: Noisy ADC pro�le and 0th, 2nd and 4th-order SH models �t to 4 testcases. The model selected by the F-test algorithm in each case is surrounded bya black box. Notice that each of the selected models has a similar shape to itscorresponding noisy ADC pro�le, and that increasing the model order further doesn'tsigni�canly change the shape.


corresponding chi-squared values. The F-statistic can be thought of as a measure of

the relative improvement in the reduced chi-squared value a�orded by the additional

model parameters.

Fa,b =(N − pb − 1)(χ2

a − χ2b)

(pb − pa)χ2b

(3.1)

To utilize the F-statistic in a hypothesis testing framework, the null hypothesis is that

both models describe the data equally well; that is, there is no advantage to using

the more complex model. In this case, the F-statistic follows a known theoretical

distribution, and a threshold, Ta,b, is de�ned such that Fa,b is less than the threshold

99% of the time. Alternatively, thresholds can be based on performance criteria from

simulation studies. If the F-statistic is above the threshold, the null hypothesis is

rejected and the more complex model is deemed necessary.

The model selection algorithm works as follows. Each voxel is initially assigned

the 0th-order isotropic classi�cation. If F0,2 is greater than the T0,2 threshold, the

voxel is assigned to the 2nd-order class. Next, F2,4 is calculated and compared to

the T2,4 threshold. If it exceeds this threshold, its classi�cation is incremented to

4th-order. When the F-statistic fails to cross its respective threshold, the algorithm

is terminated.

3.2.3 Fitting higher-order models

The spherical harmonic (SH) series is a set of basis functions represented by Yl,m(θ, ϕ),

where θ and ϕ are the collatitude and longitude angles on the sphere, l = 0, 1, 2, ... is

the function order, and m = −l, ..., 0, ..., l are the indexed functions of order l. Any

complex-valued spherical function can be written as a linear combination of these SH

functions multiplied by a set of complex coe�cients, cl,m:


f(θ, ϕ) =∞∑l=1

l∑m=−l

cl,mYl,m(θ, ϕ) (3.2)

Truncating this series at order l, leaves (l+1)(l+2)2

parameters.

Solving for the SH coe�cients is similar to �tting the di�usion tensor, described

in section 2.2. The SH coe�cients are written as a column vector, C.

C = [c0,0, c2,−2, c2,−1, c2,0, c2,1, ..., clmax,lmax ]T (3.3)

Each row of the experimental design matrix is setup using the b-values, bi, and SH

basis functions, Yl,m(θi, ϕi), that correspond to the applied di�usion gradients for each

of the N images:

Xi = [biY0,0(θi, ϕi), biY2,−2(θi, ϕi), ..., biYlmax,mmax(θi, ϕi), 1] (3.4)

The complete experimental design matrix is therefore:

X =

X1

...

Xi

...

XN

(3.5)

To solve for the SH coe�cients using linear least-squares regression, Y, is de�ned as

an Nx1 column vector of the log-transformed signal measurements,

Y = [ln(S1), ln(S2), · · · , ln(SN)]T (3.6)

resulting in the following equation:


Y = XC (3.7)

The LLS solution is calculated by taking the pseudoinverse of the design matrix, X+,

and multiplying it by Y.

C = (X∗X)−1XT = X+Y (3.8)

X∗ is the conjugate transpose of matrix X.

It is also possible to calculate the weighted linear least-squares solution using the

same weighting matrix as in the di�usion tensor case (Eq. 2.7).

C = (X∗WX)−1(XTW)Y (3.9)

Nonlinear �tting is complicated by the fact that the SH coe�cients have an imag-

inary component. If signal measurements are written as an Nx1 column vector, S,

the di�usion equation takes the following form:

S = exp

0 0 1

......

...

−biRe(Xi) −biIm(Xi) 1

......

...

−bNRe(XN) −bNIm(XN) 1

Re(C)

Im(C)

ln(S0)

(3.10)

Dividing the SH coe�cients into their real and imaginary components doubles the

number of unknowns. The reason this problem is not encountered in the case of LLS

or WLLS is that these methods implicitly constrain S to be real-valued. While it is

possible to �t the system described in Eq. 3.10 using a nonlinear algorithm with N


constraints, there is an alternative method that has several advantages.

There is a family of models based on higher-order tensors, or Generalized Di�u-

sion Tensors (GDTs) [58], that are theoretically equivalent to the spherical harmonic

models. Descoteaux et al. showed that it is trivial to convert between the two [59].

The advantage to working with GDTs is that the real and imaginary components

are completely separable. Therefore, if the measured signal is real-valued, all of the

GDT parameters are also real-valued, meaning that the GDT models can be �t using

nonlinear algorithms without the need to apply constraints. This makes �tting GDTs

conceptually easier and reduces computational demands relative to the spherical har-

monic models.

The �tting of a rank-l tensor will be described using the formalism of Ozarslan

et al. [60]. The apparent di�usion coe�cient along direction r for a rank-l tensor is

described by the equation:

D(l)(r) =3∑

i1=1

3∑i2=1

...3∑

il=1

Di1i2...ilri1ri2 ...ril (3.11)

where the indices ij represent the x, y, and z direction components respectively. To

maintain antipodal symmetry, l is forced to be an even number, as was the case with

the spherical harmonic models. Eq. 3.11 implies that there are 3l terms for a rank-l

tensor, however this number is greatly reduced by symmetry. This is analogous to

the case of the standard rank-2 tensor, where Dij is recognized as being equivalent

to Dji. For the general case, this property is expressed by the following equation:

Di1i2...in = D(i1i2...in) (3.12)

where (i1i2...in) represents all permutations of the tensor indices. The number of

unique elements is Nl = (l+1)(l+2)2

, which is the same number of parameters in a


spherical harmonic model of order l. Each of the Nl unique elements are repeated µ

times, where:

µ =

l

nx

l − nx

ny

=l!

nx!ny!nz!(3.13)

nx, ny, and nz represent the number of x, y, and z indices in the tensor subscript.

The di�usion equation can now be written in terms of a rank-l GDT:

S = S0 exp

[−b

Nl∑k=1

µkDk

l∏p=1

rk(p)

](3.14)

where µk is the number of repeated instances of tensor elementDk, and k(p) represents

the p-th index in the subscript string for Dk. As an example, Table 3.1 shows the

unique elements for a rank-4 GDT and the number of times they are repeated in

Eq. 3.11.

Table 3.1: Elements of a rank-4 generalized DT

To solve for the GDT parameters, the unique tensor elements are written as a

column vector, x(l), where the bracketed superscript de�nes the tensor rank. Note

that x(2) is exactly the same as that used in Chapter 2 for �tting the standard di�usion

tensor.


x(0) = [D, .., ln(S0)]T (3.15)

x(2) = [Dxx, Dyy, Dzz, Dxy, Dxz, Dyz, ln(S0)]T (3.16)

x(4) = [Dxxxx, Dyyyy, Dzzzz, Dxxxy, Dxyyy, ..., ln(S0)]T (3.17)

x(l) = [Di1,i2,i3,...,in , ..., ln(S0)]T (3.18)

The experimental design matrix, B(l), has N rows corresponding to the set of signal

measurements and (l + 1) columns. The �rst Nb0 rows represent the non-di�usion-

weighted images, and all elements in the rightmost column are equal to one. The

remaining elements are constructed from the b, µk, and rk(p) terms in Eq. 3.14. The

i-th row and the k-th column of the design matrix have the form:

−biµikl∏

p=1

rik(p) (3.19)

The design matrix for the rank-4 GDT has the following form:

B =

0 0 0 0 0 ... 1

......

......

......

−bir4x −bir4

iy −bir4iz −4bir

3ixriy −4bir

3ixriz ... 1

......

......

......

−bNr4Nx −bNr4

Ny −bNr4Nz −4bNr

3NxrNy −4bNr

3NxrNz ... 1

(3.20)


From this point forward, �tting a GDT is exactly the same as �tting a standard

rank-2 tensor. The magnitude-corrected nonlinear least-squares (MCNLS) �t can be

applied to GDTs by minimizing Eq. 3.21, where σ is a noise-estimation parameter

measured from a background region of the image. Setting σ to zero results in the

standard nonlinear least-squares �t.

fMCNLS(x) =N∑i=1

[Si −

√exp2(Bix) + σ2

]2(3.21)

The estimated GDT parameters can be converted to the equivalent SH model [59],

or alternatively, the model selection algorithm can be applied to the GDT models

themselves.

3.3 Methods

3.3.1 Experiment

DTI data was obtained from four healthy volunteers using a 3T GE Signa system.

Voxels were 2.6 mm isotropic, and 42 slices were acquired to cover the entire brain.

10 b0 images and 55 gradient orientations were used. Half of the subjects were imaged

using a quadrature birdcage coil while the other half used an 8-channel head coil.

Three data sets were collected sequentially from each subject with b-values of 1000,

2000, and 3000 s/mm2 using the twice-refocused spin echo sequence [18]. TE and

TR were both minimized for each b-value resulting in TE/TR values of 85/12 000,

97/14 300 and 105/15 900 ms respectively. A brain mask was created by thresholding

the b=1000 s/mm2 images ≥ 10σ.

0th, 2nd and 4th-order GDT models were �t to the data using linear least-squares,

nonlinear least-squares and magnitude-corrected nonlinear least-squares �ts. A series


of F-tests was performed to select the most appropriate model for each voxel. F-

test thresholds were set independently for each �t type, such that 99% of simulated

di�usion tensors with FA≤0.9 were correctly classi�ed by Monte Carlo simulations.

This is equivalent to �xing the false positive error rate to ≤1%.

3.3.2 Simulations

Monte Carlo simulations were performed for a two-tensor model with relative �bre

population ratios ranging from 1:9 to 5:5, and separation angles from 10 to 90 de-

grees. Each tensor had a trace of 2.1 mm2/ms, random orientation, and a Fractional

Anisotropy (FA) of 0.9 unless otherwise stated. Nb0 and the gradient orientations

were matched to the experimental setup. Each relative �bre population/separation

angle pair was simulated 10 000 times for six permutations of SNR (30 and 70 at

b=1000 s/mm2) and b-value (1000, 2000, and 3000 s/mm2). SNR was normalized for

minimum TE relative to the b=1000 s/mm2 case, as in section 2.3.2. GDT �tting

and model selection was performed as in the experiments.

3.4 Results

3.4.1 Experiment

Fig. 3.5 demonstrates three regions with clusters of voxels classi�ed as 4th-order at

b=3000 s/mm2 using the MCNLS �t and the 8-channel coil. Label 1 points to a

region in the pons where where the inferior-superior pyramidal tracts are crossed

by the transpontine tracts, which run in a lateral direction. Label 2 indicates the

crossing of the anterior-posterior optic radiation and the lateral �bres of the corpus

callosum. Label 3 points to �bre crossings in the corona radiata. The colour-coded


direction maps in the �rst column help to anatomically identify the crossing �bre

tracts. The second column shows the F2,4-statistic maps, which compare the 2nd

and 4th-order models. The higher this F-statistic (i.e. the brighter the voxel), the

more likely that the voxel represented non-Gaussian di�usion. The labeled regions

demonstrate elevated F-statistics in the neighbouring voxels. These F-statistic maps

were thresholded to produce the classi�cation maps in column three. Images on the

far right show sagittal and coronal views of these locations.

All of the labeled regions were clearly present for both subjects using the 8-channel

head coil. Figure 3.6ac and e show the same slices from one of the subjects imaged

with the quadrature coil. 4th-order clusters at regions 1 and 3 are visible, though no-

ticeably smaller. The cluster corresponding to label 2 was absent from the quadrature

coil data.

Fig. 3.7 shows classi�cation maps for the same slices in Fig. 3.5, but for three

di�erent b-values. Clearly, the size of the 4th-order clusters increases with b. Across

the entire brain, the average proportion of voxels classi�ed as 4th-order was 1.9%,

3.9%, and 3.2% for b=1000, 2000, and 3000 s/mm2 using the quadrature coil. For the

8-channel coil, the mean 4th-order classi�cation rates were 4.4%, 10.8%, and 11.3%

at b=1000, 2000, and 3000 s/mm2.

The average SNR across the entire brain masks for the b0 images was 29, 25, and

24 at b=1000, 2000, and 3000 s/mm2 using the quadrature coil, and 71, 59, and 55

for the 8-channel coil. The reduction in SNR at the higher b-values can be attributed

to T2-relaxation e�ects resulting from the longer echo times. Fig. 3.8 demonstrates

noisy ADC pro�les for the voxels labeled in Fig. 3.5.


Figure 3.5: Selected locations demonstrating clusters of voxels labelled as 4th-order.The top row is slice 7 (a-e). The middle row is slice 15 (f-j), and the bottom row is slice30 (k-o). All images are from the b=3000 s/mm2 data set using the MCNLS �t andthe 8-channel coil. The �rst column (a,f,k) shows colour-coded directional maps ofthe primary eigenvector. Column two (b,g,l) shows the F2,4-statistics. Column three(c,h,m) gives the classi�cation maps and the fourth column shows sagittal (d,i,n) andcoronal slices (e,j,o) of the labeled voxel locations.


Figure 3.6: Voxel classi�cation maps for the same three slices as in Fig. 3.5at b=3000 s/mm2 using MCNLS �tting and (a,c,e) the quadrature birdcage coil(SNR≈24), (b,d,f) the 8-channel coil (SNR≈55).


Figure 3.7: Voxel classi�cation maps for the same three slices as in Fig. 3.5 for(a,d,g) b=1000, (b,e,h) 2000, and (c,f,i) 3000 s/mm2 using the MCNLS �t and the8-channel head coil.


Figure 3.8: Noisy ADC pro�les for voxels labelled in Fig. 3.5 at 3 di�erent b-valuesusing the 8-channel head coil. (a-c) are the voxel labeled as 1, (e-g) 2, and (h-j) 3.Note the increasing complexity of these shapes as the b-value is increased.


3.4.2 Simulations

Fig. 3.9 shows the percentage of crossing �bres correctly identi�ed as 4th-order for

three di�erent b-values (SNR=30 at b=1000 s/mm2). This corresponds roughly to

the experimental SNR measured from the images acquired with the quadrature coil.

Fig. 3.10 shows the corresponding results for a reference SNR of 70, similar to the

8-channel coil experiments.

At both SNR levels, all �tting algorithms showed similar performance at b=1000 s/mm2.

The LLS-based classi�er improved at b=2000 s/mm2 but at b=3000 s/mm2 it was es-

sentially ine�ective. Both NLS and MCNLS improved dramatically when the b-value

was increased to 2000 s/mm2, though MCNLS showed a slight advantage over NLS

at low separation angles. When the b-value was increased to 3000 s/mm2, the advan-

tage of the higher b-value appeared to be partially canceled by the loss in SNR. NLS

performance was equivalent to, or slightly reduced b=2000 s/mm2 in all cases.

For MCNLS, the results were mixed. There was slightly better performance for

the highly unbalanced �bre populations at b=2000 s/mm2, while at b=3000 s/mm2,

the classi�cation algorithm seemed to do moderately better for smaller separation

angles. The performance gain between b=2000 and b=3000 s/mm2 was very small

relative to the di�erence between b=1000 and 2000 s/mm2.

The di�erence between the SNR=30 and SNR=70 results clearly demonstrates the

SNR-dependence of the algorithm. Higher SNR allowed the classi�er to reliably detect

smaller separation angles and unbalanced volume fractions. Fig. 3.11 shows detection

rates using the MCNLS algorithm across the three b-values when the Fractional

Anisotropy (FA) of the component tensors was reduced from 0.9 to 0.7. The lower

anisotropy dramatically reduced the ability to detect crossing �bres and illustrates

the need for high SNR in this application.


Figure 3.9: Percentage of crossing �bres correctly classi�ed as 4th-order in simulationsfor SNR=30 at b=1000 s/mm2. SNR at b=2000 and 3000 s/mm2 was normalized forminimum TE as in section 2.3.2. F-test thresholds were set to limit the false positiverate to less than 1%. Results for (a) b=1000, (b) 2000, and (c) 3000 s/mm2 usingLLS �tting. (d-f) Analagous results using NLS and (g-i) MCNLS �tting. Results areaccurate to ±2%.


Figure 3.10: Percentage of crossing �bres correctly classi�ed as 4th-order in simula-tions for SNR=70 at b=1000 s/mm2. This corresponds approximately to the exper-imental SNR using the 8-channel head coil. SNR at b=2000 and 3000 s/mm2 wasnormalized for minimum TE. Results for (a) b=1000, (b) 2000, and (c) 3000 s/mm2

using LLS �tting. (d-f) Analagous results using NLS and (g-i) MCNLS �tting. Re-sults are accurate to ±2%.


Figure 3.11: Percentage of �bre crossings correctly classi�ed as 4th-order when theFractional Anisotropy (FA) of the component tensors was reduced to 0.7. Results at(a) b=1000 (SNR=30), (b) 2000 (SNR=25.7), and (c) 3000 s/mm2 (SNR=23.1) usingthe MCNLS �t. SNR was normalized for minimum TE. (d-f) Analagous results forSNR=70 (at b=1000 s/mm2). Results are accurate to ±2%.


3.5 Discussion

The model selection framework introduced by Alexander et al. [31] provides an auto-

mated statistical approach to identify voxels that cannot be adequately represented

by the di�usion tensor. This has important applications both for avoiding misin-

terpretation of DTI results and providing evidence to justify more complex model-

ing/reconstruction procedures. The primary limitation of this method, one that it

shares with the majority of procedures for resolving complex �bre architectures, is

its reduced performance for �bres with small separation angles and/or unbalanced

relative �bre populations.

It has been previously reported that increasing the b-values can accentuate non-

Gaussian di�usion processes [19, 22]. A major reason that higher b-values have not

been widely adopted for this application is the limitation imposed by linear-least

squares �tting algorithms. The primary contribution of the work presented in this

chapter is the development of nonlinear and magnitude-corrected �tting algorithms for

the Spherical Harmonic and Generalized Di�usion Tensor models. These algorithms

enable robust modeling of higher-order di�usion processes at b-values much greater

than 1000 s/mm2.

The results of applying Magnitude-Corrected Nonlinear Least-Squares (MCNLS)

�tting to Alexander et al.'s model selection framework is a signi�cant improvement

in the ability to detect non-Gaussian di�usion, especially for voxels with reduced sep-

aration angles and/or unbalanced volume fractions. The F-test thresholds were very

conservative (false positive rate ≤1%), and as such, the model order is generally un-

derestimated in the results presented here. It is entirely possible to trade-o� increased

detection of crossing �bres at the expense of more false positives (more di�usion ten-

sors falsely classi�ed as 4th-order). More sophisticated thresholding methods may


also be desirable to account for the multiple comparisons being performed [61].

The poor performance of the LLS-based classi�er was the results of its F-test

threshold increasing with the b-value to account for voxels that appear to show non-

Gaussian di�usion due to �tting error. This also explains the slight advantage of

magnitude-corrected nonlinear �tting versus standard nonlinear least-squares. The

primary advantage to MCNLS is that it can reduce false positives, or for the same

number of false positives, can improve the detection rate. In addition, Chapter 2 also

showed that at high b-values, MCNLS �tting reduces the error in estimating tensor

derived parameters relative to LLS, including FA and �bre orientation.

Increasing the b-value accentuates non-Gaussian di�usion, but increasing it too

much leads to SNR losses. Although it can be summarized from these results that the

b-value should be greater than 1000 s/mm2, a speci�c �optimal� b-value for detect-

ing non-Gaussian di�usion cannot be determined from this data. The SNR/b-value

relationship is primarily the result of exponential signal decay, exp(-bD), but there

is also a contribution from the longer echo time required to achieve higher b-values.

This e�ect is hardware dependent and improved gradients may alleviate these SNR

losses somewhat.

Classi�er performance was greatly in�uenced by SNR. This e�ect was demon-

strated in simulations and experimentally by the di�erences between the subjects im-

aged with the quadrature and 8-channel head coils. Although it is possible to increase

SNR through signal averaging, time constraints usually force a trade-o� between the

number of directions and the number of repeated acquisitions. Optimization of the b-

value, SNR, and number of directions is beyond the scope of this work, but it deserves

further study.

Using the 8-channel head coil at b=3000 s/mm2, 11.4% of voxels in the brain

were classi�ed as 4th-order on average. This is almost three times the number of


4th-order voxels classi�ed at b=1000 s/mm2 (4.4%). The percentage of 4th-order

voxels at b=1000 s/mm2 was comparable to the 5% reported by Alexander et al. [31].

Experimental di�erences (SNR, number of gradient directions, etc.) and di�erences

in the model selection algorithm (�tting algorithm and F-test thresholds) do not allow

for a direct comparison of results, but the same clusters of 4th-order voxels were seen

in both studies.

Even with these improved �tting techniques and higher b-values, the simulation

results suggest that many voxels with complex �bre geometries may go undetected,

especially if the component �bres have low anisotropy, low separation angles, and/or

unbalanced �bre populations. Relatively speaking, this class of false negatives pose

a less signi�cant problem for DTI analysis because they have a much lower impact

on estimates of anisotropy and �bre orientation than crossings with high separation

angles and well balanced �bre populations.

Imaging at higher b-values has several implications that should be noted. Non-

monoexponential di�usion resulting from distinct populations of crossing �bres con-

tributes to the non-Gaussian pro�les detected by this algorithm. Even for those voxels

without crossing �bres, non-monoexponential di�usion can become signi�cant at high

b-values due to the multiple water compartments in biological tissue [7]. This means

that the shapes of ADC pro�les vary with the b-value. As a consequence, tensor

parameters can also change relative to b. For this reason, caution should be exercised

when comparing studies that use di�erent b-values.

The extended di�usion times necessary to achieve high b-values also increase the

relative e�ects of restricted di�usion and exchange [5]. While these e�ects may be sig-

ni�cant at very high b-values, they are unlikely to have a major impact on the current

experiment since the increase in di�usion time between b=1000 and b=3000 s/mm2

was less than 20 ms, corresponding to an average net water displacement of a few mi-


crons. The e�ects of restricted di�usion and exchange may limit the upper value of b,

but once again, a higher maximum gradient amplitude could overcome this limitation.

Finally, this study looked speci�cally at non-Gaussian di�usion pro�les caused

by complex �bre architectures (i.e. partial volume e�ects). There are several other

sources that can potentially contribute to perceived non-Gaussian di�usion. These

include cardiac pulsatility, head motion, and eddy currents. While there are certainly

many established ways to reduce the overall impact of these contributions including

gating and image registration, others deserve consideration. First of all, robust �tting

techniques, which are less susceptible to outliers, have been applied to the �tting of

di�usion tensors [52, 53]. There is no reason that they could not also be extended

to higher-ordered models. Secondly, the antipodal symmetry of the ADC pro�le

may provide a means for separating true non-Gaussian di�usion e�ects from these

confounding artifacts.

The results of this study show that magnitude-corrected nonlinear �tting allows

for improved estimation of the SH and GDT models at high b-values. Combined

with the classi�cation scheme of Alexander et al. [31], this signi�cantly improves the

ability to detect regions of complex �bre architecture for which the di�usion tensor

is insu�cient. This information is critical for correctly interpreting DTI results, and

in providing justi�cation for the use of higher-order models on a voxel-by-voxel basis.

This �tting technique should also be useful to other areas of the di�usion community

that utilize the SH and/or GDT basis functions, e.g. spherical deconvolution and

q-ball imaging.

Chapter 4

Conclusions and future work

Chapter 2 compared several algorithms used to �t the di�usion tensor model, while

Chapter 3 addressed the problem of model validation and the �tting of higher-order

models. This �nal chapter will describe the overarching conclusions that can be drawn

from this work and attempt to highlight speci�c areas that require further study.

4.1 Magnitude-correction

Both of the preceding chapters shared a common approach to the problem of Rician

noise, termed here as magnitude-corrected nonlinear least-squares �tting. Although

research on bias in magnitude MR images has a long history [62, 63, 64, 39, 65, 66],

relatively little attention has been paid to this issue in the context of DTI. Dietrich

et al. [67] examined the e�ect of magnitude bias in estimating ADC values, while

DTI artifacts caused by Rician noise were �rst reported by Jones and Basser [32].

The latter study examined bias in the measurement of Apparent Di�usion Coe�cient

pro�les at low SNR and proposed a simple correction method that could partially

compensate for this artifact. This method was adapted to �t the di�usion tensor in

87

Chapter 4. Conclusions and future work 88

Chapter 2 (and higher-order models in Chapter 3) rather than single ADC values.

The noise parameter was based on characteristics of the image background and �xed

for model �tting. While simulation studies have shown that this approach is e�ective

in certain situations, it does have several important limitations. This section will

describe some of these limitations and suggest possible ways in which they may be

addressed.

The following objective function is minimized in MCNLS �tting:

N∑i=1

[Si −

√Si

2+ σ2

]2

(4.1)

where Si is the magnitude of the measured signal, Si is the signal estimation based on

the di�usion model, and σ is the average Gaussian noise, assumed to be equivalent in

the real and complex channels before application of the magnitude operation. This

method is similar in nature to an approximation originally proposed by Gudbjartsson

and Patz [64], which estimated signal amplitude, A, from the magnitude signal, M .

Their method, demonstrated in Eq. 4.2, can be rearranged to solve for M (Eq. 4.3).

The MCNLS correction scheme is based on a substitution of this result into the

standard least-squares di�usion signal equation (Eq. 2.12).

A =

√∣∣∣M2 − σ2

∣∣∣ (4.2)

M =√A2 + σ2 (4.3)

The Gudbjartsson and Patz approximation is known to overestimate the mean signal

at very low SNR (≤1), and therefore the MCNLS algorithm could reasonably be

expected to show reduced performance when one or more of the signal measurements


are below this level.

Several other correction schemes have been proposed in the literature, and it may

be possible to integrate one or more into an improved �tting approach for di�usion

models. Miller and Joseph described a correction method for power images (i.e. mag-

nitude images that have been squared) [66]. This correction is described by the

following equation:

M2 = A2 + 2σ2 (4.4)

Note that M2 6= M2, and therefore applying this correction scheme to di�usion

models necessitates �tting the power signal directly. This can be accomplished using

the equation [66]:

S2 = S20 exp(−2bD) (4.5)

The advantage to this approach is that, in theory, extracting the true signal even

at very low SNR is possible. The problem is that the distribution of the corrected

power signal at low SNR is markedly non-Gaussian, making it unsuitable for least-

squares �tting [64]. Furthermore, this approach is not valid for multiple receiver coils

or multiexponential signals [66].

Dietrich et al. [67] described an exact means for correcting magnitude images

based on the theoretical Rician distribution:

M = σ

√π

2exp

(− A2

4σ2

)×[(

1 +A2

2σ2

)I0

(A2

4σ2

)+A2

2σ2I1

(A2

4σ2

)](4.6)

where I0 and I1 are the zeroth and �rst order Bessel functions. By measuring the


average of the magnitude signal, M , as well as the noise parameter from the image

background, they showed that it is possible to obtain a numerical estimate of A. This

result could be applied to the �tting of di�usion models by implementing a method

based on the following:

minN∑i=1

[Si − σ

√π

2exp

(− Si

2

4σ2

)×

[(1 +

Si2

2σ2

)I0

(Si

2

4σ2

)+Si

2

2σ2I1

(Si

2

4σ2

)]]2

(4.7)

though this has not yet been attempted.

Koay and Basser recently presented a similar correction scheme which can simul-

taneously estimate the unbiased mean and variance from magnitude images [65]. This

approach requires estimates of both the mean and standard deviation of the magni-

tude signal, which necessitates multiple acquisitions. It is therefore not practical for

the purposes of this work.

One way to avoid the issue of magnitude bias altogether is to �t the complex

signal itself. Unfortunately, complex data and the necessary reconstruction algorithms

are not widely available on clinical scanners. The use of complex di�usion-weighted

images would not only solve the problems associated with magnitude bias, but would

also enable the measurement of non-symmetric di�usion pro�les [68]. If the demand

for complex DTI data were to increase, vendors may be convinced to provide this

capability.

4.2 Noise characterization

All of the prospects for magnitude correction that have been described to this point

rely on accurate characterization of signal noise. If it is assumed that the signal in the


background region of the images follows a Rayleigh distribution, the noise parameter

can be calculated from either the mean or standard deviation of the signal [39]:

mean(Sbackground) =

√π

2σ (4.8)

std(Sbackground) =

√4− π

2σ (4.9)

For the 4-channel, 3T GE Signa system used in this study, the background signal

showed a distribution that was in relatively good agreement with the Rayleigh dis-

tribution. Another assumption was also made in this work: that the noise level was

equivalent for all voxels in the image set.

There are several scenarios under which these assumptions may be invalid. Images

reconstructed from multiple receiver coils show spatial variation in the noise level and

background signal characteristics [69]. A similar situation exists for parallel imaging

techniques [70, 71], meaning that more sophisticated noise estimation methods are

required, including those that can estimate noise on a per voxel basis [72]. It must be

stressed that these and any other changes that a�ect the noise properties of images

(e.g. reconstruction �lters) may also force modi�cations to the magnitude-correction

algorithm. This applies not only to image acquisition and reconstruction, but also

to post-processing steps. It is common practice to perform image registration and

eddy-current correction on di�usion-weighted images. Rohde et al. showed that the

interpolation step inherent in these processes alters signal variance properties [73].

This implies that estimating noise properties from post-processed images can lead to

erroneous results.


4.3 Improving SNR

Robust detection of voxels with complex �bre architectures requires a high Signal-

to-Noise Ratio. There are several strategies which could potentially increase SNR.

Improved image hardware (i.e. higher �eld magnets, stronger gradients, better coils)

o�er one avenue towards achieving this goal. In addition, parallel imaging techniques

can reduce imaging time and therefore allow for a greater degree of temporal aver-

aging. Parallel imaging also has the added bene�t of mitigating several bandwidth-

related image artifacts [71]. As the quality of di�usion-weighted images improves, so

too will the performance of the proposed techniques.

4.4 Tractography

The model selection algorithm in Chapter 3 was used to construct classi�cation maps

identifying voxels for which the di�usion tensor was unsuitable. This was achieved

through an automated model selection based on F-statistics. These F-statistics could

also be integrated into an adaptive tractography algorithm that would modify its

behaviour relative to the evidence for the di�erent models. For example, strong

support for the isotropic model could be used as a stopping criterion. Evidence for

a 4th-order di�usion model could result in the application of a two-tensor model for

selected voxels within the brain.

Tractography o�ers a complimentary source of information which may help to

identify sites of �bre crossings. Because the �bre bundles in each voxel do not exist in

isolation (they connected to �bres in the surrounding voxels), �bre orientations from

voxels in the immediate neighbourhood may provide additional data which could be

integrated into the model selection algorithm. Other groups have attempted to re-


solve complex �bre geometries based on this principle using Independent Component

Analysis [74].

4.5 Conclusions

The results of this work suggest that it is time to abandon linear least-squares as a

method for �tting di�usion models. In the case of the di�usion tensor, nonlinear algo-

rithms o�er reduced rotational bias and reduced uncertainty in Fractional Anisotropy

and �bre orientation. The only real drawback is added computational time, which at

this point, is equivalent to or less than the time of image acquisition.

Nonlinear �tting has been shown to be especially important for high di�usivity

and high b-value applications. Furthermore, a simple modi�cation to the nonlinear

�tting algorithm reduces bias caused by magnitude images with low SNR. This MC-

NLS �tting algorithm is primarily intended for high di�usivity and/or high b-value

applications, where low SNR is commonly encountered. In addition to demonstrat-

ing the advantages of nonlinear �tting, the results of this work show that optimal

imaging parameters (e.g. number of gradient directions and b-value) depend on the

chosen �tting algorithm, and therefore need to be adapted accordingly.

Finally, nonlinear �tting and magnitude-correction have been extended to higher-

order models including the spherical harmonic series and generalized di�usion tensors.

This enables improved �tting performance at high b-values, facilitating the identi�-

cation of voxels with complex �bre geometries. Looking ahead, beyond the di�usion

tensor, it seems likely that attempts at resolving these complex �bre trajectories will

increasingly involve high b-values, for which nonlinear �tting methods are essential.

The recommendations presented in this thesis, as a whole, will improve the overall

robustness of Di�usion Tensor Imaging, inspiring greater con�dence in results and


therefore increasing the utility of DTI as a research and diagnostic tool.

Bibliography

[1] A. Einstein. Investigations on the Theory of the Brownian Movement. Courier

Dover Publications, 1956.

[2] E. L. Hahn. Spin echoes. Phys Rev, 80(4):580�594, 1950.

[3] H. Y. Carr and E. M. Purcell. E�ects of di�usion on free precession in Nuclear

Magnetic Resonance Experiments. Phys Rev, 94(3):630, 1954.

[4] E. O. Stejskal and J. E. Tanner. Spin di�usion measurements: Spin echoes in

the presence of a time-dependent �eld gradient. J. Chem. Phys., 42(1):288�292,

1965.

[5] P. T. Callaghan. Principles of Nuclear Magnetic Resonance Microscopy. Oxford

University Press, USA, 1993.

[6] D. Le Bihan, E. Breton, D. Lallemand, P. Grenier, E. Cabanis, and M. Laval-

Jeantet. MR imaging of intravoxel incoherent motions: application to di�usion

and perfusion in neurologic disorders. Radiology, 161(2):401�7, 1986.

[7] Y. Cohen and Y. Assaf. High b-value q-space analyzed di�usion-weighted MRS

and MRI in neuronal tissues - a technical review. NMR Biomed, 15(7-8):516�542,

2002.

95

Bibliography 96

[8] M. E. Moseley, Y. Cohen, J. Mintorovitch, L. Chileuitt, H. Shimizu, J. Kuchar-

czyk, M. F. Wendland, and P. R. Weinstein. Early detection of regional cerebral

ischemia in cats: comparison of di�usion-and T2-weighted MRI and spectroscopy.

Magn Reson Med, 14(2):330�46, 1990.

[9] P. J. Basser, J. Mattiello, and D. Le Bihan. Estimation of the e�ective self-

di�usion tensor from the NMR spin echo. J Magn Reson, Ser B, 103(3):247,

1994.

[10] N. G. Papadakis, D. A. Xing, G. C. Houston, J. M. Smith, M. I. Smith, M. F.

James, A. A. Parsons, C. L.-H. Huang, L. D. Hall, and T. A. Carpenter. A

study of rotationally invariant and symmetric indices of di�usion anisotropy.

Magn Reson Imaging, 17(6):881, 1999.

[11] S. Pajevic and C. Pierpaoli. Color schemes to represent the orientation of

anisotropic tissues from di�usion tensor data: application to white matter �ber

tract mapping in the human brain. Magn Reson Med, 42(3):526�540, 1999.

[12] S. Mori, B. J. Crain, V. P. Chacko, and P.C. van Zijl. Three-dimensional tracking

of axonal projections in the brain by magnetic resonance imaging. Ann Neurol,

45(2):265�9, 1999.

[13] G. J. M. Parker, H. A. Haroon, and C. A. M. Wheeler-Kingshott. A framework

for a streamline-based probabilistic index of connectivity (pico) using a structural

interpretation of mri di�usion measurements. J Magn Reson Imaging, 18(2):242,

2003.

[14] M. A. Hors�eld and D. K. Jones. Applications of di�usion-weighted and di�usion

tensor MRI to white matter diseases - a review. NMR Biomed, 15(7-8):570, 2002.

Bibliography 97

[15] J. Neil, J. Miller, P. Mukherjee, and PS Hueppi. Di�usion tensor imaging of

normal and injured developing human brain- a technical review. NMR Biomed,

15(7-8):543�552, 2002.

[16] D. H. Salat, D. S. Tuch, D. N. Greve, A. J. W. van der Kouwe, N. D. Hevelone,

A. K. Zaleta, B. R. Rosen, B. Fischl, S. Corkin, H. D. Rosas, et al. Age-related

alterations in white matter microstructure measured by di�usion tensor imaging.

Neurobiology of Aging, 26(8):1215�1227, 2005.

[17] M. A. Bernstein, K. F. King, and X. J. Zhou. Handbook of MRI Pulse Sequences.

Academic Press, 2004.

[18] T. G. Reese, O. Heid, R. M. Weissko�, and V. J. Wedeen. Reduction of eddy-

current-induced distortion in di�usion MRI using a twice-refocused spin echo.

Magn Reson Med, 49(1):177�182, 2003.

[19] A. L. Alexander, K. M. Hasan, M. Lazar, J. S. Tsuruda, and D. L. Parker.

Analysis of partial volume e�ects in di�usion-tensor MRI. Magn Reson Med,

45(5):770, 2001.

[20] D. K. Jones, M. A. Hors�eld, and A. Simmons. Optimal strategies for measuring

di�usion in anisotropic systems by magnetic resonance imaging. Magn Reson

Med, 42(3):515, 1999.

[21] D. C. Alexander and G. J. Barker. Optimal imaging parameters for �ber-

orientation estimation in di�usion MRI. Neuroimage, 27(2):357�367, 2005.

[22] L. R. Frank. Anisotropy in high angular resolution di�usion-weighted MRI. Magn

Reson Med, 45(6):935�939, 2001.

Bibliography 98

[23] D. S. Tuch, R. M. Weissko�, J. W. Belliveau, and V. J. Wedeen. High angular

resolution di�usion imaging of the human brain. In Proc. ISMRM 7th Annual

Meeting, page 321, 1999.

[24] P. G. Batchelor, D. Atkinson, D. L. G. Hill, F. Calamante, and A. Connelly.

Optimisation of direction schemes for tensor imaging: Rotational invariance of

the normal matrix. In Workshop on Di�usion MRI: Biophysical Issues, pages

230�232, 2002.

[25] K. M. Hasan, D. L. Parker, and A. L. Alexander. Comparison of gradient en-

coding schemes for di�usion-tensor MRI. J Magn Reson Imaging, 13(5):769,

2001.

[26] D. K. Jones. The e�ect of gradient sampling schemes on measures derived from

di�usion tensor MRI: A Monte Carlo study. Magn Reson Med, 51(4):807, 2004.

[27] N. G. Papadakis, C. D. Murrills, L. D. Hall, C. L. H. Huang, and T. A. Carpenter.

Minimal gradient encoding for robust estimation of di�usion anisotropy. Magn

Reson Imaging, 18(6):671, 2000.

[28] S. Skare, M. Hedehus, M. E. Moseley, and T. Li. Condition number as a measure

of noise performance of di�usion tensor data acquisition schemes with MRI. J

Magn Reson, 147(2):340, 2000.

[29] D. K. Jones, C. G. Koay, and P. J. Basser. It is not possible to design a rota-

tionally invariant sampling scheme for DT-MRI. In Proc. ISMRM 15th Annual

Meeting, page 5, 2007.

[30] R. Muthupallai, C. A. Holder, A. W. Song, and W. T. Dixon. In Proc. ISMRM

7th Annual Meeting, page 1825, 1999.

Bibliography 99

[31] D. C. Alexander, G. J. Barker, and S. R. Arridge. Detection and modeling of

non-Gaussian apparent di�usion coe�cient pro�les in human brain data. Magn

Reson Med, 48(2):331�340, 2002.

[32] D. K. Jones and P. J. Basser. "Squashing peanuts and smashing pumpkins"?:

How noise distorts di�usion-weighted MR data. Magn Reson Med, 52(5):979�993,

2004.

[33] H. Jiang, P. C. M. van Zijl, J. Kim, G. D. Pearlson, and S. Mori. DtiStudio:

resource program for di�usion tensor computation and �ber bundle tracking.

Computer Methods and Programs in Biomedicine, 81(2):106�116, 2006.

[34] S. M. Smith, M. Jenkinson, M. W. Woolrich, C. F. Beckmann, T. E. J. Behrens,

H. Johansen-Berg, P. R. Bannister, M. De Luca, I. Drobnjak, D. E. Flitney, et al.

Advances in functional and structural MR image analysis and implementation

as FSL. Neuroimage, 23:208�219, 2004.

[35] C. G. Koay, J. D. Carew, A. L. Alexander, P. J. Basser, and M. E. Meyerand.

Investigation of anomalous estimates of tensor-derived quantities in di�usion

tensor imaging. Magn Reson Med, 55(4):930, 2006.

[36] R. Penrose. A generalized inverse for matrices. Proc. Cambridge Philos. Soc,

51(1955):406�413, 1955.

[37] A. C. Aitken. On least squares and linear combination of observations. Proc R

Soc Edin, pages 42�8, 1935.

[38] S. O. Rice. Mathematical analysis of random noise-conclusion. Bell Systems

Tech. J.,, 24:46�156, 1945.

Bibliography 100

[39] R. M. Henkelman. Measurement of signal intensities in the presence of noise in

MR images. Med Phys, 12(2):232�233, 1985.

[40] P. B. Kingsley. Introduction to di�usion tensor imaging mathematics: Part III.

Tensor calculation, noise, simulations, and optimization. Concepts in Magnetic

Resonance. Part A, Bridging education and research, 28A(2):155, 2006.

[41] C. G. Koay, L. C. Chang, J. D. Carew, C. Pierpaoli, and P. J. Basser. A unifying

theoretical and algorithmic framework for least squares methods of estimation

in di�usion tensor imaging. J Magn Reson, 182(1):115, 2006.

[42] P. R. Bevington. Data Reduction and Error Analysis. 1969.

[43] P. J. Basser and S. Pajevic. Statistical artifacts in di�usion tensor MRI(DT-MRI)

caused by background noise. Magn Reson Med, 44(1):41�50, 2000.

[44] D. K. Jones. Determining and visualizing uncertainty in estimates of �ber ori-

entation from di�usion tensor MRI. Magn Reson Med, 49(1):7, 2003.

[45] G. J. Stanisz. Di�usion MR in biological systems: tissue compartments and

exchange. Israel Journal of Chemistry, 43(1):33�44, 2003.

[46] K. P. Whittall, A. L. MacKay, D. A. Graeb, R. A. Nugent, D. K. B. Li, and

D. W. Paty. In vivo measurement of T2 distributions and water contents in

normal human brain. Magn Reson Med, 37(1):34�43, 1997.

[47] H. B. Nielsen. immoptibox v.1.6. http://www2.imm.dtu.dk/∼hbn/immoptibox

[Retrieved 5/29/2007], 2006.

[48] R. Fobel. dwi-toolbox v.1.0. http://dwi-toolbox.sourceforge.net [Retrieved

11/25/2007], 2007.

Bibliography 101

[49] K. Najarian and R. Splinter. Biomedical Signal And Image Processing. CRC

Press, 2006.

[50] K. M. Jansons and D. C. Alexander. Persistent angular structure: new insights

from di�usion magnetic resonance imaging data. Inverse Problems, 19(5):1031�

1046, 2003.

[51] P. J. Huber. Robust Statistics. Wiley-Interscience, 2004.

[52] J-F. Mangin, C. Poupon, C. Clark, D. Le Bihan, and I. Bloch. Distortion cor-

rection and robust tensor estimation for MR di�usion imaging. Medical Image

Analysis, 6(3):191, 2002.

[53] L Chang, D. K. Jones, and C. Pierpaoli. RESTORE: robust estimation of tensors

by outlier rejection. Magn Reson Med, 53(5):1088, 2005.

[54] S. W. Chung, Y. Lu, and R. G. Henry. Comparison of bootstrap approaches

for estimation of uncertainties of DTI parameters. Neuroimage, 33(2):531�541,

2006.

[55] B. Whitcher, D. S. Tuch, and L. Wang. The wild bootstrap to quantify variability

in di�usion tensor mri. In Proc. ISMRM 13th Annual Meeting, page 1333, 2005.

[56] D. S. Tuch, T. G. Reese, M. R. Wiegell, and V. J. Wedeen. Di�usion MRI of

complex neural architecture. Neuron, 40(5):885�895, 2003.

[57] L. R. Frank. Characterization of anisotropy in high angular resolution di�usion-

weighted MRI. Magn Reson Med, 47(6):1083, 2002.

[58] C. Liu, R. Bammer, B. Acar, and M. Moseley. Generalized Di�usion Tensor

Imaging (GDTI) using Higher Order Tensor (HOT) statistics. In Proc. ISMRM

11th Annual Meeting, page 242, 2003.

Bibliography 102

[59] M. Descoteaux, E. Angelino, S. Fitzgibbons, and R. Deriche. Apparent di�u-

sion coe�cients from high angular resolution di�usion imaging: Estimation and

applications. Magn Reson Med, 56:395�410, 2006.

[60] E. Ozarslan and T. H. Mareci. Generalized di�usion tensor imaging and analyt-

ical relationships between di�usion tensor imaging and high angular resolution

di�usion imaging. Magn Reson Med, 50(5):955, 2003.

[61] C. R. Genovese, N. A. Lazar, and T. Nichols. Thresholding of statistical maps in

functional neuroimaging using the false discovery rate. Neuroimage, 15(4):870�

878, 2002.

[62] M. A. Bernstein, D. M. Thomasson, and W. H. Perman. Improved detectability

in low signal-to-noise ratio magnetic resonance images by means of a phase-

corrected real reconstruction. Med Phys, 16:813, 1989.

[63] W. A. Edelstein, P. A. Bottomley, and L. M. Pfeifer. A signal-to-noise calibration

procedure for NMR imaging systems. Med Phys, 11:180, 1984.

[64] H. Gudbjartsson and S. Patz. The Rician distribution of noisy MRI data. Magn

Reson Med, 34(6):910�14, 1995.

[65] C. G. Koay and P. J. Basser. Analytically exact correction scheme for signal

extraction from noisy magnitude MR signals. J Magn Reson, 179(2):317, 2006.

[66] A. J. Miller and P. M. Joseph. The use of power images to perform quantitative

analysis on low SNR MR images. Magn Reson Imaging, 11(7):1051�6, 1993.

[67] O. Dietrich, S. Heiland, and K. Sartor. Noise correction for the exact determina-

tion of apparent di�usion coe�cients at low SNR. Magn Reson Med, 45(3):448�

453, 2001.

Bibliography 103

[68] C. Liu, R. Bammer, B. Acar, and M. E. Moseley. Characterizing non-Gaussian

di�usion by using generalized di�usion tensors. Magn Reson Med, 51(5):924�937,

2004.

[69] C. D. Constantinides, E. Atalar, and E. R. McVeigh. Signal-to-noise mea-

surements in magnitude images from NMR phased arrays. Magn Reson Med,

38(5):852�7, 1997.

[70] R. Bammer, M. Auer, S. L. Keeling, M. Augustin, L. A. Stables, R. W. Prokesch,

R. Stollberger, M. E. Moseley, and F. Fazekas. Di�usion tensor imaging using

single-shot SENSE-EPI. Magn Reson Med, 48:128�136, 2002.

[71] T. Jaermann, G. Crelier, K. P. Pruessmann, X. Golay, T. Netsch, A. M. C. van

Muiswinkel, S. Mori, P. C. M. van Zijl, A. Valavanis, S. Kollias, and P. Boesiger.

SENSE-DTI at 3 T. Magn Reson Med, 51(2):230�236, 2004.

[72] P. Kellman and E. R. McVeigh. Image reconstruction in SNR units: a general

method for SNR measurement. Magn Reson Med, 54(6):1439�1447, 2005.

[73] G. K. Rohde, A. S. Barnett, P. J. Basser, and C. Pierpaoli. Estimating intensity

variance due to noise in registered images: Applications to di�usion tensor MRI.

Neuroimage, 26(3):673�684, 2005.

[74] S. Kim, J. W. Jeong, and M. Singh. Estimation of multiple �ber orientations from

Di�usion Tensor MRI using Independent Component Analysis. IEEE Transac-

tions on Nuclear Science, 52:266 � 273, 2005.

Documents

Analysis of Diffusion MRI Data in the Presence of …...Acknowledgements First of all, I'd like to thank my supervisor and mentor, Greg Stanisz, for giving me the freedom to nd my