Shallow Learning with Kernels for Dictionary-Free Magnetic...

Shallow Learning with Kernels

for Dictionary-Free Magnetic Resonance Fingerprinting

Gopal Nataraj∗, Mingjie Gao∗, Jakob Asslander†, Clayton Scott∗, & Jeffrey A. Fessler∗

ISMRM Workshop on Magnetic Resonance Fingerprinting

∗Dept. of Electrical Engineering and Computer Science, University of Michigan†Center for Biomedical Imaging, NYU School of Medicine

Problem Statement

Given: at every voxel, measurement vector y = s(x) + ǫ

MRF “component” images (more later...)

y x(y)x(·)

Task: design fast voxel-by-voxel estimator x(·)

that scales well with #unknowns per voxel, L2

Problem Statement

Given: at every voxel, measurement vector y = s(x) + ǫ

MRF “component” images (more later...)

y x(y)x(·)

Task: design fast voxel-by-voxel estimator x(·)

that scales well with #unknowns per voxel, L2

Machine Learning at Different “Depths” for QMRI

Idea: “learn” separate scalar estimators x1(y), . . . , xL(y) from simulated training data

Deep Learning

• promising for QMRI [Cohen et al., 2017, Virtue et al., 2017]

• needs many training points to avoid overfitting

• trained via non-convex optimization

• limited theoretical basis

Deep Learning

Shallow Learning

• simpler structure needs fewer training points

• fast training via convex optimization

Shallow Learning with Kernels for QMRI

• sample (x1, ǫ1), . . . , (xN , ǫN) and simulate y1, . . . , yN via signal model s

• design nonlinear functions xl(·) := gl (·) + bl that seek to map each yn to xl ,n:

(gl , bl

arg min

gl∈Gbl∈R

(gl (yn) + bl − xl ,n)2+ρl‖gl‖

Solution: Parameter Estimation via Regression with Kernels (PERK)

[Nataraj et al., 2017b, arXiv:1710.02441]

• restrict optimization to a certain rich function space G with kernel k

• optimal gl ∈ G takes form gl(·) =∑N

n=1 al ,nk(·, yn) [Scholkopf et al., 2001]

Fast, simple implementation: nonlinear lifting + high-dimensional linear regression

(gl , bl

arg min

glbl∈R

(gl (yn) + bl − xl ,n)2

ill-posed!

(gl , bl

arg min

glbl∈R

(gl (yn) + bl − xl ,n)2

ill-posed!

(gl , bl

arg min

gl∈Gbl∈R

(gl , bl

arg min

gl∈Gbl∈R

PERK for Magnetic Resonance Fingerprinting (MRF)

To control lifting dimension, desirable for y to be low-dimensional

flip 1

flip 840

[Asslander et al., 2017]

0 500 1000 1500 2000 2500 3000 3500

Time (ms)

data-sharing across flips;

gridding; FFT; PCA

V ∈ C840×6

∥∥k−A(YVH

)∥∥22

Y ∈ Cnvoxels×6 5

flip 1

flip 840

0 500 1000 1500 2000 2500 3000 3500

Time (ms)

gridding; FFT; PCA

V ∈ C840×6

∥∥k−A(YVH

)∥∥22

Y ∈ Cnvoxels×6 5

flip 1

flip 840

0 500 1000 1500 2000 2500 3000 3500

Time (ms)

gridding; FFT; PCA

V ∈ C840×6

∥∥k−A(YVH

)∥∥22

Y ∈ Cnvoxels×6 5

flip 1

flip 840

0 500 1000 1500 2000 2500 3000 3500

Time (ms)

gridding; FFT; PCA

V ∈ C840×6

∥∥k−A(YVH

)∥∥22

Y ∈ Cnvoxels×6 5

In vivo results

flip 1

flip 840

Dictionary-based Grid Search Dictionary-Free PERK

In vivo results

flip 1

flip 840

In vivo results

flip 1

flip 840

28s/slice 4s train; 0.2s/slice test50 slices ∼1400s 4s train; ∼10s test 6

In vivo results

flip 1

flip 840

In vivo results

flip 1

flip 840

Summary

Contributions:

• PERK: fast, dictionary-free ML method for QMRI [Nataraj et al., 2017b]

• demonstrated PERK for in vivo MRF T1,T2 estimation

Future Work:

• QMRI problems involving more unknowns for which we expect

orders-of-magnitude computational gains [Nataraj et al., 2017a, #5076]

• comparison with other ML methods, e.g. deep learning

Summary

Contributions:

Future Work:

Summary

Contributions:

Future Work:

References i

Asslander, J., Lattanzi, R., Sodickson, D. K., and Cloos, M. A. (2017).

Relaxation in spherical coordinates: analysis and optimization of pseudo-SSFP based

MR-fingerprinting.

arxiv 1703.00481.

Cohen, O., Zhu, B., and Rosen, M. (2017).

Deep learning for fast MR fingerprinting reconstruction.

In Proc. Intl. Soc. Mag. Res. Med., page 0688.

Nataraj, G., Nielsen, J.-F., and Fessler, J. A. (2017a).

Myelin water fraction estimation from optimized steady-state sequences using kernel ridge

regression.

In Proc. Intl. Soc. Mag. Res. Med., page 5076.

References ii

Nataraj, G., Nielsen, J.-F., Scott, C., and Fessler, J. A. (2017b).

Dictionary-free MRI PERK: Parameter estimation via regression with kernels.

IEEE Trans. Med. Imag.

Submitted.

Rahimi, A. and Recht, B. (2007).

Random features for large-scale kernel machines.

In NIPS.

Scholkopf, B., Herbrich, R., and Smola, A. J. (2001).

A generalized representer theorem.

In Proc. Computational Learning Theory (COLT), pages 416–426.

LNCS 2111.

References iii

Virtue, P., Yu, S. X., and Lustig, M. (2017).

Better than real: Complex-valued neural nets for MRI fingerprinting.

In Proc. IEEE Intl. Conf. on Image Processing.

To appear.

PERK solution

Closed-form solution for each l ∈ {1, . . . , L}:

xl(·) = xTl

N1N +M(MKM+ Nρl IN)

(k(·)−

• xl := [xl ,1, . . . , xl ,N ]T training point regressands

• K :=

k(y1, y1) · · · k(y1, yN)

.... . .

k(yN , y1) · · · k(yN , yN)

Gram matrix

• M := IN − 1N1N1

TN de-meaning operator

• k(·) := [k(·, y1), . . . , k(·, yN )]T nonlinear kernel embedding

Can we scale computation with L more gracefully?

• Yes, in fact (2) separable in l ∈ {1, . . . , L} by construction

• However, explicitly computing K may be undesirable... 11

PERK solution

Closed-form solution for each l ∈ {1, . . . , L}:

xl(·) = xTl

N1N +M(MKM+ Nρl IN)

(k(·)−

• xl := [xl ,1, . . . , xl ,N ]T training point regressands

• K :=

k(y1, y1) · · · k(y1, yN)

.... . .

k(yN , y1) · · · k(yN , yN)

Gram matrix

• M := IN − 1N1N1

TN de-meaning operator

• k(·) := [k(·, y1), . . . , k(·, yN )]T nonlinear kernel embedding

Can we scale computation with L more gracefully?

• Yes, in fact (2) separable in l ∈ {1, . . . , L} by construction

• However, explicitly computing K may be undesirable... 11

Backup: PERK as High-Dimensional Bayesian Linear Regression

Suppose there exists“approximate feature mapping” z : Y 7→ RZ

such that Z := [z(y1), . . . , z(yN)] has for dim(Y) ≪ Z ≪ N

K ≈ ZTZ. (3)

Plugging (3) into PERK solution (2) and rearranging gives

xl(·) ≈1

NxTl 1N +

NxTl MZT

NZMZT + ρl IZ

)−1(z(·)−

Does such a z exist and work well in practice?

• Yes, e.g. for kernels of form k(y, y′) ≡ k(y − y′) [Rahimi and Recht, 2007]

• In such cases, can reduce from ∼N2 to ∼NZ computations

Backup: PERK as High-Dimensional Bayesian Linear Regression

Suppose there exists“approximate feature mapping” z : Y 7→ RZ

such that Z := [z(y1), . . . , z(yN)] has for dim(Y) ≪ Z ≪ N

K ≈ ZTZ. (3)

Plugging (3) into KRR solution (2) and rearranging gives

xl(·) ≈ mxl + cTxl z

(Czz + ρl IZ

(z(·)− mz) (4)

which is regularized (“ridge”) Z -dimensional affine regression!

Does such a z exist and work well in practice?

• Yes, e.g. for kernels of form k(y, y′) ≡ k(y − y′) [Rahimi and Recht, 2007]

• In such cases, can reduce from ∼N2 to ∼NZ computations

Shallow Learning with Kernels for Dictionary-Free Magnetic...

Documents

Stock Market Dictionary / Share Market Dictionary

Nieuwe Werken. samenwerken op afstand Abel, vindoplossing van Twynstra Gudde Tam Arjan Nataraj & Reinier Smit Tam

Allen D. Malony, Aroon Nataraj {malony,anataraj}@cs.uoregon.edu Department of Computer and Information Science Performance

Unabridged Dictionary and Abridged Dictionary

Pointer Networks: Handling variable size output dictionary · Pointer Networks: Handling Variable Size Output Dictionary • Fixed-Size Dictionary • Dynamic Dictionary the updated

INDIAN VEGETABLE/KGoliveretailgroup.com/images/fahaheel_jleeb_sep5.pdf · pillow case nataraj geometric box + 10 pcs natarajgrippo fine ballpen nataraj silver hb2 pencil 12 * 4 pack

Dictionary Reduction: Automatic Compact Dictionary ...web.eecs.utk.edu/~ysong18/papers/ACCV_yang_2016.pdf · Dictionary Reduction: Automatic Compact Dictionary Learning for Classiﬁcation

Chaos in a Magnetic Pendulum Subjected to Tilted Excitation ......Excitation and Parametric Damping C. A. Kitio Kwuimy,1 C. Nataraj,1 and M. Belhaq2 1 Center for Nonlinear Dynamics

Nataraj Insulation, Vadodara, Insulation Products

Book Dictionary Language Dictionary Torah

Macquarie Dictionary Online Macquarie Dictionary Publishers

dictionary types and dictionary users

Information from · Dorland's Pocket Dictionary Dorland's Illustrated Medical Dictionary Stedman's Concise Medical Dictionary Stedman's Illustrated Medical Dictionary Lexicomp Note:

Classroom Accommodations for ESL and ELD Students · dictionary, English learner dictionary, illustrated dictionary and/or picture dictionary. • Provide bilingual dictionaries for

Archivo Picture Dictionary Picture Dictionary

Data Dictionary - EdSightedsight.ct.gov/relatedreports/CSDE_LDS_PhyscialDictionary.pdf · EdSight Data Dictionary, January 24, 2017, Page 1 of 55 Data Dictionary This Data Dictionary

H. S. Nataraj · 2008-07-28 · Nataraj et al. Phys. Rev. Lett. 101, 033002 (2008) Most dominant correlation terms like pair-correlation, core-correlation and core-polarization etc

DEBATE DICTIONARY: BY TRIUMPH DEBATE DEBATE DICTIONARY

dictionary dictionary glossary

Lubrication and Oil Analysis Dictionary - SINTELUB … · Lubrication and Oil Analysis Dictionary 1 ... Analytical ferrography the magnetic precipitation and subsequent analysis of