Iterative Hard Thresholding: Theory and Practice

Iterative Hard Thresholding: Theory and Practice

TH

E

U N I V E R S

I TY

OF

ED I N B U

RG

H

T. BlumensathInstitute for Digital Communications

Joint Research Institute for Signal and Image ProcessingThe University of Edinburgh

February, 2009

home · prev · next · page

mailto:[email protected]

IV) ResultsUp there with the best

III) Stability Beyond the Grave (RIP)Operating beyond the limits

II) Theory and PracticeNice theory, bad attitude

I) Iterative Hard Thresholding for Sparse Inverse ProblemsWhat’s the probelm?

y = Φx + e

What to Expect

home · prev · next · page 1 of 20

y = Φx + e

PART 1

Iterative Hard Thresholding for Sparse InverseProblems


THE SOLUTION: The Iterative Hard Thresholding (IHT) algorithm uses theiteration

xn+1 = PK(xn + ΦT (y −Φxn)),

where PK is a hard thresholding operator that keeps the largest (in magnitude)K elements of a vector (Or, more generally, a projector onto the closest elementin the model).

THE PROBLEM: Giveny = Φx + e,

where y ∈ RM , x ∈ RN , Φ ∈ RM×N and e is observation noise, estimate xgiven y and Φ when M << N but x is approximately K-sparse.

y = Φx + e

The Problem and Solution


y = Φx + e

PART 2

Theory and Practice


RECOVERY: If δ3K(Φ) ≤ 1/√

32, then after at most⌈log2

(‖xK‖2

ε̃K

)⌉iterations,

the IHT approximation x? satisfies

‖x? − x‖2 ≤ 7ε̃K.

where ε̃K = ‖x− xK‖2 + ‖x−xK‖1√K

+ ‖e‖2 and where δK(Φ) is the smallestconstant for which

(1− δK(Φ))‖x‖22 ≤ ‖Φx‖22 ≤ (1 + δK(Φ))‖x‖22

holds for all K sparse x.

CONVERGENCE: IHT is guaranteed to converge to a local minimum of‖y −Φx‖22 s. t. ‖x‖0 ≤ K whenever ‖Φ‖2 ≤ 1.

y = Φx + e

Convergence and Recovery Performance


0 0.1 0.2 0.3 0.4 0.50

0.2

0.4

0.6

0.8

1

sparsity K/M

prob

abili

ty o

f exa

ct re

cove

rty

l1||!||2<1

||!||2 " RIP

y = Φx + e

But


y = Φx + e

PART 3

Stability Beyond RIP


The Normalised Iterative Hard Thresholding (NIHT) algorithm uses the iteration

xn+1 = PK(xn + µnΦT (y −Φxn)),

where PK is a hard thresholding operator that keeps the largest (in magnitude)K elements of a vector (or, more generally, a projector onto the closest elementin the model) and µn is a step-size.

y = Φx + e

The Normalised Iterative Hard Thresholding Algorithm


y = Φx + e

Calculating the step size

Assume the support of xn = Γn and that the support of xn+1 = Γn+1 = Γn,then the optimal step-size is (in terms of reduction in squared approximationerror)

µn =gT

ΓngΓn

gTΓnΦT

ΓnΦΓngΓn,

where g = ΦT (y −Φxn).However, if Γn+1 6= Γn, this step-size might not be optimal and does notguarantee convergence. For guaranteed convergence we require that:

µ ≤ (1− c)‖xn+1 − xn‖22‖Φ(xn+1 − xn)‖22

,

for some c > 0.

Hence, if Γn+1 6= Γn, calculate ω = (1− c) ‖xn+1−xn‖22

‖Φ(xn+1−xn)‖22

and, if µn > ω, set

µn ← µn/(κ(1− c)) or, alternatively, set µn ← ω.


RECOVERY: If Φ satisfies 0 < α2K ≤ ‖Φx‖2‖x‖2

≤ β2K for all x : ‖x‖0 ≤ 2K. Given

a noisy observation y = Φx + e, where x is an arbitrary vector, let xK be thebest K-term approximation to x.

If γ2K = max{1− α22K

κβ22K

,β2

2K

α22K− 1} < 1/8, then after at most

n? =⌈log2

(‖xK‖2/ε̃K

)⌉iterations, IHTK estimates x with accuracy given by

‖x− xn?‖2 ≤ 9ε̃K, (1)

where

ε̃K = ‖x− xK‖2 +‖x− xK‖1√

K+

1α2K‖e‖2. (2)

CONVERGENCE: NIHT is guaranteed to converge to a local minimum of‖y −Φx‖22 s. t. ‖x‖0 ≤ K.

y = Φx + e

Why the Hassle?


y = Φx + e

PART 4

Results


y = Φx + e

Before and After

0 0.1 0.2 0.3 0.4 0.50

0.2

0.4

0.6

0.8

1

sparsity K/M

prob

abili

ty o

f exa

ct re

cove

rty


y = Φx + e

Comparison to other Algorithms

0 0.1 0.2 0.3 0.4 0.50

0.2

0.4

0.6

0.8

1

sparsity K/M

prob

abili

ty o

f exa

ct re

cove

rty


y = Φx + e

Comparison to other Algorithms

0 0.1 0.2 0.3 0.4 0.50

0.2

0.4

0.6

0.8

1

sparsity K/M

prob

abili

ty o

f exa

ct re

cove

rty


y = Φx + e

Speed Comparison

0 0.05 0.1 0.15 0.2 0.25 0.30

0.01

0.02

0.03

0.04

0.05

0.06

0.07

sparsity K/M

com

puta

tion

time


0 0.1 0.2 0.3 0.4 0.5!10

0

10

200 dB SNR

sparsity K/M

estim

atio

n SN

R in

dB

0 0.1 0.2 0.3 0.4 0.50

10

20

3010 dB SNR

sparsity K/M

estim

atio

n SN

R in

dB

0 0.1 0.2 0.3 0.4 0.50

10

20

30

4020 dB SNR

estim

atio

n SN

R in

dB

0 0.1 0.2 0.3 0.4 0.50

1020304050

30 dB SNR

sparsity K/M

estim

atio

n SN

R in

dB

y = Φx + e

Robustness to Noise


0 0.1 0.2 0.3 0.4 0.50

5

10

15

20xn = n!1

estim

atio

n SN

R in

dB

0 0.1 0.2 0.3 0.4 0.510

20

30

40

50

60xn = n!2

estim

atio

n SN

R in

dB

0 0.1 0.2 0.3 0.4 0.5

406080

100120140

xn = n!4

sparsity K/M

estim

atio

n SN

R in

dB

y = Φx + e

Robustness to Non-Exact-Sparsesness


y = Φx + e

Larger Problems

Original / Reconstruction Haar Wavelet Transform

Frequency Domain Observation


Who want’s my job?

Larger Problems

0.1 0.12 0.14 0.16 0.18 0.220

40

60

80

100

120

140

ratio of measurments to signal dimension

PSN

R in

dB

IHTOMPGPL1

19 of 20

Salary scale: 28,290 - 33,779 pa. (pounds sterling)Informal enquiries to Prof. Mike Davies ([email protected])

For further details please go to: http://www.jobs.ed.ac.uk/

Sparse representations for the processing of large scale dataCompressed sensing theory and algorithms with particular application to dynamic MRI

School of Engineering and Electronics, The University of Edinburgh36 months, starting as soon as possible after 1st June 2009.

Who want’s my job?

20 of 20

Documents

Iterative Hard Thresholding: Theory and Practice