51
A Study of the K-SVD Algorithm for Designing Overcomplete Dictionaries for Sparse Representation Lijun Xu School of Mathematical Sciences Dalian University of Technology Supervised by Bo Yu (DUT) Yin Zhang (Rice University) March 6, 2012

A Study of the K-SVD Algorithm for Designing Overcomplete

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: A Study of the K-SVD Algorithm for Designing Overcomplete

A Study of the K-SVD Algorithm for Designing Overcomplete Dictionaries for

Sparse Representation

Lijun XuSchool of Mathematical Sciences

Dalian University of Technology

Supervised by

Bo Yu (DUT)Yin Zhang (Rice University)

March 6, 2012

Page 2: A Study of the K-SVD Algorithm for Designing Overcomplete

1. M. Aharon, M. Elad, and A. Bruckstein. The K-SVD: An algorithm for

designing of overcomplete dictionaries for sparse representation, IEEE Trans. on Signal Processing, vol. 54, no. 11, pp. 4311-4322, 2006.

2. R. Rubinstein, M. Zibulaevsky, and M. Elad. Efficent implementation of the k-svd algorithm using batch orthogonal matching pursuit.TechnicalReport CS-2008-08, Technion - Israel Institute of Technology, 2008

This talk is about an effective method of training overcompletedictionaries for sparse signal representation proposed in the following two papers.

Page 3: A Study of the K-SVD Algorithm for Designing Overcomplete

• Complete:

a system {yi} in a Banach Space X is complete

if every element in X can be approximated arbitrarily well in norm by linear combinations of elements in {yi}.

• Overcomplete:

if removal of an element from the system {yi} results in a complete system.

The arbitrary approximation in norm can be thought as a representation somehow.

Page 4: A Study of the K-SVD Algorithm for Designing Overcomplete

• In recent years there has been a growing interest in the study of sparse representation of signals which is based on an overcomplete dictionary.

• Applications that use sparse representation are many and include compression, regularization in inverse problems, feature extraction, and more.

• Sparsity in overcomplete dictionaries is the basis for a wide variety of highly effective signal and image processing techniques.

Page 5: A Study of the K-SVD Algorithm for Designing Overcomplete

Model: Given a dictionary D and a signal y,

0min s.t. =

xx y Dx or

0 2min s.t. -

xx y Dx

: an overcomplete dictionary matrix, : a signal can be represented as a sparse linear

combination of columns of , : a vector contains the representation

coefficients of the signal

n KD R ny R

DKx R

y

How to choose D?

Page 6: A Study of the K-SVD Algorithm for Designing Overcomplete

• Modeling Approach

Choosing a predesigned transform, such asWavelet [Mallat]Curvelet [Candes & Donoho]Contourlet [Do & Vetterli]Bandlet [Mallat & Le-Pennec]and others ...

Simpler, pre-specified, fast transform, Their success depends on how suitable to signals.How to adapt to other signals? Bad for “smaller” signal families.

Page 7: A Study of the K-SVD Algorithm for Designing Overcomplete

• Training Approach

Train the dictionary directly based on the given examples, optimizing w.r.t. sparsity and other desired properties (normalized atoms, etc.).

Examples: ML (maximum likehood)MAP(maximum a-posteriori probability)MOD (method of optimal direction)ICA-like (independent component analysis)and others …

• Fits the data,

• Design fits the true objectives,

• Can be adapted to small families.

• No fast version,

• Slow training,

Page 8: A Study of the K-SVD Algorithm for Designing Overcomplete

• model:

where

2

0,min subject to , iFD X

Y DX i x S

1[ ] represents training signal set,(known)

is a designed dictionary matrix with K atoms,

is a sparse representations matrix with a sparsity level

N P

P

N K

K P

Y y y R P

D R

X R S

[Aharon, Elad, & Bruckstein,(2004)]

Page 9: A Study of the K-SVD Algorithm for Designing Overcomplete

• K-SVD Algorithm: two steps

(i) Sparse Coding:Producing sparse representations matrix X,given the current dictionary D

(ii) Dictionary Update:(the main innovation)Updating dictionary atoms, given the current sparse representations

Page 10: A Study of the K-SVD Algorithm for Designing Overcomplete

• Fix D,

• Aim to find the best coefficient matrix X,

• Use an approximate solving method.

2

0min . . , iFX

Y DX s t i x S

2

2 0min . . , , 1,2, ,

ii i i

xy Dx s t i x S i P

Orthogonal Matching Pursuit (OMP)Basis Pursuit (BP)Focal Under-determined System Solver (FOCUSS)And others…

(Ordinary Sparse Coding Problem )

Page 11: A Study of the K-SVD Algorithm for Designing Overcomplete

Goal: search for a better dictionary,

Scheme: update one column of D at a time,

—fixing all columns in D except one, dk ,

—update it by optimizing the target function.

For the kth atom (column of D),2

22

1

KT T T T

j j j j k k k k kF Fj j kF

Y DX Y d g Y d g d g E d g

1 2

1 1 2

[ ] , ,

[ ] [ ]

N K T

K k j j

j k

T K P

P K

D d d d R E Y d g

X x x g g g R

Page 12: A Study of the K-SVD Algorithm for Designing Overcomplete

• Update the kth atom of D:2

mink

T

k k k FdE d g

(the residual)T

k j j

j k

E Y d g

We can dobetter than

this

2

,min

k k

T

k k k Fd gE d g

XT

What about sparsity ?

Page 13: A Study of the K-SVD Algorithm for Designing Overcomplete

Ek

2

Fdk

gkT

To preserve sparsity, only consider those columns of Ek using dk.

Let denote the indices of the signals in Y which use the kth atom.

Update by optimizing:

where

,min

k kd g

I

11( ,..., ), [ ] denote the submatrix of A indexed by .kk I i iI i i A a a I

22

,min

k k

T

I I k k kF Fd gY DX E d g

th

, is the error matix without the k atom.k I j j I

j k

E Y d X

th th

,= is the k row in , namely the nonzeros of the k row in .T

k k I Ig X X X

Page 14: A Study of the K-SVD Algorithm for Designing Overcomplete

The simple rank-1 problem:

Solve directly via SVD:

2

2,min s.t. 1,

k k

T

k k k kFd g

E d g d

kE U V

1 1 1, ,k kd u g v

Update all K atoms of D column by column in the same fashion and get a new dictionary D+ .Then fix this D+ and go to sparse coding stage,......Untill reach stopping rule.

Page 15: A Study of the K-SVD Algorithm for Designing Overcomplete

Initialize D

Sparse Coding

Use MP or BP…

Dictionary UpdateColumn-by-Column by

SVD computation

Page 16: A Study of the K-SVD Algorithm for Designing Overcomplete

0

0

1: Input: Signal set , initial dictionary , target sparsity , number of iteration .

2: Output: Dictionary and sparse matrix such that .

3: Init: Set :

4: 1,...,

5: :

Y D S L

D X Y DX

D D

n L

i

for do

2

2 0

,

Arg min suject to

6: 1,...,

7: : 0

8: : {indices of the signals in whose representations use }

9: :

10: { , }:

i ix

j

j

I I I l l I

l j

x y Dx x S

j K

d

I Y d

E Y DX Y d X

d g

for do

2

2,

,

Arg min subject to 1

11: :

12: :

13:

14:

T

Fd g

j

T

j I

E dg d

d d

X g

end for

end for

Sparse coding process

Dictionary update process

Page 17: A Study of the K-SVD Algorithm for Designing Overcomplete

The K-SVD algorithm is an iterative method that alternates between sparse coding of the examples based on the current dictionary, and a process of updating the dictionary atoms to better fit the data.

It is flexible and can work with any pursuit method(e.g. BP, MP, or FOCUSS) which is used in sparse coding part. And the dictionary designed by the K-SVD performs well for both synthetic and real images in applications.

But the K-SVD algorithm is quite computationally demanding, especially when the dimensions of the dictionary increase or the number of training signals becomes large.

Page 18: A Study of the K-SVD Algorithm for Designing Overcomplete

Rubinstein, Zibulevsky & Elad (08)

In practice, the exact solution of

can be computationally difficult, as the size of E is proportional to the number of training signals.

Indeed, the goal of K-SVD is really to improve a given initial dictionary, not find an optimal one.

Therefore, a much quicker approach is to use an approximate solution just as long as it ensures a reduction of the final target function.

2

2,{ , }: Arg min subject to 1T

Fd gd g E dg d

Page 19: A Study of the K-SVD Algorithm for Designing Overcomplete

As to the dictionary update step,

using a single iteration of alternate-optimization over the atom d and the coefficients row gT , which is given by

2

2,{ , }: Arg min subject to 1T

Fd gd g E dg d

2

2

fix ,

min s.t. 1T

Fd

g

E dg d

2 : /d Eg Eg

2

fix ,

min T

Fg

d

E dg

: Tg E d

Page 20: A Study of the K-SVD Algorithm for Designing Overcomplete

0

0

1: Input: Signal set , initial dictionary , target sparsity , number of iteration .

2: Output: Dictionary and sparse matrix such that .

3: Init: Set :

4: 1,...,

5: :

Y D S L

D X Y DX

D D

n L

i

for do

2

2 0

,

Arg min suject to

6: 1,...,

7: : 0

8: : {indices of the signals in whose representations use }

9: :

10: :

11:

i ix

j

j

T

j I

I I

x y Dx x S

j K

d

I Y d

g X

d Y g DX g

for do

2

,

:

12: g : ( )

13: :

14: :

15:

16:

T T

I I

j

T

j I

d d d

Y d DX d

d d

X g

end for

end for

,

2

2,

,

9: :

10: { , }: Arg min subject to 1

11: :

12: :

I I I l l I

l j

T

Fd g

j

T

j I

E Y DX Y d X

d g E dg d

d d

X g

K-SVD algorithm

Page 21: A Study of the K-SVD Algorithm for Designing Overcomplete

• Eliminate the need to explicitly compute the matrix E.

• Time and memory saving.

• Actually it’s faster and more efficient using Batch-OMP instead of OMP in the sparse coding step for large signals.

• Complexity in one iteration

2

Assuming ~ ,

( 2 )K SVD

S K N P

T P S K NK

sparsitythe number of atoms of dictionary

the number of training signals

dimension of a signal

Page 22: A Study of the K-SVD Algorithm for Designing Overcomplete

• model:

where

2

0,min subject to , iFD X

Y DX i x S

1[ ] represents training signal set,(known)

is a designed dictionary matrix with K atoms,

is a sparse representations matrix with a sparsity level

N P

P

N K

K P

Y y y R P

D R

X R S

[Aharon, Elad, & Bruckstein,(2004)]

Page 23: A Study of the K-SVD Algorithm for Designing Overcomplete

• K-SVD Algorithm: two steps

(i) Sparse Coding:Producing sparse representations matrix X,given the current dictionary D

(ii) Dictionary Update:Updating dictionary atoms, given the current sparse representations

Initialize D

Sparse Coding

Use MP or BP…

Dictionary UpdateColumn-by-Column

by SVD computation

Page 24: A Study of the K-SVD Algorithm for Designing Overcomplete

• Fix D,

• Aim to find the best coefficient matrix X,

• Use an approximate solving method.

2

0min . . , iFX

Y DX s t i x S

2

2 0min . . , , 1,2, ,

ii i i

xy Dx s t i x S i P

Orthogonal Matching Pursuit (OMP)Basis Pursuit (BP)Focal Under-determined System Solver (FOCUSS)And others…

(Ordinary Sparse Coding Problem )

Page 25: A Study of the K-SVD Algorithm for Designing Overcomplete

Goal: search for a better dictionary,

Scheme: update one column of D at a time,

—fixing all columns in D except one, dk ,

—update it by optimizing the target function.

For the kth atom (column of D),2

22

1

KT T T T

j j j j k k k k kF Fj j kF

Y DX Y d g Y d g d g E d g

1 2

1 1 2

[ ] , ,

[ ] [ ]

N K T

K k j j

j k

T K P

P K

D d d d R E Y d g

X x x g g g R

Page 26: A Study of the K-SVD Algorithm for Designing Overcomplete

• Update the kth atom of D:2

mink

T

k k k FdE d g

(the residual)T

k j j

j k

E Y d g

We can dobetter than

this

2

,min

k k

T

k k k Fd gE d g

XT

What about sparsity ?

Page 27: A Study of the K-SVD Algorithm for Designing Overcomplete

Ek

2

Fdk

gkT

To preserve sparsity, only consider those columns of Ek using dk.

Let denote the indices of the signals in Y which use the kth atom.

Update by optimizing:

where

,min

k kd g

I

11( ,..., ), [ ] denote the submatrix of A indexed by .kk I i iI i i A a a I

22

,min

k k

T

I I k k kF Fd gY DX E d g

th

, is the error matix without the k atom.k I j j I

j k

E Y d X

th th

,= is the k row in , namely the nonzeros of the k row in .T

k k I Ig X X X

Page 28: A Study of the K-SVD Algorithm for Designing Overcomplete

The simple rank-1 problem:

Solve directly via SVD

or approximate methods.

2

2,min s.t. 1,

k k

T

k k k kFd g

E d g d

1 1 1, ,k kd u g v ( )

Update all K atoms of D column by column in the same fashion and get a new dictionary D+ .Then fix this D+ and go to sparse coding stage,......Until reach stopping rule.

Page 29: A Study of the K-SVD Algorithm for Designing Overcomplete

• Generation of the data to train on:

Create a 20*50 random matrix D(normalized),

Produce 1500 data signals , each created by a linear combination of 3 different columns

of D and adding white Gaussian noise with varying SNR

(Signal to Noise Ratio)

• Applying the K-SVD to solve :

1500

1{ }i iy

2

0,min s.t. , iFD X

Y DX i x S

0initial dictionary (initialized with sigals),

sparsity =3, number of iteration =80.

using OMP in sparse coding step.

D

S L

Page 30: A Study of the K-SVD Algorithm for Designing Overcomplete

• Compare the result D with the known D:

For every column di of D, find the closest column dj

in D measuring via .

If , it was considered a success.

• Repeat 50 times trials for a same noise level:

Compute the number of successes in each trial.

• Results:

Compare with other algorithm:

MOD(Method of Optimal Directions) and

MAP-based(Maximum A-posteriori Probability)

1 T

i jd d

1 0.01T

i jd d

Page 31: A Study of the K-SVD Algorithm for Designing Overcomplete

The results for noise levels of 10dB,20dB,30dB and no noise:

Page 32: A Study of the K-SVD Algorithm for Designing Overcomplete

• Training Data:

11,000 examples of 8*8 block patches taken from a database of face images in various location. A random collection of 500 blocks is presented in Figure 4.

Page 33: A Study of the K-SVD Algorithm for Designing Overcomplete

• Every 8*8 block represents a column of 64*11000 matix Y. For a block patch of size 8*8 pixel,

[yi1 yi2 yi64]T

• Running the K-SVD:

size of D: 64*441

sparsity of X: 10

sparse coding algorithm: OMP

• Comparison Dictionaries:with the overcomplete Haar dictionary and overcompleteDCT dictionary

Page 34: A Study of the K-SVD Algorithm for Designing Overcomplete

• Dictionary size : 64*441.every column presents a block patch of size 8*8 pixels

Page 35: A Study of the K-SVD Algorithm for Designing Overcomplete

• Application: Filling-In Missing PixelsTest image: one random full face image consisting of 594 blocks

(non of them were used for training)

For each 8*8 block B:

1)Delete a fraction r of the pixels in random location(set to zero)

2)Using OMP to the corrupted block under the learned D, the overcomplete Haar dictionary and the overcomplete DCT dictionary.

actually using a decimated dictionary with rows removed

Page 36: A Study of the K-SVD Algorithm for Designing Overcomplete

denote the resulting coefficient vector as XB

3)The reconstructed block B’=D XB ,

4)The reconstructed error:

=64

The corrupted block

decimated D XB

2' / 64

FB B

Page 37: A Study of the K-SVD Algorithm for Designing Overcomplete

• Compute the mean reconstruction errors for all blocks and all corruption rates , displayed in Figure 6

Page 38: A Study of the K-SVD Algorithm for Designing Overcomplete

• Two corrupted images and their reconstructions

Page 39: A Study of the K-SVD Algorithm for Designing Overcomplete

Some experiments on K-SVD & SeMF

Review:

K-SVD Algorithm: (two steps) • sparse coding (OMP)

given Y and D, producing a X with sparsity S.

• dictionary updatingupdating D columns by columns by using SVD or

approximate approaches.

2

0,min subject to , iFD X

Y DX i x S

Page 40: A Study of the K-SVD Algorithm for Designing Overcomplete

• Original model

• Model with splitting variables:

Splitting variable S1 separates W from T1 (similarly for S2).

Separations facilitates alternating direction methods.

Page 41: A Study of the K-SVD Algorithm for Designing Overcomplete

• Alternating Direction Method FrameworkAugmented Lagrangian (AL):

Alternately minimizing it over one variable:

Page 42: A Study of the K-SVD Algorithm for Designing Overcomplete

Structure-enforcing Matrix Factorization (SeMF) :

• Splitting variables

• ADM for Augmented Lagrangian (AL):

Notes: T1 (normalized),T2(sparse) allow easy projections.

Some experiments on K-SVD & SeMF

2

1 1 2 2,

min subject to ,FD X

Y DX D S T X S T

2 2 2

1 1 2 2

1 1 2 2

( , , , ) +

2 ( ) 2 ( )

Y F F FL D X S Y DX D S X S

D S X S

Page 43: A Study of the K-SVD Algorithm for Designing Overcomplete

Some experiments on K-SVD & SeMF

1. Synthetic experimentsD: random 20*50 matrix with

X: sparse 50*1500 matrix, sparsity is 3 with random location and value,

Y: D*X+noise,

Solving with SeMF and K-SVD,

Denote the learning dictionary as D,

Measure the distance via

If , we consider as a success.

Compute the number of the successful columns.

1500

1{ }i iy

21id

2

0,min s.t. , 3iFD X

Y DX i x

1 T

i jd d

1 0.01T

i jd d

Page 44: A Study of the K-SVD Algorithm for Designing Overcomplete

Some experiments on K-SVD & SeMFDictionary size: 20 x 50 Sparsity: 3 Number of examples: 1500elapsed time (SeMF): 3.08729 (s) iteration(SeMF): 200elapsed time (ksvd): 3.98188 (s) iteration(ksvd): 66Ratio of recovered atoms(SeMF): 100.00%Ratio of recovered atoms(ksvd): 100.00%

Page 45: A Study of the K-SVD Algorithm for Designing Overcomplete

Some experiments on K-SVD & SeMF The results for noise levels of 10dB,20dB,30dB and no noise:

Repeat 50 times trials for every level, the mean number of detected columns

in groups of 10 experiments.

Page 46: A Study of the K-SVD Algorithm for Designing Overcomplete

The results for different number of samples:

D:20*50, sparsity: 3, the number of samples: L=150:50:1400

Some experiments on K-SVD & SeMF

Page 47: A Study of the K-SVD Algorithm for Designing Overcomplete

Number of samples to achieve 95% recovery with different sparsity level.

D:20*50, sparsity: k=1:6, number of samples: L=200:10:2000.

repeat 5 times for each k and make an average L.

Some experiments on K-SVD & SeMF

Page 48: A Study of the K-SVD Algorithm for Designing Overcomplete

2. Applications on denoising

Some experiments on K-SVD & SeMF

a) Given a noisy image y, we can clean it by solving (OMP)

b) Need a dictionary D. (K-SVD & SeMF)c) Training data Y:

Use the corrupted image itself ! Simply sweep through all NxN patches

(with overlaps) and use them to train Image of size 1000x1000 pixels

~106 examples to use – more than enough.Using this Y, compute a dictionary D by K-SVD & SeMF, respectively.

2

0ˆ - . .

FArgMin y D s t k

ˆx̂ D

Page 49: A Study of the K-SVD Algorithm for Designing Overcomplete
Page 50: A Study of the K-SVD Algorithm for Designing Overcomplete

Some experiments on K-SVD & SeMF

Page 51: A Study of the K-SVD Algorithm for Designing Overcomplete

1. M. Aharon, M. Elad, and A. Bruckstein. The K-SVD: An algorithm for designing of overcomplete dictionaries for sparse representation, IEEE Trans. on Signal Processing, vol. 54, no. 11, pp. 4311-4322, 2006.

2. R. Rubinstein, M. Zibulaevsky, and M. Elad. Efficient implementation of the k-svd algorithm using batch orthogonal matching pursuit. Technical Report CS-2008-08, Technion - Israel Institute of Technology, 2008

3. J.A. Tropp, A.C. Gilbert. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inform. Theory, 53 (12) (2007), pp. 4655–4666