New Algorithms for Sparse Representation of Discrete ...jyan/Publications/PacRim2011 slides.pdf · New Algorithms for Sparse Representation of Discrete Signals Based on ‘p-‘2

New Algorithms for Sparse Representation of Discrete Signals Based on `p-`2 Optimization

New Algorithms for Sparse Representation ofDiscrete Signals Based on `p-`2 Optimization

Jie Yan and Wu-Sheng Lu

Department of Electrical and Computer EngineeringUniversity of Victoria, Victoria, BC, Canada

August 25, 2011

1 / 30


INTRODUCTION

OUTLINE

1 INTRODUCTION

2 PRELIMINARIES

3 ALGORITHMS FOR `p-`2 OPTIMIZATION

4 SIMULATIONS

5 CONCLUSIONS

2 / 30


INTRODUCTION

Motivation

A central point in sparse signal processing is to seek andapproximate to an ill-posed linear system while requiringthat the solution has fewest nonzero entries.Many of the applications lead to minimizing the following`1-`2 function

F(s) = ‖x−Ψs‖22 + λ‖s‖1.

F(s) is globally convex and its global minimizer can beidentified.

3 / 30


INTRODUCTION

Motivation Cont’d

For the `1-`2 problem, iterative-shrinkage algorithms haveemerged as a family of highly effective numerical methods.Of particular interest, a state-of-the-art algorithm calledFISTA/MFISTA was developed by A. Beck and M. Teboulle.Chartrand and Yin have proposed algorithms for `p-`2minimization for 0 < p < 1. Improved performance relativeto that obtained by `1-`2 minimization was demonstrated.

4 / 30


INTRODUCTION

Contribution

New algorithms for sparse representation based on `p-`2optimization are proposed.Our algorithms are built on MFISTA with several majorchanges.The soft-shrinkage step in MFISTA is replaced by a globalsolver for the minimization of a 1-D nonconvex `p-`2problem.Two efficient techniques for solving the 1-D `p-`2 areproposed.

5 / 30


PRELIMINARIES

OUTLINE

1 INTRODUCTION

2 PRELIMINARIES


4 SIMULATIONS

5 CONCLUSIONS

6 / 30


PRELIMINARIES

Sparse represenations in overcomplete bases

A typical sparse representation problem can be stated asfinding the sparsest represenations of a discrete signal xunder a (possibly overcomplete) dictionary Ψ.The problem can be described as minimizing ‖s‖0 subjectto x = Ψs or ‖x−Ψs‖2 ≤ ε. Unfortunately, this problem isNP hard.A popular approach is to consider a relaxed `1-`2unconstrained convex problem as

mins

F(s) = ||x−Ψs||22 + λ||s||1.

7 / 30


PRELIMINARIES

Iterative shrinkage-thresholding algorithm (ISTA)

ISTA can be viewed as an extension of the classicalgradient algorithm. Due to its simplicity, it is adequate forsolving large-scale problem.A key step in its kth iteration is to approximate F(s) by aneasy-to-deal-with upper-bound (up to a constant) convexfunction

F̂(s) =L2‖s− ck‖2

2 + λ‖s‖1

The minimizer of F̂(s) is a soft shrinkage of vector ck with aconstant threshold λ/L, as sk = Tλ/L(ck).ISTA provides a convergence rate O(1/k).

8 / 30


PRELIMINARIES

FISTA and MFISTA

The FISTA is built on ISTA with an extra step in eachiteration that, with the help of a sequence of scaling factorstk, creates an auxiliary iterate bk+1 by moving the currentiterate sk along the direction of sk − sk−1 so as to improvethe subsequent iterate sk+1.Furthermore, monotone FISTA (MFISTA) includes anadditional step to FISTA to possess desirable monotoneconvergence.FISTA and MFISTA possess a much improvedconvergence rate of O(1/k2).

9 / 30


ALGORITHMS FOR `p-`2 OPTIMIZATION

OUTLINE

1 INTRODUCTION

2 PRELIMINARIES


4 SIMULATIONS

5 CONCLUSIONS

10 / 30



An interesting development in sparse representation andcompressive sensing is to investigate a nonconvex variantof the basis pursuit by replacing the `1 norm term with an`p norm with 0 < p < 1.Naturally, an `p-`2 counterpart can be formulated as

mins

F(s) = ||x−Ψs||22 + λ||s||pp.

11 / 30



The algorithms we propose in this paper will be developedwithin the framework of FISTA/MFISTA in that

sk = argmins

{L2||s− ck||22 + λ||s||pp

}(1)

With 0 < p < 1, the setting is closer to the `0-norm problem,hence an improved sparse representation is expected.However, soft shrinkage fails to work as (1) is nonconvex.

12 / 30



The computation of sk reduces to solving Mone-dimensional (1-D) minimization problems, and it boilsdown to solving the 1-D problem

s∗ = argmins{u(s) = L

2(s− c)2 + λ|s|p}. (2)

We propose two techniques to find the global solution of(2) with 0 < p < 1.

13 / 30



Method 1: When p is rational

Suppose p = a/b with a, b positive integers and a < b. Letus first consider s ≥ 0, then the problem is converted tominimizing

v(z) = u(s)s=zb =L2(zb − c)2 + λza

whose gradient is

∇v(z) = Lbz2b−1 − Lcbzb−1 + λaza−1.

The global minimizer z∗+ must either be 0, or one of thosestationary points where ∇v(z) = 0. MATLAB functionroots was applied to find all the roots of polynomial ∇v(z).After identifying z∗+, we have s∗+ = (z∗+)

b as the solution thatminimizes u(s) for s ≥ 0.

14 / 30



Method 1: When p is rational Cont’d

In a similar way, the global minimizer s∗− that minimizes u(s)for s ≤ 0 can be computed, and the global minimizer s∗ isobtained as s∗ = argmins {u(s) : s = s∗+, s

∗−}.

The above `p solver is incorporated into an FISTA/MFISTAtype algorithm.In each iteration, the computational complexity isO(M(2b− 1)3).The method proposed above works well whenever p isrational with a small denominator integer such asp ∈ {1/4, 1/3, 1/2, 2/3, 3/4}.

15 / 30



Method 2: When p is an arbitrary real in (0, 1)

−2 −1 0 1 2 3 4 50

10

20

30

40

50

60

a(s)=L(s−c)2/2

b(s)=λ|s|p

u(s)=a(s)+b(s)

Let us examine the function to minimize, i.e.,u(s) = L

2 (s− c)2 + λ|s|p.If c = 0, s∗ = 0. Next, we consider the case of c > 0.

16 / 30



Method 2: When p is an arbitrary real in (0, 1) Cont’d

It can be observed that the global minimizer s∗ lies in [0, c]where the function of interest becomes

u(s) =L2(s− c)2 + λsp for s ∈ [0, c].

The convexity of u(s) can be analyzed by examining the2nd-order derivative, i.e.,

u′′(s) = L + λp(p− 1)sp−2.

17 / 30



Method 2: When p is an arbitrary real in (0, 1) Cont’d

The stationary point that makes u′′(s) = 0 issc = [λp(1−p)

L ]1/(2−p).For 0 ≤ s < sc, u(s) is concave as u′′(s) < 0; for s > sc, u(s)is convex as u′′(s) > 0.

18 / 30



Case (a): sc ≥ c

scc0

u(s) is concave in [0, c]. As a result, s∗ must be either 0 orc. Namely, s∗ = argmins {u(s) : s = 0, c}.

19 / 30



Case (b): sc < c

sc c0

u(s) is concave in [0, sc] and convex in [sc, c]. We argue thats∗ must be either at the point st that minimizes convexfunction u(s) in [sc, c], or at the boundary point 0.

20 / 30



To this end, we have proposed two techniques for theglobal minimization of the 1-D nonconvex `p subproblem.Based on this, an MFISTA type algorithm for the proposed`p-`2 problem can be developed by replacing the shrinkagestep of the conventional MFISTA with the above 1-D `p

solver.The algorithm we developed will be referred to as themodified MFISTA.

21 / 30


SIMULATIONS

OUTLINE

1 INTRODUCTION

2 PRELIMINARIES


4 SIMULATIONS

5 CONCLUSIONS

22 / 30


SIMULATIONS

Test signal x: Bumps signal of length N = 256.

0 50 100 150 200 2500

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Our objective is to find a representation vector s ∈ R3N×1 forsignal x such that x ≈ Ψs with s as sparse as possible.

23 / 30


SIMULATIONS

The dictionary adopted is a combination of threeorthonormal bases Ψ = [Ψ1 Ψ2 Ψ3] ∈ RN×3N where Ψ1 isthe Dirac basis, Ψ2 is the 1-D DCT basis and Ψ3 is thewavelet basis generated by orthogonal Daubechieswavelet D8.To this end we solve the `p-`2 problem withp = 1, 0.95, 0.9, 0.85, 0.8 and 0.75, respectively.

24 / 30


SIMULATIONS

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.080.8

0.82

0.84

0.86

0.88

0.9

0.92

0.94

R (relative equation error)

Z (

pe

rce

nta

ge

of

ze

ros)

p=1

p=0.95

p=0.9

p=0.85

p=0.8

p=0.75

Comparison of `p-`2 sparse representation of “bumps”signal for p = 1, 0.95, 0.9, 0.85, 0.8, 0.75 in terms of relativeequation error and signal sparsity in the dictionary domain. 25 / 30


SIMULATIONS

Several observations

1 For a fixed relative equation error, the sparsity improves asa smaller power p was used;

2 For a fixed level of sparsity, the relative equation errordecreases as a smaller power p was used;

3 The performance improvement appears to be kind ofnonlinear with respect to the change in power p.

26 / 30


SIMULATIONS

100 200 300 400 500 600 700−0.03

−0.02

−0.01

0

0.01

0.02

Sparse signal computed with p=1

100 200 300 400 500 600 700−0.03

−0.02

−0.01

0

0.01

0.02

Sparse signal computed with p=0.75

Sparse representation of the “bumps” signal based on `1and `0.75 reconstruction.

27 / 30


SIMULATIONS

For a fair comparison, both solutions yield the samerelative equation error of 0.00905.The sparsity achieved was found to be 87.24% for p = 0.75versus 81.77% for p = 1.

28 / 30


CONCLUSIONS

OUTLINE

1 INTRODUCTION

2 PRELIMINARIES


4 SIMULATIONS

5 CONCLUSIONS

29 / 30


CONCLUSIONS

New algorithms for sparse representation based on `p-`2optimization with 0 < p < 1 are proposed.In particular, the soft shrinkage step in MFISTA is replacedby a global solver for the minimization of a 1-D nonconvex`p problem.Two efficient techniques for solving the 1-D `p problem inquestion are proposed.Simulation studies for sparse representations arepresented to evaluate the performance of the proposedalgorithms with various values of p and compare with thebasis pursuit (BP) benchmark with p = 1.

30 / 30

Documents

New Algorithms for Sparse Representation of Discrete ...jyan/Publications/PacRim2011 slides.pdf · New Algorithms for Sparse Representation of Discrete Signals Based on ‘p-‘2