6
TWO DIMENSIONAL SYNTHESIS SPARSE MODEL Na Qi 1 , Yunhui Shi 1 , Xiaoyan Sun 2 ,Jingdong Wang 2 ,Baocai Yin 1 1 Beijing Key Laboratory of Multimedia and Intelligent Software Technology College of Computer Science and Technology Beijing University of Technology, Beijing, China 2 Microsoft Research Asia, Beijing, China [email protected],{syhzm,ybc}@bjut.edu.cn,{xysun,jingdw}@microsoft.com ABSTRACT Sparse representation has been proved to be very efficient in machine learning and image processing. Traditional image sparse representation formulates an image into a one dimen- sional (1D) vector which is then represented by a sparse lin- ear combination of the basis atoms from a dictionary. This 1D representation ignores the local spatial correlation inside one image. In this paper, we propose a two dimensional (2D) sparse model to much efficiently exploit the horizon- tal and vertical features which are represented by two dictio- naries simultaneously. The corresponding sparse coding and dictionary learning algorithm are also presented in this pa- per. The 2D synthesis model is further evaluated in image denoising. Experimental results demonstrate our 2D synthe- sis sparse model outperforms the state-of-the-art 1D model in terms of both objective and subjective qualities. Index TermsSynthesis Sparse Model, Sparse Repre- sentation, 2D-KSVD, Dictionary Learning, Image Denoising 1. INTRODUCTION Sparse representation has been widely studied and provided promising performance in numerous signal processing tasks such as image denoising, texture synthesis, audio processing, and image classification. Modeling a signal x ∈R d by sparse representation involves two ways, the synthesis sparse mod- eling and analysis sparse modeling [1]. The synthesis model can be defined as: x = Db, s.t.kbk 0 = k, (1) where D ∈R d×n is an over-complete dictionary in which each column denotes an atom, b ∈R n is a sparse vector. The sparsity is measured by the l 0 norm k·k 0 which is valued by the number of nonzero entries k of a vector. In this model, x is synthesized by a linear combination of certain atoms from D [2]. Correspondingly, the analysis sparse model is defined as: b = Ωx, s.t.kbk 0 = p - l, (2) where Ω ∈R p×d is a linear operator (also called as a dic- tionary), and l denotes the co-sparsity of the signal x. The analysis representative vector b is sparse with l zeros. The zeros in b denote the low-dimensional subspace to which the signal x belongs. The dictionaries in both analysis and synthesis models play an important role in sparse representation. They can be roughly classified into two categories, analytic dictionaries and learned dictionaries, according to the generation methods. The analytic dictionaries are predefined and built by mathe- matics models. The typical predefined dictionaries to natural images include Wavelets [3], Curvelet [4], Contourlets [5], and Bandelets [6]. Differently, the learned dictionaries are produced by training samples. Compared with the analytic dictionaries which have limited expressions, the learned dic- tionaries can adaptively represent a wider range of signal characteristic. Different learning algorithms, e.g. K-SVD [7] and sparse coding [8], have been proposed to generate learned dictionar- ies from a set of samples. Sparse coding, as a fundamen- tal method in dictionary learning, provides a class of algo- rithms for finding sparse representations of an training set. Many algorithms, such as matching pursuit(MP), orthogonal MP(OMP) [9], Lasso, and proximal method [10] have been proposed to solve the sparse pursuit problem. Lee. et al. in [8] formulate the sparse coding algorithms as a combina- tion of two convex optimization problems: the L1-regularized least squares problem solved by feature-sign search to learn coefficients, and the L2-constrained least squares problem solved by a Lagrange dual method to learn the bases for any sparsity penalty function. The K-SVD is an iterative method that alternates between sparse coding of the examples based on the current dictionary and an updating process of the dic- tionary atoms so as to better fit the data [7]. It is a two-phase block coordinate-relaxation approach. We observe that all the previous sparse models with learned dictionaries treat an input signal as a 1D vector. When dealing with 2D signals, they are reshaped into 1D vector. For natural images, this kind of reshape breaks the local correla-

[IEEE 2013 IEEE International Conference on Multimedia and Expo (ICME) - San Jose, CA, USA (2013.07.15-2013.07.19)] 2013 IEEE International Conference on Multimedia and Expo (ICME)

  • Upload
    dinhbao

  • View
    216

  • Download
    1

Embed Size (px)

Citation preview

TWO DIMENSIONAL SYNTHESIS SPARSE MODEL

Na Qi1, Yunhui Shi1, Xiaoyan Sun2,Jingdong Wang2,Baocai Yin1

1Beijing Key Laboratory of Multimedia and Intelligent Software TechnologyCollege of Computer Science and Technology

Beijing University of Technology, Beijing, China2Microsoft Research Asia, Beijing, China

[email protected],syhzm,[email protected],xysun,[email protected]

ABSTRACTSparse representation has been proved to be very efficient inmachine learning and image processing. Traditional imagesparse representation formulates an image into a one dimen-sional (1D) vector which is then represented by a sparse lin-ear combination of the basis atoms from a dictionary. This1D representation ignores the local spatial correlation insideone image. In this paper, we propose a two dimensional(2D) sparse model to much efficiently exploit the horizon-tal and vertical features which are represented by two dictio-naries simultaneously. The corresponding sparse coding anddictionary learning algorithm are also presented in this pa-per. The 2D synthesis model is further evaluated in imagedenoising. Experimental results demonstrate our 2D synthe-sis sparse model outperforms the state-of-the-art 1D model interms of both objective and subjective qualities.

Index Terms— Synthesis Sparse Model, Sparse Repre-sentation, 2D-KSVD, Dictionary Learning, Image Denoising

1. INTRODUCTION

Sparse representation has been widely studied and providedpromising performance in numerous signal processing taskssuch as image denoising, texture synthesis, audio processing,and image classification. Modeling a signal x ∈ Rd by sparserepresentation involves two ways, the synthesis sparse mod-eling and analysis sparse modeling [1]. The synthesis modelcan be defined as:

x = Db, s.t.‖b‖0 = k, (1)

where D ∈ Rd×n is an over-complete dictionary in whicheach column denotes an atom, b ∈ Rn is a sparse vector. Thesparsity is measured by the l0 norm ‖ · ‖0 which is valued bythe number of nonzero entries k of a vector. In this model, xis synthesized by a linear combination of certain atoms fromD [2]. Correspondingly, the analysis sparse model is definedas:

b = Ωx, s.t.‖b‖0 = p− l, (2)

where Ω ∈ Rp×d is a linear operator (also called as a dic-tionary), and l denotes the co-sparsity of the signal x. Theanalysis representative vector b is sparse with l zeros. Thezeros in b denote the low-dimensional subspace to which thesignal x belongs.

The dictionaries in both analysis and synthesis modelsplay an important role in sparse representation. They canbe roughly classified into two categories, analytic dictionariesand learned dictionaries, according to the generation methods.The analytic dictionaries are predefined and built by mathe-matics models. The typical predefined dictionaries to naturalimages include Wavelets [3], Curvelet [4], Contourlets [5],and Bandelets [6]. Differently, the learned dictionaries areproduced by training samples. Compared with the analyticdictionaries which have limited expressions, the learned dic-tionaries can adaptively represent a wider range of signalcharacteristic.

Different learning algorithms, e.g. K-SVD [7] and sparsecoding [8], have been proposed to generate learned dictionar-ies from a set of samples. Sparse coding, as a fundamen-tal method in dictionary learning, provides a class of algo-rithms for finding sparse representations of an training set.Many algorithms, such as matching pursuit(MP), orthogonalMP(OMP) [9], Lasso, and proximal method [10] have beenproposed to solve the sparse pursuit problem. Lee. et al.in [8] formulate the sparse coding algorithms as a combina-tion of two convex optimization problems: the L1-regularizedleast squares problem solved by feature-sign search to learncoefficients, and the L2-constrained least squares problemsolved by a Lagrange dual method to learn the bases for anysparsity penalty function. The K-SVD is an iterative methodthat alternates between sparse coding of the examples basedon the current dictionary and an updating process of the dic-tionary atoms so as to better fit the data [7]. It is a two-phaseblock coordinate-relaxation approach.

We observe that all the previous sparse models withlearned dictionaries treat an input signal as a 1D vector. Whendealing with 2D signals, they are reshaped into 1D vector. Fornatural images, this kind of reshape breaks the local correla-

Fig. 1. Illustration of 1D synthesis sparse model. An imagepatch X is reshaped to a vector x in column direction. In 1Dmodel, x is modeled as x = Db. That is, x can be repre-sented by k atoms of a given dictionary D.

tion inherent inside images. To fully make use of the localcorrelation inside natural images, we propose modeling thehorizontal and vertical image features by 2D synthesis sparsemodel. Although Richard. et al. have proposed the Kro-necker Compressive Sensing [11] and Cesar. et al. have pro-posed the block sparse representation of tensors using kro-necker bases [12], the dictionaries in the related work are notlearned. Our key contribution in this paper lies on learningthe dictionaries and computing the coefficients and we vali-date its power of doing image denoising, achieving advancedperformance compared with the state-of-the-art 1D synthesismodels.

The rest of this paper is organized as follows. Our 2Dsynthesis sparse model (2D SSM) and the corresponding 2Ddictionary learning algorithm (2D DL) are described in Sec-tion 2. The performance of our (2D SSM) is demonstrated inSection 3 in image denoising. Finally, Section 4 concludesthis paper.

2. 2D SYNTHESIS SPARSE MODEL

2.1. Notation

Before presenting our 2D synthesis sparse model, we wouldlike to define the notations used in our model for easy un-derstanding. In the following, X denotes a matrix. x is thereshaped vector of matrix X. We define the lq−norm of a

vector x ∈ Rm as ‖x‖qdef= (

∑mj=1 |xj |q)1/q , where xj de-

notes the jth coordinate of x. ‖·‖0 means the l0−norm valuedby the non-zeros entries of a vector. We consider the Frobe-

nius norm of X as ‖X‖Fdef= (

∑mi=1

∑nj=1 x

2ij)

1/2, wherexij denotes the entry of X at the ith row and jth column. Op-erator ⊗ represents the Kronecker product. We also define a

as a vector of the matrix A, and A is a set of matrixes. d(1)j

Fig. 2. Illustration of our proposed 2D SSM. An image patchis modeled as X = D1B

TDT2 , which denotes that it can be

directly composed by two dictionaries D1,D2 and a sparsematrix B. D1 and D2 are horizontal and vertical dictionary,respectively.The curves on x−ordinate and y−ordinate reflectthe distinct features in the image patch.

and d(2)j are the jth column of two dictionaries D1 and D2,

respectively.

2.2. Our Proposed 2D Model

Fig. 1 illustrates a traditional 1D sparse model. Given a 2Dpatch X ∈ Rd1×d1 , the 1D model first reshapes the patch Xto a vector x. This vector x satisfies x = Db in the 1D sparsemodel, where b is the sparse representation, D ∈ Rd×n is adictionary, and ‖b‖0 = k denotes the sparsity. In other words,the vector x can be composed by k atoms of the dictionaryD, where d = d21 and n d. As shown in this figure, thesparsity is only exploited in one direction which ignores the2D correlation inside the image patch.

To efficiently exploit the 2D correlation inside images,we proposed the 2D sparse model. Our 2D synthesis sparsemodel is defined as follows:

X = D1BTDT

2 ,

s.t. ‖B‖0 = k,

X = D1A1, ‖A1‖0 = p,

XT = D2A2, ‖A2‖0 = q,

(3)

where D1 ∈ Rd1×n1 and D2 ∈ Rd1×n2 are horizontaland vertical dictionaries, respectively. A1 ∈ Rn1×d1 andA2 ∈ Rn2×d1 are the corresponding horizontal and verticalsparse representations of the image patch X, respectively. Asshown in Fig. 2, the image patch X can be composed by twodimensional dictionaries and a sparse representation matrixB ∈ Rn2×n1 . The sparsity of a matrix Z is defined as ‖Z‖0,and ‖ · ‖0 is the count of the non-zeros in the matrix.

In the following, we describe our 2D SSM model in de-tails. Given an image patch X, there is a horizontal dictionaryD1 such that each column xj of the patch X can be composedas by a sparse linear combination of the columns of D1. Moreprecisely, the dictionary D1 is learned along with a matrix ofdecomposition coefficients A1 = [a

(1)1 ,a

(1)2 , . . . ,a

(1)d1

] such

that xj = D1a(1)j for every column of the matrix X. So the

image patch X can be denoted as X = D1A1. In the sameway, XT = D2A2.

Either X = D1A1 or XT = D2A2 just takes 1D corre-lation into consideration. We should take the two directionalcorrelation into account. The sparse representation coefficientA1 has some redundancy in the vertical direction, and D2 re-flects the correlation in the vertical direction. So AT

1 , thetransportation of A1, must be composed as a linear combina-tion of a few atoms from the vertical dictionary D2, which canbe expressed as AT

1 = D2B, that is X = D1BTDT

2 . Here,B is the sparse coefficient in our proposal model. Obviously,AT

2 = D1BT and XT = D2BDT

1 . When both D1 and D2

are fixed, our 2D sparse model can be easily converted to 1Dsparse model x = Db, and D = D1 ⊗D2.

2.3. Dictionary Learning

This subsection presents the dictionary training methods for2D SSM. That is how to get the dictionaries D1 and D2 bytraining image patches.

Given a training set I = [Y1,Y2, . . . ,YM ] ∈ Rd1×M0 ,where Yj is a image patch of size d1 × d1, and M0 =d1 × M . The training set consists of M numbers of patchYj ∈ Rd1×d1 . Assuming every sample Yj contaminated bynoise, satisfies such a model Yj = Xj +Vj , where the orig-inal patch Xj satisfying Xj ≈ D1B

TDT2 or ‖D1B

TDT2 −

Xj‖ ≤ ε, ‖B‖0 = L and Vj is a zero-mean white-Gaussianaddictive noise patch. The image patch Xj belongs to the(ε, L,D1,D2)−model. We aim at learning two dimensionaldictionaries from the training set. The optimization task canbe formulated as follows:

D1, D2, B = arg minD1,D2,B

‖I − II‖2F ,

s.t. D1BTj DT2 = Xj , 1 ≤ j ≤M, (4)

‖BTj ‖0 = k,

where B is the sparse coefficient matrix for the samples, andBj is the sparse coefficient matrix for the image patch Xj .

As in classical dictionary learning, the optimization prob-lem is not jointly convex in (D,A), but it can be convex withrespect to D when A is fixed and vice-versa [13]. So theabove problem can be solved by using a two-phase block-coordinate-relaxation approach. In the first phase, we opti-mize B with fixing D1 and D2; in the second phase, we up-date D1 and D2 using the computed B. The process repeatsuntil some stop criterion is satisfied.

Table 1 shows our 2D synthesis dictionary learning algo-rithm. The detail of the sparse coding step and dictionaryupdating step are discussed in details in 2.3.1 and 2.3.2, re-spectively.

Table 1. 2D Synthesis Dictionary Learning Algorithm

Input: Training Set I, number of iteration numInitialization: Set the horizontal dictionary D1 and thevertical dictionary D2 with the redundant DCT dictionary.For n = 1 : num

Sparse coding Step:Compute D by using D = D1 ⊗D2

Compute B with (5) or (6) and get A1,A2

Dictionary Update Step:For each column in D1 update it by (7)For each column in D2 update it by (8)

End ForOutput:Learned dictionaries D1,D2

2.3.1. Optimization of B with Dictionary D1,D2 Fixed

When the two directional dictionary D1,D2 are given, wenote that updating the matrix B is solving M independentoptimization problems with respect to each image patch Yj .For each Yj , the following optimization problem should besolved:

Bj = argminBj

‖Yj −D1BTj DT2 ‖2F + λ‖Bj‖0, (5)

where B is the sparse coefficient of the whole training set. BTjis the sparse coefficient matrix of the image patch Yj . Theabove problem is 2D sparse-coding or sparse-pursuit prob-lem, which can be solved by the 2D Orthogonal MatchingPursuit [14]. And Fang et al. has proved that 2D-OMP is infact equivalent to 1D-OMP, both algorithms will output ex-actly the same results. However, the complexity and memoryusage of 2D-OMP are much lower than of 1D-OMP. Thus,2D-OMP can be used as an alternative to 1D-OMP in 2Dsparse signal recovery.

What’s the complexity of our proposed dictionary learn-ing? Our dictionary learning problem denoted in (5) can bereformed as:

bj = argminbj

‖yj −Dbj‖22 + λ‖bj‖0, (6)

where bj and yj denote the reshaped vector of BTj and Yj ,respectively. Follow the analysis in [14] and using the equa-tion (6) to solve the problem, the complexity of 1D-OMP isO(d × n), where d = d1 × d1 and n = n1 × n2. But thecomplexity of using our 2D-OMP to solve this problem isO(d1×n), roughly 1/d1 of 1D-OMP. Note that our 2D sparsecoding only needs the memory usage of size d1 × q, whereq = n1+n2. However, the 1D one needs d×n, where n ≥ q.Clearly, our 2D sparse-coding step has lower complexity andmuch less memory usage than the 1D one.

2.3.2. Optimization of D1,D2 with Dictionary B Fixed

Given the sparse coefficient B, we will then updatethe 2D dictionaries D1 and D2. Regarding D1

and D2 as the relative exact dictionaries, we haveA1

def= [(D2B1)T , (D2B2)T , . . . , (D2BM )T ] and A2

def=

[(D1BT1 )T , (D1BT2 )T , . . . , (D1BTM )T ]. We use A1 and A2

to update the 2D dictionaries D1 and D2 by:

D1 = argminD1

‖I −D1A1‖2F , (7)

D2 = argminD2

‖J −D2A2‖2F , (8)

where J = [YT1 ,Y

T2 , . . . ,Y

TM ] ∈ Rd1×M0 .

To solve (7), we update the dictionary D1 together withthe nonzero coefficients in A1. Assuming both I and A1 arefixed, we consider only one column in the dictionary, d

(1)j ,

and its corresponding coefficients, the j−th row in A1, de-noted as Aj

1T . The equation (7) can be rewritten as:

‖I −D1A1‖2F = ‖I −M∑j=1

d(1)j A

j1T ‖

2F ,

= ‖(I −∑j 6=k

d(1)j A

j1T )− d

(1)k A

k1T ‖2F ,

= ‖Ek − d(1)k A

k1T ‖2F . (9)

We have decomposed the multiplication D1A1 to the sum ofM rank−1 matrices. Among these, M −1 terms are assumedto be fixed, and one- the k-th - remains in question. The ma-trix Ek stands for the error for all the M examples when thek-th atom is removed. Here, we use K-SVD [7] to find al-ternative d

(1)k and Ak

1T . The K-SVD finds the closes rank−1matrix that approximates Ek, and effectively minimizes theerror as defined in (9). In such an update of d

(1)k , the sparsity

constrain doesn’t be enforced. The details on K-SVD can befound in [7].

We discuss the complexity as well as memory usage ofour dictionary update step. The complexity of exacting theSVD of a m × n matrix is O(minmn2, n2m) [15]. In thetraditional 1D dictionay learning process, updating each atomof the dictionary needs SVD. So the complexity of updatingeach atom is O(d2M) (d << M ), and the complexity ofupdating all the atoms of D is O(nd2M). However, in our2D dictionary update process, the corresponding matrix im-plementing SVD will be d1 × N , where N = d1 ×M . Sothe complexity of updating each atom of the dictionary D1 isO(d21N) = O(d3/2M). Both D1 and D2 need n1 atoms toupdate, so the complexity of one dictionary update step in ourproposal will be O(n1/2d3/2M). Furthermore, the memorycost in our 2D updating is 2d1n1 = 2d1/2n1/2 pixels, and thedictionary size of 1D traditional sparse model is dn pixels.Obviously, the time complexity and the memory usage of our2D model are much less than those of 1D model.

Table 2. Summary of the denoising PSNR results in [dB].Ineach cell, the upper value is Portilla et al.result [16], the mid-dle value is Elad et al.result [17]. They all use the dictionaryof size 64×256. And the lower value is the proposed methodwith two dictionaries of size 8× 16.

σ\PSNR 2\42.11 5\34.15 10\28.13Lena 42.23 38.49 35.61

43.58 38.6 35.4743.58 38.55 35.37

Barb 43.29 37.79 34.0343.67 38.08 34.4243.64 38.05 34.02

Boats 42.09 36.97 33.5843.14 38.08 33.6443.11 37.16 33.56

Fgprt 43.05 36.68 32.4542.99 36.65 32.3942.93 36.53 32.33

House 44.07 38.65 35.3544.47 39.37 35.9844.38 39.14 35.59

Peppers 43 37.31 33.7743.33 37.78 34.2843.37 37.93 34.26

3. EXPERIMENTAL RESULTS

We validate our proposed 2D SSM in this section. The ef-fectiveness of our 2D SSM as well as the dictionary learningalgorithm is evaluated in image denoising.

A noise image V is denoted as V = U + W, where Wis a zero-mean white-Gaussian addictive noise image and Uis the original version in which every patch belongs to the(ε, L,D1,D2)− model. The image denoising problem is for-mulated as :

Bij , U = arg minBij ,U

‖U−V‖2F +∑i,j

µij‖Bij‖0

+∑i,j

‖D1BTijDT2 −Uij‖22. (10)

The first term in (10) is a log-likelihood global force onthe proximity between the image U and V with the constrain‖U−V‖2F ≤ const · σ2. The second and the third terms in(10) are the local constrains to make every patch Uij of sized1 × d1 in location (i, j) have a representation with boundederror ‖D1BTijDT

2 −Uij‖22 ≤ T . µij is Lagrange multiplier.It is similar to solve the equation (10) by using the methodproposed by [17]. The dictionary D is computed by usingD = D1 ⊗D2.

In the following, our proposed 2D SSM is used to solvethe image denosing problem defined in (10). Six grey im-ages, ’Lena’, ’Peppers’, ’House’, ’Barbara’, ’Fingerprint’ ,

(a)1D denoising result(30.09dB) (b)our proposal (38.55dB)

Fig. 3. The denoising resuls of Lena by [17] and our proposal2D model using the dictionaries of size 64 × 4 and 8 × 16,respectively.

and ’Boats’, which are also used in [17] are involved for ourdenosing test. The noisy images are generated by addingwhite Gaussian noise at different standard deviations σ. Inall the test, image patch X is of size 8 × 8 pixels, and thetwo dictionaries D1 and D2 are of size 8 × 16. Then D isgenerated by using Kronecker product is of size 64× 256.

The performance of our 2D SSM in image denosing isevaluated in comparison with those of 1D sparse synthesismodels proposed in [16] and [17]. Table 2 shows the denois-ing results in terms of PSNR. Note that the sizes of dictionar-ies are quite different in generating the results. The schemesin [16] and [17] require the dictionaries of size 64 × 256,whereas our scheme needs only two dictionaries of size 8×16.Though only 1/64 size of dictionary is used, our 2D modelachieves comparable results to the other schemes and some-times provides the best among all the three evaluated schemesas denoted by the bolded numbers.

We further study the performance of our scheme with sim-ilar sizes of dictionaries are used. Table 3 shows the denoisingresults of [17] using different sizes of dictionaries which areall larger than the size used in our scheme. The noise errorpower σ = 5, and the noise image of PSNR = 34.15dB.As shown in this table, the denoisng result varies with thechange of the size of dictionaries. The large size, the higherPSNR results. Although the 1D dictionary is four times of thesize of our 2D dictionary, the denoisng result is much worsethan our proposal result. 11dB lost in average when same sizeof dictionary is used.

A visual comparison is given in Figure 3. It presents thedenoisng results of ’Lena’ generated by 1D sparse model and2D sparse model with the dictionary of similar sizes 64 × 4and 2× 8× 16 = 256, respectively. Clearly, our scheme pro-vides much clean and vivid reconstructed image using smalldictionary.

Exemplified dictionaries used in our scheme is illustratedin Figure 4. We shows the dictionaries D, D1 and D2 of’Peppers’. The left two images present the dictionaries D1

Table 3. Results by [17] with the dictionaries of differentsizes.

image\DictSize 64× 4 64× 16 64× 64Lena 30.09 35.81 38.20Barb 24.40 30.14 37.74Boats 27.08 33.04 37.14Fgprt 24.72 33.50 34.25House 29.35 36.15 39.34

Peppers 25.73 32.23 37.29

(a) Our Dictionary D1, D2 (b) The Kronecker product D

Fig. 4. Exemplified dictionaries in our 2D model. Left twoimages show D1, D2, in which each column is the atom ofone directional signal of the image patch.The right image de-notes the tensor produced dictionary D in which every squareis an atom of size 8× 8.

and D2. Here every column is an atom of every directionalsignal of the image patch. The right image is the generateddictionary D, of which every square denotes an atom of size8 × 8. The 2D dictionaries D1 and D2 represent differentdimensional features which can be used for the feature selec-tion and pattern recognition, and the Kronecker product Dcan fully represent the image spatial correlation.

We also study the convergence property of our 2D SSMmodel in Figure 5. The Mean Squared Error(MSE) of ‖U −U‖2F /n is computed at different iterations. We observe thatthe errors descend with the increase of iterations so our pro-posal dictionary learning method is convergent.

4. CONCLUSION

In this paper, we propose a novel 2D sparse synthesis modelto fully make use of 2D correlation inside 2D signals, e.g.images. We also present a new formulation and an efficientalgorithm for learning 2D dictionaries. The effectiveness ofour 2D sparse model is demonstrated in image denoising inthis paper. Our 2D model saves 98% memory compared with1D model but achieves similar results. It significantly outper-

Fig. 5. Err Analysis on our dictionary learning. Y labels MSEwhich is ‖U −U‖2F /n.The image is of size n. X labels theiteration number.

forms 1D model when similar memory cost is required.Our 2D model exploits the directional sparse features of

both horizontal and vertical directions while obtaining thecomplete features of an image. It also provides a light so-lution with lower complexity and much lower memory costcompared with 1D solutions. We believe that it will benefitmany image-related tasks, such as image denosing, super res-olution, and classification. Also, we would like to extend ourmodel to high dimensional case, e.g. video, in the future.

5. ACKNOWLEDGEMENT

This work was supported by 973 Program (2011CB302703),the National Natural Science Foundation of China (No.61033004,61170103, U0935004).

6. REFERENCES

[1] M. Elad, P. Milanfar, and R. Rubinstein, “Analysis ver-sus synthesis in signal priors,” Inverse problems, vol.23, no. 3, pp. 947, 2007.

[2] A.M. Bruckstein, D.L. Donoho, and M. Elad, “Fromsparse solutions of systems of equations to sparse mod-eling of signals and images,” SIAM review, vol. 51, no.1, pp. 34–81, 2009.

[3] Y. Meyer and D.H. Salinger, Wavelets and operators,vol. 2, Cambridge Univ Press, 1992.

[4] E. Candes, L. Demanet, D. Donoho, and L. Ying, “Fastdiscrete curvelet transforms,” Multiscale Modeling &Simulation, vol. 5, no. 3, pp. 861–899, 2006.

[5] M.N. Do and M. Vetterli, “The contourlet transform:an efficient directional multiresolution image represen-tation,” Image Processing, IEEE Transactions on, vol.14, no. 12, pp. 2091–2106, 2005.

[6] E. Le Pennec and S. Mallat, “Sparse geometric im-age representations with bandelets,” Image Process-ing, IEEE Transactions on, vol. 14, no. 4, pp. 423–438,2005.

[7] A. Elad, M. ; Bruckstein, “K-svd: An algorithm for de-signing overcomplete dictionaries for sparse representa-tion,” IEEE Transactions on Signal Proceeding, vol. 54,pp. 4311–4322, Nov. 2006.

[8] H. Lee, A. Battle, R. Raina, and A.Y. Ng, “Efficientsparse coding algorithms,” Advances in neural informa-tion processing systems, vol. 19, pp. 801, 2007.

[9] Y.C. Pati, R. Rezaiifar, and PS Krishnaprasad, “Orthog-onal matching pursuit: Recursive function approxima-tion with applications to wavelet decomposition,” in Sig-nals, Systems and Computers. 1993 Conference Recordof The Twenty-Seventh Asilomar Conference on. IEEE,1993, pp. 40–44.

[10] P. Weiss, L. Blanc-Feraud, et al., “A proximal methodfor inverse problems in image processing,” in EU-SIPCO, 2009.

[11] Marco F Duarte and Richard G Baraniuk, “Kroneckercompressive sensing,” Image Processing, IEEE Trans-actions on, vol. 21, no. 2, pp. 494–504, 2012.

[12] Cesar F Caiafa and Andrzej Cichocki, “Block sparserepresentations of tensors using kronecker bases,” inAcoustics, Speech and Signal Processing (ICASSP),2012 IEEE International Conference on. IEEE, 2012,pp. 2709–2712.

[13] L. Benoıt, J. Mairal, F. Bach, and J. Ponce, “Sparse im-age representation with epitomes,” in Computer Visionand Pattern Recognition (CVPR), 2011 IEEE Confer-ence on. IEEE, 2011, pp. 2913–2920.

[14] Y. Fang, J.J. Wu, and B.M. Huang, “2d sparse signalrecovery via 2d orthogonal matching pursuit,” ScienceChina Information sciences, vol. 55, no. 4, pp. 889–897,2012.

[15] M. Holmes, A. Gray, and C. Isbell, “Fast svd forlarge-scale matrices,” in Workshop on Efficient MachineLearning at NIPS, 2007.

[16] J. Portilla, V. Strela, M.J. Wainwright, and E.P. Simon-celli, “Image denoising using scale mixtures of gaus-sians in the wavelet domain,” Image Processing, IEEETransactions on, vol. 12, no. 11, pp. 1338–1351, 2003.

[17] M. Elad and M. Aharon, “Image denoising via sparseand redundant representations over learned dictionar-ies,” Image Processing, IEEE Transactions on, vol. 15,no. 12, pp. 3736–3745, 2006.