Recovery of Sparsely Corrupted Signals

IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 5, MAY 2012 3115

Recovery of Sparsely Corrupted SignalsChristoph Studer, Member, IEEE, Patrick Kuppinger, Graeme Pope, Student Member, IEEE, and

Helmut Bölcskei, Fellow, IEEE

Abstract—We investigate the recovery of signals exhibitinga sparse representation in a general (i.e., possibly redundantor incomplete) dictionary that are corrupted by additive noiseadmitting a sparse representation in another general dictionary.This setup covers a wide range of applications, such as imageinpainting, super-resolution, signal separation, and recovery ofsignals that are impaired by, e.g., clipping, impulse noise, ornarrowband interference. We present deterministic recoveryguarantees based on a novel uncertainty relation for pairs ofgeneral dictionaries and we provide corresponding practicablerecovery algorithms. The recovery guarantees we find depend onthe signal and noise sparsity levels, on the coherence parametersof the involved dictionaries, and on the amount of prior knowledgeabout the signal and noise support sets.

Index Terms—Coherence-based recovery guarantees, greedy al-gorithms, �-norm minimization, signal restoration, signal separa-tion, uncertainty relations.

I. INTRODUCTION

W E consider the problem of identifying the sparse vectorfrom linear and nonadaptive measure-

ments collected in the vector

(1)

where and are known deterministicand general (i.e., not necessarily of the same cardinality, andpossibly redundant or incomplete) dictionaries, andrepresents a sparse noise vector. The support set of and thecorresponding nonzero entries can be arbitrary; in particular,may also depend on and/or the dictionary .

This recovery problem occurs in many applications, some ofwhich are described next.

Manuscript received February 21, 2011; revised October 09, 2011; acceptedOctober 24, 2011. Date of publication December 14, 2011; date of current ver-sion April 17, 2012. This work was supported in part by the Swiss NationalScience Foundation under Grant PA00P2-134155. The material in this paperwas presented in part in [1] at the 2010 IEEE International Symposium on In-formation Theory.

C. Studer was with the Department of Information Technology and ElectricalEngineering, ETH Zürich, CH-8092 Zürich, Switzerland. He is now with theDepartment of Electrical and Computer Engineering, Rice University, Houston,TX 77005 USA (e-mail: [email protected]).

P. Kuppinger was with the Department of Information Technology and Elec-trical Engineering, ETH Zürich, CH-8092 Zürich, Switzerland. He is now withUBS Zürich, CH-8098 Zürich, Switzerland (e-mail: [email protected]).

G. Pope and H. Bölcskei are with the Department of Information Technologyand Electrical Engineering, ETH Zürich, CH-8092 Zürich, Switzerland (e-mail:[email protected]; [email protected]).

Communicated by J. Romberg, Associate Editor for Signal Processing.Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TIT.2011.2179701

• Clipping: Nonlinearities in (power-)amplifiers or inanalog-to-digital converters often cause signal clippingor saturation [2]. This impairment can be cast into thesignal model (1) by setting , where de-notes the identity matrix, and rewriting (1) as

with . Concretely, instead ofthe –dimensional signal vector of interest,the device in question delivers , where the function

realizes entry-wise signal clipping to the interval. The vector will be sparse, provided the clip-

ping level is high enough. Furthermore, in this case, thesupport set of can be identified prior to recovery, bysimply comparing the absolute values of the entries ofto the clipping threshold . Finally, we note that, here, it isessential that the noise vector be allowed to depend onthe vector and/or the dictionary .

• Impulse noise: In numerous applications, one has to dealwith the recovery of signals corrupted by impulse noise [3].Specific applications include, e.g., reading out from unre-liable memory [4] or recovery of audio signals impaired byclick/pop noise, which typically occurs during playback ofold phonograph records. The model in (1) is easily seento incorporate such impairments. Just set and let

be the impulse-noise vector. We would like to empha-size the generality of (1) which allows impulse noise thatis sparse in general dictionaries .

• Narrowband interference: In many applications, one is in-terested in recovering audio, video, or communication sig-nals that are corrupted by narrowband interference. Elec-tric hum, as it may occur in improperly designed audioor video equipment, is a typical example of such an im-pairment. Electric hum, typically, exhibits a sparse repre-sentation in the Fourier basis as it (mainly) consists of atone at some base-frequency and a series of correspondingharmonics, which is captured by setting in (1),where is the –dimensional discrete Fourier trans-form (DFT) matrix defined below in (2).

• Super-resolution and inpainting: Our framework also en-compasses super-resolution [5], [6] and inpainting [7] forimages, audio, and video signals. In both applications, onlya subset of the entries of the (full-resolution) signal vector

is available and the task is to fill in the missingentries of the signal vector such that . The missingentries are accounted for by choosing the vector suchthat the entries of corresponding to the missingentries in are set to some (arbitrary) value, e.g., 0. Themissing entries of are, then, filled in by first recovering

from and, then, computing . Note that in bothapplications, the support set is known (i.e., the locationsof the missing entries can easily be identified) and the dic-

0018-9448/$26.00 © 2011 IEEE

3116 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 58, NO. 5, MAY 2012

tionary is typically redundant (see, e.g., [8] for a corre-sponding discussion), i.e., has more dictionary elements(columns) than rows, which demonstrates the need for re-covery results that apply to general (i.e., possibly redun-dant) dictionaries.

• Signal separation: Separation of (audio or video) signalsinto two distinct components also fits into our framework.A prominent example for this task is the separation of tex-ture from cartoon parts in images (see [9], [10], and refer-ences therein). In the language of our setup, the dictionaries

and are chosen such that they allow for sparse repre-sentation of the two distinct features; and are the corre-sponding coefficients describing these features (sparsely).Note that here the vector no longer plays the role of (un-desired) noise. Signal separation then amounts to simul-taneously extracting the sparse vectors and from theobservation (e.g., the image) .

Naturally, it is of significant practical interest to identify fun-damental limits on the recovery of (and , if appropriate) from

in (1). For the noiseless case , such recovery guaran-tees are known [11]–[13] and typically set limits on the max-imum allowed number of nonzero entries of or—more collo-quially—on the “sparsity” level of . These recovery guaran-tees are usually expressed in terms of restricted isometry con-stants (RICs) [14], [15] or in terms of the coherence parameter[11]–[13], [16] of the dictionary . In contrast to coherence pa-rameters, RICs can, in general, not be computed efficiently. Inthis paper, we focus exclusively on coherence-based recoveryguarantees. For the case of unstructured noise, i.e.,with no constraints imposed on apart from , co-herence-based recovery guarantees were derived in [16]–[20].The corresponding results, however, do not guarantee perfectrecovery of , but only ensure that either the recovery error isbounded above by a function of or only guarantee perfectrecovery of the support set of . Such results are to be expected,as a consequence of the generality of the setup in terms of theassumptions on the noise vector .

A. Contributions

In this paper, we consider the following questions. 1) Underwhich conditions can the vector (and the vector , if appro-priate) be recovered perfectly from the (sparsely corrupted)observation , and 2) can we formulate practicalrecovery algorithms with corresponding (analytical) perfor-mance guarantees? Sparsity of the signal vector and the errorvector will turn out to be key in answering these questions.More specifically, based on an uncertainty relation for pairsof general dictionaries, we establish recovery guarantees thatdepend on the number of nonzero entries in and , and onthe coherence parameters of the dictionaries and . Theserecovery guarantees are obtained for the following differentcases: I) The support sets of both and are known (prior torecovery); II) the support set of only or only is known; III)the number of nonzero entries of only or only is known;IV) nothing is known about and . We formulate efficientrecovery algorithms and derive corresponding performance

guarantees. Finally, we compare our analytical recovery thresh-olds to numerical results and we demonstrate the application ofour algorithms and recovery guarantees to an image inpaintingexample.

B. Outline of the Paper

The remainder of this paper is organized as follows. InSection II, we briefly review relevant previous results. InSection III, we derive a novel uncertainty relation that lays thefoundation for the recovery guarantees reported in Section IV.A discussion of our results is provided in Section V and nu-merical results are presented in Section VI. We conclude inSection VII.

C. Notation

Lowercase boldface letters stand for column vectors anduppercase boldface letters designate matrices. For the matrix

, we denote its transpose and conjugate transpose byand , respectively, its (Moore–Penrose) pseudoinverse by

, its th column by , and the entryin the th row and th column by . The th entry of thevector is . The space spanned by the columns of isdenoted by . The identity matrix is denoted by

, the all zeros matrix by , and the all-zerosvector of dimension by . The DFT matrixis defined as

(2)where . The Euclidean (or ) norm of the vector isdenoted by , stands for the -norm of , anddesignates the number of nonzero entries in . Throughout thispaper, we assume that the columns of the dictionaries and

have unit -norm. The minimum and maximum eigenvalueof the positive-semidefinite matrix is denoted byand , respectively. The spectral norm of the matrixis . Sets are designated by uppercasecalligraphic letters; the cardinality of the set is . The com-plement of a set (in some superset ) is denoted by . Fortwo sets and , means that is of the form

, where and . The support set ofthe vector is designated by . The matrix is ob-tained from by retaining the columns of with indices in

; the vector is obtained analogously. We define thediagonal (projection) matrix for the set asfollows:

andotherwise.

For , we set .

II. REVIEW OF RELEVANT PREVIOUS RESULTS

Recovery of the vector from the sparsely corrupted mea-surement corresponds to a sparse-signal recoveryproblem subject to structured (i.e., sparse) noise. In this section,

STUDER et al.: RECOVERY OF SPARSELY CORRUPTED SIGNALS 3117

we briefly review relevant existing results for sparse-signal re-covery from noiseless measurements, and we summarize the re-sults available for recovery in the presence of unstructured andstructured noise.

A. Recovery in the Noiseless Case

Recovery of from where is redundant (i.e.,) amounts to solving an underdetermined linear system

of equations. Hence, there are infinitely many solutions , ingeneral. However, under the assumption of being sparse, thesituation changes drastically. More specifically, one can recover

from the observation by solving

This approach results, however, in prohibitive computationalcomplexity, even for small problem sizes. Two of the mostpopular and computationally tractable alternatives to solving

by an exhaustive search are basis pursuit (BP) [11]–[13],[21]–[23] and orthogonal matching pursuit (OMP) [13], [24],[25]. BP is essentially a convex relaxation of and amountsto solving

OMP is a greedy algorithm that recovers the vector by itera-tively selecting the column of that is most “correlated” withthe difference between and its current best (in -norm sense)approximation.

The questions that arise naturally are: Under which condi-tions does have a unique solution and when do BP and/orOMP deliver this solution? To formulate the answer to thesequestions, define and the coherence of the dictio-nary as

(3)

As shown in [11]–[13], a sufficient condition for to be theunique solution of applied to and for BP andOMP to deliver this solution is

(4)

B. Recovery in the Presence of Unstructured Noise

Coherence-based recovery guarantees in the presence of un-structured (and deterministic) noise, i.e., for ,with no constraints imposed on apart from , werederived in [16]–[20] and the references therein. Specifically, itwas shown in [16] that a suitably modified version of BP, re-ferred to as BP denoising, recovers an estimate satisfying

provided that (4) is met. Here,depends on the coherence and on the sparsity level of

. Note that the support set of the estimate may differ fromthat of . Another result, reported in [17], states that OMP de-livers the correct support set (but does not perfectly recover thenonzero entries of ) provided that

(5)

where denotes the absolute value of the component ofwith smallest nonzero magnitude. The recovery condition (5)yields sensible results only if is small. Resultssimilar to those reported in [17] were obtained in [18], [19]. Re-covery guarantees in the case of stochastic noise can be foundin [19], [20]. We finally point out that perfect recovery of is,in general, impossible in the presence of unstructured noise. Incontrast, as we shall see in the following, perfect recovery ispossible under structured noise according to (1).

C. Recovery Guarantees in the Presence of Structured Noise

As outlined in Section I, many practically relevant signal re-covery problems can be formulated as (sparse) signal recoveryfrom sparsely corrupted measurements, a problem that seems tohave received comparatively little attention in the literature sofar and does not appear to have been developed systematically.

A straightforward way leading to recovery guarantees in thepresence of structured noise, as in (1), follows from rewriting(1) as

(6)

with the concatenated dictionary and the stackedvector . This formulation allows us to invokethe recovery guarantee in (4) for the concatenated dictionary ,which delivers a sufficient condition for (and hence, and )to be the unique solution of applied to and forBP and OMP to deliver this solution [11], [12]. However, the soobtained recovery condition

(7)

with the dictionary coherence defined as

(8)

ignores the structure of the recovery problem at hand, i.e., is ag-nostic to i) the fact that consists of the dictionaries andwith known coherence parameters and , respectively, andii) knowledge about the support sets of and/or that may beavailable prior to recovery. As shown in Section IV, exploitingthese two structural aspects of the recovery problem yields supe-rior (i.e., less restrictive) recovery thresholds. Note that condi-tion (7) guarantees perfect recovery of (and ) independent ofthe -norm of the noise vector, i.e., may be arbitrarilylarge. This is in stark contrast to the recovery guarantees fornoisy measurements in [16] and (5) (originally reported in [17]).

Special cases of the general setup (1), explicitly taking intoaccount certain structural aspects of the recovery problem, wereconsidered in [3], [14], [26]–[30]. Specifically, in [26], it wasshown that for , , and knowledge of thesupport set of , perfect recovery of the –dimensional vector

is possible if

(9)

where . In [27], [28], recovery guarantees based onthe RIC of the matrix for the case where is an orthonormalbasis (ONB), and where the support set of is either known or


unknown, were reported; these recovery guarantees are partic-ularly handy when is, for example, i.i.d. Gaussian [31], [32].However, results for the case of and both general (anddeterministic) dictionaries taking into account prior knowledgeabout the support sets of and seem to be missing in the liter-ature. Recovery guarantees for i.i.d. nonzero mean Gaussian,

, and the support sets of and unknown were reportedin [29]. In [30], recovery guarantees under a probabilistic modelon both and and for unitary and were reportedshowing that can be recovered perfectly with high probability(and independently of the -norm of and ). The problemof sparse-signal recovery in the presence of impulse noise (i.e.,

) was considered in [3], where a particular nonlinearmeasurement process combined with a nonconvex program forsignal recovery was proposed. In [14], signal recovery in thepresence of impulse noise based on -norm minimization wasinvestigated. The setup in [14], however, differs considerablyfrom the one considered in this paper as in [14] needs to betall (i.e., ) and the vector to be recovered is not nec-essarily sparse.

We conclude this literature overview by noting that this paperis inspired by [26]. Specifically, we note that the recovery guar-antee (9) reported in [26] is obtained from an uncertainty rela-tion that puts limits on how sparse a given signal can simultane-ously be in the Fourier basis and in the identity basis. Inspiredby this observation, we start our discussion by presenting an un-certainty relation for pairs of general dictionaries, which formsthe basis for the recovery guarantees reported later in this paper.

III. GENERAL UNCERTAINTY RELATION FOR

-CONCENTRATED VECTORS

We next present a novel uncertainty relation, which extendsthe uncertainty relation in [33, Lemma 1] for pairs of generaldictionaries to vectors that are -concentrated rather than per-fectly sparse. As shown in Section IV, this extension constitutesthe basis for the derivation of recovery guarantees for BP.

A. The Uncertainty Relation

Define the mutual coherence between the dictionaries andas

Furthermore, we will need the following definition, which ap-peared previously in [26].

Definition 1: A vector is said to be -concentratedto the set if , where

. We say that the vector is perfectly concentratedto the set and, hence, -sparse if , i.e., if .

We can now state the following uncertainty relation for pairsof general dictionaries and for -concentrated vectors.

Theorem 1: Let be a dictionary with coherence, a dictionary with coherence , and denote the

mutual coherence between and by . Let be a vector inthat can be represented as a linear combination of columns

of and, similarly, as a linear combination of columns of .Concretely, there exists a pair of vectors andsuch that (we exclude the trivial case where

and ).1 If is -concentrated to andis -concentrated to , then (10), shown at the bottom of thepage, holds.

Proof: The proof follows closely that of [33, Lemma 1],which applies to perfectly concentrated vectors and . Wetherefore only summarize the modifications to the proof of [33,Lemma 1]. Instead of using to arrive at[33, Eq. 29]

we invoke to arrive at the fol-lowing inequality valid for -concentrated vectors :

(11)

Similarly, -concentration, i.e., ,is used to replace [33, Eq. 30] by

(12)

The uncertainty relation (10) is, then, obtained by multi-plying (11) and (12) and dividing the resulting inequality by

.

In the case where both and are perfectly concentrated,i.e., , Theorem 1 reduces to the uncertainty relationreported in [33, Lemma 1], which we restate next for the sakeof completeness.

Corollary 2 [33, Lemma 1]: If and, the following holds:

(13)

As detailed in [33], [34], the uncertainty relation in Corollary 2generalizes the uncertainty relation for two ONBs found in [23].Furthermore, it extends the uncertainty relations provided in[35] for pairs of square dictionaries (having the same numberof rows and columns) to pairs of general dictionaries and .

B. Tightness of the Uncertainty Relation

In certain special cases, it is possible to find signals that sat-isfy the uncertainty relation (10) with equality. As in [26], con-

1The uncertainty relation continues to hold if either � � � or � � � ,but does not apply to the trivial case� � � and � � � . In all three cases,we have � � � .

(10)


sider and , so that , and definethe comb signal containing equidistant spikes of unit height as

ifotherwise

where we shall assume that divides . It can be shown that thevectors and , both having nonzero en-tries, satisfy . If and ,the vectors and are perfectly concentrated to and , re-spectively, i.e., . Since and

, it follows that and, hence,satisfies (10) with equality.

We will next show that for pairs of general dictionaries and, finding signals that satisfy the uncertainty relation (10) with

equality is NP-hard. For the sake of simplicity, we restrict our-selves to the case and , which im-plies and . Next, consider the problem

Since we are interested in the minimum of fornonzero vectors and , we imposed the constraintsand to exclude the case where and/or

. Now, it follows that for the particular choiceand, hence, (note that we

exclude the case as a consequence of the requirement) the problem reduces to

where . However, as is equivalent to , whichis NP-hard [36], in general, we can conclude that finding a pair

and satisfying the uncertainty relation (10) with equality isNP-hard.

IV. RECOVERY OF SPARSELY CORRUPTED SIGNALS

Based on the uncertainty relation in Theorem 1, we next de-rive conditions that guarantee perfect recovery of (and of

, if appropriate) from the (sparsely corrupted) measurement. These conditions will be seen to depend on

the number of nonzero entries of and , and on the coherenceparameters , , and . Moreover, in contrast to (5), the re-covery conditions we find will not depend on the -norm of thenoise vector , which is, hence, allowed to be arbitrarilylarge. We consider the following cases: I) The support sets ofboth and are known (prior to recovery); II) the support setof only or only is known; III) the number of nonzero entriesof only or only is known; and IV) nothing is known about

and . The uncertainty relation in Theorem 1 is the basis forthe recovery guarantees in all four cases considered. To simplifynotation, motivated by the form of the right-hand side (RHS) of(13), we define the function

In the remainder of the paper, denotes and standsfor . We furthermore assume that the dictionaries and

are known perfectly to the recovery algorithms. Moreover, weassume that2 .

A. Case I: Knowledge of and

We start with the case where both and are known priorto recovery. The values of the nonzero entries of and areunknown. This scenario is relevant, for example, in applicationsrequiring recovery of clipped band-limited signals with knownspectral support . Here, we would have , ,and can be determined as follows: Compare the measurements

, , to the clipping threshold ; if addthe corresponding index to .

Recovery of from is then performed as follows. We firstrewrite the input–output relation in (1) as

with the concatenated dictionary and thestacked vector . Since and are known,

we can recover the stacked vector , perfectlyand, hence, the nonzero entries of both and , if the pseudoin-verse exists. In this case, we can obtain , as

(14)

The following theorem states a sufficient condition for tohave full (column) rank, which implies existence of the pseu-doinverse . This condition depends on the coherence pa-rameters , , and , of the involved dictionaries andand on and through the cardinalities and , i.e., thenumber of nonzero entries in and , respectively.

Theorem 3: Let with and. Define and . If

(15)

then the concatenated dictionary has full(column) rank.

Proof: See Appendix A.

For the special case and (so thatand ), the recovery condition (15) reduces

to , a result obtained previously in [26]. Tightnessof (15) can be established by noting that the pairs ,

with and ,with and both satisfy (15)

with equality and lead to the same measurement outcome[34].

It is interesting to observe that Theorem 3 yields a sufficientcondition on and for any -submatrix of

to have full (column) rank. To see this, consider the specialcase and hence, . Condition (15)characterizes pairs ( ), for which all matrices with

and are guaranteed to have full (column)rank. Hence, the submatrix consisting of all rows of withrow index in must have full (column) rank as well. Since the

2If� � �, the space spanned by the columns of� is orthogonal to the spacespanned by the columns of�. This makes the separation of the components��and �� given � straightforward. Once this separation is accomplished, � canbe recovered from �� using ��, BP, or OMP, if (4) is satisfied.


result holds for all support sets and with and, all possible -submatrices of must

have full (column) rank.

B. Case II: Only or Only Is Known

Next, we find recovery guarantees for the case where eitheronly or only is known prior to recovery.

1) Recovery When Is Known and Is Unknown: A promi-nent application for this setup is the recovery of clipped band-limited signals [27], [37], where the signal’s spectral support,i.e., , is unknown. The support set can be identified as de-tailed previously in Section IV-A. Further application examplesfor this setup include inpainting and super-resolution [5]–[7] ofsignals that admit a sparse representation in (but with un-known support set ). The locations of the missing elementsin are known (and correspond, e.g., to missing paintelements in frescos), i.e., the set can be determined prior torecovery. Inpainting and super-resolution then amount to recon-structing the vector from the sparsely corrupted measurement

and computing .The setting of known and unknown was considered pre-

viously in [26] for the special case and . Therecovery condition (18) in Theorem 4 in the following extendsthe result in [26, Theorems 5 and 9] to pairs of general dictio-naries and .

Theorem 4: Let where is known.Consider the problem

(16)

and the convex program

(17)

If and satisfy

(18)

then the unique solution of applied to isgiven by and will deliver this solution.

Proof: See Appendix B.

Solving requires a combinatorial search, which re-sults in prohibitive computational complexity even for mod-erate problem sizes. The convex relaxation can, how-ever, be solved more efficiently. Note that the constraint

reflects the fact that any error componentyields consistency on account of known (by assumption). For

(i.e., the noiseless case), the recovery threshold (18)reduces to , which is the well-known re-covery threshold (4) guaranteeing recovery of the sparse vector

through and BP applied to . We finally notethat RIC-based guarantees for recovering from (i.e.,recovery in the absence of (sparse) corruptions) that take intoaccount partial knowledge of the signal support set were de-veloped in [38], [39].

Tightness of (18) can be established by setting and. Specifically, the pairs ,

and , both satisfy (18) with equality. Onecan furthermore verify that and are both in the admis-sible set specified by the constraints in and and

, . Hence, andboth cannot distinguish between and based on the mea-surement outcome . For a detailed discussion of this example,we refer to [34].

Rather than solving or , we may attempt torecover the vector by exploiting more directly the fact that

is known (since and are assumed to be known)and projecting the measurement outcome onto the orthog-onal complement of . This approach would eliminate the(sparse) noise component and leave us with a standard sparse-signal recovery problem for the vector . We next show thatthis ansatz is guaranteed to recover the sparse vector providedthat condition (18) is satisfied. Let us detail the procedure. Ifthe columns of are linearly independent, the pseudoinverse

exists, and the projector onto the orthogonal complement ofis given by

(19)

Applying to the measurement outcome yields

(20)

where we used the fact that . We are now leftwith the standard problem of recovering from the modifiedmeasurement outcome . What comes to mind firstis that computing the standard recovery threshold (4) for themodified dictionary should provide us with a recoverythreshold for the problem of extracting from .It turns out, however, that the columns of will, in gen-eral, not have unit -norm, an assumption underlying (4). Whatcomes to our rescue is that under condition (18), we have (asshown in Theorem 5 below) for .We can, therefore, normalize the modified dictionary byrewriting (20) as

(21)

where is the diagonal matrix with elements

and . Now, plays the role of the dictionary(with normalized columns) and is the unknown sparse vectorthat we wish to recover. Obviously, andcan be recovered from according to3 . The followingtheorem shows that (18) is sufficient to guarantee the following:i) The columns of are linearly independent, which guaran-tees the existence of , ii) for ,and iii) no vector with lies in the kernelof . Hence, (18) enables perfect recovery of from (21).

3If �� for � � �� , then the matrix �� corresponds to aone-to-one mapping.


Theorem 5: If (18) is satisfied, the unique solution ofapplied to is given by . Furthermore, BP andOMP applied to are guaranteed to recover theunique -solution.

Proof: See Appendix C.

Since condition (18) ensures that , ,the vector can be obtained from according to .Furthermore, (18) guarantees the existence of and hence thenonzero entries of can be obtained from as follows:

Theorem 5 generalizes the results in [26, Theorems 5 and 9]obtained for the special case and to pairs ofgeneral dictionaries and additionally shows that OMP deliversthe correct solution provided that (18) is satisfied.

It follows from (21) that other sparse-signal recovery algo-rithms, such as iterative thresholding-based algorithms [40],CoSaMP [41], or subspace pursuit [42] can be applied torecover .4 Finally, we note that the idea of projecting themeasurement outcome onto the orthogonal complement of thespace spanned by the active columns of and investigatingthe effect on the RICs, instead of the coherence parameter(as was done in Appendix C-C) was put forward in [27] and[43] along with RIC-based recovery guarantees that apply torandom matrices and guarantee the recovery of with highprobability (with respect to and irrespective of the locationsof the sparse corruptions).

2) Recovery When Is Known and Is Unknown: A pos-sible application scenario for this situation is the recovery ofspectrally sparse signals with known spectral support that areimpaired by impulse noise with unknown impulse locations.

It is evident that this setup is formally equivalent to that dis-cussed in Section IV-B1, with the roles of and interchanged.In particular, we may apply the projection matrix

to the corrupted measurement outcome to obtain thestandard recovery problem , where is a di-agonal matrix with diagonal elements .

The corresponding unknown vector is given by .The following corollary is a direct consequence of Theorem 5.

Corollary 6: Let where isknown. If the number of nonzero entries in and , i.e.,

and , satisfy

(22)

then the unique solution of applied to isgiven by . Furthermore, BP and OMP applied to

recover the unique -solution.Once we have , the vector can be obtained easily, since

and the nonzero entries of are given by

4Finding analytical recovery guarantees for these algorithms remains an in-teresting open problem.

Since (22) ensures that the columns of are linearly inde-pendent, the pseudoinverse is guaranteed to exist. Note thattightness of the recovery condition (22) can be established anal-ogously to the case of known and unknown (discussed inSection IV-B1).

C. Case III: Cardinality of or Known

We next consider the case where neither nor are known,but knowledge of either or is available (prior to re-covery). An application scenario for unknown andknown would be the recovery of a sparse pulse stream with un-known pulse locations from measurements that are corrupted byelectric hum with unknown base frequency but known numberof harmonics (e.g., determined by the base frequency of the humand the acquisition bandwidth of the system under considera-tion). We state our main result for the case knownand unknown. The case where is known andis unknown can be treated similarly.

Theorem 7: Let , define and, and assume that is known. Consider the problem

P(23)

where P denotes the set of subsets ofof cardinality less than or equal to . The unique

solution of ( ) applied to is given by if

(24)

Proof: See Appendix D.

We emphasize that the problem exhibits prohibitive(concretely, combinatorial) computational complexity, in gen-eral. Unfortunately, replacing the -norm of in the mini-mization in (23) by the -norm does not lead to a compu-tationally tractable alternative either, as the constraint

P specifies a nonconvex set, in gen-eral. Nevertheless, the recovery threshold in (24) is interestingas it completes the picture on the impact of knowledge aboutthe support sets of and on the recovery thresholds. Werefer to Section V-A for a detailed discussion of this matter.Note, though, that greedy recovery algorithms, such as OMP[13], [24], [25], CoSaMP [41], or subspace pursuit [42], can bemodified to incorporate prior knowledge of the individual spar-sity levels of and/or . Analytical recovery guarantees corre-sponding to the resulting modified algorithms do not seem to beavailable.

We finally note that tightness of (24) can be established forand . Specifically, consider the pair

, and the alternative pair, . It can be shown that both and

are in the admissible set of in (23), satisfy, and lead to the same measurement outcome . Therefore,

cannot distinguish between and (we refer to [34]for details).


D. Case IV: No Knowledge About the Support Sets

Finally, we consider the case of no knowledge (prior to re-covery) about the support sets and . A corresponding ap-plication scenario would be the restoration of an audio signal(whose spectrum is sparse with unknown support set) that iscorrupted by impulse noise, e.g., click or pop noise occurring atunknown locations. Another typical application can be found inthe realm of signal separation; e.g., the decomposition of imagesinto two distinct features, i.e., into a part that exhibits a sparserepresentation in the dictionary and another part that exhibitsa sparse representation in . Decomposition of the imagethen amounts to performing sparse-signal recovery based on

with no knowledge about the support setsand available prior to recovery. The individual image featuresare given by and .

Recovery guarantees for this case follow from the results in[33]. Specifically, by rewriting (1) as as in (6), we canemploy the recovery guarantees in [33], which are explicit in thecoherence parameters and , and the dictionary coherence

of . For the sake of completeness, we restate the followingresult from [33].

Theorem 8 ([33, Theorem 2]): Let withand with the coherence parameters

and the dictionary coherence as defined in (8). Asufficient condition for the vector to be the unique solutionof applied to is

(25)

where

and . Furthermore,and

Obviously, once the vector has been recovered, we can ex-tract and . The following theorem, originally stated in [33],guarantees that BP and OMP deliver the unique solution ofapplied to and the associated recovery threshold, asshown in [33], is only slightly more restrictive than that forin (25).

Theorem 9 ([33, Corollary 4]): A sufficient condition for BPand OMP to deliver the unique solution of applied to

is given by

(26)with and

where and .We emphasize that both thresholds (25) and (26) are more

restrictive than those in (15), (18), (22), and (24) (see alsoSection V-A), which is consistent with the intuition that addi-tional knowledge about the support sets and should lead tohigher recovery thresholds. Note that tightness of (25) and (26)was established before in [44], [33], respectively.

V. DISCUSSION OF THE RECOVERY GUARANTEES

The aim of this section is to provide an interpretation of the re-covery guarantees found in Section IV. Specifically, we discussthe impact of support-set knowledge on the recovery thresholdswe found, and we point out limitations of our results.

A. Factor of Two in the Recovery Thresholds

Comparing the recovery thresholds (15), (18), (22), and(24) (Cases I–III), we observe that the price to be paid for notknowing the support set or is a reduction of the recoverythreshold by a factor of two (note that in Case III, both and

are unknown, but the cardinality of either or is known).For example, consider the recovery thresholds (15) and (18).For given , solving (15) for yields

Similarly, still assuming and solving (18) for, we get

Hence, knowledge of prior to recovery allows for the re-covery of a signal with twice as many nonzero entries incompared to the case where is not known. This factor-of-twopenalty has the same roots as the well-known factor-of-twopenalty in spectrum-blind sampling [45]–[47]. Note that thesame factor-of-two penalty can be inferred from the RIC-basedrecovery guarantees in [15], [39], when comparing the recoverythreshold specified in [39, Theorem 1] for signals where par-tial support-set knowledge is available (prior to recovery) tothat given in [15, Theorem 1.1] which does not assume priorsupport-set knowledge.

We illustrate the factor-of-two penalty in Figs. 1 and 2,where the recovery thresholds (15), (18), (22), (24), and (26)are shown. In Fig. 1, we consider the caseand . We can see that for and known, thethreshold evaluates to . When only or is known,we have , and finally in the case where only isknown, we get . Note furthermore that in Case IV,where no knowledge about the support sets is available, therecovery threshold is more restrictive than in the case whereis known.

In Fig. 2, we show the recovery thresholds for ,, and . We see that all threshold

curves are straight lines. This behavior can be explained bynoting that (in contrast to the assumptions underlying Fig. 1) thedictionaries and have and the correspondingrecovery thresholds are essentially dominated by the numeratorof the RHS expressions in (15), (18), (22), and (24), which de-pends on both and . More concretely, if


Fig. 1. Recovery thresholds (15), (18), (22), (24), and (26) for � � � � �,and � � ��

��. Note that the curves for “only � known” and “only �

known” are on top of each other.

Fig. 2. Recovery thresholds (15), (18), (22), (24), and (26) for � � ��,� � ��, and � � ��.

, then the recovery threshold for Case II (where the sup-port set is known) becomes

(27)

which reflects the behavior observed in Fig. 2.

B. The Square-Root Bottleneck

The recovery thresholds presented in Section IV hold forall signal and noise realizations and and for all dictio-nary pairs (with given coherence parameters). However, asis well known in the sparse-signal recovery literature, coher-ence-based recovery guarantees are—in contrast to RIC-basedrecovery guarantees—fundamentally limited by the so-calledsquare-root bottleneck [48]. More specifically, in the noiselesscase (i.e., for ), the threshold (4) states that recoverycan be guaranteed only for up to nonzero entries in .Put differently, for a fixed number of nonzero entries in ,

i.e., for a fixed sparsity level, the number of measurementsrequired to recover through , BP, or OMP is on the orderof .

As in the classical sparse-signal recovery literature, thesquare-root bottleneck can be broken by performing a proba-bilistic analysis [48]. This line of work—albeit interesting—isoutside the scope of this paper and is further investigated in[33], [49], [50].

C. Trade-Off Between and

We next illustrate a trade-off between the sparsity levels ofand . Following the procedure outlined in [12], [51], we con-struct a dictionary consisting of ONBs and a dictionaryconsisting of ONBs such that ,where with , prime, and .Now, let us assume that the error sparsity level scales accordingto for some . For the case where only

is known but is unknown (Case II), we find from (18) thatany signal with (order-wise) nonzero entries(ignoring terms of order less than ) can be reconstructed.Hence, there is a trade-off between the sparsity levels of and

(here quantified through the parameter ), and both sparsitylevels scale with .

VI. NUMERICAL RESULTS

We first report simulation results and compare them to thecorresponding analytical results in this paper. We will find thateven though the analytical thresholds are pessimistic in general,they do reflect the numerically observed recovery behavior cor-rectly. In particular, we will see that the factor-of-two penaltydiscussed in Section V-A can also be observed in the numericalresults. We then demonstrate, through a simple inpainting ex-ample, that perfect signal recovery in the presence of sparse er-rors is possible even if the corruptions are significant (in termsof the -norm of the sparse noise vector ). In all numericalresults, OMP is performed with a predetermined number of it-erations [13], [24], [25], i.e., for Case II and Case IV, we set thenumber of iterations to and , respectively. To imple-ment BP, we employ SPGL1 [52], [53].

A. Impact of Support-Set Knowledge on Recovery Thresholds

We first compare simulation results to the recovery thresholds(15), (18), (22), and (26). For a given pair of dictionaries and

, we generate signal vectors and error vectors as follows:We first fix and , then, the support sets of the -sparsevector and the -sparse vector are chosen uniformly atrandom among all possible support sets of cardinality and

, respectively. Once the support sets have been chosen, wegenerate the nonzero entries of and by drawing from i.i.d.zero mean, unit variance Gaussian random variables. For eachpair of support-set cardinalities and , we perform 10 000Monte Carlo trials and declare success of recovery whenever therecovered vector satisfies

(28)

We plot the 50% success-rate contour, i.e., the border betweenthe region of pairs ( , ) for which (28) is satisfied in at least50% of the trials and the region where (28) is satisfied in less


than 50% of the trials. The recovered vector is obtained asfollows.

• Case I: When and are both known, we perform re-covery according to (14).

• Case II: When either only or only is known, we applyBP and OMP using the modified dictionary as detailed inTheorem 5 and Corollary 6, respectively.

• Case IV: When neither nor is known, we apply BPand OMP to the concatenated dictionary asdescribed in Theorem 9.

Note that for Case III, i.e., the case where the cardinality ofthe support set is known—as pointed out in Section IV-C—weonly have uniqueness results but no analytical recovery guar-antees, neither for BP nor for greedy recovery algorithms thatmake use of the separate knowledge of or (whereas, e.g.,standard OMP makes use of knowledge of , rather thanknowledge of and individually). This case is, therefore,not considered in the simulation results in the following.

1) Recovery Performance for the Hadamard–Identity PairUsing BP and OMP: We take , let be the HadamardONB [54], and set , which results inand . Fig. 3 shows 50% success-rate contours,under different assumptions of support-set knowledge. For per-fect knowledge of and , we observe that the 50% success-rate contour is at about , which is significantlybetter than the sufficient condition (guaranteeingperfect recovery) provided in (15).5 When either only or only

is known, the recovery performance is essentially independentof whether or is known. This is also reflected by the analyt-ical thresholds (18) and (22) when evaluated for(see also Fig. 1). Furthermore, OMP is seen to outperform BP.When neither nor is known, OMP again outperforms BP.

It is interesting to see that the factor-of-two penalty discussedin Section V-A is reflected in Fig. 3 (for ) betweenCases I and II. Specifically, we can observe that for full sup-port-set knowledge (Case I), the 50% success-rate is achievedat . If either or only is known (Case II),OMP achieves 50% success rate at , demon-strating a factor-of-two penalty since .Note that the results from BP in Fig. 3 do not seem to reflectthe factor-of-two penalty. For lack of an efficient recovery al-gorithm (making use of knowledge of ), we do not show nu-merical results for Case III.

2) Impact of : We take and generatethe dictionaries and as follows. Using the alternatingprojection method described in [56], we generate an approx-imate equiangular tight frame (ETF) for consisting of160 columns. We split this frame into two sets of 80 elements(columns) each and organize them in the matrices andsuch that the corresponding coherence parameters are given by

, , and . Fig. 4 showsthe 50% success-rate contour under four different assumptionsof support-set knowledge. In the case where either only oronly is known and in the case where and are unknown,we use OMP and BP for recovery. It is interesting to note that

5For � � � and � � � , it was proven in [55] that a set of columnschosen randomly from both� and� is linearly independent (with high proba-bility) given that the total number of chosen columns, i.e., � � � here, doesnot exceed a constant proportion of � .

Fig. 3. Impact of support-set knowledge on the 50% success-rate contour ofOMP and BP for the Hadamard–identity pair.

Fig. 4. Impact of support-set knowledge on the 50% success-rate contour ofOMP and BP performing recovery in pairs of approximate ETFs each of di-mension 64� 80.

the graphs for the cases where only or only are known aresymmetric with respect to the line . This symmetryis also reflected in the analytical thresholds (18) and (22) (seealso Fig. 2 and the discussion in Section V-A).

We finally note that in all cases considered earlier, the nu-merical results show that recovery is possible for significantlyhigher sparsity levels and than indicated by the corre-sponding analytical thresholds (15), (18), (22), and (26) (seealso Figs. 1 and 2). The underlying reasons are i) the deter-ministic nature of the results, i.e., the recovery guarantees in(15), (18), (22), and (26) are valid for all dictionary pairs (withgiven coherence parameters) and all signal and noise realiza-tions (with given sparsity level), and ii) we plot the 50% suc-cess-rate contour, whereas the analytical results guarantee per-fect recovery in 100% of the cases.

B. Inpainting Example

In transform coding, one is typically interested in maximallysparse representations of a given signal to be encoded [57].


Fig. 5. Recovery results using the DCT basis for the signal dictionary and the identity basis for the noise dictionary, for the cases where � and � are known, only� is known, and no support-set knowledge is available. (Picture origin: ETH Zürich/Esther Ramseier).

Fig. 6. Recovery results for the signal dictionary given by the Haar wavelet basis and the noise dictionary given by the identity basis, for the cases where both �and � are known, only � is known, and no support-set knowledge is available. (Picture origin: ETH Zürich/Esther Ramseier).

In our setting, this would mean that the dictionary shouldbe chosen so that it leads to maximally sparse representationsof a given family of signals. We next demonstrate, however,that in the presence of structured noise, the signal dictionary

should additionally be incoherent to the noise dictionary .This extra requirement can lead to very different criteria for de-signing transform bases (frames).

To illustrate this point, and to show that perfect recoverycan be guaranteed even when the -norm of the noise term

is large, we consider the recovery of a sparsely corrupted512 512-pixel grayscale image of the main building of ETHZürich. The dictionary is taken to be either the 2-D discretecosine transform (DCT) or the Haar wavelet decomposed onthree octaves [58]. We first “sparsify” the image by retainingthe 15% largest entries of the image’s representation in .We then corrupt (by overwriting with text) 18.8% of the pixelsin the sparsified image by setting them to the brightest grayscalevalue; this means that the errors are sparse in and thatthe noise is structured but may have large -norm. Image re-covery is performed according to (14) if and are known.(Note, however, that knowledge of is usually not available ininpainting applications.) We use BP when only is known andwhen neither nor are known. The recovery results are eval-uated by computing the mean-square error (MSE) between thesparsified image and its recovered version.

Figs. 5 and 6 show the corresponding results. As expected,the MSE increases as the amount of knowledge about the sup-

port sets decreases. More interestingly, we note that even thoughHaar wavelets often yield a smaller approximation error in clas-sical transform coding compared to the DCT, here the wavelettransform performs worse than the DCT. This behavior is due tothe fact that sparsity is not the only factor determining the per-formance of a transform coding basis (or frame) in the presenceof structured noise. Rather the mutual coherence between thedictionary used to represent the signal and that used to repre-sent the structured noise becomes highly relevant. Specifically,in the example at hand, we have for the Haar waveletand the identity basis, and for the DCT and theidentity basis. The dependence of the analytical thresholds (15),(18), (22), (24), and (26) on the mutual coherence explainsthe performance difference between the Haar wavelet basis andthe DCT basis. An intuitive explanation for this behavior is asfollows: the Haar-wavelet basis contains only four nonzero en-tries in the columns associated to fine scales, which is reflectedin the high mutual coherence (i.e., ) between the Haarwavelet basis and the identity basis. Thus, when projecting ontothe orthogonal complement of , it is likely that all nonzeroentries of such columns are deleted, resulting in columns of allzeros. Recovery of the corresponding nonzero entries of is,thus, not possible. In summary, we see that the choice of thetransform basis (frame) for a sparsely corrupted signal shouldnot only aim at sparsifying the signal as much as possible butshould also take into account the mutual coherence between thetransform basis (frame) and the noise sparsity basis (frame).


VII. CONCLUSION

The setup considered in this paper, in its generality, appearsto be new and a number of interesting extensions are possible.In particular, developing (coherence-based) recovery guaranteesfor greedy algorithms such as CoSaMP [41] or subspace pursuit[42] for all cases studied in the paper are interesting open prob-lems. Note that probabilistic recovery guarantees for the casewhere nothing is known about the signal and noise support sets(i.e., Case IV) readily follow from the results in [33]. Proba-bilistic recovery guarantees for the other cases studied in thispaper are in preparation [50]. Furthermore, an extension of theresults in this paper that accounts for measurement noise (in ad-dition to sparse noise) and applies to approximately sparse sig-nals can be found in [59].

APPENDIX APROOF OF THEOREM 3

We prove the full (column-)rank property of byshowing that under (15) there is a unique pair with

and satisfying .Assume that there exists an alternative pair such that

with and(i.e., the support sets of and are contained in and ,respectively). This would then imply that

and thus

Since both and have support in , it follows that alsohas support in , which implies . Similarly, weget . Defining and

, and, similarly, and ,we obtain the following chain of inequalities:

(29)

(30)

where (29) follows by applying the uncertainty relation in The-orem 1 (with since both and are perfectly con-centrated to and , respectively) and (30) is a consequenceof and . Obviously, (30) contradicts the as-sumption in (15), which completes the proof.

APPENDIX BPROOF OF THEOREM 4

We begin by proving that is the unique solution ofapplied to . Assume that there exists an alternativevector that satisfies with .

This would imply the existence of a vector with, such that

and hence

Since and , we haveand hence . Furthermore, since both and

have at most nonzero entries, we have .Defining and , and, similarly,

and , we obtain the followingchain of inequalities:

(31)

(32)

where (31) follows by applying the uncertainty relation in The-orem 1 (with since both and are perfectly con-centrated to and , respectively) and (32) is a consequenceof and . Obviously, (32) contradicts theassumption in (18), which concludes the first part of the proof.

We next prove that is also the unique solution ofapplied to . Assume that there exists an alter-native vector that satisfies with

. This would imply the existence of a vectorwith , such that

and hence

Defining , we obtain the following lower bound forthe -norm of :

(33)

where (33) is a consequence of the reverse triangle inequality.Now, the -norm of can be smaller than or equal to that of

only if . This would then imply that thedifference vector needs to be at least 50%–concentrated to theset (of cardinality ), i.e., we require that .Defining and , and noting that

and , it follows that . Thisleads to the following chain of inequalities:

(34)


(35)

where (34) follows from the uncertainty relation in Theorem 1applied to the difference vectors and (with since

is at least 50%–concentrated to and since isperfectly concentrated to ) and (35) is a consequence of

and . Rewriting (35), we obtain

(36)Since (36) contradicts the assumption in (18), this proves thatis the unique solution of applied to .

APPENDIX CPROOF OF THEOREM 5

We first show that condition (18) ensures that the columnsof are linearly independent. Then, we establish that

for . Finally, we show that theunique solution of , BP, and OMP applied tois given by .

1) The Columns of are Linearly Independent: Condi-tion (18) can only be satisfied if , whichimplies that . It was shown in [11]–[13] that for adictionary with coherence no fewer than columnsof can be linearly dependent. Hence, the columns ofmust be linearly independent.

2) for : We have to verify thatcondition (18) implies for . Since

is a projector and, therefore, Hermitian and idempotent, itfollows that

(37)

(38)

where (37) is a consequence of , and (38)follows from the reverse triangle inequality and ,

. Next, we derive an upper bound on ac-cording to

(39)

(40)

where (39) follows from the Rayleigh–Ritz theorem [60, The-orem 2] and (40) results from

Next, applying Gersgorin’s disc theorem [60, Theorem 6.1.1],we arrive at

(41)

Combining (38), (40), and (41) leads to the following lowerbound on :

(42)

Note that if condition (18) holds for6 , it follows thatand hence the RHS of (42) is

strictly positive. This ensures that defines a one-to-one map-ping. We next show that, moreover, condition (18) ensures thatfor every vector satisfying , has anonzero component that is orthogonal to .

3) Unique Recovery Through , BP, and OMP: Wenow need to verify that , BP, and OMP (applied to

) recover the vector provided that (18) issatisfied. This will be accomplished by deriving an upper boundon the coherence of the modified dictionary ,which, via the well-known coherence-based recovery guarantee[11]–[13]

(43)

leads to a recovery threshold guaranteeing perfect recovery of. This threshold is then shown to coincide with (18). More

specifically, the well-known sparsity threshold in (4) guaranteesthat the unique solution of applied to isgiven by , and, furthermore, that this unique solution can beobtained through BP and OMP if (43) holds. It is important tonote that . With

we obtain

(44)

Next, we upper bound the RHS of (44) by upper-bounding itsnumerator and lower-bounding its denominator. For the numer-ator, we have

(45)

(46)

(47)

where (45) follows from , (46) is obtainedthrough the triangle inequality, and (47) follows from

. Next, we derive an upper-bound on ac-cording to

(48)

(49)

6The case � � � is not interesting, as � � � corresponds to � � �

and hence recovery of � � � only could be guaranteed.


where (48) follows from the Cauchy–Schwarz inequality and(49) from the Rayleigh–Ritz theorem [60, Theorem 4.2.2].Defining , we further have

We obtain an upper bound on using the same steps that wereused to bound in (39)–(41):

(50)

where . Combining (47) and (50) leadsto the following upper bound:

(51)

Next, we derive a lower bound on the denominator on theRHS of (44). To this end, we set andnote that

(52)

where (52) follows from (42). Finally, combining (51) and (52),we arrive at

(53)

Inserting (53) into the recovery threshold in (43), we obtainthe following threshold guaranteeing recovery of from

through , BP, and OMP:

(54)

Since , we can transform (54) into

(55)

Rearranging terms in (55) finally yields

which proves that (18) guarantees recovery of the vector (andthus also of ) through , BP, and OMP.

APPENDIX DPROOF OF THEOREM 7

Assume that there exists an alternative vector that satisfies

P (with P )

with . This implies the existence of a vector withsuch that

and therefore

From and it follows that. Similarly, and imply. Defining and , and, similarly,

and , we arrive at

(56)

(57)

where (56) follows from the uncertainty relation in Theorem 1applied to the difference vectors and (with sinceboth and are perfectly concentrated to and , respec-tively) and (57) is a consequence of and .Obviously, (57) is in contradiction to (24), which concludes theproof.

ACKNOWLEDGMENT

The authors would like to thank C. Aubel, A. Bracher, andG. Durisi for interesting discussions, and the anonymous re-viewers for their valuable comments which helped to improvethe exposition.

REFERENCES

[1] C. Studer, P. Kuppinger, G. Pope, and H. Bölcskei, “Sparse signalrecovery from sparsely corrupted measurements,” in Proc. IEEE Int.Symp. Inf. Theory, St. Petersburg, Russia, Aug. 2011, pp. 1422–1426.

[2] J. S. Abel and J. O. Smith, III, “Restoring a clipped signal,” in Proc.IEEE Int. Conf. Acoust. Speech Signal Process., May 1991, vol. 3, pp.1745–1748.

[3] R. E. Carrillo, K. E. Barner, and T. C. Aysal, “Robust sampling andreconstruction methods for sparse signals in the presence of impulsivenoise,” IEEE J. Sel. Topics Signal Process., vol. 4, no. 2, pp. 392–408,Apr. 2010.

[4] C. Novak, C. Studer, A. Burg, and G. Matz, “The effect of unreli-able LLR storage on the performance of MIMO-BICM,” in Proc. 44thAsilomar Conf. Signals, Syst., Comput., Pacific Grove, CA, Nov. 2010,pp. 736–740.

[5] S. G. Mallat and G. Yu, “Super-resolution with sparse mixing estima-tors,” IEEE Trans. Image Process., vol. 19, no. 11, pp. 2889–2900, Nov.2010.

[6] M. Elad and Y. Hel-Or, “Fast super-resolution reconstruction algo-rithm for pure translational motion and common space-invariant blur,”IEEE Trans. Image Process., vol. 10, no. 8, pp. 1187–1193, Aug. 2001.

[7] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester, “Image in-painting,” in Proc. 27th Annu. Conf. Comp. Graph. Int. Technol., 2000,pp. 417–424.


[8] M. Aharon, M. Elad, and A. M. Bruckstein, “�-SVD: An algorithm fordesigning overcomplete dictionaries for sparse representation,” IEEETrans. Signal Process., vol. 54, no. 11, pp. 4311–4322, Nov. 2006.

[9] M. Elad, J.-L. Starck, P. Querre, and D. L. Donoho, “Simultaneouscartoon and texture image inpainting using morphological componentanalysis (MCA),” Appl. Comput. Harmon. Anal., vol. 19, pp. 340–358,Nov. 2005.

[10] D. L. Donoho and G. Kutyniok, “Microlocal analysis of the geometricseparation problem,” Feb. 2010 [Online]. Available: http://arxiv.org/abs/1004.3006v1

[11] D. L. Donoho and M. Elad, “Optimally sparse representation in general(nonorthogonal) dictionaries via � minimization,” Proc. Natl. Acad.Sci. USA, vol. 100, no. 5, pp. 2197–2202, Mar. 2003.

[12] R. Gribonval and M. Nielsen, “Sparse representations in unions ofbases,” IEEE Trans. Inf. Theory, vol. 49, no. 12, pp. 3320–3325, Dec.2003.

[13] J. A. Tropp, “Greed is good: Algorithmic results for sparse approxi-mation,” IEEE Trans. Inf. Theory, vol. 50, no. 10, pp. 2231–2242, Oct.2004.

[14] E. J. Candès and T. Tao, “Decoding by linear programming,” IEEETrans. Inf. Theory, vol. 51, no. 12, pp. 4203–4215, Dec. 2005.

[15] E. J. Candès, “The restricted isometry property and its implications forcompressed sensing,” C. R. Math. Acad. Sci. Paris, Ser. I, vol. 346, pp.589–592, 2008.

[16] T. T. Cai, L. Wang, and G. Xu, “Stable recovery of sparse signalsand an oracle inequality,” IEEE Trans. Inf. Theory, vol. 56, no. 7, pp.3516–3522, Jul. 2010.

[17] D. L. Donoho, M. Elad, and V. N. Temlyakov, “Stable recovery ofsparse overcomplete representations in the presence of noise,” IEEETrans. Inf. Theory, vol. 52, no. 1, pp. 6–18, Jan. 2006.

[18] J. J. Fuchs, “Recovery of exact sparse representations in the presenceof bounded noise,” IEEE Trans. Inf. Theory, vol. 51, no. 10, pp.3601–3608, Oct. 2005.

[19] J. A. Tropp, “Just relax: Convex programming methods for identifyingsparse signals in noise,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp.1030–1051, Mar. 2006.

[20] Z. Ben-Haim, Y. C. Eldar, and M. Elad, “Coherence-based perfor-mance guarantees for estimating a sparse vector under random noise,”IEEE Trans. Signal Process., vol. 58, no. 10, pp. 5030–5043, Oct. 2010.

[21] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decompositionby basis pursuit,” SIAM J. Sci. Comput., vol. 20, no. 1, pp. 33–61, 1998.

[22] D. L. Donoho and X. Huo, “Uncertainty principles and ideal atomic de-composition,” IEEE Trans. Inf. Theory, vol. 47, no. 7, pp. 2845–2862,Nov. 2001.

[23] M. Elad and A. M. Bruckstein, “A generalized uncertainty principleand sparse representation in pairs of bases,” IEEE Trans. Inf. Theory,vol. 48, no. 9, pp. 2558–2567, Sep. 2002.

[24] Y. C. Pati, R. Rezaifar, and P. S. Krishnaprasad, “Orthogonal matchingpursuit: Recursive function approximation with applications to waveletdecomposition,” in Proc. 27th Asilomar Conf. Signals, Syst., Comput.,Pacific Grove, CA, Nov. 1993, pp. 40–44.

[25] G. Davis, S. G. Mallat, and Z. Zhang, “Adaptive time-frequency de-compositions,” Opt. Eng., vol. 33, no. 7, pp. 2183–2191, Jul. 1994.

[26] D. L. Donoho and P. B. Stark, “Uncertainty principles and signal re-covery,” SIAM J. Appl. Math., vol. 49, no. 3, pp. 906–931, Jun. 1989.

[27] J. N. Laska, P. T. Boufounos, M. A. Davenport, and R. G. Baraniuk,“Democracy in action: Quantization, saturation, and compressivesensing,” Appl. Comput. Harmon. Anal., vol. 31, no. 3, pp. 429–443,2011.

[28] J. N. Laska, M. A. Davenport, and R. G. Baraniuk, “Exact signal re-covery from sparsely corrupted measurements through the pursuit ofjustice,” in Proc. 43rd Asilomar Conf. Signals, Syst., Comput., PacificGrove, CA, Nov. 2009, pp. 1556–1560.

[29] J. Wright and Y. Ma, “Dense error correction via � -minimization,”IEEE Trans. Inf. Theory, vol. 56, no. 7, pp. 3540–3560, Jul. 2010.

[30] N. H. Nguyen and T. D. Tran, “Exact recoverability from dense cor-rupted observations via � minimization,” Feb. 2011 [Online]. Avail-able: http://arxiv.org/abs/1102.1227v3

[31] D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory, vol.52, no. 4, pp. 1289–1306, Apr. 2006.

[32] E. J. Candès and J. Romberg, “Sparsity and incoherence in compressivesampling,” Inverse Problems, vol. 23, no. 3, pp. 969–985, 2007.

[33] P. Kuppinger, G. Durisi, and H. Bölcskei, “Uncertainty relations andsparse signal recovery for pairs of general signal sets,” IEEE Trans. Inf.Theory, vol. 58, no. 1, pp. 263–277, Jan. 2012.

[34] P. Kuppinger, General Uncertainty Relations and Sparse Signal Re-covery, H. Bölcskei, Ed. Konstanz, Germany: Hartung-Gorre Verlag,2011, vol. 8, Series in Communication Theory.

[35] S. Ghobber and P. Jaming, “On uncertainty principles in the finite di-mensional setting,” Feb. 2010 [Online]. Available: http://arxiv.org/abs/0903.2923

[36] G. Davis, S. G. Mallat, and M. Avellaneda, “Adaptive greedy algo-rithms,” Constr. Approx., vol. 13, pp. 57–98, 1997.

[37] A. Adler, V. Emiya, M. G. Jafari, M. Elad, R. Gribonval, and M.D. Plumbley, “A constrained matching pursuit approach to audiodeclipping,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Process.,Prague, Czech Republic, May 2011, pp. 329–332.

[38] N. Vaswani and W. Lu, “Modified-CS: Modifying compressivesensing for problems with partially known support,” IEEE Trans.Signal Process., vol. 58, no. 9, pp. 4595–4607, Sep. 2010.

[39] L. Jacques, “A short note on compressed sensing with partially knownsignal support,” EURASIP J. Signal Process., vol. 90, no. 12, pp.3308–3312, Dec. 2010.

[40] A. Maleki and D. L. Donoho, “Optimally tuned iterative reconstruc-tion algorithms for compressed sensing,” IEEE J. Sel. Topics SignalProcess., vol. 4, no. 2, pp. 330–341, Apr. 2010.

[41] D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery fromincomplete and inaccurate samples,” Appl. Comput. Harmon. Anal.,vol. 26, no. 3, pp. 301–321, May 2009.

[42] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensingsignal reconstruction,” IEEE Trans. Inf. Theory, vol. 55, no. 5, pp.2230–2249, May 2009.

[43] M. A. Davenport, P. T. Boufonos, and R. G. Baraniuk, “Compressivedomain interference cancellation,” in Proc. Workshop Signal Process.Adapt. Sparse Struct. Represent., Saint-Malo, France, Apr. 2009.

[44] A. Feuer and A. Nemirovski, “On sparse representations in pairs ofbases,” IEEE Trans. Inf. Theory, vol. 49, no. 6, pp. 1579–1581, Jun.2003.

[45] P. Feng and Y. Bresler, “Spectrum-blind minimum-rate samplingand reconstruction of multiband signals,” in Proc. IEEE Int. Conf.Acoust., Speech, Signal Process., Atlanta, GA, May 1996, vol. 3,pp. 1689–1692.

[46] Y. Bresler, “Spectrum-blind sampling and compressive sensing forcontinuous-index signals,” in Proc. Inf. Theory Appl. Workshop, SanDiego, CA, Jan. 2008, pp. 547–554.

[47] M. Mishali and Y. C. Eldar, “Blind multi-band signal reconstruction:Compressed sensing for analog signals,” IEEE Trans. Signal Process.,vol. 57, no. 3, pp. 993–1009, Mar. 2009.

[48] J. A. Tropp, “On the conditioning of random subdictionaries,” Appl.Comput. Harmon. Anal., vol. 25, pp. 1–24, Jul. 2008.

[49] P. Kuppinger, G. Durisi, and H. Bölcskei, “Where is randomnessneeded to break the square-root bottleneck?,” in Proc. IEEE Int. Symp.Inf. Theory, Jun. 2010, pp. 1578–1582.

[50] G. Pope, A. Bracher, and C. Studer, “Probabilistic recovery guaranteesfor sparsely corrupted signals,” in preparation.

[51] A. R. Calderbank, P. J. Cameron, W. M. Kantor, and J. J. Seidel,“� -Kerdock codes, orthogonal spreads, and extremal Euclideanline-sets,” Proc. London Math. Soc., vol. 75, no. 2, pp. 436–480, 1997.

[52] E. van den Berg and M. P. Friedlander, “Probing the Pareto frontierfor basis pursuit solutions,” SIAM J. Sci. Comput., vol. 31, no. 2, pp.890–912, 2008.

[53] E. van den Berg and M. P. Friedlander, “SPGL1: A solver for large-scale sparse reconstruction,” 2007 [Online]. Available: http://www.cs.ubc.ca/labs/scl/spgl1

[54] S. S. Agaian, Hadamard Matrices and Their Applications. New York:Springer, 1985, vol. 1168, Lecture notes in mathematics.

[55] J. A. Tropp, “On the linear independence of spikes and sines,” J.Fourier Anal. Appl., vol. 14, no. 5, pp. 838–858, 2008.

[56] J. A. Tropp, I. S. Dhillon, R. W. Heath Jr., and T. Strohmer, “De-signing structured tight frames via an alternating projection method,”IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 188–209, Jan. 2005.

[57] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Principles andApplications to Speech and Video. Englewood Cliffs, NJ: Prentice-Hall, 1984.

[58] S. G. Mallat, A Wavelet Tour of Signal Processing. San Diego, CA:Academic Press, 1998.

[59] C. Studer and R. G. Baraniuk, “Stable restoration and separation of ap-proximately sparse signals,” Appl. Comput. Harmon. Anal., Jul. 2011,submitted for publication.

[60] R. A. Horn and C. R. Johnson, Matrix Analysis. New York: Cam-bridge Univ. Press, 1985.


Christoph Studer (S’06–M’10) was born in Solothurn, Switzerland on De-cember 25, 1979. He received his M.Sc. and Dr. sc. degrees in information tech-nology and electrical engineering from ETH Zürich, Switzerland, in 2005 and2009, respectively. In 2005, he was a Visiting Researcher with the Smart An-tennas Research Group, Information Systems Laboratory, Stanford University,Stanford, CA, USA. From 2006 to 2009, he was a Research Assistant jointlyat the Integrated Systems Laboratory (IIS) and the Communication TechnologyLaboratory (CTL), ETH Zürich. In 2008, he was a consultant for Celestrius, anETH-spinoff specialized in the field of multi-antenna wireless communication.From 2009 to 2011, Dr. Studer was a Postdoctoral Researcher at CTL, ETHZürich, and since 2011, he has been a Postdoctoral Research Associate at theDepartment of Electrical and Computer Engineering, Rice University, Houston,TX. His research interests include the design of VLSI circuits and systems, wire-less communication, as well as signal and image processing.

Dr. Studer was the recipient of an ETH Medal in 2005 and in 2011 for hisM.S. Thesis and Ph.D. Thesis, respectively. He was the winner of the StudentPaper Contest of the 2007 Asilomar Conf. on Signals, Systems, and Computersand received a Best Student Paper Award at the 2008 IEEE Int. Symp. on Cir-cuits and Systems. He received (jointly with Mr. Fateh and Dr. Seethaler) theSwisscom/ICTnet Innovations Award 2010 for the work on “VLSI Implemen-tation of Soft-Input Soft-Output Minimum Mean-Square Error Parallel Interfer-ence Cancellation.” In 2011, Dr. Studer was awarded a two-year fellowship forAdvanced Researchers of the Swiss National Science Foundation (SNSF).

Patrick Kuppinger (S’07) was born in Freiburg, Germany, on May 5, 1983. Hereceived the B.Sc. degree in electrical engineering and information technologyin 2005 from ETH Zürich, Switzerland, the M.Sc. degree with distinction inelectrical engineering in 2006 from Imperial College London, London, UK, andthe Dr. sc. degree in 2011 from ETH Zürich. Since 2011, he has been with UBS,Opfikon, Switzerland.

He held a scholarship from the Klaus Murmann Fellowship Programme(Stiftung der Deutschen Wirtschaft) (2004–2006) and was recipient of an IDEALeague scholarship during his stay at Imperial College London.

Graeme Pope (S’05) was born in Canberra, Australia on April 3, 1984. He com-pleted a B.Sc. (Pure Mathematics) with 1st Class Honours in 2006 and a B.Eng.(Electrical Engineering) with 1st Class Honours in 2007, both at the Universityof Sydney, Australia. In 2009 he received an M.Sc. degree in computer sciencefrom ETH Zürich, Switzerland, where he is currently working towards the Dr.sc. degree in electrical engineering.

He was the recipient of the University Medal (Sydney University) for bothhis B.Sc. and his B.Eng. Honours degrees. He held an Excellence Scholarshipduring his studies for the M.Sc. degree at ETH Zürich.

Helmut Bölcskei (M’98–SM’02–F’09) was born in Mödling, Austria on May29, 1970, and received the Dipl.-Ing. and Dr. techn. degrees in electrical en-gineering from Vienna University of Technology, Vienna, Austria, in 1994 and1997, respectively. In 1998 he was with Vienna University of Technology. From1999 to 2001 he was a postdoctoral researcher in the Information Systems Lab-oratory, Department of Electrical Engineering, and in the Department of Statis-tics, Stanford University, Stanford, CA. He was in the founding team of IospanWireless Inc., a Silicon Valley-based startup company (acquired by Intel Cor-poration in 2002) specialized in multiple-input multiple-output (MIMO) wire-less systems for high-speed Internet access, and was a co-founder of CelestriusAG, Zürich, Switzerland. From 2001 to 2002 he was an Assistant Professor ofElectrical Engineering at the University of Illinois at Urbana-Champaign. Hehas been with ETH Zürich since 2002, where he is Professor of Electrical En-gineering. He was a visiting researcher at Philips Research Laboratories Eind-hoven, The Netherlands, ENST Paris, France, and the Heinrich Hertz InstituteBerlin, Germany. His research interests are in information theory, mathematicalsignal processing, and applied and computational harmonic analysis.

He received the 2001 IEEE Signal Processing Society Young Author BestPaper Award, the 2006 IEEE Communications Society Leonard G. AbrahamBest Paper Award, the 2010 Vodafone Innovations Award, the ETH “GoldenOwl” Teaching Award, is a Fellow of the IEEE, a EURASIP Fellow, andwas an Erwin Schrödinger Fellow (1999–2001) of the Austrian NationalScience Foundation (FWF). He was a plenary speaker at several IEEE con-ferences and served as an associate editor of the IEEE TRANSACTIONS ON

INFORMATION THEORY, the IEEE TRANSACTIONS ON SIGNAL PROCESSING, theIEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, and the EURASIPJournal on Applied Signal Processing. He is currently editor-in-chief of theIEEE TRANSACTIONS ON INFORMATION THEORY and serves on the editorialboards of Foundations and Trends in Networking and of the IEEE SIGNAL

PROCESSING MAGAZINE. He was TPC co-chair of the 2008 IEEE InternationalSymposium on Information Theory and served on the Board of Governors ofthe IEEE Information Theory Society.

Documents

Recovery of Sparsely Corrupted Signals