16
29 Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction EWOUT VAN DEN BERG, MICHAEL P. FRIEDLANDER, GILLESHENNENFENT, FELIX J. HERRMANN, RAYAN SAAB, and ¨ OZG ¨ UR YILMAZ University of British Columbia Sparco is a framework for testing and benchmarking algorithms for sparse reconstruction. It includes a large collection of sparse reconstruction problems drawn from the imaging, com- pressed sensing, and geophysics literature. Sparco is also a framework for implementing new test problems and can be used as a tool for reproducible research. Sparco is implemented entirely in MatLab, and is released as open-source software under the GNU Public License. Categories and Subject Descriptors: G.4 [Mathematics of Computing]: Mathematical Soft- ware—Certification and testing; E.4 [Data]: Coding and Information Theory—Data compaction and compression General Terms: Design, Experimentation, Standardization Additional Key Words and Phrases: Compressed sensing, sparse recovery, linear operators ACM Reference Format: van den Berg, E., Friedlander, M. P., Hennenfent, G., Herrmann, F. J., Saab, R., and Yılmaz, ¨ O. 2009. Algorithm 890: Sparco: A testing framework for sparse reconstruction. ACM Trans. Math. Softw. 35, 4, Article 29 (February 2009), 16 pages. DOI = 10.1145/1462173.1462178. http://doi.acm.org/10.1145/1462173.1462178. 1. INTRODUCTION Sparco is a suite of problems for testing and benchmarking algorithms for sparse signal reconstruction. It is also an environment for creating new test problems; a large library of standard operators is provided from which new test problems can be assembled. Research supported by Collaborative Research and Development grant number 334810-05 from the Natural Sciences and Engineering Research Council of Canada. Authors’ addresses: E. van den Berg, M. P. Friedlander (contact author), G. Hennenfent, F. J. Herrmann, R. Saab, and ¨ O. Yılmaz, University of British Columbia, Vancouver, Canada; email: [email protected]. Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct com- mercial advantage and that copies show this notice on the first page or initial screen of a display along with the full citation. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific permission and/or a fee. Permissions may be requested from the Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or [email protected]. c 2009 ACM 0098-3500/2009/02-ART29 $5.00 DOI: 10.1145/1462173.1462178. http://doi.acm.org/10.1145/1462173.1462178. ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29

Algorithm 890: Sparco: A Testing Frameworkfor Sparse Reconstruction

EWOUT VAN DEN BERG, MICHAEL P. FRIEDLANDER, GILLES HENNENFENT,FELIX J. HERRMANN, RAYAN SAAB, and OZGUR YILMAZUniversity of British Columbia

Sparco is a framework for testing and benchmarking algorithms for sparse reconstruction.It includes a large collection of sparse reconstruction problems drawn from the imaging, com-pressed sensing, and geophysics literature. Sparco is also a framework for implementing new testproblems and can be used as a tool for reproducible research. Sparco is implemented entirely inMatLab, and is released as open-source software under the GNU Public License.

Categories and Subject Descriptors: G.4 [Mathematics of Computing]: Mathematical Soft-ware—Certification and testing; E.4 [Data]: Coding and Information Theory—Data compaction

and compression

General Terms: Design, Experimentation, Standardization

Additional Key Words and Phrases: Compressed sensing, sparse recovery, linear operators

ACM Reference Format:

van den Berg, E., Friedlander, M. P., Hennenfent, G., Herrmann, F. J., Saab, R., and Yılmaz, O.2009. Algorithm 890: Sparco: A testing framework for sparse reconstruction. ACM Trans. Math.Softw. 35, 4, Article 29 (February 2009), 16 pages. DOI = 10.1145/1462173.1462178.http://doi.acm.org/10.1145/1462173.1462178.

1. INTRODUCTION

Sparco is a suite of problems for testing and benchmarking algorithms forsparse signal reconstruction. It is also an environment for creating new testproblems; a large library of standard operators is provided from which newtest problems can be assembled.

Research supported by Collaborative Research and Development grant number 334810-05 fromthe Natural Sciences and Engineering Research Council of Canada.Authors’ addresses: E. van den Berg, M. P. Friedlander (contact author), G. Hennenfent,F. J. Herrmann, R. Saab, and O. Yılmaz, University of British Columbia, Vancouver, Canada;email: [email protected] to make digital or hard copies of part or all of this work for personal or classroom useis granted without fee provided that copies are not made or distributed for profit or direct com-mercial advantage and that copies show this notice on the first page or initial screen of a displayalong with the full citation. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to poston servers, to redistribute to lists, or to use any component of this work in other works requiresprior specific permission and/or a fee. Permissions may be requested from the Publications Dept.,ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, [email protected]© 2009 ACM 0098-3500/2009/02-ART29 $5.00 DOI: 10.1145/1462173.1462178.

http://doi.acm.org/10.1145/1462173.1462178.

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 2: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 2 · E. van den Berg et al.

Recent theoretical breakthroughs on the recovery of sparse signals from in-complete information (see, e.g., Donoho [2006b], Donoho and Tanner [2005],Chen et al. [1998], Candes et al. [2006b]) has led to a flurry of activity in de-veloping algorithms for solving the underlying recovery problem. Sparco orig-inated from the need to extensively test the sparse-reconstruction softwarepackage SPGL1 [van den Berg and Friedlander 2008]. Because the test prob-lems and infrastructure needed to generate them are useful in their own right,these were gathered into a common framework that could be used to test andbenchmark related algorithms.

Collections of test problem sets, such as those provided by Matrix Market[Boisvert et al. 1997], COPS [Dolan and More 2000], and CUTEr [Gould et al.2003], are routinely used in the numerical linear algebra and optimizationcommunities as a means of testing and benchmarking algorithms. These col-lections are invaluable because they provide a common and easily identifiablereference point. A situation in which each research group shares or contributesto a common set of test problems leads to a more transparent research process.Our aim is to provide a software package that can be used as one of the tools ofreproducible research; we follow the example set by SparseLab [Donoho et al.2007].

Sparco is implemented entirely in MatLab. It is built around a comprehen-sive library of linear operators that implement the main functions that areused in signal and image processing. These operators can be assembled todefine new test problems. The toolbox is self contained, and external packagesare only optional.

The outline of this document is as follows. Section 2 gives background on thesparse signal recovery problem. Section 3 describes the problem instantiationmechanism of Sparco. The operators provided by Sparco and an example onhow to prototype an application using these operators are given in Sections 4and 5.

2. SPARSE SIGNAL RECOVERY

A fundamental feature of digital technology is that analog signals, such asimages and audio, are stored and processed digitally. This dichotomy makesit challenging to design signal processing techniques that can find efficientdigital representations of analog signals. The ubiquity of digital technologymakes this a vital problem of practical interest. In the classical approach,the first step is to sample the analog signal by collecting measurements. Therequired resolution determines the number of needed measurements, whichcan be huge. The second step is source coding, namely, the quantizationof the measured values followed by compression of the resulting bit-stream.Frequently, transform-coding methods are employed to compress quantizedmeasurements. These methods reduce the dimensionality of the signal by de-composing it as a linear combination of a small number of fundamental build-ing blocks. Mathematically, the aim is to find a nice basis for the finite, buthigh-dimensional, space in which the signal has a sparse representation, that

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 3: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 3

is, has only a few basis coefficients that are relatively large in magnitude. Thissparsity is then exploited for various purposes, including compression.

The classical two-stage approach—first sample densely and then sparsifyand compress—is the basis for several applications that we use in our dailylives. The JPEG image format, for example, uses the discrete cosine basisto successfully sparsify, and thus compress, natural images. Although thisapproach has so far proved successful in signal processing technology, thereare certain restrictions it imposes. We briefly describe these and discuss howthey lead to the so-called sparse recovery problem.

Source coding and redundant dictionaries. In various settings, it is possibleto obtain much sparser representations by expanding the underlying signalswith respect to a redundant dictionary rather than a basis [Chen et al. 1998].Such dictionaries can be obtained, for example, as unions of different bases(e.g., wavelet, discrete cosine transform (DCT), and Fourier) for the underly-ing space. In this case, signals no longer have unique representations, andfinding the sparsest representation is nontrivial. In order for transform codingmethods based on redundant dictionaries to comprise a practical alternative,we need computationally tractable ways of obtaining sparse expansions withrespect to redundant dictionaries.

Number of measurements versus reconstruction quality. As noted before, thenumber of measurements required to achieve a target resolution can be verylarge. A 1024-by-1024 image, for example, needs 220 pixel values, which asa raw image would correspond to a file size in the order of megabytes. In(exploration) seismology, the situation is even more extreme because datasetsare measured in terabytes.

In the compression stage, however, most of the collected data is discarded.Indeed, a typical JPEG file that stores a 1024-by-1024 image has a sizein the order of hundreds of kilobytes, and the conversion to JPEG discardsup to 90% of the collected data. The fact that we need to collect a hugeamount of data, only to throw it away in the next stage, seems rather inef-ficient. Furthermore, in many settings (e.g., seismic and magnetic resonanceimaging), making a large number of measurements is expensive and phys-ically challenging, if not impossible, thus limiting the maximum achievableresolution by the corresponding digital representation. All these concerns leadto the question of whether we can possibly reconstruct the original signal fromfewer than the “required” number of measurements, without compromising thequality of the digitized signal. Note that this would correspond to combiningthe sampling and compression stages into a single compressed sensing [Donoho2006a] or compressive sampling [Candes 2006] stage.

2.1 The Sparse Recovery Problem

The core of the sparse recovery problem can be described as follows. Considerthe linear system

b = Ax + r, (1)

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 4: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 4 · E. van den Berg et al.

where A is an m-by-n matrix, r is an unknown m-vector of additive noise, andthe m-vector b is the set of observed measurements. The goal is to find anappropriate x that is a solution of (1). Typically, m < n, and this set of equationsis underdetermined and has infinitely many solutions. It is well known howto find the solution of (1) that has the smallest two-norm, provided that A

has full rank (e.g., see Daubechies [1992]). However, the aim in the sparserecovery problem is to find a (possibly approximate) solution that is in somesense sparse, so that either x or the gradient of the resulting image has fewnonzero elements.

Some fundamental questions regarding the sparse recovery problem are: (i)Is there a unique sparsest solution of (1), particularly when r = 0? Under whatconditions does this hold? (ii) Can one find a sparse solution in a computa-tionally tractable way that is stable in the presence of noise, namely, whenr 6= 0? (iii) Is there a fast algorithm that gives a solution that is guaranteedto be sparse in some sense? (iv) How should one choose the matrix A so thatthe answers to (i)–(iii) are affirmative? In the following sections we presenta brief overview of the literature that addresses these questions and describedifferent settings that lead to the sparse recovery problem.

2.2 Redundant Representations and Basis Pursuit

As discussed earlier, one of the important goals in signal processing is to obtainsparse decompositions of a signal with respect to a given redundant dictionary.This problem was addressed in Chen et al. [1998]. Using the notation in (1),the aim is to express a signal b as a linear combination of a small number ofcolumns (atoms) from the redundant dictionary consisting of the columns of A,that is, we aim to find a sparse vector x which satisfies (1). Note that, ideally,the solution of this problem can be found by solving the optimization problem

minimizex

‖x‖0 subject to ‖Ax − b‖2 ≤ σ, (2)

where the function ‖x‖0 counts the number of nonzero components of thevector x, and the parameter σ ≥ 0 prescribes the desired fit in the set of equa-tions Ax ≈ b . This problem is combinatorial and NP-hard [Natarajan 1995].Basis pursuit denoise [Chen et al. 1998] aims to find a sparse solution of (1) byreplacing the discrete optimization problem (2) with the convex optimizationproblem

minimizex

‖x‖1 subject to ‖Ax − b‖2 ≤ σ. (3)

Methods for solving this (and closely related problems) include interior-point algorithms [Chen et al. 1998; Kim et al. 2007; Candes and Romberg2007], gradient projection [Figueiredo et al. 2007], iterative soft threshold-ing [Daubechies et al. 2004; Hale et al. 2007; Combettes and Wajs 2005], andduality-based root-finding [van den Berg and Friedlander 2008]. The homo-topy [Osborne et al. 2000a, 2000b; Malioutov et al. 2005] and LARS [Efronet al. 2004] approaches can be used to find solutions of (3) for all σ ≥ 0.Alternatively, greedy approaches such as orthogonal matching pursuit (OMP)

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 5: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 5

[Pati et al. 1993] and Stagewise OMP [Donoho et al. 2006] can be used to findsparse solutions to (1) in certain cases.

Note that, from a transform coding point of view, it is not essential that theobtained sparse solution is the unique sparsest solution. In that context, theactual signal is b , and x is only a representation of b with respect to the givenredundant dictionary.

2.3 Compressed Sensing

The sparse recovery problem is also at the heart of the recently developedtheory of compressed sensing. Here, the goal is to reconstruct a signal fromfewer than the “necessary” number of measurements, without compromisingthe resolution of the end product [Donoho 2006a; Candes et al. 2006a, 2006b;Candes and Tao 2006] .

Suppose a signal f ∈ Rn admits a sparse representation with respect to a

basis B, namely, f = Bx where x has a few nonzero entries. Now, let M be anm × n measurement matrix, and suppose we sample f by measuring

b = Mf. (4)

Under what conditions on M and the level of sparsity of f with respect to B

can we recover the coefficient vector x from the measurements b? Clearly, oncewe recover x, we also have the signal f = Bx. Using the notation of (1), thiscorresponds to setting A = MB. In other words, we wish to solve the optimiza-tion problem (2) where b is the possibly noisy measurement vector, σ denotesthe noise level, and A = MB. Note that in this context it is essential that werecover the correct set of coefficients: The vector x denotes the coefficients ofthe signal f with respect to a basis, thus there is a one-to-one correspondencebetween the set of coefficients and the signal.

2.3.1 Sparse Signals and Exact Reconstruction. Let r = 0 in (1), and sup-pose that x is sparse. For a given matrix A, if x is sufficiently sparse (i.e., ‖x‖0

is sufficiently small), then it is possible to recover x exactly by solving the con-vex optimization problem (3) with σ = 0. For example, Donoho and Huo [2001],Donoho and Elad [2003], and Gribonval and Nielsen [2003] derive conditionson the sparsity of x that are based on the mutual coherence of A, namely,the absolute value of the largest-in-magnitude off-diagonal element of theGram matrix ATA. In contrast, Candes et al. [2006a; 2006b] and Candes andTao [2006] derive conditions on the sparsity of x that are based on the restricted

isometry constants of A, which are determined by the smallest and largestsingular values of certain submatrices of A.

It is important to note that both families of sufficiency conditions mentionedearlier are quite pessimistic and difficult to verify for a given matrix A. Fortu-nately, for certain types of random matrices one can prove that these, or evenstronger conditions, hold with high probability; see, for example Candes andTao [2006; 2005], Donoho and Tanner [2005], Mendelson et al. [2008].

Greedy algorithms such as OMP provide an alternative for finding thesparsest solution of (1) when r = 0. When the signal is sufficiently sparse,

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 6: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 6 · E. van den Berg et al.

OMP can recover a signal exactly, and the sufficient conditions in this casedepend on the (cumulative) mutual coherence of A [Tropp 2004]. Interest-ingly, many of these greedy algorithms are purely algorithmic in nature, anddo not explicitly minimize a cost function.

2.3.2 Compressible Signals and Noisy Observations. For sparse recoverytechniques to be of practical interest, it is essential that they can be extendedto more realistic scenarios where the signal is not exactly sparse but is insteadcompressible. This means that its sorted coefficients, with respect to somebasis, decay rapidly. Moreover, the measurements gathered in a practical set-ting will typically be noisy. It is established in Candes et al. [2006b], Donoho[2006a], and Tropp [2006] that the solution of (3) approximates a compressiblesignal surprisingly well. In fact, the approximation obtained from m randommeasurements turns out to be almost as good as the approximation we wouldobtain if we knew the largest m coefficients of the signal. Similarly, the re-sults on OMP can be generalized to the case where the underlying signal iscompressible [Tropp 2004].

2.3.3 Applications. Motivated by the results outlined previously, com-pressed sensing ideas have been applied to certain problems in signal process-ing. Next we list some examples of such applications.

When the measurement operator M in (4) is replaced by a restrictionoperator, the sparse reconstruction problem can be viewed as an interpola-tion problem. In particular, suppose that M = RmMn, where Mn is an n × n

(invertible) measurement matrix and Rm is an m × n restriction operator thatremoves certain rows from Mn. Equivalently, we have access only to m sam-ples of the n-dimensional signal f , given by Mf , which we wish to interpolateto obtain the full signal f . This problem can be phrased as a sparse approxi-mation problem and thus the original signal can be obtained by solving (3). Inexploration seismology, this approach was successfully applied by Sacchi et al.[1998], where a Bayesian method that exploits sparsity in the Fourier domainis developed, and by Hennenfent and Herrmann [2006; 2005], where sparsityof seismic data in the curvelet domain is promoted.

Compressed sensing has also led to promising new methods for imagereconstruction in medical imaging, which are largely founded on results givenin Candes et al. [2006a]. MRI images are often sparse in the wavelet trans-form, whereas angiograms are inherently sparse in image space [Lustig et al.2007b; 2007c]. Moreover, they often have smoothly varying pixel values, whichcan be enforced during reconstruction by including a total variation penaltyterm [Rudin et al. 1992]. In this case, the aim is to find a balance between thevariation in Bx and sparsity of x via the optimization problem

minimizex

‖x‖1 + γ TV(Bx) subject to ‖MBx − b‖2 ≤ σ,

where the function TV(·) measures the total variation in the reconstructedsignal Bx.

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 7: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 7

3. SPARCO TEST PROBLEMS

The collection of Sparco test problems covers the wide variety of sparse recov-ery problems just described. These problems are made available through aconsistent interface, and a single gateway routine returns a problem structurethat contains all components of a test problem, including the constituent oper-ators. Sparco includes a suite of fast operators which are used to define a setof test problems of the form (1). In order to define a consistent interface to aset of test problems that covers a broad range of applications, the operator A

is always formulated as the product

A = MB.

The measurement operator M describes how the signal is sampled; the op-erator B describes a basis in which the signal can be sparsely represented.Examples of measurement operators M are the Gaussian ensemble and therestricted Fourier operator. Examples of sparsity bases B include wavelets,curvelets, and dictionaries that stitch together common bases.

Every operator in the Sparco framework is a function that takes a vectoras an input, and returns the operator’s action on that vector. As shown inSection 4, operators are instantiated using the natural dimensions of the sig-nal, and automatically reshape input and output vectors as needed.

3.1 Generating Problems

The gateway function generateProblem is the single interface to all problemsin the collection. Each problem can be instantiated by calling generateProblem

with a unique problem-identifier number, and optional flags and keyword-value pairs.

P = generateProblem(id, [[key1,value1|flag1],...]);

This command has a single output structure P, which encodes all parts of theinstantiated problem. The details of the problem structure are discussed inSection 3.2. The optional keyword-value pairs control various aspects of theproblem formulation, such as the problem size or the noise level. In this way, asingle problem definition can yield a whole family of related problem instances.

For instance, Sparco-problem five generates a signal of length n from a fewatoms in a DCT/Dirac dictionary. Two versions of this problem can be gener-ated as follows.

P1 = generateProblem(5); % Default signal lengthP1 = generateProblem(5,‘n’,2048); % Signal of length n = 2048

Documentation for each problem, including a list of available keyword-valuepairs, the problem source, and relevant citations, can be retrieved with thefollowing call.

generateProblem(id, ‘help’);

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 8: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 8 · E. van den Berg et al.

Table I. Basic Fields in a Problem Structure

Field Description

A, B, M Function handles for operators A, B, and M

b the observed signal b

op summary of all operators usedreconstruct function handle to reconstruct a signal from coefficients in x

info problem information structuresignalSize dimensions of the actual signal, b0 := Bx

sizeA, sizeB, sizeM dimensions of the operators A, B, and M

Table II. List of Problem Sources

Seismic imaging Hennenfent and Herrmann [2008; 2007]Blind source separation Wang et al. [2008]MRI imaging Lustig et al. [2007b], Candes et al. [2006a]Basis pursuit Chen et al. [1998], Donoho and Johnstone [1994]Compressed sensing Candes and Romberg [2007], Takhar et al. [2006]Image processing Takhar et al. [2006], Figueiredo et al. [2007]

Other commonly supported flags are ‘show’ for displaying example figuresassociated with the problem, and ‘noseed’ to suppress the initialization ofthe random number generators.

A full list of all available problem identifiers in the collection can be gener-ated via the following command.

pList = generateProblem(‘list’);

This is useful, for example, when benchmarking an algorithm on a series oftest problems or for checking which problem identifiers are valid. The codefragment

pList = generateProblem(‘list’);

for id = pList

P = generateProblem(id);

disp(P.info.title);

end

prints the names of every problem in the collection. The field info.title iscommon to all problem structures.

3.2 Problem Structure

A single data structure defines all aspects of a problem, and is designed togive flexible access to specific components. For the sake of uniformity, thereare a small number of fields guaranteed to exist in every problem structure.These include function handles for the operators A, B, and M, the data forthe observed signal b, and the dimensions of the (perhaps unobserved) actualsignal b0 := Bx; these fields are summarized in Table I.

Other useful fields include op, info, and reconstruct. The field op providesa structure with all individual operators used to define A, B, and M, as wellas a string that describes each operator. The field info provides background

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 9: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 9

information on the problem, such as citations and associated figures. The fieldreconstruct provides a method that takes a vector of coefficients x and returnsa matrix (or vector) b0 that has been reshaped to correspond to the originalsignal.

Many of the problems also contain the fields signal, which holds thetarget test problem signal or image; noise, which holds the noise vector r

(see (1)); and, if available, x0, which holds the true expansion of the signalin the sparsity basis B. See Table II for a list of problem sources.

4. SPARCO OPERATORS

At the core of the Sparco architecture is a large library of linear operators.Where possible, specialized code is used for fast evaluation of matrix-vectormultiplications. Once an operator has been created, such as

D = opDCT(128);

matrix-vector products with the created operator can be accessed as follows.

y = D(x,1); % gives y := Dx

x = D(y,2); % gives y := DTy

A full list of the basic operators available in the Sparco library is given inTables III and IV.

MatLab classes can be used to overload the operations commonly used formatrices so that the objects in that class behave exactly like explicit matrices.Sparco provides a mechanism for overloading MatLab’s built-in matrix oper-ator. The function classOp takes a Sparco operator op as input and returnsa MatLab matrix-like object for which the main matrix-vector operations aredefined. For example, we have the following.

C = classOp(op); % create matrix-like object C from op

C = classOp(op,‘nCprod’); % use global counter vector nCprod

In its second form, the classOp function accepts an optional string argumentand creates a global variable that keeps track of the number of multiplicationswith C and CT ; the variable can be accessed from MatLab’s base workspace.The following example illustrates the use of classOp to create a matrix-likeobject.

F = opFFT(128); % create an FFT operatorG = classOp(F); % convert to a matrix-like objectg1 = F(y,2); % gives g1 := FTy

g2 = G’*y; % gives g2 := GTy ≡ FTy

The vectors g1 and g2 are identical.

4.1 Meta-Operators

Several tools are available for conveniently assembling more complex op-erators from the basic operators. The seven meta-operators opFoG, opSum,opDictionary, opStack, opTranspose, opBlockDiag, and opKron take one or

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 10: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 10 · E. van den Berg et al.

Table III. The Operators in the Sparco Library

MatLab function Description

opBinary binary (0/1) ensembleopBlockDiag compound operator with operators on the diagonalopBlur two-dimensional blurring operatoropColumnRestrict restriction on matrix columnsopConvolve1d one-dimensional convolution operatoropCrop crop part of operatoropCurvelet2d two-dimensional curvelet operatoropDCT one-dimensional discrete cosine transformopDiag scaling operatoropDictionary compound operator with operators abuttedopDirac identity operatoropFFT one-dimensional FFTopFFT2d two-dimensional FFTopFFT2C centralized two-dimensional FFTopFoG subsequent application of a set of operatorsopGaussian Gaussian ensembleopHaar one-dimensional Haar wavelet transformopHaar2d two-dimensional Haar wavelet transformopHadamard Hadamard matrixopHeaviside Heaviside matrix operatoropKron Kronecker product of two operatorsopMask vector entry selection maskopMatrix wrapper for matricesopPadding pad and unpad operators equally around each sideopReal discard imaginary componentsopRestriction vector entry restrictionopSign sign-ensemble operatoropStack vertical dictionaryopSum summation of operatorsopToepGauss Toeplitz matrix with Gaussian entriesopToepSign Toeplitz matrix with ±1 entriesopTranspose transpose of operatoropWavelet wavelet operatoropWindowedOp overcomplete windowed operator

more Sparco operators or matrix-like objects as input, and assemble them intoa single operator.

H = opFoG(A1,A2,...); % H := A1 · A2 · . . . · An

H = opSum(A1,A2,...); % H := A1 + A2 + . . . + An

H = opDictionary(A1,A2,...); % H := [A1 | A2 | · · · | An]H = opStack(A1,A2,...); % H := [AT

1 | AT2 | · · · | AT

n ]T

H = opBlockDiag(A1,A2,...); % H := diag(A1, A2, . . . , An)H = opKron(A1,A2); % H := A1 ⊗ A2

H = opTranspose(A); % H := AT

An eighth meta-operator, opWindowedOp, is a mixture between opDictionary

and opBlockDiag in which blocks can partially overlap. Unlike opDictionary

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 11: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 11

Table IV. Operators Grouped by Type

Operator type MatLab function

Ensembles opBinary, opSign, opGaussian

Selection opMask, opColumnRestrict, opRestriction, opCrop, opPadding

Matrix opDiag, opDirac, opMatrix, opToMatrix

Fast operators opBlur, opCurvelet, opConvolve1d, opCurvelet2d, opDCT,

opFFT, opFFT2d, opFFT2C, opHaar, opHaar2d, opHadamard,

opHeaviside, opToepGauss, opToepSign, opWavelet

Meta operators opBlockDiag, opDictionary, opFoG, opKron, opStack, opSum,

opWindowedOp

Nonlinear opReal, opSplitComplex

and opBlockDiag, opWindowedOp only takes a single operator as input (andoptionally a diagonal weighting matrix to apply to each block).

4.2 Ensemble Operators and General Matrices

The three ensemble operators (see Table IV) can be instantiated by simplyspecifying their dimensions and a mode that determines the normalizationof the ensembles. Unlike the other operators in the collection, the ensembleoperators can be instantiated as explicit matrices (requiring O(mn) storage),or as implicit operators. When instantiated as implicit operators, the randomnumber seeds are saved, and rows and columns are generated on-the-fly duringmultiplication, requiring only O(n) storage for the normalization coefficients.

4.3 Selection Operators

There are five selection operators: opRestriction, opMask, opColumnRestrict,opCrop, and opPadding. In forward mode, the restriction operator selects cer-tain entries from the given vector and returns a correspondingly shortenedvector. In contrast, the mask operator evaluates the dot-product with a binaryvector, thus zeroing out the entries instead of discarding them, and returns avector of the same length as the input vector. The column restriction operatortreats the vector as the representation of a matrix and discards blocks of rowscorresponding to columns in the matrix. The padding operator embeds a givenoperator in a larger operator consisting of all zeros. The crop operator does theopposite and extracts a range of rows and columns of a given operator to forma new operator.

4.4 Fast Operators

Sparco also provides support for operators with a special structure for whichfast algorithms are available. Such operators in the library include Fourier,discrete cosine, wavelet, and two-dimensional curvelet transform; Hadamardand random Toeplitz matrices; and one-dimensional convolution of a signalwith a kernel.

As an example, the following code generates a partial 128 × 128 Fouriermeasurement operator (F), a masked version with 30% of the rows randomly

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 12: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 12 · E. van den Berg et al.

Fig. 1. Plots of the explicit matrices for: (a) the partial Fourier measurement matrix; and (b) theDCT-Dirac dictionary.

zeroed out (M), and a dictionary (B) consisting of an FFT and a scaled Diracbasis (D) with di,i = 0.1 (see also Figure 1).

m = 128; % Operator sizeD = opDiag(m,0.1); % Diagonal operator D

F = opFFT(m); % Fourier operator F

M = opFoG(opMask(rand(m,1) < 0.7),F); % Masked version of F

B = opDictionary(F,D); % B = [F D]

4.5 Matrix Operators and Utilities

For general matrices there are three operators: opDirac, opDiag, and opMatrix.The Dirac operator coincides with the identity matrix of desired size. Diagonalmatrices can be generated using opDiag, which takes either a size and scalar,or a vector containing the diagonal entries. General matrix operators can becreated using opMatrix with a matrix as an argument. This last operator isuseful if explicit matrices need to be combined with other operators from theSparco library.

The opToMatrix utility function has the opposite functionality of opMatrix;it takes as input an implicit linear operator, and forms and returns an explicitmatrix. Figure 1 shows the results of using this utility function on the opera-tors M and B defined in Section 4.4.

Mexplicit = opToMatrix(M); imagesc(Mexplicit);

Bexplicit = opToMatrix(B); imagesc(Bexplicit);

5. FAST PROTOTYPING USING SPARCO OPERATORS

In this section we give an example of how Sparco operators can be used for fastprototyping ideas and solving real-life problems. This example, whose sourcecode can be found in $SPARCO/examples/exSeismic.m, inspired test problem901.

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 13: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 13

Fig. 2. Seismic wavefield reconstruction. (a) Input data with missing traces; and (b) interpolateddata using SPGL1 to solve a sparse recovery problem in the curvelet domain.

Fig. 3. Prototyping a seismic imaging application using the Sparco operators.

The set of observed measurements is a spatiotemporal sampling at theEarth’s surface of the reflected seismic waves caused by a seismic source. Dueto practical and economical constraints, the spatial sampling is incomplete(Figure 2(a)). The data needs to be interpolated before further processing canbe done to obtain an image of the Earth’s subsurface. A successful approach tothe seismic wavefield reconstruction problem is presented in Hennenfent andHerrmann [2006] and Herrmann and Hennenfent [2008] where the authors

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 14: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 14 · E. van den Berg et al.

form a sparse recovery problem in the curvelet domain [Candes et al. 2007]that is approximately solved using a solver for large-scale one-norm regular-ized least squares. In this example, we want to form the same sparse recoveryproblem and solve it using SPGL1. Figure 3 shows the code to load the data,set up the operators, and solve the interpolation problem.

6. CONCLUSIONS

Sparco is organized as a flexible framework, providing test problems for sparsesignal reconstruction as well as a library of operators and tools. The problemsuite currently contains 26 problems and 34 operators. We will continue to addmore problems in the near future. Contributions from the community are verywelcome!

ACKNOWLEDGMENTS

The authors would like to thank M. Lustig and S. Wright, who kindlycontributed code: M. Lustig contributed code from the SparseMRI toolbox[Lustig et al. 2007a], and S. Wright contributed code from test problemspackaged with the GPSR solver [Figueiredo et al. 2007]. We also thankM. Saunders and S. Wright for courageously testing early versions of Sparco,and for their thoughtful feedback. We extend thanks to an anonymous refereeand to T. Hopkins for their valuable comments. We are delighted to acknowl-edge the open-source package Rice Wavelet Toolbox [Baraniuk et al. 1993],which was invaluable in developing Sparco.

This research was carried out as part of the SINBAD project with supportsecured through the Industry Technology Facilitator and from the followingorganizations: BG Group, BP, Chevron, ExxonMobil, and Shell.

REFERENCES

BARANIUK, R., CHOI, H., FERNANDES, F., HENDRICKS, B., NEELAMANI, R., RIBEIRO, V.,ROMBERG, J., GOPINATH, R., GUO, H.-T., LANG, M., ODEGARD, J. E., AND WEI, D. 1993.Rice wavelet toolbox. http://www.dsp.rice.edu/software/rwt.shtml.

VAN DEN BERG, E. AND FRIEDLANDER, M. P. 2008. Probing the Pareto frontier for basispursuit solutions. SIAM J. Sci. Comp. 31, 2, 840–912.

BOISVERT, R., POZO, R., REMINGTON, K., BARRETT, R., AND DONGARRA, J. 1997. MatrixMarket: A Web resource for test matrix collections. In The Quality of Numerical Software:

Assessment and Enhancement, R. F. Boisvert, Ed. Chapman & Hall, London, 125–137.

CANDES, E. J. 2006. Compressive sampling. In Proceedings of the International Congress of

Mathematicians.

CANDES, E. J., DEMANET, L., DONOHO, D. L., AND YING, L.-X. 2007. CurveLab.http://www.curvelet.org/.

CANDES, E. J. AND ROMBERG, J. 2007. ℓ1-magic. http://www.l1-magic.org/.

CANDES, E. J., ROMBERG, J., AND TAO, T. 2006a. Robust uncertainty principles: Exact signalreconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory 52, 2,489–509.

CANDES, E. J., ROMBERG, J., AND TAO, T. 2006b. Stable signal recovery from incomplete andinaccurate measurements. Commun. Pure Appl. Math. 59, 8, 1207–1223.

CANDES, E. J. AND TAO, T. 2005. Decoding by linear programming. IEEE Trans. Inf. Theory 51,12, 4203–4215.

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 15: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

Algorithm 890: Sparco: A Testing Framework for Sparse Reconstruction · 29: 15

CANDES, E. J. AND TAO, T. 2006. Near-Optimal signal recovery from random projections:Universal encoding strategies? IEEE Trans. Inf. Theory 52, 12, 5406–5425.

CHEN, S. S., DONOHO, D. L., AND SAUNDERS, M. A. 1998. Atomic decomposition by basis pursuit.SIAM J. Sci. Comput. 20, 1, 33–61.

COMBETTES, P. L. AND WAJS, V. R. 2005. Signal recovery by proximal forward-backwardsplitting. Multiscale Modeling and Simul. 4, 4, 1168–1200.

DAUBECHIES, I. 1992. Ten Lectures on Wavelets. SIAM, Philadelphia, PA.

DAUBECHIES, I., DEFRISE, M., AND DE MOL, C. 2004. An iterative thresholding algorithm forlinear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413–1457.

DOLAN, E. D. AND MORE, J. J. 2000. Benchmarking optimization software with COPS. Tech. rep.ANL/MCS-246, Mathematics and Computer Science Division, Argonne National Laboratory,Argonne, IL. Revised January 2001.

DONOHO, D. L. 2006a. Compressed sensing. IEEE Trans. Inf. Theory 52, 4, 1289–1306.

DONOHO, D. L. 2006b. For most large underdetermined systems of linear equations the minimalℓ1-norm solution is also the sparsest solution. Commun. Pure Appl. Math. 59, 6, 797–829.

DONOHO, D. L. AND ELAD, M. 2003. Optimally sparse representation in general (nonorthogonal)dictionaries via ℓ1 minimization. Proc. Natl. Acad. Sci. USA 100, 5, 2197–2202.

DONOHO, D. L. AND HUO, X. 2001. Uncertainty principles and ideal atomic decomposition. IEEE

Trans. Inf. Theory 47, 7, 2845–2862.

DONOHO, D. L. AND JOHNSTONE, I. M. 1994. Ideal spatial adaptation by wavelet shrinkage.Biometrika 81, 3, 425–455.

DONOHO, D. L., STODDEN, V. C., AND TSAIG, Y. 2007. Sparselab. http://sparselab.stanford.edu/.

DONOHO, D. L. AND TANNER, J. 2005. Sparse nonnegative solution of underdetermined linearequations by linear programming. Proc. Natl. Acad. Sci. USA 102, 27, 9446–9451.

DONOHO, D. L., TSAIG, Y., DRORI, I., AND STARCK, J.-L. 2006. Sparse solution of under-determined linear equations by stagewise orthogonal matching pursuit. Tech. rep. 2006-2,Department of Statistics, Stanford University. April.

EFRON, B., HASTIE, T., JOHNSTONE, I., AND TIBSHIRANI, R. 2004. Least angle regression. Ann.

Statist. 32, 2, 407–499.

FIGUEIREDO, M., NOWAK, R., AND WRIGHT, S. J. 2007. Gradient projection for sparse reconstruc-tion: Application to compressed sensing and other inverse problems. IEEE J. Selected Topics in

Signal Process. 1, 4, 586–597.

GOULD, N. I. M., ORBAN, D., AND TOINT, PH. L. 2003. CUTEr and SifDec: A constrained andunconstrained testing environment, revisited. ACM Trans. Math. Softw. 29, 4, 373–394.

GRIBONVAL, R. AND NIELSEN, M. 2003. Sparse representations in unions of bases. IEEE Trans.

Inf. Theory 49, 12, 3320–3325.

HALE, E., YIN, W., AND ZHANG, Y. 2007. A fixed-point continuation method for l1-regularizedminimization with applications to compressed sensing. Tech. rep. TR07-07, Department ofComputational and Applied Mathematics, Rice University, Houston, TX.

HENNENFENT, G. AND HERRMANN, F. J. 2005. Sparseness-Constrained data continuation withframes: Applications to missing traces and aliased signals in 2/3-D. In SEG International

Exposition and 75th Annual Meeting.

HENNENFENT, G. AND HERRMANN, F. J. 2006. Application of stable signal recovery to seismicinterpolation. In SEG International Exposition and 76th Annual Meeting.

HENNENFENT, G. AND HERRMANN, F. J. 2007. Random sampling: New insights into thereconstruction of coarsely-sampled wavefields. In SEG International Exposition and 77th

Annual Meeting.

HENNENFENT, G. AND HERRMANN, F. J. 2008. Simply denoise: Wavefield reconstruction viajittered undersampling. Geophys. 73, 3, V19–V28.

HERRMANN, F. J. AND HENNENFENT, G. 2008. Non-Parametric seismic data recovery withcurvelet frames. Geophys. J. Int. 173, 233–248.

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.

Page 16: Algorithm 890: Sparco: A Testing Framework for Sparse ...s3.amazonaws.com/researchcompendia_prod/articles/b609a6bdf67… · different settings that lead to the sparse recovery problem

29: 16 · E. van den Berg et al.

KIM, S.-J., KOH, K., LUSTIG, M., BOYD, S., AND GORINEVSKY, D. 2007. An interior-pointmethod for large-scale ℓ1-regularized least squares. IEEE J. Selected Topics Signal Process. 1, 4,606–617.

LUSTIG, M., DONOHO, D. L., AND PAULY, J. M. 2007a. Sparse MRI Matlab code.http://www.stanford.edu/∼mlustig/SparseMRI.html.

LUSTIG, M., DONOHO, D. L., AND PAULY, J. M. 2007b. Sparse MRI: The application of compressedsensing for rapid MR imaging. Magnetic Resonance Med. 58, 6, 1182–1195.

LUSTIG, M., DONOHO, D. L., SANTOS, J. M., AND PAULY, J. M. 2007c. Compressed sensing MRI.IEEE Signal Process. Mag. 25, 2, 72–82.

MALIOUTOV, D., CETIN, M., AND WILLSKY, A. S. 2005. A sparse signal reconstruction perspectivefor source localization with sensor arrays. IEEE Trans. Signal. Process. 53, 8, 3010–3022.

MENDELSON, S., PAJOR, A., AND TOMCZAK-JAEGERMANN, N. 2008. Uniform uncertaintyprinciple for Bernoulli and subGaussian ensembles. Constructive Approximation. To appear.

NATARAJAN, B. K. 1995. Sparse approximate solutions to linear systems. SIAM J. Comput. 24, 2,227–234.

OSBORNE, M. R., PRESNELL, B., AND TURLACH, B. A. 2000a. The Lasso and its dual. J. Comput.

Graph. Statist. 9, 319–337.

OSBORNE, M. R., PRESNELL, B., AND TURLACH, B. A. 2000b. A new approach to variableselection in least squares problems. IMA J. Numer. Anal. 20, 3, 389–403.

PATI, Y. C., REZAIIFAR, R., AND KRISHNAPRASAD, P. S. 1993. Orthogonal matching pursuit:Recursive function approximation with applications to wavelet decomposition. In Proceedings of

the 27th Annual Asilomar Conference on Signals, Systems and Computers, vol. 1, 40–44.

RUDIN, L., OSHER, S., AND FATEMI, E. 1992. Nonlinear total variation based noise removalalgorithms. Physica D 60, 259–268.

SACCHI, M. D., ULRYCH, T. J., AND WALKER, C. J. 1998. Interpolation and extrapolation usinga high-resolution discrete Fourier transform. IEEE Trans. Signal Process. 46, 1, 31–38.

TAKHAR, D., LASKA, J. N., WAKIN, M., DUARTE, M., BARON, D., SARVOTHAM, S., KELLY, K. K.,AND BARANIUK, R. G. 2006. A new camera architecture based on optical-domain compression.In Proceedings of the IS&T/SPIE Symposium on Electronic Imaging: Computational Imaging,vol. 6065.

TROPP, J. A. 2004. Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inf.

Theory 50, 10, 2231–2242.

TROPP, J. A. 2006. Just relax: Convex programming methods for identifying sparse signals innoise. IEEE Trans. Inf. Theory 52, 3, 1030–1051.

WANG, D., SAAB, R., YILMAZ, O., AND HERRMANN, F. 2008. Bayesian wavefield separation bytransform-domain sparsity promotion. Geophys. To appear.

Received October 2007; accepted June 2008

ACM Transactions on Mathematical Software, Vol. 35, No. 4, Article 29, Pub. date: February 2009.