New A Survey on Compressive Sensing and Applicationsmine/Denjo/rinkodata/rinko... · 2010. 12. 13. · A Survey on Compressive Sensing and Applications ... regarding its applications

1

A Survey on Compressive Sensing and ApplicationsCompressive Sensingとその応用に関する調査

René M A Teixeira (博士1年: 48-107410)Dept. of Information and Communication EngineeringAizawa Yamasaki Laboratory, The University of Tokyo

Abstract—Compressive sensing, sometimes referredas compressed sampling, is a recent breakthrough ininformation theory and related areas. It is an eleganttechnique that combines information theory and statis-tics in order to perfectly, or nearly perfectly, recon-struct a sampled signal. The recovery is achieved bytaking the advantage of having fewer samples than whatwould be necessary according to Shannon-Nyquist the-orem. In this paper we introduce the theory behindcompressive sensing and we survey different worksregarding its applications and related algorithms, espe-cially those related to the signal reconstruction, whichcurrently represents the bottleneck.

I. IntroductionTraditional sampling theorem, as established by Claude

Shannon [19] and Harry Nyquist [15], states that if wewant to discretely sample a given function and perfectlyrecover it into its original “continuous” shape afterwards,the sampling frequency must be at least twice as muchas the highest frequency component of the original signal.This has been the basis of much of the progress in elec-tronics, telecommunications and related fields.

In spite of the correctness of the sampling theorem, it isnot efficient in all cases. Often the result of the samplingprocess is a huge amount of data that is not convenientfor processing or storage. For this, compression is usuallyapplied to the data in order to reduce their volume withoutlosing too much sensitive information.

Here lies the paradox of the traditional method: toomuch data is collected and too much data is discarded.There should be a way of acquiring only the informationthat is useful and this is the kernel of compressive sensing.

A common example is the case of digital photographies.We can interpret each pixel as one measurement. Themore measurements, the bigger the image gets. During thecompression step, many of those pixels will be irrelevantand should not be sampled.

Compressive sensing (CS), or compressed sampling, per-forms a sub-Nyquist sampling of the target signal. Theamount of samples is much smaller than the one specifiedby traditional sampling and in a worst case scenario, asmany samples as the traditional sampling are obtained. Itsdevelopment started with imaging processes but is now animportant theory in different fields. This paper will focuson image/video fields, as they represent the current areaof interest.

The paper is organized as follows: Section II overviewsthe development of compressive sensing and the historybehind it. Section III gives the mathematical rationale aswell as an overview of the basic algorithms. In Section IVwe review previous works related to compressive sensingitself and applications, with focus on image and videoprocessing.

II. OverviewA tendency that has being kept along the years is

the maximum that the more data you get, the better.Examples are pictures taken with huge amount of pixelsor oversampled music, for instance.

It is better to have abundance of information than thelack of it. The trouble is that getting more informationimplies into transmitting more data and larger containersto store it. And one very important detail is that once thedata is sampled, it is often compressed with some sort ofalgorithm. The meaning of this paradigm is that you get alot of data, but you also discard a lot of data afterwards.

A precursor of the compressive sensing technique datesback to 1970 [10], when seismologists were able to recovercomplete images of underground scanning in the absenceof a complete sampling set that would satisfy the Shannon-Nyquist criteria. The possibilities of the new techniquewere not clear at the time, mainly due to its lack ofmathematical formalization.

However, in 2004 the sampling paradigm was aboutto change, based on the works of Emmanuel Candès ofCaltech, Terence Tao of the University of California atLos Angeles, Justin Romberg of Georgia Tech and DavidDonoho of Stanford University.

They were able to show (with the due formalism) thatfrom an incomplete set of sampled data, gotten from agiven signal, it could be completely restored as long assome constraints are kept. These constraints are relatedto the minimum amount of measurements of the originalsignal, how the measurement is conducted and the recov-ery method utilized.

III. FormalizationThe works of Baraniuk [1] and Romberg [17] are used

here as reference for the theory of compressive sensing dueto its introductory context. Please refer to Donoho [6] forthe proof of the theorems.

2

Let x be a signal in a m-dimensional space Rm, which issparse in any arbitrary basis. A K-sparse vector is the onewith K elements different than zero. The u × u unitarymatrix is an example of a matrix with sparsity K = u.Signals of interest in compressive sensing have K � N ,where N is the size of the signal vector.

The vector x can be represented in a vectorial base Ψ

that expands to Ψ := [ψ1|ψ2| . . . |ψN ].Equations 1 and 2 shows the representation of a signal

in a different vectorial basis.

x =

N∑i=1

siψi (1)

Or the equivalent in matrix notation:

x = Ψs (2)

Where si is the inner product1:

si = 〈x, ψi〉 = ψTi x (3)

A well known transformation is the case of the Fourierbasis, where each ψi represents a frequency component.

We have a sparse representation of a signal s that wecall x, but we are still missing a method to sample thissignal. If we define a measurement matrix Φ, it can beused to generate a compressed sampled signal followingthe equation:

y = Φx = ΦΨs = Θs (4)

Although abstract, Figure 1 and Figure 2 representthis process. In the former we have each of the matricesexploited, as well as each of the signal involved. In thelatter we have the same representation in terms of the com-pressive sensing matrix Θ = ΦΨ. The highlighted verticalrectangles represent the desired number of measurementsM ≈ K.

Fig. 1. Compressive sensing measurement with Gaussian matrix.Source: [1].

Two elements are still missing, the definition of thematrix Φ and a way to recover the signal. The design of Φshould be good enough to guarantee that the important(salient) information from the original signal, an image inour case, is correctly sampled. This is important to assurethe correct recovery.

The reconstruction of the signal is just a matter oflinear algebra, where from vector y we get vector x. The

1The symbol T denotes the Hermitian transpose operator.

Fig. 2. Compressive sensing measurement in terms of the matrix Θ.Source: [1].

problem is that in most cases we will end up with morevariables than equations, creating an under-determinedproblem. That would not be a problem if we knew exactlythe location of each non-sparse element. Supposing thefeasibility of this approach, a sufficient condition to givethe system a possible solution is that for a given vector v,we have:

1− ε ≤ ‖ Θv ‖2‖ v ‖2

≤ 1 + ε (5)

Unfortunately this approach is not practical, as it is notpossible to know the position of each non-zero element inthe original signal. As a solution, Candes et al. [4] showedthat a stable Θ matrix should obey the restricted isometryproperty (RIP).

An alternative is to create a measurement matrix Φ

that is incoherent with Ψ. Although that can be a difficultproblem, in compressive sensing the measurement matrixis defined as a random matrix, following a Gaussian distri-bution. That is a sufficient condition to make it incoherentwith Ψ and with high probability for the case of deltaspikes: Ψ = I. Under this property, Θ = ΦI = Φ satisfiesthe RIP condition and we are able to recover the originalsignal if M ≥ cK log(N/K) � N , where c is a smallconstant.

A. Sparsity and compressibilityIt is not always possible to determine the sparsity of

a vector due to the difficulty of finding in which basisthe sparsity occurs. It is well known that images arecompressible in the frequency domain, see Figures 3 and 4.The former shows the result of a discrete cosine transformand the latter shows the result of a wavelet transform.It is observable that in high frequencies the amplitude ofthe components are very low, which are usually ignored.Standard algorithms in which this happens are: the JPEGand JPEG2000 compression algorithms for images andMPEG-2 and MPEG-4/H.264, for video.

Despite the hardships imposed by the determination ofsparsity in the optical domain, it is easy to do it throughdomain transformations. For this purpose, it is assumedthat a compressible vectors is also sparse, in the sense thatthey hold similar properties that are useful for CS.

Another interesting issue is that images that are sparsein the domain of random samples, do not need sparsifying

3

transforms. Hence, the decoding process is faster, as notedby Pudlewski and Melodia [16].

Fig. 3. DCT transform. Green regions represent low energy fre-quency components whereas red regions represent high energy fre-quencies.

Fig. 4. Wavelet transform. Blue areas represent very low energywavelet components.

B. Signal reconstructionUntil now we have covered the sampling process but,

without a way to rebuild the original signal the wholeprocess becomes irrelevant.

Let us first define the lp-norm. Equation 6 shows itsgeneralized form. For the special case where p = 2, wehave the well-known Euclidean norm (l2-norm). For p = 1,we have the l1-norm, also known as Manhattan norm orTaxicab norm. Although not a true norm, the l0-norm, orzero norm, is defined as the amount of non-zero elementsin a vector.

‖ x ‖p=

(∑i∈N

|xi|p)1/p

(6)

The traditional way of solving this kind of problem hasbeen the l2-norm. This is actually an optimization problemwhere the sparsest solution is sought.

s = argmin ‖ s′ ‖2 for Θs′ = y (7)

That holds the closed form solution:

s = ΘT(ΘΘT

)−1

y (8)

Although very fast, the solution is almost never correctfor K-sparse signals. It means that although a solution isfound, it is hardly the best (sparsest) solution. The normthat gives the best solution is the l0-norm. It is able to

recover the sparsest signal but at the cost of being ex-tremely complex. The program for it is intractable, whichis believed to belong to the class of NP-Hard problems.

On the other hand, the convex optimization prob-lem using the l1-norm can recover the signal if M ≥cK log(N/K) for random Gaussian measurements. Forthis end, a linear program called basis pursuit is used. See[6] for reference.

Figure 5 shows the geometrical interpretation of the l1-norm recovery and why it is possible. The geometry of thel1-norm can be seen as the left-most ball in the figure. Itis clearly anisotropic. In the middle, α0 is a sparse vector.The line H is the solution for all α that have the samemeasurement as α0. The intersection of the l1 ball andthe line H is the solution for the minimization problem asit is the point with the minimum l1-norm.

Let us now consider the l2-norm, which is a least squareproblem. As shown in the right-most picture, it is possibleto observe that the intersection of α and H does not needto be sparse. Actually, the probability of being sparse isvery low.

Figure 6 shows a comparison of image recovery using themethod describe in this section, as shown in Romberg [17].The blue line is the recovery error versus number of mea-surements, in the case of linear DCT acquisition. The redline the same type of result but using compressive imaging(the method of this section). The green lines shows theresult for a method called DCT imaging augmented withtotal-variation minimization. Please refer to the paper forthe complete explanation of the methods. Despite theimpressive results, as shown by the figure, compressivemethods are still very slow, as will be demonstrated inSection IV.

IV. Related Works

A. Compressive sensing algorithmsBefore reviewing applications of the compressive sensing

paradigm, we introduce some of the works done aimingat the improvement of the compressive sensing frameworkitself. Works that fall onto this category has mostly dealtwith the reconstruction algorithm, focusing on aspects likereducing the amount of required samples and computa-tional time.

One solid development was made by Gilbert et al. [9].Their intention was the creation of one unique algorithmthat would suffice four basic requirements: “the measure-ment ensemble succeeds for all signals”; optimal numberof measurements; polynomial running time (according tothe amount of data) but poly-logarithmic on the signallength and finally, robust to errors. The newly developedmethod, named HHS Pursuit (Heavy Hitters on Steroids),requires the minimum number of measurements to be(m/ε2)polylog(d/ε) whereas the running time equal to(m2/ε4)polylog(d/ε).

4

Fig. 5. Geometry of the reconstruction. Source: [17]

Fig. 6. Coded imaging simulation. Source: [17]

As seen in Section III-B, the first method used in thereconstruction of the CS-encoded signals was based on thegeometry of the problem. Another class of algorithms relieson combinatorial methods. Berinde et al. [3] proposed amethod that combines both classes of solution. To this endthey generalized the restricted isometry property (RIP) tothe lp-norm, for p ≈ 1.

Although very good results were obtained regarding thenumber of measurements and error tolerance, encodingtime remained similar to previous works and, as expected,decoding time was just a matter of linear programing (LP).

Following the same line, Baraniuk et al. [2], one of thecreators of the original theory, proposed a model-based ap-proach to compressive sensing. The merits of this extensionto their original work is the introduction of a framework.This framework can be used for the development of newdecoding algorithms but with performance guarantees.

The breakthrough is that different types of signal, whichimplies in different patterns of sparsity, require differenttypes of approaches, instead of a generic program. In asense, it is the opposite idea of previous works, as the onespresent in this section.

Most of the analysis is conducted comparing the model-based recovery method to an algorithm called CoSaMP,which is an iterative recovery algorithm introduced byNeedell and Tropp [14]. It provides the “same guaran-tees and the best optimized-based approaches”. Despitethe high efficiency of CoSaMP, it was outperformed inmost of the test cases by model-based methods. Figure7 shows examples of the test results. The top left image

Test signal CoSaMP

Model-basedl1-Norm Minimization

Fig. 7. Comparison of the recovery of a piece-wise signal usingdifferent methods. Source: [2].

is the original piecewise smooth HeaviSine test signal oflength N = 1024 to be recovered. It is compressible in aconnected wavelet tree model. Top right shows the resultfor the CoSaMP algorithm. Bottom left is the standardl1-norm minimization via linear programing. Finally, thebottom right image shows the result of the recovery usinga model-based version of the CoSaMP.

B. ApplicationsIn this section we survey reference works that utilized

the theory of compressive sensing. Although CS has beenapplied to different fields of science, herein we focus on im-age and video processing works. For in depth informationand detailed results, the reader is kindly asked to refer tothe respective paper of interest.

1) Single-pixel Imaging Via Compressive Sampling: Thebest known example of CS in action is the prototype ofthe “single-pixel camera” created by Duarte et al. [7]. Theoperation can be described as follows: The light of thetarget object reaches through the lens a matrix of micro-mirrors (DMD). Each micro-mirror can either reflect thelight to the photo-sensor or away from each. Each mirror iscontrolled by a random function. Although there are manymirrors, there is only one photo-sensor (photodiode). Theresult is that the “image” captured by the photo-sensor ata time is a combination of a random number of samples,

5

created by the mirrors. The sensor signal is digitalized andtransmitted for posterior reconstruction. Please see Figure8 for the schematic of the camera. Figure 9 shows the resultof an image captured by the single-pixel camera.

Fig. 8. Schematic of the single-pixel camera. Source: [1].

Fig. 9. Conventional image (black and white 256 × 256 px) on theleft. On the right side, the resulting image captured by the single-pixel camera with 1300 measurements (50× sub-Nyquist). Source:[7].

2) Compressive Imaging of Color Images: Nagesh andLi [13] extended the concept of the single-pixel camera tosupport color images. The schematic can be seen in Figure10.

As a first idea, someone would be tempted to implementthree versions of the single-pixel camera, one for each colorchannel (red, green and blue, RGB), and then combine theoutputs. The problem of this approach is that it does notconsider the redundancies that exist between the channel.To avoid this, Nagesh and Li [13] used a Bayes color filterfor acquisition of the channels followed by a joint RGBreconstruction scheme.

Photodiode

RotatingColor Filter

DMD Array

ColorImage

RNG + Rotation Control

Rearrange to

Joint Color CSReconstruction

Lens 2

Lens 1

Mosaic Structure

Fig. 10. Color Single-pixel Camera. Source: [13].

3) Compressive Imaging for Video Representation andCoding: In this work, Wakin et al. [22] proposed al-gorithms and hardware for compressed video sampling.Based on the single pixel-camera, it “directly acquiresrandom projections of the light field without first collectingthe pixels/voxels”.

Although substantially similar to the previous worksdealing with the single-pixel camera, this one innovateson the concept of three dimensional sparsity of the video.As a video stream can be seen as a succession of 2D frames,we get a 3D structure, considering the temporal dimension.The authors used a 3D wavelet basis to represent the videostream.

One disadvantage of this method is the assumption thatone frame is not very different from its predecessor frame.In practical terms it means that the events depicted inthe video change very slowly. In general, simulation resultspresented better reconstruction using 3D wavelets than 2Dwavelets.

4) Compressive Rendering: A Rendering Application ofCompressed Sensing: Although not directly related tovideo processing, the work of Sen and Darabi [18] haveinteresting results that can be used in 3D video acqui-sition and processing. They proposed an “application ofcompressed sensing by using it to accelerate ray-tracedrendering in a manner that exploits the sparsity of thefinal image in the wavelet basis”.

The problem with rendering is very similar to thefundamental problem that led to the development of CS:the final image can be compressed with a desired algorithmbut there is no reason to spend processing power renderingevery pixel if they are going to be discarded. In this work,only a portion of the pixels was rendered. The remainingones were merely estimated.

As observed by the authors, the objective of the workwas not improve the speed of ray-tracing but how to get abetter quality for the final image with the given samples.The algorithm executes in two basic steps. In the first, thescene is rendered as usual, but using only a set k of pixels,instead of all the pixels. There is the assumption thatpixels are independent from each other. In the second step,the ray-traced pixels are used as the observation vectory, which is now used to reconstruct using the algorithmregularized orthogonal matching pursuit (ROMP).

Results showed that this method is faster than two ofthe best rendering techniques (Delaunay-Interpolation andACT) and require between 5 to 10% less samples. Also,compressive rendering outperformed the other methods inimage regions with sharp edges. An adaptive version wasalso developed and a comparison can be seen in Figure 11.

5) Image Compression and Recovery Through Compres-sive Sampling and Particle Swarm: whilst most of worksdealing with the reconstruction of sparse signal apply avariation of either l1 minimization or some kind of greedyalgorithm, Sturgill et al. [21] have used Particle SwarmOptimization (PSO) in order to reconstruct the sparsesignal. Initial tests on synthetic signals were promising,leading the authors to test their method with real-worldimage signals.

For the tests, images were split into 8 × 8 blocks andeach block is separately encoded as a signal in R64. After

6

Fig. 11. Compressive rendering. The top row shows the measurementlocations. Bottom row shows the image results. The images on theright are the results after many iterations. Source: [18].

taking the DCT coefficients of each block, their sparsitywas calculated for the determination of the best numberof samples. Authors call this method as “approximatesparsity”.

Another approach is “exact sparsity”. After taking theDCT transform, small coefficients are zeroed to forcethe singal to be k-sparse, according to the desired k.IDCT is applied and the reconstruct image block is nowcompressively sampled.

Although exact recovery outperformed the approximaterecovery, results are still far from what is obtained, forexample, with JPEG.

6) Compressive Video Sampling: Stankovic et al. [20]presented a more efficient method to perform video sam-pling. Like in previous section IV-B5, DCT is applied tothe initial image and the coefficients with low amplitudeare zeroed. The next step determines which image blocksare candidate to compressive sensing. The premise here isthat not every region of the image should be compressivelysampled. Compressive sampling is applied only to theblocks that are already sparse.

The initial step of the process consists in the full sam-pling of a reference frame. This frame will be used topredict the sparsity of blocks in the successive frames. If ablock is determined to be sparse, it will be compressivelysampled.

Compressed sampled frames are reconstructed using or-thogonal matching pursuit (OMP) algorithm, which doesnot give the best possible results, but has low complexityand thus, can be used for real-time processing. Also,the algorithm is used only for the blocks that were CS-encoded. The diagram of the process can be seen in Figure12.

The authors obtained good reconstruction quality withup to 50% saving in acquisition. Also, low complexityscenes required much less reference frames, increasingefficiency. Due to the feedback link from the decoder, it isexpected that this system works for on-line streaming.

domain. In this section we briefly review compressive sam-pling closely following notation of [3].

Let x = {x[1], . . . ,x[N]} be a set of N samples of a real-valued, discrete-time random process X . Let s be the repre-sentation of x in the Ψ domain, that is:

x = Ψs =N

∑i=1

siψi, (1)

where s = [s1, . . . ,sN ] is an N-vector of weighted coefficientssi = 〈x,ψi〉, and Ψ = [ψ1|ψ2| · · · |ψN ] is an N ×N basic ma-trix with ψi being the i-th basic column vector.

Vector x is considered K-sparse in the domain Ψ, forK ≪ N, if only K out of N coefficients of s are non-zero.Sparsity of a signal is used for compression in conventionaltransform coding, where the whole signal is first acquired(all N samples of x), then the N transform coefficients s areobtained via s = ΨT x, and then N −K (or more in the caseof lossy compression) coefficients of s are discarded and theremaining are encoded. Hence severe redundancy is presentin the acquisition since large amounts of data are discardedbecause they carry negligible or no energy.

The main idea of compressive sampling is to remove this“sampling redundancy” by needing only M samples of thesignal, where K < M ≪ N. Let y be an M-length measure-ment vector given by: y = Φx, where Φ is an M ×N mea-surement matrix. The above expression can be written interms of s as:

y = ΦΨs. (2)

It has been shown in [1, 2] that signal x can be recov-ered losslessly from M ≈ K or slightly more measurements(vector y in (2)) if the measurement matrix Φ is properly de-signed, so that ΦΨ satisfies the so-called restricted isometryproperty [2]. This will always be true if Φ and Ψ are in-coherent, that is, the vectors of Φ cannot sparsely representbasic vectors and vice versa.

It was further shown in [1, 2, 3] that an independent iden-tically distributed (i.i.d.) Gaussian matrix Φ satisfies theabove property for any (orthonormal) Ψ with high probabil-ity if M ≥ cK log(N/K) for some small constant c. Thus, onecan recover N measurements of x with high probability fromonly M ≈ cK log(N/K) < N random Gaussian measurementsy under the assumption that x is K-sparse in some domain Ψ.Note that it is not known in advance which si coefficients arezeros, or which x[i] samples are not needed.

Unfortunately, reconstruction of x = {x[1], . . . ,x[N]} (orequivalently, s = [s1, . . . ,sN ]) from vector y of M samplesis not trivial. The exact solution [1, 2, 3] is NP-hard andconsists of finding the minimum l0 norm (the number of non-zero elements). However, excellent approximation can beobtained via the l1 norm minimization given by:

s = argmin ||s′||1, such that ΦΨs′ = y. (3)

This convex optimization problem, namely, basis pur-suit [1, 2], can be solved using a linear program algorithmof O(N3) complexity. In contrast to l0 norm minimization,the l1 norm minimization usually requires more than K + 1measurements. Due to complexity and low speed of linearprogramming algorithms, faster solutions were proposed atthe expense of slightly more measurements, such as match-ing pursuit, tree matching pursuit [8], orthogonal matchingpursuit [7], and group testing [9].

3. PROPOSED SYSTEM

In this section, we describe our system for compressive videosampling using the OMP algorithm [7].

We use an i.i.d. Gaussian measurement matrix for Φ andinverse DCT for Ψ in equation (2). This choice of Φ en-sures that the restricted isometry property is satisfied. TheOMP algorithm is an efficient solution for signal recoverythat is easy to implement. It is of O(MNK) complexity, andrequires M ≈ 2K logN measurements for error-free recoveryof N samples in 99% of time. The algorithm has K iterations,and in each iteration it calculates N inner products betweenM-length vectors and finds the maximum. Thus, when M andN are large the algorithm is slow and impractical.

Our proposed system is shown in Figure 1. To reducethe acquisition and reconstruction complexity and exploit lo-cal sparsity within the frame, each frame is split into B non-overlapping blocks each of size n×n = N pixels. We definereference and non-reference frames. Each reference frame issampled fully. After sampling, a compressive sampling test(described below) is carried out to identify which blocks aresparse within the reference frame. The output of the test isbinary for each of the B blocks, e.g., true or false.

Reference frames should be inserted regularly in thestream: exactly when reference frames are required can bedetermined by exploiting decoder feedback. The number ofrequired reference frames depends on the dynamics of thescene, as shown in Section 4.

Let Bs be the number of sparse blocks. Each non-reference frame is sampled in the following way: Bs blocksthat spatially correspond to the sparse blocks in the referenceframe are compressively sampled; that is, each block is trans-formed into a N ×1 vector on which Gaussian measurementmatrix Φ is applied. The remaining B−Bs blocks are sam-pled in the conventional way. For each of the Bs selectedblocks we acquire M < N measurements. The resulting co-efficients can undergo conventional compression in the formof quantization/entropy coding.

Divide into B

n×n blocks

Divide into B

nxn blocks

Conventional

sampling

DCT on each

nxn block

Compressive

sampling test

Conventional

sampling

Compressive

sampling

Tow

ards v

ideo

com

pressio

n

Refe

rence

fram

e

No

n-r

efer

en

ce

fram

e

Control Unit

switches

feedback from the decoderDCTb1,…, DCTbB

Binary decision for each

of the B blocks

block1, …, blockB

M

T, C

Bs

blocks

B-Bs blocks

block1, …, blockB

Figure 1: Block diagram of the acquisition process.

Compressive sampling test: Reference frames are sam-pled fully, and DCT is applied on each of the B blocks. Weselect Bs sparse blocks in the following manner. Let C bea small positive constant, and T an integer threshold that isrepresentative of the average number of non-significant DCTcoefficients over all blocks. If the number of DCT coeffi-cients in the block whose absolute value is less than C is

16th European Signal Processing Conference (EUSIPCO 2008), Lausanne, Switzerland, August 25-29, 2008, copyright by EURASIP

Fig. 12. Compressive video sampling. Source: [20].

7) Joint Compressive Video Coding and Analysis: Aclassic approach to video processing can be defined inthree steps: acquisition, coding and analysis. These stepshave been performed in separated stages but, after thedevelopment of CS, acquisition and coding have merged.Still, analysis is a separate step and the topic presentedby Cossalter et al. [5] is how to merge all these stages.

The premise is that not all information in the opticaldomain is relevant to all applications. The paper focuseson surveillance videos. In this type of video, certain typesof information (e.g.: background) is irrelevant. Conversely,motion direction and speed are probably very important.With this knowledge, the system presented does not re-construct the whole frame, instead only the foreground.Also, the decoder utilizes prior information, like size andposition, of the objects in scene as clues to help reduce thenumber of required measurements. Figure 13 summarizesthe encoding and decoding process. Please refer to thepaper for the details of each method depicted.

Tests showed that joint compressive coding and analysisresulted in better quality of the foreground, when com-pared to disjoint coding and analysis. Despite the goodresults, they are still not as good as H.264/AVC, accordingto the comparisons performed.

Fig. 13. Joint Compressive Video Coding and Analysis. Source: [5].

8) On the Performance of Compressive Video Stream-ing for Wireless Multimedia Sensor Networks: Pudlewski

7

and Melodia [16] investigate the use of CS in wirelessmultimedia sensor networks (WMSN) focusing on keyvideo parameters, i.e.: quantization, samples per frameand channel encoding rate; and how each parameter affectsthe received video quality, through the wireless network.

In their tests, they used gray-scale images sparsifiedthrough wavelet transform. Instead of using a randomacquisition system (see subsection IV-B1) a scrambledblock Hadamard ensemble was used [8].

During the tests, results similar to MPEG-2 and othermodern codecs were obtained. One observation from theauthors is that only spatial correlation between frameshave been used (difference frame). Therefore, the im-plementation of temporal correlation, motion vectors orHuffman coding could improve the results even more.

Perhaps the most worth mentioning result from thiswork is the resistance of CS-encoded video to channelerrors. Authors could get about zero loss in image qualityfor BER up to 10−4 and low degradation up to 10−3. Theauthors also showed that forward error correction is notbeneficial to CS-encoded video and they proposed a newadaptive parity scheme.

Figure 14 shows the result of the method for the re-ceived image of Lena under different BER conditions. Thebottom row shows the same BER conditions but for JPEGencoding.

Fig. 14. On the top: CS-encoded Lena with 10−5, 10−4 and 10−3

BER, respectively. Bottom: same image encoded using JPEG alsowith 10−5, 10−4 and 10−3 BER, respectively. Source: [16].

9) Compressive Image Fusion: Usually, in order twofuse two images into one, using well-known methods,previous knowledge of the original images is required. AsCS does not suggest any prior knowledge of the signalbeing encoded, it might be useful for image fusion.

Taking advantage of this fact, Wan et al. [23] firstanalyzed the impact of different sampling patterns in theFourier domain at the reconstruction process. The threesampling patterns, used for the Φ matrix (section III) canbe seen in Figure 15. The white lines indicates the regionswhere frequencies will be sampled. Different patterns gen-erate different measures. The best performance in the tests

was achieved by the double-star-shaped pattern, due to itsgood balance of low and high frequencies in the Fourierdomain.

Authors observed that by using about 50% fewer com-pressive measurements than reconstructed pixels, resultswere very similar as if the full set of pixels were sampled.Unfortunately, perceptual analysis of the resulting imageswere not shown.

Fig. 15. CS sampling patterns used for image fusion. From left toright: star shape, double-star shape and star-circle shape. Source:[23].

10) Compressive Coded Aperture Video Reconstruction:One of the natural applications of CS regards super-resolution and its application on the reconstruction ofimages. With preliminary superresolution attempt shownin [12], Marcia and Willett [11] extended their frameworkto support video reconstruction. The basis of their workrelies on “a combination of coded aperture sensing andwavelet-based reconstruction algorithms”.

Two points of interest in this work is the use of adaptedmasks for coded aperture to support compressive sensingand the use of the gradient projection for sparse recon-struction (GPSR) algorithm to solve the optimizationproblems. Interesting finding from the simulations is thatfor a desired accuracy, the processing time in generallysmaller when the picture block size is larger. The authorsalso reviewed several sparse representation algorithms.

V. Conclusion

In this paper we reviewed a number of works related tothe theory of compressive sensing and recent advances to-wards more efficient reconstruction algorithms, which untilnow represent the bottleneck of CS process. Also differentworks regarding the application of the compressive sensingtheory were reviewed. Although the fields in which CShave been applied are diverse, we focused on works relatedto image and video coding.

It is possible to summarize the pros and cons of com-pressive sensing as:

Pros:• Less bandwidth required for transmission.• Less storage space needed.• Very robust to channels errors.• High potential for scalability.Cons:• Increased complexity of the decoder.

8

• Difficult to further process the data once it is com-pressed.

None of the works reviewed is efficient in a processingtime sense. They tend to be slow compared to current “tra-ditional” methods, which leaves space for much improve-ment. Due to its recent nature, compared to Shannon-Nyquist sampling, more applications and further develop-ments are expected in short and long terms.

References[1] Richard Baraniuk. Compressive sensing. Lecture

notes in IEEE Signal Processing magazine, 24(4):118–120, 2007.

[2] Richard G. Baraniuk, Volkan Cevher, Marco F.Duarte, and Chinmay Hegde. Model-Based compres-sive sensing. 2008. URL http://arxiv.org/abs/0808.3572.

[3] R. Berinde, A. C Gilbert, P. Indyk, H. Karloff, andM. J Strauss. Combining geometry and combina-torics: A unified approach to sparse signal recovery.In 2008 46th Annual Allerton Conference on Com-munication, Control, and Computing, pages 798–805,2008.

[4] E.J. Candes, J. Romberg, and T. Tao. Robust un-certainty principles: exact signal reconstruction fromhighly incomplete frequency information. Informa-tion Theory, IEEE Transactions on, 52(2):489–509,2006. ISSN 0018-9448. doi: 10.1109/TIT.2005.862083.

[5] Michele Cossalter, Giuseppe Valenzise, MarcoTagliasacchi, and Stefano Tubaro. Joint compressivevideo coding and analysis. IEEE Transactions onMultimedia, 12(3):168–183, 2010. ISSN 1520-9210.doi: 10.1109/TMM.2010.2041105.

[6] David L. Donoho. Compressed sensing. IEEE Trans-actions on Information Theory, 52(4):1289–1306,2006. ISSN 0018-9448. doi: 10.1109/TIT.2006.871582.

[7] Marco F Duarte, Mark A Davenport, DharmpalTakhar, Jason N Laska, Ting Sun, Kevin F Kelly, andRichard G Baraniuk. Single-pixel imaging via com-pressive sampling. IEEE Signal Processing Magazine,25(2):83–91, 2008.

[8] L. Gan, T. T. Do, and T. D. Tran. Fast compressiveimaging using scrambled hadamard ensemble. InProc. EUSIPCO, 2008.

[9] A. C. Gilbert, M. J. Strauss, J. A. Tropp, andR. Vershynin. One sketch for all: fast algorithmsfor compressed sensing. In Proceedings of the thirty-ninth annual ACM symposium on Theory of com-puting, pages 237–246, San Diego, California, USA,2007. ACM. ISBN 978-1-59593-631-8. doi: 10.1145/1250790.1250824.

[10] Brian Hayes. The best bits. American Scientist, 97(4):276, 2009. ISSN 0003-0996. doi: 10.1511/2009.79.276. URL http://www.americanscientist.org/issues/num2/the-best-bits/1.

[11] R. Marcia and R. M Willett. Compressive codedaperture video reconstruction. In Proc. EuropeanSignal Processing Conf.(EUSIPCO), 2008.

[12] R. F Marcia and R. M Willett. Compressive codedaperture superresolution image reconstruction. In Int.Conf. on Acoustics, Speech and Sig. Proc., ICASSP,pages 833–836, 2008.

[13] Pradeep Nagesh and Baoxin Li. Compressive imagingof color images. In Proceedings of the 2009 IEEEInternational Conference on Acoustics, Speech andSignal Processing, pages 1261–1264. IEEE ComputerSociety, 2009. ISBN 978-1-4244-2353-8.

[14] D. Needell and J. A. Tropp. CoSaMP: iterative signalrecovery from incomplete and inaccurate samples.Applied and Computational Harmonic Analysis, 26(3):301–321, 2009.

[15] Harry Nyquist. Certain topics in telegraph transmis-sion theory. Proceedings of the IEEE, 90(2):280–305,2002. ISSN 0018-9219. doi: 10.1109/5.989875.

[16] Scott Pudlewski and Tommaso Melodia. On the per-formance of compressive video streaming for wirelessmultimedia sensor networks. In Proc. of IEEE IntConf on Communications (ICC), May 2010.

[17] Justin Romberg. Imaging via compressive sampling.IEEE Signal Processing Magazine, 25(2):14–20, 2008.

[18] Pradeep Sen and Soheil Darabi. Compressive render-ing: A rendering application of compressed sensing.IEEE Transactions on Visualization and ComputerGraphics, 2010. ISSN 1077-2626. doi: 10.1109/TVCG.2010.46.

[19] Claude E. Shannon. Communication in the presenceof noise. Proceedings of the IEEE, 86(2):447–457,1998. ISSN 0018-9219. doi: 10.1109/JPROC.1998.659497.

[20] V. Stankovic, L. Stankovic, and S. Cheng. Com-pressive video sampling. In Proc. Eusipco-2008 16thEuropean Signal Processing Conference, Lausanne,Switzerland, 2008.

[21] David B Sturgill, Benjamin Van Ruitenbeek, andRobert J Marks. Image compression and recoverythrough compressive sampling and particle swarm.In Proceedings of the 2009 IEEE international con-ference on Systems, Man and Cybernetics, page1821–1826, 2009.

[22] M. Wakin, J. N Laska, M. F Duarte, D. Baron, S. Sar-votham, D. Takhar, K. F Kelly, and R. G Baraniuk.Compressive imaging for video representation andcoding. In Picture Coding Symposium, 2006.

[23] Tao Wan, N. Canagarajah, and A. Achim. Compres-sive image fusion. In Image Processing, 2008. ICIP2008. 15th IEEE International Conference on, pages1308–1311, 2008. ISBN 1522-4880. doi: 10.1109/ICIP.2008.4712003.

http://arxiv.org/abs/0808.3572

http://arxiv.org/abs/0808.3572

http://www.americanscientist.org/issues/num2/the-best-bits/1

http://www.americanscientist.org/issues/num2/the-best-bits/1

Documents

New A Survey on Compressive Sensing and Applicationsmine/Denjo/rinkodata/rinko... · 2010. 12. 13. · A Survey on Compressive Sensing and Applications ... regarding its applications