Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Foundations of Smart SensingCompressive Sensing
MSc in Statistics for Smart Data - ENSAI
Aline RoumyInria Rennes (France)
December 2018
1/ 66
Outline1 Part 1 - Why compressive sensing?
2 Part 2 - Compressive sensing: how it works?Notations (Reminder)Sensing processReconstruction processGood sensing matrices? First insightsCompressive sensing vs other schemes
3 Part 3 - Compressive sensing: why it works?A magic tool: concentration inequalityConditions for exact reconstruction with P1
Restricted isometry propertyRIP ⇒ NSP ⇔ recoveryConcentration ⇒ RIP and condition on mGaussian ⇒ Concentration
4 Part 4 - Compressive sensing: what it is good for?
2/ 66
Organization
• This course is the second and last part of the courseFoundations of Smart Sensing.
• Teachers:
Nancy BertinCNRS/IRISA, Panama
audio and acoustics
Cedric HerzetInria, Fluminance
geophysics
Aline Roumy∗
Inria, Siroccoimage and video
3/ 66
Course schedule (tentative)
Compressive sensing (CS): a self-sufficient course with a lot ofconnections to sparse approximations
• Dec 5: Lecture (CS: how it works + why it works )
• Dec 6: Lecture (CS: what for -2)+ Lab
• Dec 7: Lab, summaries
Quiz: (within course Quiz ), no grade.
• connect to socrative.com and login as student
• room name ALINER
• you can put your name but if you prefer you can use any othername
4/ 66
Course grading
• Quiz: (within course), no grade.
• Project:I Computer lab
using Matlab, R, Python...I write a report on the lab activities (max 2 pages for the
comments (excluding proofs, figures):deadline TO BE DISCUSSEDsend the pdf file + matlab (or sthg else) code files via email [email protected]
I You will get a grade from the evaluation of your report,and during the lab session.
• Final Exam:I written exam (Dec 13th) for the whole courseI 2 hours
5/ 66
Course material
• G. Kutyniok, Theory and Applications of Compressed Sensing,GAMM Mitteilungen 36 (2013), 79-101.
• S. Foucart, Notes on compressed sensing, 2009. (pdf)
For more information:
• S. Foucart, H. Rauhut, A mathematical introduction tocompressive sensing, Birkhauser, 2013.Chapters 6 (RIP), 9 (RIP probability).
• M.A. Davenport, M.F. Duarte, Y.C. Eldar, G. KutyniokIntroduction to compressed sensing.in Compressed Sensing: Theory and Applications, Y.C. Eldar andG. Kutyniok eds., Dekker, 2012. (pdf)
6/ 66
Outline1 Part 1 - Why compressive sensing?
2 Part 2 - Compressive sensing: how it works?Notations (Reminder)Sensing processReconstruction processGood sensing matrices? First insightsCompressive sensing vs other schemes
3 Part 3 - Compressive sensing: why it works?A magic tool: concentration inequalityConditions for exact reconstruction with P1
Restricted isometry propertyRIP ⇒ NSP ⇔ recoveryConcentration ⇒ RIP and condition on mGaussian ⇒ Concentration
4 Part 4 - Compressive sensing: what it is good for?
7/ 66
Part 1 - Why compressive sensing?
8/ 66
Classical:
sampling + compression
Compressive sensing
sampling AND compression
9/ 66
Classical: Sampling + compression
Acquisition (sampling/sensing)
• For instance HD video1920x1080=2.07 M pixels/image25Hz: images/s,12(=8+2+2) bits/pixel
→ 0.6 Gbit/s
Compression
• For instance, MPEG-5 (2013)0.6 Gbit/s → 2Mbit/s i.e. compression ratio 300:10.6 Gbit/s → 0.04 bit/pixel
Compressive sensing: can we acquire less data in the first place?To answer the question: let us first look at the classicalsampling+compression technique.
10/ 66
Sampling: (1) Fourier representation
The frequency viewpoint (Fourier 1822)Signals can be built from the sum of harmonic functions
Wikipedia.
irenevigueguix.wordpress.com
x(t) signalX (f ) Fourier coefficiente2πift harmonic functionAnalysis
X (f ) =
∫ +∞
−∞x(t)e−2πiftdt
Synthesis
x(t) =
∫ +∞
−∞X (f )e2πiftdt
11/ 66
Sampling: (2) optimal sampling rate
Nyquist–Shannon sampling theorem:“Exact reconstruction of a continuous-timesignal from discrete samples is possible if thesignal is bandlimited and the samplingfrequency is greater than twice the highestfrequency.”
Wikipedia.
IDCOM, University of Edinburgh
Classical Sampling Theory
The Whittaker–Kotelnikov–Shannon
Sampling Theorem states:
“Exact reconstruction of a continuous-time
signal from discrete samples is possible if
the signal is bandlimited and the sampling
frequency is greater than twice the highest
frequency.”
Sampling below this rate introduces aliasing
Subspace of
bandlimited
signals
Sampling
Operator
S i g n a l s p a c e
Reconstruction
Operator (linear)
Observation space
Mike Davies.
12/ 66
Sampling: (3) optimal sampling rate
Sampling below this rate introduces
aliasing
signal ambiguity
SVI.nl
wikipedia
13/ 66
Compression: (1) image decomposition
with 2D-discrete cosine transform (DCT) (orthogonal basis)1- Split image into blocks of size N1 × N2 each2- For each N1 × N2 image block (xn1,n2) compute the N1 × N2 blockof transformed image (ck1,k2) with:
ck1,k2 =
N1−1∑n1=0
N2−1∑n2=0
xn1,n2 cos
[π
N1
(n1 +
1
2
)k1
]cos
[π
N2
(n2 +
1
2
)k2
]︸ ︷︷ ︸
Φn1,n2(k1, k2)
Example: 8x8 DCT transformTop-left matrix is(Φn1,n2(k1 = 0, k2 = 0))n1,n2
Quiz 1, 2, 3
14/ 66
Compression: (1) image decomposition
Left: image; Right: discrete cosine transform of image
Key concept: few degrees of freedom in the transform domain
15/ 66
Compression: (2) image approximation
s-term approximation:Keep the s coefficients cs with largest absolute value and x = Φ−1csLeft: 1% kept; Right:5% kept
Quiz 4
16/ 66
Can we sample signals at the “Information Rate”?
Yes, we can: compressive sensing
Wikipedia.
E. J. Candes and T. Tao, 2005“Decoding by linear programming”
Wikipedia.
D. L. Donoho, 2006“Compressed sensing”
When compressing a signal we typically take lots of samples, move toa transform domain, and then throw most of the coefficients away!Can we just sample what we need?
Yes! . . . and more surprisingly we can do this non-adaptively.
17/ 66
Compressive sensing overviewObserve x ∈ Rn via m measurements, with m� nMore precisely, y = Mx where y ∈ Rm
Assumptions:
- signal approximately s-sparse
- use approximately m ≥ O(
s logn
s
)random projections formeasurements
- reconstruct by a non linear mapping
IDCOM, University of Edinburgh
Compressed sensing Overview
Compressed Sensing assumes a
compressible set of signals, i.e.
approximately k-sparse.
Using approximately
𝑚 ≥ 𝒪 𝑘 log2
𝑁
𝑘
random projections for measurements
we have little or no information loss.
Signal reconstruction by a nonlinear
mapping.
Many practical algorithms with
guaranteed performance e.g. 𝐿1 min.,
OMP, CoSaMP, IHT.
Compressible
set of interest
random projection
(observation)
nonlinear
approximation
(reconstruction)
Observe 𝒙 ∈ ℝ𝑁 via 𝑚 ≪ 𝑁 measurements, 𝒚 ∈ ℝ𝑚 where 𝒚 = Φ𝒙
Mike Davies.
THIS is the course SUMMARY!!!
18/ 66
Outline1 Part 1 - Why compressive sensing?
2 Part 2 - Compressive sensing: how it works?Notations (Reminder)Sensing processReconstruction processGood sensing matrices? First insightsCompressive sensing vs other schemes
3 Part 3 - Compressive sensing: why it works?A magic tool: concentration inequalityConditions for exact reconstruction with P1
Restricted isometry propertyRIP ⇒ NSP ⇔ recoveryConcentration ⇒ RIP and condition on mGaussian ⇒ Concentration
4 Part 4 - Compressive sensing: what it is good for?
19/ 66
Part 2 - Maths of compressive sensing -how it works?
Notations (Reminder)
20/ 66
NormsDefinition (lp-norm)
The lp-norm of x ∈ Rn, p > 1 is defined as
||x ||p =
(
n∑i=1
|xi |p)1/p
p ∈ [1,∞)
maxi|xi | p =∞
If p < 1, definition still valid, but triangle inequality not satisfied⇒ quasi-norm.
Definition (inner product)
〈x , z〉 = zTx =n∑
i=1
xizi
See textbook F.R. for extension to Cn.21/ 66
Definition (support and l0-norm)
The support of a vector x is the index set of its non-zero entries, i.e.
supp (x) = {j ∈ [n] : xj 6= 0}, where [n] = {1, 2, ..., n}
The l0-norm of x is defined as
||x ||0 = card ( supp (x) )
||x ||0 counts the number of non-zero entries of x .||.||0 is not even a quasi-norm.
22/ 66
Sparsity definition
Definition (s-sparse)
A signal x ∈ Rn is said to be s-sparse if it has at most s non-zeroentries, i.e. ||x ||0 ≤ s.
Definition (Σs)
We define Σs as the set containing all s-sparse signals, i.e.Σs = {x ∈ Rn : ||x ||0 ≤ s}.
Note 1: Sparsity is a highly nonlinear model (Σs is not a linear space)
Quiz 1
Note 2: in many practical cases, x is not sparse itself, but it has asparse representation in some basis Φ. We still say that x is s-sparse,with the understanding that we can write x = Φu, and ||u||0 ≤ s.
23/ 66
Approximate sparsity
• A sparse signal can be represented exactly giving the positionsand values of its s nonzero components
• Real-world signals are rarely exactly sparse.We need to
I generalize the def: from “sparse’’ to “compressible” signals,I describe the representation error i.e. the error incurred
representing just s components of the signal.
24/ 66
Best s-term approximation
The best s-term approximation picks the s components thatminimize the representation error
Definition (best s-term approximation)
For p > 0, the lp-error incurred by the best s-term approximation toa vector x ∈ Rn is given by
σs(x)p = minx∈Σs
||x − x ||p
• If x ∈ Σs , then σs(x)p = 0 for any p.
25/ 66
Compressible signal
Optimal strategy to compute the best s-term approximation:thresholding
• Reorder the elements of x by decreasing magnitude
• Pick the first s elements, set all others to zero.
Definition (compressible signal)
a signal x ∈ Rn is said to be compressible if the error of its bests-term approximation decays quickly in si.e. if ∃C1, q > 0 such that |xi | ≤ C1i−q., when the coefficients havebeen ordered such that |x1| ≥ |x2|... ≥ |xn|.
26/ 66
Sparsity support
Suppose x ∈ Rn. Let S ⊂ [n] and Sc ⊂ [n]\S• S : sparsity support of x , i.e. the locations of the nonzero
coefficients of x
• Sc : set of locations of the 0 coefficients
• S for compressible signal: set of locations of the coefficientsbelonging to the best s-term approximation of x .
Notation
xS vector obtained by setting the entries of x indexed by Sc to 0.
MS matrix obtained by setting the columns of M indexed by Sc to 0.
• Same notation to denote vectors/matrices where theelements/columns have been removed, instead of being set to 0
27/ 66
Outline1 Part 1 - Why compressive sensing?
2 Part 2 - Compressive sensing: how it works?Notations (Reminder)Sensing processReconstruction processGood sensing matrices? First insightsCompressive sensing vs other schemes
3 Part 3 - Compressive sensing: why it works?A magic tool: concentration inequalityConditions for exact reconstruction with P1
Restricted isometry propertyRIP ⇒ NSP ⇔ recoveryConcentration ⇒ RIP and condition on mGaussian ⇒ Concentration
4 Part 4 - Compressive sensing: what it is good for?
28/ 66
Part 2 - Maths of compressive sensing -how it works?
Sensing process
29/ 66
Compressive sensing
Goal of Compressive sensing (CS): achieve results similar to bests-term approximation using a nonadaptive acquisition/encoder.
Definition (CS)
Let x ∈ Rnx1 be a signal.Let M ∈ Rmxn, with m < n, be a sensing matrix.We take linear measurements of the signal as
y = Mx
with y ∈ Rmx1. Also, yj = Mjx , where Mj is the j-th row of M.
FIRST, assume that x is sparse.
30/ 66
Sensing process Sensing process
= .
y M
x
• How should we choose a “good” matrix A with m π n??• How do we reconstruct x from y , given A?
49 / 235
• How do we reconstruct x from y , given M?
31/ 66
Part 2 - Maths of compressive sensing -how it works?
Reconstruction process
32/ 66
Reconstruction
Now we have taken measurements
y = Mx ,
and want to reconstruct x from y , given knowledge of M .
• M ∈ Rmxn is not square ⇒ not invertible
• Underdetermined system of equations,⇒ infinitely many solutions
• Which one should we pick out of all z’s satisfying y = Mz?
33/ 66
Least square solution and pseudo inverse
s
sss
Tall matrix + full column rank: A injective (one-to-one)overdetermined system, (at most 1 solution)
x = argminz||y − Az ||2
= A+y = (ATA)−1AT y
PIm(A)y = y = AA+y
Recall: used in OMPs
s
ssss
Fat matrix + full row rank: A surjective (onto)underdetermined/overcomplete system, (∞ nb. solutions)
x = arg minz:y=Az
||z ||2
= A+y = AT (AAT )−1y
34/ 66
What’s wrong with the pseudoinverse?Example: x ∈ R128x1, s = 3,m = 10,M ∈ R10x128 i.i.d. Gaussian
What’s wrong with the pseudoinverse?Example: x œ R128◊1, K = 3, M = 10, A œ R10◊128
pseudorandom Gaussian
20 40 60 80 100 120-5
0
5
10
15
20x
20 40 60 80 100 120-6
-4
-2
0
2
4
6M+ y
62 / 23535/ 66
Minimum l0-norm solution
x = arg minz∈Rn||z ||0 subject to Mz = y
Complexity?
• Problem is non-convex
• Problem is NP-hard:for a given s, try all possible
(ns
)supports, estimate the s
nonzero values of x , check if constraint is satisfied⇒ infeasible for practical problem sizes
36/ 66
Practical philosophies
x = arg minz∈Rn||z ||0 subject to Mz = y
Greedyalgorithms
Focus on ||x ||0
Thresholdingalgorithms
Focus on y ∼ Mx
Convex relaxationalgorithms
Solve a nicer problem
Courtesy N. Bertin
37/ 66
Convex relaxation: minimum l1-norm
Basis Pursuit (BP)
x = arg minz∈Rn||z ||1 subject to Mz = y
• This problem has polynomial complexity, O(n3)
• It can be solved using linear programming techniques⇒ efficient solvers available
• Solutions to the l1 problem still tend to be rather sparse• Equivalence of solutions between l1 and l0 problems? (See N.
Bertin course)
38/ 66
Signal sparse in transform domain
Real signals are rarely directly sparse...but rather sparse in a transform domain
original image DCT coefficients of the imageoriginal image in the transform domain
39/ 66
Signal sparse in transform domain
Let us assume that
• x is not sparse
• x sparse in a transform domain: ∃Φ s.t. x = Φu, u sparse
Then sensing problem is exactly the same.
• We sense y = Mx , with x nonsparse
Since y = Mx = MΦu, with u sparse, reconstruction becomes
• “Solve” y = MΦu for u:
u = arg minz∈Rn||z ||1 subject to MΦz = y
• We obtain x = Φu
• Make sure that MΦ (and not M) is a “good” sensing matrix
40/ 66
Signal sparsevs signal sparse in transform domain
x sparse
SENSINGy = Mx
RECONSTRUCTIONx = argminz∈Rn ||z ||1
subject to Mz = y.
x = Φu, u sparse
SENSINGy = Mx
RECONSTRUCTIONu = argminz∈Rn ||z ||1
subject to MΦz = yx = Φu
In conclusion: sparse vs sparse in the transform domain: the same!!
• same sensing
• similar reconstruction problem
41/ 66
Part 2 - Maths of compressive sensing -how it works?
Good sensing matrices? First insights
42/ 66
Sensing process Sensing process
= .
y M
x
• How should we choose a “good” matrix A with m π n??• How do we reconstruct x from y , given A?
49 / 235
• How should we choose a “good” matrix M with m� n?
43/ 66
Sensing matrices that are not good
Sensing matrices that are not good
0 5 10 15 20 25 30 35−1
−0.5
0
0.5
1
1.5
2
2.5
3
xMk
• Vector y is all zero!I We need A to be di�erent from x (in this case non-sparse)
∆ incoherence
56 / 235
Vector y is all zero!→ If x sparse, M must be non-sparse→ We need M to be different from x
44/ 66
Part 2 - Maths of compressive sensing -how it works?
Compressive sensing vs other schemes
45/ 66
Linear approximation
• Linear transform (DCT, wavelet, ...)
• Approximate the coefficients in a linear way:retain certain fixed coefficients
• Send the non 0 coefficients
+ Send only transform coefficients (support is known)
- Reconstruction not efficient (not all vectors have same supports)
||x − x ||2 = D requires R = 64sLA (if 64 bits used per coefficient)
46/ 66
Nonlinear approximation
• Linear transform (DCT, wavelet, ...)
• Approximate the coefficients in a nonlinear way: retaincoefficients that best approximate x (Best s-term approximation)
• Send support and values of transform coefficients
+ Adaptive representation, Φ(S)
- Computation not efficient (acquire a lot of coeff and set many to 0)
||x − x ||2 = D requires R = 64sNLA + sNLA log2 n, with sNLA ≤ sLA
46/ 66
Sparse approximation
• Approximate the coefficients in a nonlinear way:(Best s-term approximation)
• Send support and values of transform coefficients
+ Adaptive representation, D(S)
+/- Less non 0 coeff but higher cost for position (D fat)
- High complexity at encoder
||x − x ||2 = D requires R = 64sSA + sSA log2 d ,with sSA << sNLA ≤ sLA and d >> n
46/ 66
Compressive sensing
• Acquire directly “few” numbers in a nonadaptive way
+ Universal acquisition
+ Low complexity at encoder
- High complexity at decoder
? Can universal acquisition perform as well as an adaptive model?
→ Surprisingly yes, to some extent
||x − x ||2 = D requires R = 64m
46/ 66
More on Compressive sensing (CS)vs Sparse approximation (SA)
47/ 66
CS vs SA (con’t)
Non-linear solvers:
CS Given y and M , find xsparse such that Mx ≈ y .
Return x with guarantee that||x − x || small
SA Given x and D, find csparse such that x = Dc ≈ x .
Return x with guarantee that||x − x || = ||D(c − c)|| small
CS: proximity to the true root
SA: proximity to zero in the range of the function
48/ 66
CS vs SA (con’t)
Non-linear solvers:
CS Given y and M , find xsparse such that Mx ≈ y .
Return x with guarantee that||x − x || small
SA Given x and D, find csparse such that x = Dc ≈ x .
Return x with guarantee that||x − x || = ||D(c − c)|| small
Same decompositionalgorithms
Different criteria
CS: proximity to the true root
SA: proximity to zero in the range of the function
48/ 66
CS vs SA (con’t)
Non-linear solvers:
CS Given y and M , find xsparse such that Mx ≈ y .
Return x with guarantee that||x − x || small
SA Given x and D, find csparse such that x = Dc ≈ x .
Return x with guarantee that||x − x || = ||D(c − c)|| small
Root-finding algorithm:
CS Given y = 0 and f , find xsuch that y = 0 ≈ f (x).
Return x with guarantee that||x − x || small
SA Given y = 0 and f , find xsuch that y = 0 ≈ y = f (x).
Return y with guarantee that||f (x)− 0|| small
CS: proximity to the true rootSA: proximity to zero in the range of the function
48/ 66
CS vs SA (con’t)
𝜖𝜖𝜖𝜖 x
yy=f(x)
solutions of SAsolutions of CS
Root-finding algorithm:
CS Given y = 0 and f , find xsuch that y = 0 ≈ f (x).Return x with guarantee that
||x − x || small
SA Given y = 0 and f , find xsuch that y = 0 ≈ y = f (x).Return y with guarantee that
||f (x)− 0|| small
CS: proximity to the true rootSA: proximity to zero in the range of the function
49/ 66
Part 3 - Compressive sensing -why it works?
.
Note: analysis performed for the more general inverse problemwith generic matrix A instead of M
50/ 66
The problem: invert y = Ax
A square
∃ a reconstruction map:
Rm → Rn
y 7→ x = A−1y
mcondition on the matrix
rank(A) = m = n
ker(A) = {z : Az = 0} = {0}
.
.
.51/ 66
The problem: invert y = Ax
A square
∃ a reconstruction map:
Rm → Rn
y 7→ x = A−1y
mcondition on the matrix
rank(A) = m = n
ker(A) = {z : Az = 0} = {0}
.
.
.
A fat
∃ a reconstruction map:
Rm → Rn
y 7→ x =??
mcondition on the matrix ???NEW Reduce the domain of
definition of A: s-sparse
.51/ 66
Part 3 - Maths of CS: why it works?
A magic tool: concentration inequality
52/ 66
Part 3 - Maths of CS: why it works?
Conditions for exact reconstruction with P1
53/ 66
Null space property (NSP): definition (1/3)
Definition (NSP(s))
Let A be an m× n matrix. Then A has the null space property (NSP)of order s if, for all v ∈ ker(A) \ 0 and for all index sets S s.t. |S | ≤ s
||vS ||1 < ||vSc ||1 or equivalently ||vS ||1 <1
2||v ||1
54/ 66
NSP: Uniqueness of P1 (2/3)
Definition (NSP(s))
Let A be an m× n matrix. Then A has the null space property (NSP)of order s if, for all v ∈ ker(A) \ 0 and for all index sets S s.t. |S | ≤ s
||vS ||1 < ||vSc ||1 or equivalently ||vS ||1 <1
2||v ||1
Theorem (F&R Th4.5 NSP ⇔ recovery P1)
Let A be an m× n matrix, and let s ≤ m. Then, these conditions areequivalent:
(i) If a solution x of (P1) satisfies ||x ||0 ≤ s, then this is the uniquesolution.
(ii) A satisfies the null space property of order s.
55/ 66
NSP: Stronger NSP (3/3)
Definition (NSP(s))
Let A be an m× n matrix. Then A has the null space property (NSP)of order s if, for all v ∈ ker(A) \ 0 and for all index sets S s.t. |S | ≤ s
||vS ||1 < ||vSc ||1 or equivalently ||vS ||1 <1
2||v ||1
Definition (mixed-norm NSP(s))
Let A be an m × n matrix. Then A has the strong l2 null spaceproperty (NSP) of order s if, for all v ∈ ker(A) \ 0 and for all indexsets S s.t. |S | ≤ s
||vS ||2 <1
2√
s||v ||1
Theorem (mixed-norm NSP(s) ⇒ NSP(s))
The standard NSP(s) is implied by the mixed-norm NSP(s).56/ 66
The restricted isometry property (RIP) (1/2)
Change of notation: s → k , since reference is now Foucart CS-Notes
Definition (Foucart CS-Notes Def. 8.1. RI Ratio)
The k-th order Restricted Isometry Ratio of the measurement matrixA is
γk(A) =βk(A)
αk(A)≥ 1
where αk(A) and βk(A) are the largest and smallest positiveconstants α and β s.t.
∀v ∈ Σk , α||v ||22 ≤ ||Av ||22 ≤ β||v ||22
• A preserves the Euclidean norm of k-sparse vectors
57/ 66
RIP: alternative definition RIP (2/2)
Definition (RIP and RI constant)
A satisfies the k-th order RIP with constant δ ∈ (0, 1) if
∀v ∈ Σk , (1− δ)||v ||22 ≤ ||Av ||22 ≤ (1 + δ)||v ||22
The smallest such constant δk(A) is the RI constant of the matrix A.
Note that
γk(A) =1 + δk(A)
1− δk(A)
58/ 66
RIP ⇒ NSP ⇔ recoveryTheorem (RIP ⇒ mixed norm NSP(s) ⇒ NSP(s) ⇔ recovery)
If A satisfies RIP with RI ratio γ2s =β2s
α2s, i.e.
∀v ∈ Σ2s , α2s ||v ||22 ≤ ||Av ||22 ≤ β2s ||v ||22
then
∀v ∈ ker(A)/{0},∀S s.t. |S | ≤ s, ||vS ||2 ≤η√s||v ||1 (1)
where η =γ2s − 1
2=
1
2
β2s − α2s
α2s.
So that if γ2s < 2 then A satisfies the mixed norm NSPi.e. every s-sparse vector x ∈ Rn is the unique solution of (P1) withy = Ax .
Foucart CS-Notes Th. 8.2. Exercise.59/ 66
Concentration ⇒ RIP and condition on mMatrices that satisfy the RIP
Theorem (Foucart CS-Notes Th. 9.1. concentration ⇒ RIP)
Let A be an m × n random matrix drawn according to a probabilitydistribution satisfying the concentration inequality
∀x ∈ Rn,∀ε ∈ (0, 1),P(∣∣∣||Ax ||22 − ||x ||22
∣∣∣ > ε||x ||22)≤ 2 exp(−c(ε)m)
(2)where c(ε) is a constant depending only on ε. Then, for eachδ ∈ (0, 1), there exist constants c0(δ), c1(δ) > 0 depending on δ andon the probability distribution of A such that
P(∣∣∣||Ax ||22 − ||x ||22
∣∣∣ > δ||x ||22,∃x ∈ Σk
)≤ 2 exp(−c0(δ)m) (3)
provided thatm ≥ c1(δ)k ln(e n/k).
60/ 66
Matrices that satisfy the concentrationinequality
Theorem (Gaussian ⇒ concentration around the mean)
Suppose that the aij entries of the m × n random matrix A areindependent realizations of Gaussian random variables, centered withvariance σ2. For each x ∈ Rn, ε ∈ (0, 1)
P(∣∣∣||Ax ||22 −mσ2||x ||22
∣∣∣ > εmσ2||x ||22)≤ 2 exp
((− ε2
2+ε3
6
)m
).
Foucart CS-Notes Section 9.3.It also holds for
• any subgaussian iid matrices.
• subsampled Fourier matrices (cf. ComputerLab).
61/ 66
Summary
• if A follows a subgaussian distribution with m = O (s ln(n/s)),
[easy construction / easy to verify...]
then A will satisfy the RIP of order 2s
with probability at least 1− 2e−c0(δ)m)
and hence the NSP
and hence exact reconstruction under P1
• Gaussian, Bernoulli (Rademacher entries) matrices ...,subsampled Fourier matrices satisfy the RIP.
62/ 66
Part 4 - Compressive sensing -what it is good for?
.
63/ 66
How to spot a compressive sensing system?
Case 1
• Think about systems that use a raster mode for samplingthen think of physical ways to perform multiplexing instead
• Once you perform the multiplexing,use compressive sensing solvers to reconstruct signal
• Does it work better or as well with fewer measurements ?
64/ 66
Example: single pixel camera
Classical i=1
65/ 66
Example: single pixel camera
Classical i=2
65/ 66
Example: single pixel camera
Classical i=3
65/ 66
Example: single pixel camera
Classical i=4
65/ 66
Example: single pixel camera
Classical i=5
65/ 66
Example: single pixel camera
Classical i=1000
65/ 66
Example: single pixel camera
Classical i=10000000
65/ 66
Example: single pixel camera
Compressive sensing i=1
65/ 66
Example: single pixel camera
Compressive sensing i=2
65/ 66
Example: single pixel camera
Compressive sensing i=3
65/ 66
Example: single pixel camera
Compressive sensing
if image is 3-sparse, then ∼ 3 measurements are enough
65/ 66
How to spot a compressive sensing system?
Case 2
• Look for acquisition schemes that multiplexes a signal already
• Is the signal produced by this system sparse in some basis?
• If yes, subsample the acquisition,use compressive sensing solvers to reconstruct signal
• Does it work better or as well with fewer measurements ?
66/ 66