Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Echantillonnage de signaux sur graphesvia des processus determinantaux
Nicolas Tremblay, Simon Barthelme, Pierre-Olivier Amblard
CNRS, Gipsa-lab, Grenoble
Introduction
The graph sampling problem and existing methods
Sampling via DPP
Conclusion
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
What’s a graph signal ?
Why sample ?
General reason : reduce dimensions (thereby costs) for• statistics estimation : mean, moments, etc.• perfect or lossy reconstruction (compression)• run costly algorithms in smaller dimensions (e.g. community detection)• cases where measuring each node is costly (e.g. sensor networks)• etc.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 1 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Three useful matrices
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 2 / 16
The adjacency matrix : The degree matrix :
The Laplacian matrix :
W =
0 1 1 0
1 0 1 1
1 1 0 0
0 1 0 0
D =
2 0 0 0
0 3 0 0
0 0 2 0
0 0 0 1
L = D−W =
2 −1 −1 0
−1 3 −1 −1
−1 −1 2 0
0 −1 0 1
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Notations
L = D−W = UΛU>
• U is a Fourier basis of the graph [Hammond ’11]
• a Fourier transform of a signal x reads :
x = U>x
• Λ = Diag(λ1, λ2, · · · , λN) the spectrum
A low frequency Fourier mode A high frequency Fourier mode
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 3 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Introduction
The graph sampling problem and existing methods
Sampling via DPP
Conclusion
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 3 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Given a graph signal x ∈ RN , sampling consists in :
1. choosing a subset of nodes S = s1, . . . , sm2. measuring x on S : y = Mx + n ∈ Rm.
How to reconstruct the original signal x from its measurement y ?
Basically, we need :
1. a (low-dimensional) model for the signal to sample
2. a method to choose the nodes to sample
3. a “decoder” that exactly recovers the signal given its samples
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 4 / 16
M =
δᵀs1
δᵀs2
...
δᵀsm
∈ Rm×N (1)
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Given a graph signal x ∈ RN , sampling consists in :
1. choosing a subset of nodes S = s1, . . . , sm
2. measuring x on S : y = Mx + n ∈ Rm.
How to reconstruct the original signal x from its measurement y ?
Basically, we need :
1. a (low-dimensional) model for the signal to sample
2. a method to choose the nodes to sample
3. a “decoder” that exactly recovers the signal given its samples
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 4 / 16
M =
δᵀs1
δᵀs2
...
δᵀsm
∈ Rm×N (1)
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Given a graph signal x ∈ RN , sampling consists in :
1. choosing a subset of nodes S = s1, . . . , sm2. measuring x on S : y = Mx + n ∈ Rm.
How to reconstruct the original signal x from its measurement y ?
Basically, we need :
1. a (low-dimensional) model for the signal to sample
2. a method to choose the nodes to sample
3. a “decoder” that exactly recovers the signal given its samples
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 4 / 16
M =
δᵀs1
δᵀs2
...
δᵀsm
∈ Rm×N (1)
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Given a graph signal x ∈ RN , sampling consists in :
1. choosing a subset of nodes S = s1, . . . , sm2. measuring x on S : y = Mx + n ∈ Rm.
How to reconstruct the original signal x from its measurement y ?
Basically, we need :
1. a (low-dimensional) model for the signal to sample
2. a method to choose the nodes to sample
3. a “decoder” that exactly recovers the signal given its samples
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 4 / 16
M =
δᵀs1
δᵀs2
...
δᵀsm
∈ Rm×N (1)
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Given a graph signal x ∈ RN , sampling consists in :
1. choosing a subset of nodes S = s1, . . . , sm2. measuring x on S : y = Mx + n ∈ Rm.
How to reconstruct the original signal x from its measurement y ?
Basically, we need :
1. a (low-dimensional) model for the signal to sample
2. a method to choose the nodes to sample
3. a “decoder” that exactly recovers the signal given its samples
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 4 / 16
M =
δᵀs1
δᵀs2
...
δᵀsm
∈ Rm×N (1)
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Low-dimensional model : smoothness assumption
In 1D sig proc, a smooth signal has most of its energy at low frequencies.
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Smooth signal in time−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
3
Fourier transform
Definition (Bandlimited graph signal [Chen ’15,Anis ’16,Marques ’16,Puy ’16])
A k-bandlimited signal x ∈ RN on G is a signal that satisfies, for some α ∈ Rk
x = Ukα =k∑
i=1
αiui
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 5 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Low-dimensional model : smoothness assumption
In 1D sig proc, a smooth signal has most of its energy at low frequencies.
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Smooth signal in time−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.5
1
1.5
2
2.5
3
Fourier transform
Definition (Bandlimited graph signal [Chen ’15,Anis ’16,Marques ’16,Puy ’16])
A k-bandlimited signal x ∈ RN on G is a signal that satisfies, for some α ∈ Rk
x = Ukα =k∑
i=1
αiui
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 5 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
The sampling problem
Recall that :
1. y = Mx + n ∈ Rm is the noisy measurement of x on the sampled nodes S.
2. x is supposed approximately bandlimited : x = Ukα + nsuch that :
y = MUkα + n,
where n = Mn + n encompasses measurement noise and the distance-to-model.
Our objective : tight sampling for perfect reconstruction.
Two cases :
Uk is computable.Uk is too expensive to compute :
only partial information isaccessible.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 6 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
The sampling problem
Recall that :
1. y = Mx + n ∈ Rm is the noisy measurement of x on the sampled nodes S.
2. x is supposed approximately bandlimited : x = Ukα + nsuch that :
y = MUkα + n,
where n = Mn + n encompasses measurement noise and the distance-to-model.
Our objective : tight sampling for perfect reconstruction.
Two cases :
Uk is computable.Uk is too expensive to compute :
only partial information isaccessible.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 6 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
The sampling problem
Recall that :
1. y = Mx + n ∈ Rm is the noisy measurement of x on the sampled nodes S.
2. x is supposed approximately bandlimited : x = Ukα + nsuch that :
y = MUkα + n,
where n = Mn + n encompasses measurement noise and the distance-to-model.
Our objective : tight sampling for perfect reconstruction.
Two cases :
Uk is computable.Uk is too expensive to compute :
only partial information isaccessible.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 6 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Known Uk case
Decoder. xrec = argminz∈span(Uk )
‖Mz − y‖2 = Uk (MUk )†y .
Theorem. Reconstruction (up to the noise level) is perfect iff σ1(MUk ) > 0. In thiscase, (MUk )† = (Uᵀ
k MᵀMUk )−1Uᵀk Mᵀ, and :
xrec = x + (MUk )†n.
Choosing m = k, there are in general many possible sets S such that σ1(MUk ) > 0.
Optimality measures. [Chen ’15, Anis ’16, Tsitsvero ’16] :
1. Worst-case error : SWCE = arg maxS s.t. |S|=k
σ21 ,
2. Mean-square error : SMSE = arg minS s.t. |S|=k
k∑i=1
1
σ2i
,
3. Maximum Volume : SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg maxS s.t. |S|=k
(det(MUk ))2
Practical algorithm. These combinatorial problems are NP-complete [Civril ’09].
→ Greedy approximate solutions S are computed with a cost of O(Nk4).
X 1st contribution : a DPP-based algorithm finding an approximate SMV in O(Nk2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 7 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Known Uk case
Decoder. xrec = argminz∈span(Uk )
‖Mz − y‖2 = Uk (MUk )†y .
Theorem. Reconstruction (up to the noise level) is perfect iff σ1(MUk ) > 0. In thiscase, (MUk )† = (Uᵀ
k MᵀMUk )−1Uᵀk Mᵀ, and :
xrec = x + (MUk )†n.
Choosing m = k, there are in general many possible sets S such that σ1(MUk ) > 0.
Optimality measures. [Chen ’15, Anis ’16, Tsitsvero ’16] :
1. Worst-case error : SWCE = arg maxS s.t. |S|=k
σ21 ,
2. Mean-square error : SMSE = arg minS s.t. |S|=k
k∑i=1
1
σ2i
,
3. Maximum Volume : SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg maxS s.t. |S|=k
(det(MUk ))2
Practical algorithm. These combinatorial problems are NP-complete [Civril ’09].
→ Greedy approximate solutions S are computed with a cost of O(Nk4).
X 1st contribution : a DPP-based algorithm finding an approximate SMV in O(Nk2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 7 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Known Uk case
Decoder. xrec = argminz∈span(Uk )
‖Mz − y‖2 = Uk (MUk )†y .
Theorem. Reconstruction (up to the noise level) is perfect iff σ1(MUk ) > 0. In thiscase, (MUk )† = (Uᵀ
k MᵀMUk )−1Uᵀk Mᵀ, and :
xrec = x + (MUk )†n.
Choosing m = k, there are in general many possible sets S such that σ1(MUk ) > 0.
Optimality measures. [Chen ’15, Anis ’16, Tsitsvero ’16] :
1. Worst-case error : SWCE = arg maxS s.t. |S|=k
σ21 ,
2. Mean-square error : SMSE = arg minS s.t. |S|=k
k∑i=1
1
σ2i
,
3. Maximum Volume : SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg maxS s.t. |S|=k
(det(MUk ))2
Practical algorithm. These combinatorial problems are NP-complete [Civril ’09].
→ Greedy approximate solutions S are computed with a cost of O(Nk4).
X 1st contribution : a DPP-based algorithm finding an approximate SMV in O(Nk2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 7 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Known Uk case
Decoder. xrec = argminz∈span(Uk )
‖Mz − y‖2 = Uk (MUk )†y .
Theorem. Reconstruction (up to the noise level) is perfect iff σ1(MUk ) > 0. In thiscase, (MUk )† = (Uᵀ
k MᵀMUk )−1Uᵀk Mᵀ, and :
xrec = x + (MUk )†n.
Choosing m = k, there are in general many possible sets S such that σ1(MUk ) > 0.
Optimality measures. [Chen ’15, Anis ’16, Tsitsvero ’16] :
1. Worst-case error : SWCE = arg maxS s.t. |S|=k
σ21 ,
2. Mean-square error : SMSE = arg minS s.t. |S|=k
k∑i=1
1
σ2i
,
3. Maximum Volume : SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg maxS s.t. |S|=k
(det(MUk ))2
Practical algorithm. These combinatorial problems are NP-complete [Civril ’09].
→ Greedy approximate solutions S are computed with a cost of O(Nk4).
X 1st contribution : a DPP-based algorithm finding an approximate SMV in O(Nk2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 7 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Known Uk case
Decoder. xrec = argminz∈span(Uk )
‖Mz − y‖2 = Uk (MUk )†y .
Theorem. Reconstruction (up to the noise level) is perfect iff σ1(MUk ) > 0. In thiscase, (MUk )† = (Uᵀ
k MᵀMUk )−1Uᵀk Mᵀ, and :
xrec = x + (MUk )†n.
Choosing m = k, there are in general many possible sets S such that σ1(MUk ) > 0.
Optimality measures. [Chen ’15, Anis ’16, Tsitsvero ’16] :
1. Worst-case error : SWCE = arg maxS s.t. |S|=k
σ21 ,
2. Mean-square error : SMSE = arg minS s.t. |S|=k
k∑i=1
1
σ2i
,
3. Maximum Volume : SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg maxS s.t. |S|=k
(det(MUk ))2
Practical algorithm. These combinatorial problems are NP-complete [Civril ’09].
→ Greedy approximate solutions S are computed with a cost of O(Nk4).
X 1st contribution : a DPP-based algorithm finding an approximate SMV in O(Nk2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 7 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Unknown Uk case
First option : Uncorrelated random sampling [Puy ’16, Tremblay ’16].
1. Associate to each node i a probability pi = ‖Uᵀkδi‖
22/k to draw this node.
2. Draw the sampling set S of size m independently with replacement from p.
3. Theorem (Restricted Isometry Property) With high probability,reconstruction (up to the noise) is perfect, provided that
m > O(k log k).
X Up to the log factor, it is tight.
X There is an efficient algorithm that estimates p in O(|E | logN).
Second option : approximate the “known Uk” algorithms with spectral proxies.
1. approximate the greedy algorithms e.g. [Anis ’16], in O(mk|E |).
2. 2nd contribution : approximate the DPP sampling algorithm, inO(m|E |+ Nm2). See also [Chamon ’17].
In all cases, regularized decoder (O(|E |) w/ gradient descent) :
xrec = minz∈RN
∥∥∥P−1/2Ω (Mz − y)
∥∥∥2
2+ γ zᵀLrz .
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 8 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Unknown Uk case
First option : Uncorrelated random sampling [Puy ’16, Tremblay ’16].
1. Associate to each node i a probability pi = ‖Uᵀkδi‖
22/k to draw this node.
2. Draw the sampling set S of size m independently with replacement from p.
3. Theorem (Restricted Isometry Property) With high probability,reconstruction (up to the noise) is perfect, provided that
m > O(k log k).
X Up to the log factor, it is tight.
X There is an efficient algorithm that estimates p in O(|E | logN).
Second option : approximate the “known Uk” algorithms with spectral proxies.
1. approximate the greedy algorithms e.g. [Anis ’16], in O(mk|E |).
2. 2nd contribution : approximate the DPP sampling algorithm, inO(m|E |+ Nm2). See also [Chamon ’17].
In all cases, regularized decoder (O(|E |) w/ gradient descent) :
xrec = minz∈RN
∥∥∥P−1/2Ω (Mz − y)
∥∥∥2
2+ γ zᵀLrz .
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 8 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Unknown Uk case
First option : Uncorrelated random sampling [Puy ’16, Tremblay ’16].
1. Associate to each node i a probability pi = ‖Uᵀkδi‖
22/k to draw this node.
2. Draw the sampling set S of size m independently with replacement from p.
3. Theorem (Restricted Isometry Property) With high probability,reconstruction (up to the noise) is perfect, provided that
m > O(k log k).
X Up to the log factor, it is tight.
X There is an efficient algorithm that estimates p in O(|E | logN).
Second option : approximate the “known Uk” algorithms with spectral proxies.
1. approximate the greedy algorithms e.g. [Anis ’16], in O(mk|E |).
2. 2nd contribution : approximate the DPP sampling algorithm, inO(m|E |+ Nm2). See also [Chamon ’17].
In all cases, regularized decoder (O(|E |) w/ gradient descent) :
xrec = minz∈RN
∥∥∥P−1/2Ω (Mz − y)
∥∥∥2
2+ γ zᵀLrz .
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 8 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Introduction
The graph sampling problem and existing methods
Sampling via DPP
Conclusion
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 8 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Determinantal point processes (DPP)
Definition [Kulesza ’12]. Consider a point process, i.e., a process that randomlydraws an element A ∈ [N]. It is determinantal if, for every S ⊆ A,
P(S ⊆ A) = det(KS),
where K ∈ RN×N , a SDP matrix s.t. 0 K 1.
Negative correlation : P(xi and xj ∈ A) = KiKj − K2ij 6 P(xi ∈ A)P(xj ∈ A).
Illustration with Gaussian kernel in the plane : Kij = exp−‖xi−xj‖2
σ2
a.
b.
0.00 0.04 0.08 0.12
05
1015
20
Interpoint distance
Den
sity
c.
DPP
IID
Figure – a : uniform iid, b : DPP with Gaussian kernel, c : interdistance distribution
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 9 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Determinantal point processes (DPP)
Definition [Kulesza ’12]. Consider a point process, i.e., a process that randomlydraws an element A ∈ [N]. It is determinantal if, for every S ⊆ A,
P(S ⊆ A) = det(KS),
where K ∈ RN×N , a SDP matrix s.t. 0 K 1.
Negative correlation : P(xi and xj ∈ A) = KiKj − K2ij 6 P(xi ∈ A)P(xj ∈ A).
Illustration with Gaussian kernel in the plane : Kij = exp−‖xi−xj‖2
σ2
a.
b.
0.00 0.04 0.08 0.12
05
1015
20
Interpoint distance
Den
sity
c.
DPP
IID
Figure – a : uniform iid, b : DPP with Gaussian kernel, c : interdistance distribution
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 9 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Determinantal point processes (DPP)
Definition [Kulesza ’12]. Consider a point process, i.e., a process that randomlydraws an element A ∈ [N]. It is determinantal if, for every S ⊆ A,
P(S ⊆ A) = det(KS),
where K ∈ RN×N , a SDP matrix s.t. 0 K 1.
Negative correlation : P(xi and xj ∈ A) = KiKj − K2ij 6 P(xi ∈ A)P(xj ∈ A).
Illustration with Gaussian kernel in the plane : Kij = exp−‖xi−xj‖2
σ2
a.
b.
0.00 0.04 0.08 0.12
05
1015
20
Interpoint distanceD
ensi
ty
c.
DPP
IID
Figure – a : uniform iid, b : DPP with Gaussian kernel, c : interdistance distribution
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 9 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
A DPP kernel for graph sampling
In the context of graph sampling, consider the kernel
Kk = UkUᵀk
Recall that :
SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg max
S s.t. |S|=kdet(MUkUᵀ
kMᵀ)
= arg maxS s.t. |S|=k
det(Kk,S)
⇒ SMV is the most probable sample from the DPP associated to Kk .
Case 1 : known Uk . Sample from Kk : result is close to SMV, but faster tocompute O(Nk2), vs O(Nk4).Case 2 : unknown Uk . Two possible options :
• find spectral proxies to approximate the DPP sampling with Kk , inO(m|E |+ Nm2).
• use loop-erased random walks to efficiently samply from a DPP w/ kernelclose to Kk [Tremblay ’17].
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 10 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
A DPP kernel for graph sampling
In the context of graph sampling, consider the kernel
Kk = UkUᵀk
Recall that :
SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg max
S s.t. |S|=kdet(MUkUᵀ
kMᵀ)
= arg maxS s.t. |S|=k
det(Kk,S)
⇒ SMV is the most probable sample from the DPP associated to Kk .
Case 1 : known Uk . Sample from Kk : result is close to SMV, but faster tocompute O(Nk2), vs O(Nk4).
Case 2 : unknown Uk . Two possible options :
• find spectral proxies to approximate the DPP sampling with Kk , inO(m|E |+ Nm2).
• use loop-erased random walks to efficiently samply from a DPP w/ kernelclose to Kk [Tremblay ’17].
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 10 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
A DPP kernel for graph sampling
In the context of graph sampling, consider the kernel
Kk = UkUᵀk
Recall that :
SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg max
S s.t. |S|=kdet(MUkUᵀ
kMᵀ)
= arg maxS s.t. |S|=k
det(Kk,S)
⇒ SMV is the most probable sample from the DPP associated to Kk .
Case 1 : known Uk . Sample from Kk : result is close to SMV, but faster tocompute O(Nk2), vs O(Nk4).Case 2 : unknown Uk . Two possible options :
• find spectral proxies to approximate the DPP sampling with Kk , inO(m|E |+ Nm2).
• use loop-erased random walks to efficiently samply from a DPP w/ kernelclose to Kk [Tremblay ’17].
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 10 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
A DPP kernel for graph sampling
In the context of graph sampling, consider the kernel
Kk = UkUᵀk
Recall that :
SMV = arg maxS s.t. |S|=k
k∏i=1
σ2i = arg max
S s.t. |S|=kdet(MUkUᵀ
kMᵀ)
= arg maxS s.t. |S|=k
det(Kk,S)
⇒ SMV is the most probable sample from the DPP associated to Kk .
Case 1 : known Uk . Sample from Kk : result is close to SMV, but faster tocompute O(Nk2), vs O(Nk4).Case 2 : unknown Uk . Two possible options :
• find spectral proxies to approximate the DPP sampling with Kk , inO(m|E |+ Nm2).
• use loop-erased random walks to efficiently samply from a DPP w/ kernelclose to Kk [Tremblay ’17].
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 10 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute· Normalize fn ← fn/
√fn(sn)
· Update p :∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3) O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl(sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3) O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl(sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3)
O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn−
∑n−1l=1 fl fl(sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3) O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn−
∑n−1l=1 fl fl(sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3) O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
DPP sampling with projectivekernel K = UkUᵀ
k
Input : K = UkUᵀk
S ← ∅, let p0 = diag(K) ∈ RN
p ← p0
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Update p :∀i p(i) = p0(i)− Kᵀ
S,iK−1S KS,i
end forOutput : S.
Equivalent algorithm
Input : K = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn−
∑n−1l=1 fl fl(sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
O(n3 + Nn2) for n = 1, . . . , k : O(Nk3) O(Nn) for n = 1, . . . , k : O(Nk2)
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 11 / 16
1st advantage : complexity gain of a factor k, for an added cost in memory of O(Nk).
2nd advantage : this formulation enables polynomial approximations. All we need is
• an estimate of diag(K)• estimates of m columns of K.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Polynomial approximation
1. Note that :
ki = Kδi = UkUᵀkδi = Uhλk (Λ)Uᵀδi ,
with hλk the ideal low-pass filter.λ
0 λk
1 2
hλ
k
(λ)
-0.5
0
0.5
1
1.5
The polynomial approximation hλk (λ) =∑p
i=1 αiλi enables to write :
ki = Uhλk (Λ)Uᵀδi ' Uhλk (Λ)Uᵀδi =
p∑i=1
αiLiδi
→ Estimating a column of K costs O(|E |p).
2. Consider a Gaussian random matrix R ∈ RN×r of mean 0 and variance 1/r ,and its approximate filtered version Rh ' KR. One has, if r = O(logN) :
Tr(RhRᵀh ) ' Tr(KRRᵀK) ' Tr(K) = k.
→ In O(p|E | logN), one may estimate : λk by dichotomy, and diag(K).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 12 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Polynomial approximation
1. Note that :
ki = Kδi = UkUᵀkδi = Uhλk (Λ)Uᵀδi ,
with hλk the ideal low-pass filter.λ
0 λk 1 2
hλ
k(λ)
-0.5
0
0.5
1
1.5idealm=100m=20m=5
ppp
The polynomial approximation hλk (λ) =∑p
i=1 αiλi enables to write :
ki = Uhλk (Λ)Uᵀδi ' Uhλk (Λ)Uᵀδi =
p∑i=1
αiLiδi
→ Estimating a column of K costs O(|E |p).
2. Consider a Gaussian random matrix R ∈ RN×r of mean 0 and variance 1/r ,and its approximate filtered version Rh ' KR. One has, if r = O(logN) :
Tr(RhRᵀh ) ' Tr(KRRᵀK) ' Tr(K) = k.
→ In O(p|E | logN), one may estimate : λk by dichotomy, and diag(K).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 12 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Polynomial approximation
1. Note that :
ki = Kδi = UkUᵀkδi = Uhλk (Λ)Uᵀδi ,
with hλk the ideal low-pass filter.λ
0 λk 1 2
hλ
k(λ)
-0.5
0
0.5
1
1.5idealm=100m=20m=5
ppp
The polynomial approximation hλk (λ) =∑p
i=1 αiλi enables to write :
ki = Uhλk (Λ)Uᵀδi ' Uhλk (Λ)Uᵀδi =
p∑i=1
αiLiδi
→ Estimating a column of K costs O(|E |p).
2. Consider a Gaussian random matrix R ∈ RN×r of mean 0 and variance 1/r ,and its approximate filtered version Rh ' KR. One has, if r = O(logN) :
Tr(RhRᵀh ) ' Tr(KRRᵀK) ' Tr(K) = k.
→ In O(p|E | logN), one may estimate : λk by dichotomy, and diag(K).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 12 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Polynomial approximation
1. Note that :
ki = Kδi = UkUᵀkδi = Uhλk (Λ)Uᵀδi ,
with hλk the ideal low-pass filter.λ
0 λk 1 2
hλ
k(λ)
-0.5
0
0.5
1
1.5idealm=100m=20m=5
ppp
The polynomial approximation hλk (λ) =∑p
i=1 αiλi enables to write :
ki = Uhλk (Λ)Uᵀδi ' Uhλk (Λ)Uᵀδi =
p∑i=1
αiLiδi
→ Estimating a column of K costs O(|E |p).
2. Consider a Gaussian random matrix R ∈ RN×r of mean 0 and variance 1/r ,and its approximate filtered version Rh ' KR. One has, if r = O(logN) :
Tr(RhRᵀh ) ' Tr(KRRᵀK) ' Tr(K) = k.
→ In O(p|E | logN), one may estimate : λk by dichotomy, and diag(K).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 12 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Polynomial approximation
1. Note that :
ki = Kδi = UkUᵀkδi = Uhλk (Λ)Uᵀδi ,
with hλk the ideal low-pass filter.λ
0 λk 1 2
hλ
k(λ)
-0.5
0
0.5
1
1.5idealm=100m=20m=5
ppp
The polynomial approximation hλk (λ) =∑p
i=1 αiλi enables to write :
ki = Uhλk (Λ)Uᵀδi ' Uhλk (Λ)Uᵀδi =
p∑i=1
αiLiδi
→ Estimating a column of K costs O(|E |p).
2. Consider a Gaussian random matrix R ∈ RN×r of mean 0 and variance 1/r ,and its approximate filtered version Rh ' KR. One has, if r = O(logN) :
Tr(RhRᵀh ) ' Tr(KRRᵀK) ' Tr(K) = k.
→ In O(p|E | logN), one may estimate : λk by dichotomy, and diag(K).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 12 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Exact sampling from Kk = UkUᵀk
Input : Kk = UkUᵀk = [k1, . . . , kN ]
S ← ∅, let p = diag(K) ∈ RN
for n = 1, . . . , k do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)· Update p :
∀i p(i)← p(i)− fn(i)2
end forOutput : S.
Approximate sampling from Kk
Input : L, k, p, mEstimate λk , the kth eigenvalue of LS ← ∅, estimate p ' diag(K) ∈ RN
for n = 1, . . . ,m do :· Draw sn with probability
P(s) = p(s)/∑
i p(i)· S ← S ∪ sn· Estimate ksn ' Kδsn· Compute fn = ksn −
∑n−1l=1 fl fl (sn)
· Normalize fn ← fn/√
fn(sn)
· Update p(i)← p(i)− fn(i)2
end forOutput : S of size m
O((|E |k + Nk2)I )︸ ︷︷ ︸partial diago
+O(Nk2)︸ ︷︷ ︸sampling
= O((|E |+ Nk)kI )
O(p|E |I log N)︸ ︷︷ ︸est. λk and diag(K)
+ O(p|E |m)︸ ︷︷ ︸est. m cols of K
+ O(Nm2)︸ ︷︷ ︸add. sampling cost
= O(p|E |I + Nm2).
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 13 / 16
Speed / accuracy trade-off :
1. At small k, very good approximation, but not much faster than exact sampling.
2. At large k, speed is much improved, but approximation error increases.
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Toy experiments
Known Uk case, on the SBM,compare greedy vs DPP w/ Kk :
(m = k samples)10−2 10−1 100
ε/εc
10−2
10−1
1
||xrec−x|| 2
DPP w/ Kk
greedy
a)
10 11 12 13 14 15 16 17 18
m
10−2
10−1
100
101
||xrec−x|| 2
b)
10 11 12 13 14 15 16 17 18
m
10−2
10−1
100
101
||xrec−x|| 2
c)
10 11 12 13 14 15 16 17 18
m
10−2
10−1
100
||xrec−x|| 2 iid uniforme
iid avec p = diag(Kk)
iid avec p ' diag(Kk)
notre methode: alg.3PPD avec Kk
Figure – Median reconstruction performance over 100 realisations of 10-bandlimitedsignals on a) a transportation graph, b) a 3D mesh graph, c) a realisation of SBM.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 14 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Introduction
The graph sampling problem and existing methods
Sampling via DPP
Conclusion
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 14 / 16
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)
∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
To sum up : various strategies to sample k-bandlimited graph signals withthe objective of perfect reconstruction
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 15 / 16
diago. cost algorithm # nec. samples sampling cost
Known Uk O(|E |k + Nk2I )
leverage score m = O(k log k) † O(Nm)
greedy w/ WCE
m = k †O(Nk4)greedy w/ MSE
greedy w/ MV
DPP w/ Kk O(Nk2)
Unknown Uk N/A
random uniform m = O(N maxi∥∥Uᵀ
k δi
∥∥2
2log k)† O(m)
lev. score w/ proxies m & O(k log k) ∗ O(|E | log N + Nm)
greedy w/ proxies m & k ∗ O(mk|E |)
Kk -DPP w/ proxies m & k ∗ O(m|E | + Nm2)
LERW m & k ∗ ' O(|E |d/q
)∗ : heuristics, † : provably
Introduction The graph sampling problem and existing methods Sampling via DPP Conclusion
Conclusion
X DPP w/ Kk improves over greedy optim. of the “max volume” set SMV
X One can use polynomial approximations to avoid to compute Uk explictly
X One can add prior information on noise structure to improve reconstruction
X Authors in [Chamon ’17] show empirically very good performance (in fact,close to the exact solution) on different types of graphs
× In the approximate sampling framework, lack of control over thespeed/precision trade-off + lack of reconstruction guarantees
Future questions
1. truly “graph-based” algorithms ?
2. distributed algorithms ?
3. applying this work to different clustering problem such as coresets.
N. Tremblay Graph sampling with DPPs GRETSI, September 2017 16 / 16
References
[Hammond ’11] Wavelets on graphs via spectral graph theory, ACHA[Chen ’15] Discrete Signal Processing on Graphs : Sampling Theory, TSP[Anis ’16] Efficient Sampling Set Selection for Bandlimited Graph . . . , TSP[Marques ’16] Sampling of graph signals with successive local. . . , TSP[Puy ’16] Random sampling of bandlimited signals . . . , ACHA[Tsitsvero ’16] Signals on Graphs : Uncertainty Principle and Sampling, TSP[Civril ’09] On selecting a maximum volume. . . , Theoretical Comp. Science[Chamon ’17] Greedy Sampling of Graph Signals, Arxiv 1704.01223[Tremblay ’17] Graph sampling with determinantal processes, EUSIPCO[Tremblay ’16] Compressive spectral clustering, ICML[Kulesza ’12] Determinantal Point Processes. . . , Found. and Trends in ML[Avena ’13] Random spanning forests, Markov matrix. . . , Arxiv 1310.1723