Graph signal processing for clustering · 2016. 1. 26. · Graph signal processing for clustering...

Preview:

Citation preview

Graph signal processing for clustering

Nicolas Tremblay

PANAMA Team, INRIA Rennes with Remi Gribonval,Signal Processing Laboratory 2, EPFL, Lausanne with Pierre Vandergheynst.

Introduction Graph signal processing... ... applied to clustering Conclusion

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 1 / 26

What’sclustering ?

Introduction Graph signal processing... ... applied to clustering Conclusion

Given a series of N objects :

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Given a series of N objects :

1/ Find adapted descriptors

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Given a series of N objects :

1/ Find adapted descriptors

2/ Cluster

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 2 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

After step 1, one has :

• N vectors in d dimensions (descriptor dimension) :

x1, x2, · · · , xN ∈ Rd

• and their distance matrix D ∈ RN×N .

The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.

There exists two different general types of methods :

• methods directly based on the xi and/or D like k-means orhierarchical clustering.

• graph-based methods.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

After step 1, one has :

• N vectors in d dimensions (descriptor dimension) :

x1, x2, · · · , xN ∈ Rd

• and their distance matrix D ∈ RN×N .

The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.

There exists two different general types of methods :

• methods directly based on the xi and/or D like k-means orhierarchical clustering.

• graph-based methods.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

After step 1, one has :

• N vectors in d dimensions (descriptor dimension) :

x1, x2, · · · , xN ∈ Rd

• and their distance matrix D ∈ RN×N .

The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.

There exists two different general types of methods :

• methods directly based on the xi and/or D like k-means orhierarchical clustering.

• graph-based methods.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

After step 1, one has :

• N vectors in d dimensions (descriptor dimension) :

x1, x2, · · · , xN ∈ Rd

• and their distance matrix D ∈ RN×N .

The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.

There exists two different general types of methods :

• methods directly based on the xi and/or D like k-means orhierarchical clustering.

• graph-based methods.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

After step 1, one has :

• N vectors in d dimensions (descriptor dimension) :

x1, x2, · · · , xN ∈ Rd

• and their distance matrix D ∈ RN×N .

The goal of clustering is to assign a label c(i) = 1, · · · , k to eachobject i in order to organize / simplify / analyze the data.

There exists two different general types of methods :

• methods directly based on the xi and/or D like k-means orhierarchical clustering.

• graph-based methods.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 3 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Graph construction from the distance matrix D

Create a graph G = (V ,E ) :

• each node in V is one of the N objects

• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.

For example, two connectivity possibilities :

• Gaussian kernel :

1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)

2. remove all links of weight inferior to ε

• k nearest neighbors : connect each node to its k nearestneighbors.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Graph construction from the distance matrix D

Create a graph G = (V ,E ) :

• each node in V is one of the N objects

• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.

For example, two connectivity possibilities :

• Gaussian kernel :

1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)

2. remove all links of weight inferior to ε

• k nearest neighbors : connect each node to its k nearestneighbors.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Graph construction from the distance matrix D

Create a graph G = (V ,E ) :

• each node in V is one of the N objects

• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.

For example, two connectivity possibilities :

• Gaussian kernel :

1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)

2. remove all links of weight inferior to ε

• k nearest neighbors : connect each node to its k nearestneighbors.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Graph construction from the distance matrix D

Create a graph G = (V ,E ) :

• each node in V is one of the N objects

• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.

For example, two connectivity possibilities :

• Gaussian kernel :

1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)

2. remove all links of weight inferior to ε

• k nearest neighbors : connect each node to its k nearestneighbors.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Graph construction from the distance matrix D

Create a graph G = (V ,E ) :

• each node in V is one of the N objects

• each pair of nodes (i , j) is connected if the associated distanceD(i , j) is small enough.

For example, two connectivity possibilities :

• Gaussian kernel :

1. all pairs of nodes are connected with links of weightsexp(−D(i , j)/σ)

2. remove all links of weight inferior to ε

• k nearest neighbors : connect each node to its k nearestneighbors.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 4 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The problem now states :

Given the graph G representing the similarity between the Nobjects, find a partition of all nodes in k clusters.

Many methods exist [Fortunato ’10] :

• Modularity (or other cost-function) optimisation methods[Newman ’04]

• Random walk methods [Delvenne ’10]

• Methods inspired from statistical physics [Krzakala ’12],information theory [Rosvall ’07]...

• spectral methods

• ...

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 5 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The problem now states :

Given the graph G representing the similarity between the Nobjects, find a partition of all nodes in k clusters.

Many methods exist [Fortunato ’10] :

• Modularity (or other cost-function) optimisation methods[Newman ’04]

• Random walk methods [Delvenne ’10]

• Methods inspired from statistical physics [Krzakala ’12],information theory [Rosvall ’07]...

• spectral methods

• ...

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 5 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Three useful matrices

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 6 / 26

The adjacency matrix : The degree matrix :

The Laplacian matrix :

W =

0 1 1 01 0 1 11 1 0 00 1 0 0

S =

2 0 0 00 3 0 00 0 2 00 0 0 1

L = S−W =

2 −1 −1 0−1 3 −1 −1−1 −1 2 00 −1 0 1

Introduction Graph signal processing... ... applied to clustering Conclusion

Three useful matrices

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 6 / 26

The adjacency matrix : The degree matrix :

The Laplacian matrix :

W =

0 .5 .5 0.5 0 .5 4.5 .5 0 00 4 0 0

S =

1 0 0 00 5 0 00 0 1 00 0 0 4

L = S−W =

1 −.5 −.5 0

−.5 5 −.5 −4−.5 −.5 1 0

0 −4 0 4

Introduction Graph signal processing... ... applied to clustering Conclusion

The classical spectral clustering algorithm [Von Luxburg ’06] :

Given the N-node graph G of adjacency matrix W :

1. Compute :Uk = (u1|u2| · · · |uk)

the first k eigenvectors of L = S−W.

2. Consider each node i as a point in Rk :

fi = U>k δi .

3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The classical spectral clustering algorithm [Von Luxburg ’06] :

Given the N-node graph G of adjacency matrix W :

1. Compute :Uk = (u1|u2| · · · |uk)

the first k eigenvectors of L = S−W.

2. Consider each node i as a point in Rk :

fi = U>k δi .

3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The classical spectral clustering algorithm [Von Luxburg ’06] :

Given the N-node graph G of adjacency matrix W :

1. Compute :Uk = (u1|u2| · · · |uk)

the first k eigenvectors of L = S−W.

2. Consider each node i as a point in Rk :

fi = U>k δi .

3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 7 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s the point of using a graph ?

N points in d = 2 dimensions.Result with k-means (k=2) :

After creating a graph fromthe N points’ interdistances,and running the spectral clus-tering algorithm (with k=2) :

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 8 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Computation bottlenecks of thespectral clustering algorithm

When N and/or k become too large, there are two mainbottlenecks in the algorithm :

1. The partial eigendecomposition of the Laplacian.

2. k-means.

Our goal :

Circumvent both !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Computation bottlenecks of thespectral clustering algorithm

When N and/or k become too large, there are two mainbottlenecks in the algorithm :

1. The partial eigendecomposition of the Laplacian.

2. k-means.

Our goal :

Circumvent both !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Computation bottlenecks of thespectral clustering algorithm

When N and/or k become too large, there are two mainbottlenecks in the algorithm :

1. The partial eigendecomposition of the Laplacian.

2. k-means.

Our goal :

Circumvent both !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 9 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 10 / 26

What’s graph signalprocessing ?

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s a graph signal ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 11 / 26

J F M A M J J A S O N D−20

0

20

40

C

0 5 10 15 205

10

15

C

Heure

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s the graph Fourier matrix ?[Hammond ’11]

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 12 / 26

The “classical” graph :

Lcl =

2 −1 0 ··· 0 −1−1 2 −1 ··· 0 0...

......

......

0 0 0 ··· 2 −1−1 0 0 ··· −1 2

All classical Fourier modes are theeigenvectors of Lcl

Any graph :

L

By analogy, any graph’s Fouriermodes are the eigenvectors of its

Laplacian matrix L.

Introduction Graph signal processing... ... applied to clustering Conclusion

What’s the graph Fourier matrix ?[Hammond ’11]

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 12 / 26

The “classical” graph :

Lcl =

2 −1 0 ··· 0 −1−1 2 −1 ··· 0 0...

......

......

0 0 0 ··· 2 −1−1 0 0 ··· −1 2

All classical Fourier modes are theeigenvectors of Lcl

Any graph :

L

By analogy, any graph’s Fouriermodes are the eigenvectors of its

Laplacian matrix L.

Introduction Graph signal processing... ... applied to clustering Conclusion

The graph Fourier matrix

L = S−W

Its eigenvectors :

U = (u1|u2| · · · |uN)

form the graph Fourierorthonormal basis.

Its eigenvalues :

0 = λ1 ≤ λ2 ≤ · · · ≤ λN

represent the graphfrequencies. λi is the squaredfrequency associated to the

Fourier mode ui .

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 13 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Illustration

Low frequency : High frequency :

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 14 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The Fourier transform• given f ∈ RN a signal on a graph of size N.

• f is obtained by decomposing f on the eigenvectors ui :

f =

< u1, f >< u2, f >< u3, f >

...< uN , f >

, i.e. f = U> f

• Inversely, the inverse Fourier transform reads :

f = U f

• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The Fourier transform• given f ∈ RN a signal on a graph of size N.

• f is obtained by decomposing f on the eigenvectors ui :

f =

< u1, f >< u2, f >< u3, f >

...< uN , f >

, i.e. f = U> f

• Inversely, the inverse Fourier transform reads :

f = U f

• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The Fourier transform• given f ∈ RN a signal on a graph of size N.

• f is obtained by decomposing f on the eigenvectors ui :

f =

< u1, f >< u2, f >< u3, f >

...< uN , f >

, i.e. f = U> f

• Inversely, the inverse Fourier transform reads :

f = U f

• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

The Fourier transform• given f ∈ RN a signal on a graph of size N.

• f is obtained by decomposing f on the eigenvectors ui :

f =

< u1, f >< u2, f >< u3, f >

...< uN , f >

, i.e. f = U> f

• Inversely, the inverse Fourier transform reads :

f = U f

• The Parseval theorem stays valid : ∀(g , h) < g , h >=< g , h >

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 15 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Filtering

Given a filter function g definedin the Fourier space.

0

0.2

0.4

0.6

0.8

1

1 2λ

g(λ

)

In the Fourier space, the signal filtered by g reads :

f g =

f (1) g(λ1)

f (2) g(λ2)

f (3) g(λ3)...

f (N) g(λN)

= G f with G =

g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)

In the node space, the filtered signal f g reads therefore :

f g = U G U> f = Gf

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Filtering

Given a filter function g definedin the Fourier space.

0

0.2

0.4

0.6

0.8

1

1 2λ

g(λ

)

In the Fourier space, the signal filtered by g reads :

f g =

f (1) g(λ1)

f (2) g(λ2)

f (3) g(λ3)...

f (N) g(λN)

= G f with G =

g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)

In the node space, the filtered signal f g reads therefore :

f g = U G U> f = Gf

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Filtering

Given a filter function g definedin the Fourier space.

0

0.2

0.4

0.6

0.8

1

1 2λ

g(λ

)

In the Fourier space, the signal filtered by g reads :

f g =

f (1) g(λ1)

f (2) g(λ2)

f (3) g(λ3)...

f (N) g(λN)

= G f with G =

g(λ1) 0 0 ... 00 g(λ2) 0 ... 00 0 g(λ3) ... 0... ... ... ... ...0 0 0 ... g(λN)

In the node space, the filtered signal f g reads therefore :

f g = U G U> f = Gf

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 16 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 17 / 26

So where’s the link ?

Introduction Graph signal processing... ... applied to clustering Conclusion

Remember : the classical spectral clustering algorithm

Given the N-node graph G of adjacency matrix W :

1. Compute :Uk = (u1|u2| · · · |uk)

the first k eigenvectors of L = S−W.

2. Consider each node i as a point in Rk :

fi = U>k δi .

3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the first bottleneck : estimate Dij without partiallydiagonalizing the Laplacian matrix.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 18 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Remember : the classical spectral clustering algorithm

Given the N-node graph G of adjacency matrix W :

1. Compute :Uk = (u1|u2| · · · |uk)

the first k eigenvectors of L = S−W.

2. Consider each node i as a point in Rk :

fi = U>k δi .

3. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the first bottleneck : estimate Dij without partiallydiagonalizing the Laplacian matrix.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 18 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Ideal low-pass filtering1st step : assume we know Uk and λk

Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5

Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.

We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.

Norm conservation theorem for ideal filter

Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :

∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Ideal low-pass filtering1st step : assume we know Uk and λk

Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5

Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.

We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.

Norm conservation theorem for ideal filter

Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :

∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Ideal low-pass filtering1st step : assume we know Uk and λk

Given hλk an ideal LP,Hλk = UHλk U> = UkU>kis its filter matrix.

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5

Let R = (r1|r2| · · · |rη) ∈ RN×η be a random Gaussian matrix.

We define fi = (Hλk R)>δi ∈ Rη and Dij = ||fi − fj ||.

Norm conservation theorem for ideal filter

Let ε > 0, if η > η0 ∼ log Nε2 , then, with proba > 1− 1/N, we have :

∀(i , j) ∈ [1,N]2 (1− ε)Dij ≤ Dij ≤ (1 + ε)Dij .

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 19 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Non-ideal low-pass filtering2nd step : assume all we know is λk

In practice, we use a polyapprox of order m of hλk :

hλk =m∑l=1

αlλl ' hλk .

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5ideal

m=100

m=20

m=5

Indeed, in this case, filtering a vector x reads :

Hλkx = Uhλk (Λ)U>x = Um∑l=1

αlΛlU>x =

m∑l=1

αlLlx

• Does not require the knowledge of Uk .

• Only involves matrix-vector multiplications [costs O(m|E |)].

The theorem stays (more or less) valid with this non-ideal filtering !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Non-ideal low-pass filtering2nd step : assume all we know is λk

In practice, we use a polyapprox of order m of hλk :

hλk =m∑l=1

αlλl ' hλk .

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5ideal

m=100

m=20

m=5

Indeed, in this case, filtering a vector x reads :

Hλkx = Uhλk (Λ)U>x = Um∑l=1

αlΛlU>x =

m∑l=1

αlLlx

• Does not require the knowledge of Uk .

• Only involves matrix-vector multiplications [costs O(m|E |)].

The theorem stays (more or less) valid with this non-ideal filtering !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Non-ideal low-pass filtering2nd step : assume all we know is λk

In practice, we use a polyapprox of order m of hλk :

hλk =m∑l=1

αlλl ' hλk .

λ

0 λk

1 2

k

(λ)

-0.5

0

0.5

1

1.5ideal

m=100

m=20

m=5

Indeed, in this case, filtering a vector x reads :

Hλkx = Uhλk (Λ)U>x = Um∑l=1

αlΛlU>x =

m∑l=1

αlLlx

• Does not require the knowledge of Uk .

• Only involves matrix-vector multiplications [costs O(m|E |)].

The theorem stays (more or less) valid with this non-ideal filtering !

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 20 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Last step : estimate λk

Goal : given L, estimate its k-th eigenvalue as fast as possible.

We use eigencount techniques (also based on polynomial filteringof random vectors !) :

• given the interval [0, b], get an approximation of the numberof enclosed eigenvalues.

• And find λk by dichotomy on b.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 21 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Last step : estimate λk

Goal : given L, estimate its k-th eigenvalue as fast as possible.

We use eigencount techniques (also based on polynomial filteringof random vectors !) :

• given the interval [0, b], get an approximation of the numberof enclosed eigenvalues.

• And find λk by dichotomy on b.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 21 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Accelerated spectral algorithm

Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Run k-means with the Euclidean distance : Dij = ||fi − fj ||and obtain k clusters.

Let’s work on the second bottleneck : avoid k-means in possiblyvery large dimension N (step 4).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 22 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Fast spectral algorithm ?Given the N-node graph G of adjacency matrix W :

1. Estimate λk , the k-th eigenvalue of L.

2. Generate η random graph signals in matrix R ∈ RN×η.

3. Filter them with Hλk and treat each node i as a point in Rη :

f >i = δ>i Hλk R.

4. Sample randomly ρ ' k log k << N nodes out of N :

f ri = Mfi = (MHλk R)>δri .

5. Run k-means in this reduced space with the Euclideandistance : Dr

ij = ||f ri − f r

j || and obtain k clusters.

6. Interpolate cluster indicator functions c rl on the whole graph :

cl = arg minx∈RN

||Mx − c rl ||2 + µxL>x .

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 23 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Compressive spectral clustering : a summary

1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;

2. subsample the set of nodes ;

3. cluster the reduced set of nodes ;

4. interpolate the cluster indicator vectors back to the completegraph.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Compressive spectral clustering : a summary

1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;

2. subsample the set of nodes ;

3. cluster the reduced set of nodes ;

4. interpolate the cluster indicator vectors back to the completegraph.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Compressive spectral clustering : a summary

1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;

2. subsample the set of nodes ;

3. cluster the reduced set of nodes ;

4. interpolate the cluster indicator vectors back to the completegraph.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Compressive spectral clustering : a summary

1. generate a feature vector for each node by filtering fewrandom gaussian random signal on G ;

2. subsample the set of nodes ;

3. cluster the reduced set of nodes ;

4. interpolate the cluster indicator vectors back to the completegraph.

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 24 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

This work was done in collaboration with :

• Gilles PUY and Remi GRIBONVAL from the PANAMAteam (INRIA).

• Pierre VANDERGHEYNST from EPFL.

Part of this work has been published (or submitted) :

• Circumventing the first bottleneck has been accepted toICASSP 2016

• Interpolation of k-bandlimited graph signals has beensubmitted to ACHA in November (an application of whichhelps us circumvent the second bottleneck).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 25 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

This work was done in collaboration with :

• Gilles PUY and Remi GRIBONVAL from the PANAMAteam (INRIA).

• Pierre VANDERGHEYNST from EPFL.

Part of this work has been published (or submitted) :

• Circumventing the first bottleneck has been accepted toICASSP 2016

• Interpolation of k-bandlimited graph signals has beensubmitted to ACHA in November (an application of whichhelps us circumvent the second bottleneck).

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 25 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Perspectives and difficult questions

Two difficult questions (among others) :

1. Given a semi-definite positive matrix, how to estimate as fastas possible its k-th eigenvalue, and only that one ?

2. How to subsample ρ nodes out of N while ensuring thatclustering them in k classes is the result one would haveobtained by clustering all N nodes ?

Perspectives

1. How about if nodes are added one by one ?

2. Rational filters instead of polynomial filters ?

3. Approximating other spectral clustering algorithms ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 26 / 26

Introduction Graph signal processing... ... applied to clustering Conclusion

Perspectives and difficult questions

Two difficult questions (among others) :

1. Given a semi-definite positive matrix, how to estimate as fastas possible its k-th eigenvalue, and only that one ?

2. How to subsample ρ nodes out of N while ensuring thatclustering them in k classes is the result one would haveobtained by clustering all N nodes ?

Perspectives

1. How about if nodes are added one by one ?

2. Rational filters instead of polynomial filters ?

3. Approximating other spectral clustering algorithms ?

N. Tremblay Graph signal processing for clustering Rennes, 13th of January 2016 26 / 26

Recommended