Upload
tuxette
View
107
Download
1
Embed Size (px)
DESCRIPTION
February 19th, 2009 Dagstuhl seminar on "Similarity-based learning on structures", Shloss Dagstuhl, Germany
Citation preview
Topographic graph clustering with kernel anddissimilarity methods
Nathalie Villa-Vialaneixhttp://www.nathalievilla.org
& Fabrice Rossi
University of Toulouse
Seminar on Similarity-based learning on structures, Dagstuhl16-20 February, 2009
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 1 / 16
A short introduction
A weighted undirected graph G
A (rectangular) grid
with vertices V = {x1, . . . , xn}
with neurons {1, . . . ,M},
and weights W
a distance d on the gridand prototypes (pj)
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 2 / 16
A short introduction
A weighted undirected graph G A (rectangular) gridwith vertices V = {x1, . . . , xn} with neurons {1, . . . ,M},
and weights W a distance d on the gridand prototypes (pj)
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 2 / 16
Table of content
1 Dissimilarity and kernel based organizing maps
2 Applications
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 3 / 16
Dissimilarity and kernel based organizing maps
Table of content
1 Dissimilarity and kernel based organizing maps
2 Applications
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 4 / 16
Dissimilarity and kernel based organizing maps
SOM, dissimilarity SOM and kernel SOM
Original SOM algorithm (batch): x1, . . . , xn ∈ Rd
1 Initalization: Initialize randomly p01 , ..., p0
M in Rd
2 For l = 1, . . . , L do
3 Assignment: for all i = 1, . . . , n do
f l(xi) = arg minj=1,...,M
‖xi − p l−1j ‖Rd
4 Representation: for all j = 1, . . . ,M,
p lj = arg min
p∈Rd
n∑i=1
h l(f l(xi), j)‖xi − p‖2Rd
Online versions by [Lau et al., 2006]
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16
Dissimilarity and kernel based organizing maps
SOM, dissimilarity SOM and kernel SOMDissimilarity SOM (batch): xi ∈ G defined by a dissimilarity relation:δ(xi , xj)
1 Initalization: Initialize randomly p01 , ..., p0
M in (xi)i
2 For l = 1, . . . , L do
3 Assignment: for all i = 1, . . . , n do
f l(xi) = arg minj=1,...,M
δ(xi , p l−1j )
4 Representation: for all j = 1, . . . ,M,
p lj = arg min
p∈(xi)i
n∑i=1
h l(f l(xi), j)δ(xi , p)
[Kohohen and Somervuo, 1998, Kohonen and Somervuo, 2002]
Online versions by [Lau et al., 2006]
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16
Dissimilarity and kernel based organizing maps
SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G
defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H
1 Initalization: Initialize randomly p0j =
∑ni=1 γ
0jiφ(xi)
2 For l = 1, . . . , L do
3 Assignment: for all i = 1, . . . , n do
f l(xi) = arg minj=1,...,M
‖φ(xi) − p l−1j ‖H
4 Representation: for all j = 1, . . . ,M,
γlj = arg min
γ∈Rn
n∑i=1
h l(f l(xi), j)‖φ(xi) −n∑
k=1
γkφ(xk )‖2H
[Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]
Online versionsby [Lau et al., 2006]
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16
Dissimilarity and kernel based organizing maps
SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G
defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H
1 Initalization: Initialize randomly p0j =
∑ni=1 γ
0jiφ(xi)
2 For l = 1, . . . , L do
3 Assignment: for all i = 1, . . . , n do
f l(xi) = arg minj=1,...,M
n∑k ,k ′=1
γl−1jk γl−1
jk ′ K(xk , xk ′) − 2n∑
k=1
γl−1jk K(xi , xk )
4 Representation: for all j = 1, . . . ,M,
γljk =
h(f l(xk ), j)∑nk ′=1 h(f l(xk ′), j)
[Villa and Rossi, 2007, Hammer and Hasenfuss, 2007]
Online versionsby [Lau et al., 2006]
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16
Dissimilarity and kernel based organizing maps
SOM, dissimilarity SOM and kernel SOMRelational Topographic Mapping or Kernel SOM (batch): xi ∈ G
defined by a kernel relation: K(xi , xj)⇒ ∃ φ : G → (H , 〈., .〉H):K(x, x′) = 〈φ(x), φ(x′)〉H
1 Initalization: Initialize randomly p0j =
∑ni=1 γ
0jiφ(xi)
2 For l = 1, . . . , L do
3 Assignment: for all i = 1, . . . , n do
f l(xi) = arg minj=1,...,M
n∑k ,k ′=1
γl−1jk γl−1
jk ′ K(xk , xk ′) − 2n∑
k=1
γl−1jk K(xi , xk )
4 Representation: for all j = 1, . . . ,M,
γljk =
h(f l(xk ), j)∑nk ′=1 h(f l(xk ′), j)
Online versions by [Lau et al., 2006]Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 5 / 16
Dissimilarity and kernel based organizing maps
Which kernels?Laplacian [Kondor and Lafferty, 2002]
For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n
(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where
Li,j =
{−wi,j if i , jdi =
∑j,i wi,j if i = j
;
1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =
∑+∞k=1
(−βL)k
k ! .⇒
Kβ : V × V → R
(xi , xj) → Kβi,j
heat kernel (or diffusion kernel)
2 Generalized inverse of the Laplacian [Fouss et al., 2007] :K = L+.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16
Dissimilarity and kernel based organizing maps
Which kernels?Laplacian [Kondor and Lafferty, 2002]
For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n
(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where
Li,j =
{−wi,j if i , jdi =
∑j,i wi,j if i = j
;
1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =
∑+∞k=1
(−βL)k
k ! .⇒
Kβ : V × V → R
(xi , xj) → Kβi,j
heat kernel (or diffusion kernel)
2 Generalized inverse of the Laplacian [Fouss et al., 2007] :K = L+.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16
Dissimilarity and kernel based organizing maps
Which kernels?Laplacian [Kondor and Lafferty, 2002]
For a graph with vertices V = {x1, . . . , xn} and weights (wi,j)i,j=1,...,n
(positive, symmetric), the Laplacian is: L = (Li,j)i,j=1,...,n where
Li,j =
{−wi,j if i , jdi =
∑j,i wi,j if i = j
;
1 Diffusion matrix [Kondor and Lafferty, 2002]: for β > 0,Kβ = e−βL =
∑+∞k=1
(−βL)k
k ! .⇒
Kβ : V × V → R
(xi , xj) → Kβi,j
heat kernel (or diffusion kernel)2 Generalized inverse of the Laplacian [Fouss et al., 2007] :
K = L+.Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 6 / 16
Applications
Table of content
1 Dissimilarity and kernel based organizing maps
2 Applications
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 7 / 16
Applications
Tuning the parameters
On a practical point of view, how tuning• Parameter of the kernel (if there is one);
• Size of the grid;
• Other parameters: kind of neighborhood on the grid, annealingscheme of SOM, ...
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 8 / 16
Applications
Quality criterion that can be used• Kaski-Lagus quality criterion:
KL =1n
n∑i=1
‖φ(xi) − pLfL (xi)‖+ min
(j0,...,jq)∈Ci
q−1∑k=0
‖pLjk − pL
jk+1‖
where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure.
, : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.
• Modularity:
Q =1
2m
n∑i,j=1
(wij −
didj
2m
)I{fL (xi)=fL (xj)}
, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16
Applications
Quality criterion that can be used• Kaski-Lagus quality criterion:
KL =1n
n∑i=1
‖φ(xi) − pLfL (xi)‖+ min
(j0,...,jq)∈Ci
q−1∑k=0
‖pLjk − pL
jk+1‖
where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.
• Modularity:
Q =1
2m
n∑i,j=1
(wij −
didj
2m
)I{fL (xi)=fL (xj)}
, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16
Applications
Quality criterion that can be used• Kaski-Lagus quality criterion:
KL =1n
n∑i=1
‖φ(xi) − pLfL (xi)‖+ min
(j0,...,jq)∈Ci
q−1∑k=0
‖pLjk − pL
jk+1‖
where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.
• Modularity:
Q =1
2m
n∑i,j=1
(wij −
didj
2m
)I{fL (xi)=fL (xj)}
, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16
Applications
Quality criterion that can be used• Kaski-Lagus quality criterion:
KL =1n
n∑i=1
‖φ(xi) − pLfL (xi)‖+ min
(j0,...,jq)∈Ci
q−1∑k=0
‖pLjk − pL
jk+1‖
where j0 is the best matching unit for xi , jq is the second bestmatching unit for xi and Ci are paths made with direct neighbors onthe prior structure., : Organization quality criterion/ : Can not be used to tune the type of kernel or either its parameter.
• Modularity:
Q =1
2m
n∑i,j=1
(wij −
didj
2m
)I{fL (xi)=fL (xj)}
, : True quality measure for graphs (can be used to tune everyparameters)/ : Clustering measure: no idea about the organization quality
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 9 / 16
Applications
A first example: a medieval social networkExample from [Boulet et al., 2008]In Cahors (Lot, France), stands a big corpus of 5000 agrarian contracts
• coming from 4 seignories (about 25 little villages) of south west ofFrance;
• being established between 1240 and 1520 (just before and after thehundred years’ war).
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 10 / 16
Applications
Simplification of this network by kernel SOM
Kernel SOM with heatkernel (β chosen with modularity); optimization ofthe size of the grid and of the kind of initialization (random vs PCA) ismade with Kaski-Lagus criterion.
Modularity : 0.597 ; KL = 0.531
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 11 / 16
Applications
Simplification of this network by kernel SOM
Modularity : 0.597 ; KL = 0.531
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 11 / 16
Applications
A brief comparison with spectral clustering
Number of clusters: 35 50Maximum size of the clusters: 255 268
Modularity: 0.597 0.420
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 12 / 16
Applications
A brief comparison with spectral clustering
Number of clusters: 35 29Maximum size of the clusters: 255 325
Modularity: 0.597 0.433
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 12 / 16
Applications
Organizing map as a basis for a full representation of thegraph [Truong et al., 2007]
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 13 / 16
Applications
Classification and drawing criteria
The previous tuning method is unsatisfactory: need of a method toevaluate the organization of the graph on the map as well as itsclustering.We propose to combine a graph drawing quality measure and a graphclustering quality measure:
1 the percentage of pairs of edges that cross on the map;
2 the modularity.
Combining these two measures gives Pareto points that can help theuser to make the tradeoff between the quality of the representation of thegraph on the map and the quality of the clustering.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 14 / 16
Applications
Classification and drawing criteria
The previous tuning method is unsatisfactory: need of a method toevaluate the organization of the graph on the map as well as itsclustering.We propose to combine a graph drawing quality measure and a graphclustering quality measure:
1 the percentage of pairs of edges that cross on the map;
2 the modularity.
Combining these two measures gives Pareto points that can help theuser to make the tradeoff between the quality of the representation of thegraph on the map and the quality of the clustering.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 14 / 16
Applications
A second example: a collaboration network
Example coming from [Newman, 2006].
Weighted connected graphwith 379 vertices.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16
Applications
A second example: a collaboration network
Example coming from [Newman, 2006].
Weighted connected graphwith 379 vertices.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16
Applications
A second example: a collaboration network
Example coming from [Newman, 2006].
Weighted connected graph Modularity: 0.827with 379 vertices. % cut edges: 0.094
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16
Applications
A second example: a collaboration network
Example coming from [Newman, 2006].
Weighted connected graph Modularity: 0.816with 379 vertices. % cut edges: 0.005
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16
Applications
A second example: a collaboration network
Example coming from [Newman, 2006].
Weighted connected graph Modularity: 0.771with 379 vertices. % cut edges: 0.044
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 15 / 16
Applications
Conclusion
• Organizing maps for graphs having several hundred of vertices.
• Based on kernel methods.
• Simplified representation of the graph.
• Tuning the parameters of the algorithm is very important to obtainmeaningfull maps.
• Which kernel? Sometimes the heat kernel seems to work well andsometimes it provides poor solutions.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16
Applications
Boulet, R., Jouve, B., Rossi, F., and Villa, N. (2008).Batch kernel SOM and related laplacian methods for social network analysis.Neurocomputing, 71(7-9):1257–1273.
Fouss, F., Pirotte, A., Renders, J., and Saerens, M. (2007).Random-walk computation of similarities between nodes of a graph, withapplication to collaborative recommendation.IEEE Transactions on Knowledge and Data Engineering, 19(3):355–369.
Hammer, B. and Hasenfuss, A. (2007).Relational topographic maps.Technical Report IfI-07-01, Clausthal University of Technology.
Kohohen, T. and Somervuo, P. (1998).Self-Organizing maps of symbol strings.Neurocomputing, 21:19–30.
Kohonen, T. and Somervuo, P. (2002).How to make large self-organizing maps for nonvectorial data.Neural Networks, 15(8):945–952.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16
Applications
Kondor, R. and Lafferty, J. (2002).Diffusion kernels on graphs and other discrete structures.In Proceedings of the 19th International Conference on Machine Learning,pages 315–322.
Lau, K., Yin, H., and Hubbard, S. (2006).Kernel self-organising maps for classification.Neurocomputing, 69:2033–2040.
Newman, M. (2006).Finding community structure in networks using the eigenvectors of matrices.Physical Review, E, 74(036104).
Truong, Q., Dkaki, T., and Charrel, P. (2007).An energy model for the drawing of clustered graphs.In Proceedings of Vème colloque international VSST, Marrakech, Maroc.
Villa, N. and Rossi, F. (2007).A comparison between dissimilarity SOM and kernel SOM for clustering thevertices of a graph.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16
Applications
In Proceedings of the 6th Workshop on Self-Organizing Maps (WSOM 07),Bielefield, Germany.
Dagstuhl (16-20 February, 2009) Nathalie Villa & Fabrice Rossi Kernel & dissimilarity SOM 16 / 16