13
Pattern Recognition and Machine Learning Summer School 2014

Spectral clustering

Embed Size (px)

Citation preview

Page 1: Spectral clustering

Pattern Recognition and Machine Learning Summer School 2014

Page 2: Spectral clustering

Hierarchical methods

Agglomerative clustering

Divisive clustering

Iterative methods

k‐means clustering

EM algorithm

Mean‐shift algorithm

Spectral clustering

Normalized cut

Ratio cut

Graph‐cut

Page 3: Spectral clustering

Clustering based on the spectrum of the graph

the multiset of the eigenvalues of the Laplacian matrix

Treats clustering as a graph partitioning problem without making specific assumptions on the form of the clusters.

Clusters points using eigenvectors of matrices derived from the data.

Maps data to a low‐dimensional space that are separated and can be easily clustered.

L = D (degree matrix) – W (adjacency matrix)

Page 4: Spectral clustering

affinity or similarity of the two nodes

• Affinity matrix

• Laplacian matrix

• Similarity measures

– Cosine measure

– Bhattacharyya coefficient

• Distance measures

– Euclidean distance

– Manhattan distance

– Maximum distance …

Page 5: Spectral clustering

Find a label vector x !

Convert the discrete problem to continuous domain

But, NP-hard problem..

Average association

Page 6: Spectral clustering

Points in dominant cluster are non-zero

X(label) is divided

into 0 and 1

Page 7: Spectral clustering

But, favor for small and isolated clusters

Sum of the weights to cut edges

Page 8: Spectral clustering

Find the second minimum eigenvector

=

= assoc(G1,G) - assoc(G1,G1)

= cut(G1, G2)

(D-W) * 1 = 0 * 1

The smallest eigenvector is 1.

Page 9: Spectral clustering

y : binary vector representing the

cluster association

Favors partitioning with equal size segments

The second smallest eigenvalue

Based on the edge weights

‘NP-complete’

Find z in

Page 10: Spectral clustering

Pros Generic framework, can be used with many different

features

Cons High storage requirement and time complexity

Bias towards partitioning into equal segments

Need the number of clusters as parameter

Page 11: Spectral clustering

Incremental partitioning

Partition using only one eigenvector at a time

Use procedure recursively

Batch partitioning

Use k eigenvectors

Directly compute k‐way partitioning, for example, by k‐means clustering

Usually performs better

Page 12: Spectral clustering

Find a low‐dimensional

embedding by

eigen‐decomposition

separates data while projecting in the low dimensional space

allows clustering of non‐convex data effectively

Page 13: Spectral clustering

Thank you !