22
A Tutorial on Spectral Clustering Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No. 4 2011-07-22 Presented by Yongjin Kwon

A Tutorial on Spectral Clustering

  • Upload
    linh

  • View
    78

  • Download
    0

Embed Size (px)

DESCRIPTION

A Tutorial on Spectral Clustering. Ulrike von Luxburg Max Planck Institute for Biological Cybernetics Statistics and Computing, Dec. 2007, Vol. 17, No. 4 2011-07-22 Presented by Yongjin Kwon. Outline. Introduction Spectral Clustering Algorithms Two Explanations of Spectral Clustering - PowerPoint PPT Presentation

Citation preview

Page 1: A Tutorial on Spectral Clustering

A Tutorial on Spectral Clustering

Ulrike von Luxburg

Max Planck Institute for Biological Cybernetics

Statistics and Computing, Dec. 2007, Vol. 17, No. 4

2011-07-22

Presented by Yongjin Kwon

Page 2: A Tutorial on Spectral Clustering

Outline

Introduction

Spectral Clustering Algorithms

Two Explanations of Spectral Clustering

Graph Partitioning Point of View

Random Walks Point of View

Conclusion

2

Page 3: A Tutorial on Spectral Clustering

Introduction

Clustering Algorithms

k-means / k-means++

Mixture of Gaussians (MoG)

Hierarchical Clustering (Centroid-based, MST, Average Dis-tance)

DBSCAN

ROCK

BIRCH

CURE

3

Page 4: A Tutorial on Spectral Clustering

Introduction (Cont’d)

Spectral Clustering

Simple, but powerful method of clustering

Requires less assumptions on the form of clusters

Outperforms the traditional approaches, such as k-means clustering.

4

(http://www.squobble.com/academic/publications/FFF_MIMO/node4.html)

Page 5: A Tutorial on Spectral Clustering

Introduction (Cont’d)

Spectral Clustering?

Spectrum

Spectral analysis

– Scientific or mathematic methods of analyzing something, such as light or waves, and finding the basis for them

5

Page 6: A Tutorial on Spectral Clustering

Introduction (Cont’d)

Spectral Clustering?

Spectral analysis in linear algebra

– Basic features of matrices : eigenpairs (eigenvalue, eigenvec-tor)

– Methods of using the eigenpairs to solve given problems

Spectral Clustering!

Methods of using the eigenvectors of some matrices to find a partition of the data such that points in the same group are similar

6

Page 7: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms

Similarity Graphs

ε-neighborhood graph

– Connect all points whose pairwise distances are smaller than ε.

k-nearest neighbor graph

– Connect two points if one is among the k-nearest neighbors of the other (and vice versa for mutual k-nearest neighbor graph).

– Each edge is weighted by the similarity of their endpoints.

fully connected graph

– Connect all points and weight all edges by similarity of their endpoints.

7

Page 8: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms (Cont’d)

In spectral clustering, the Gaussian similarity is used to repre-sent local neighborhood relationships.

: adjacency matrix of similarity graph

: degree matrix

8

Page 9: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms (Cont’d)

Graph Laplacian

unnormalized graph Laplacian :

normalized graph Laplacian

Example

9

related to random walk

Assume the weights of edges are 1.

Page 10: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms (Cont’d)

Unnormalized Spectral Clustering

10

Page 11: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms (Cont’d)

Normalized Spectral Clustering [Shi2000]

11

Page 12: A Tutorial on Spectral Clustering

Spectral Clustering Algorithms (Cont’d)

Normalized Spectral Clustering [Ng2002]

12

Page 13: A Tutorial on Spectral Clustering

Two Explanations of Spectral Cluster-ing

Graph partitioning point of view

Based on mincut problem in a similarity graph

Find a partition such that the edges between clusters have a very low weight and the edges within a cluster have high weight.

Random walks point of view

Based on random walks on the similarity graph

Find a partition such that random walk stays long within the same cluster and seldom jumps to other clusters.

13

Page 14: A Tutorial on Spectral Clustering

Graph Partitioning Point of View

Mincut problem

Given a number k, find a partition which mini-mizes

In practice, the mincut problem results in a size imbal-ance in the partition.

14

Page 15: A Tutorial on Spectral Clustering

Graph Partitioning Point of View (Cont’d)

Two common objective functions

Normalized mincut problem

Given a number k, find a partition which mini-mizes

15

or .

Page 16: A Tutorial on Spectral Clustering

Graph Partitioning Point of View (Cont’d)

Represent partitions by k indicator vectors .

Relationship between mincut problem and graph Lapla-cian

Mincut problem is converted into

16

Page 17: A Tutorial on Spectral Clustering

Graph Partitioning Point of View (Cont’d)

Relaxing the form of indicator vectors, normalized min-cut problems can be expressed with graph Laplacian.

17

Page 18: A Tutorial on Spectral Clustering

Graph Partitioning Point of View (Cont’d)

The optimization problems are NP-hard.

Relax the discreteness condition of vectors .

By the Rayleigh-Ritz theorem, the solutions of above problems are the matrix that contains the k smallest eigenvectors of the graph Laplacian and normalized one , repectively.

18

Page 19: A Tutorial on Spectral Clustering

Graph Partitioning Point of View (Cont’d)

Note that the solutions of the relaxed optimization prob-lems does NOT indicate which nodes are included in which groups!

However, we hope that if the data are well-separate, the eigenvectors of graph Laplacians are close to piecewise constant (and close to indicator vectors).

Thinking of the rows of the solution matrices as another representation of data points, k-mean clustering is a way of finding an appropriate group for each point.

19

k-means cluster-

ing!

Page 20: A Tutorial on Spectral Clustering

Random Walks Point of View

Tradition Probability Matrix

If the graph is connected and non-bipartite, then the random walk always possesses a unique stationary dis-tribution

20

Assume the weights of edges are 1.

Page 21: A Tutorial on Spectral Clustering

Random Walks Point of View (Cont’d)

Relationship between Ncut problem and random walks

For the random walk in the stationary distribution,

The problem of finding a partition such that random walk does not have many opportunities to jump between clus-ters is equiva-lent to Ncut problem due to:

Relationship between and

21

Page 22: A Tutorial on Spectral Clustering

Conclusion

Spectral clustering has been made popular by several researches, and extended to many non-standard set-tings.

Spectral clustering has many advantages.

Requires less assumptions on the form of clusters

Be simple and efficient to implement

Has no issues of getting stuck in local minima

However, there are some issues when applying it.

Not trivial to choose a good similarity graph

Unstable under different parameter settings

22