22
University of Joensuu Dept. of Computer Scie P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955 www.cs.joensuu.fi K-MST -based clustering Caiming Zhong Pasi Franti

University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955 K-MST -based

Embed Size (px)

Citation preview

Page 1: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST -based clustering

Caiming Zhong

Pasi Franti

Page 2: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Outline

• Minimum spanning tree (MST)

• MST-based clustering

• K-MST

• K-MST-based clustering

• Fast approximate MST

MST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 3: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Minimum Spanning Tree

• Spanning tree

Given graph

Spanning tree

Non-

Spanning tree

MSTMST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 4: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Minimum Spanning Tree• Minimize the sum of weights (Kruskal, Pri

m’s Algorithm)

Given graph

G=(V,E)

MST

T

),(

),()(Tvu

vuwTwMSTMST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 5: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST-based clustering

• The most used Method1: removing

long MST-edgesMST

MST-based MST-based

clusteringclustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 6: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST

MST-based MST-based

clusteringclustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 7: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST-based clustering

• Removing long MST-edges doesn’t

always workMST

MST-based MST-based

clusteringclustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 8: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST-based clustering

• The most used Method2: edge inconsistent

Tree edge

AB, whose weight

W(AB) is

significantly larger

than the average of

nearby edge

weights on both

sides of the edge

AB, should be

deleted.

MST

MST-based MST-based

clusteringclustering

K-MST

K-MST-based

clustering

Fast approximate

MST

Page 9: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST

• What is K-MST?

– Let G = (V,E) denote the complete graph

– Let MST1 denote the MST of G, and it is

computed as MST1 = mst(V, E).

– Then, MST2 denote the second round of

MST of G, MST2 = mst(V, E- MST1).

– MSTk = mst(V, E- MST1-…-MSTk-1).

MST

MST-based

clustering

KK-MST-MST

K-MST-based

clustering

Fast approximate

MST

Page 10: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST

MST-based

clustering

KK-MST-MST

K-MST-based

clustering

Fast approximate

MST

Page 11: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST

• K-MST-based graph

MST

MST-based

clustering

KK-MST-MST

K-MST-based

clustering

Fast approximate

MST

Page 12: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST• Typical clustering problems

– Separated problems and touching

problems.

– Separated problems includes distance-

separated problems and density-separated

problems.

MST

MST-based

clustering

KK-MST-MST

K-MST-based

clustering

Fast approximate

MST

Page 13: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST-based clustering• Definition of edge weight for separated

problems

MST

MST-based

clustering

K-MST

KK-MST-based -MST-based

clusteringclustering

Fast approximate

MST

)(

})){(}),{(min(1)(

ab

abbabaab

e

eEavgeEavgew

Page 14: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Three good features: (1) Weights of inter-cluster edges are

quite larger than those of intra-cluster edges. (2) The inter-

cluster edges are approximately equally distributed to T1 and

T2. (3) Except inter- cluster edges, most of edges with large

weights come from T2.

Page 15: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST

MST-based

clustering

K-MST

KK-MST-based -MST-based

clusteringclustering

Fast approximate

MST

Page 16: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST

MST-based

clustering

K-MST

KK-MST-based -MST-based

clusteringclustering

Fast approximate

MST

Page 17: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

K-MST-based clustering• Touching problems

MST

MST-based

clustering

K-MST

KK-MST-based -MST-based

clusteringclustering

Fast approximate

MST

Page 18: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Partition(cut1) and Partition(cut1) and

Partition(cut3) are similar ;Partition(cut3) are similar ;

Partition(cut2) and Partition(cut2) and

Partition(cut3) are similar .Partition(cut3) are similar .

Page 19: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Fast approximate MST (FAMST)

• Traditional MST algorithms take

O(N2) time, not favored by large data

sets.

• In practical application, generally

FAMST has as same result as exact

MST

• Find a FAMST in O(N1.55)

MST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate Fast approximate

MSTMST

Page 20: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Fast approximate MST (FAMST)

• Scheme: Divide-and-Conquer

MST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate Fast approximate

MSTMST

Page 21: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

Fast approximate MST (FAMST)

• Performance

MST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate Fast approximate

MSTMST

Page 22: University of Joensuu Dept. of Computer Science P.O. Box 111 FIN- 80101 Joensuu Tel. +358 13 251 7959 fax +358 13 251 7955  K-MST -based

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 JoensuuTel. +358 13 251 7959fax +358 13 251 7955www.cs.joensuu.fi

MST

MST-based

clustering

K-MST

K-MST-based

clustering

Fast approximate Fast approximate

MSTMST