Community detection in complex...

Preview:

Citation preview

Community detection incomplex networks

Vinh Loc DAO

Summary

1 Introduction

2 Datasets and Benchmarks

3 Community detection in network

4 Evaluating partition quality

5 Objectives and Perspectives

2/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Summary

1 Introduction

What is a network or a graph ?

Some notions

Structure properties of real networks

2 Datasets and Benchmarks

3 Community detection in network

4 Evaluating partition quality

5 Objectives and Perspectives

3/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

What is a network or a graph ?

Example : Internet, transport network, power grid, food web, social network

Figure – A very simple network G(V ,E) with |V | Vertices and |E | Edges

Node : Entity in real life

Edge : Relation between two entities to which it connects

A natural language to describe complex systems.

4/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

A local mesh network - very small network

Node : Computer

Edge : Connection between two computer

5/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Student network in a university - relatively small network

Node : Student

Edge : Relation (Could be any kind of relation)

6/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

The French highway network - large network

Node : City

Edge : Highway

7/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

The Internet - international network

8/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Some notions

Complex network : Graph with non-trivial topology features

Network analysis : Studies of graph to extract non-trivial features

Community detection algorithm : Divide nodes into groups calledcommunities whose members are connected densely.

Figure – Uncover graph modules without specifying clusters’ size

9/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Structure properties of real networks

- Graph representing real systems are normaly neither regular nor random.- Degree (nb of connections of a node) distribution often follows a power law,as connections often follow preferential patterns.- Nodes are often found to cluster into high density groups.

Figure – Regular lattice graph Figure – Graph with modular structure

10/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Why community detection ?

- Comprehend network global organization- Reveal modular structures- Reveal hidden properties between nodes- Understand information diffusion process throughout network

APPLICATIONS :- Detect web clients with similar interests- Prevent epidemic transmission- Managing collaboration network- etc,- You name it

11/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Summary

1 Introduction

2 Datasets and Benchmarks

Datasets - Real networks

Datasets - Artificial networks

3 Community detection in network

4 Evaluating partition quality

5 Objectives and Perspectives

12/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Datasets - Real networks

- Social network of friendships between 34 members of a karate club at a USuniversity in the 1970s. (left)- An identity graph with 25 vertices and 31 edges. An identity graph has asingle graph auto-morphism, the trivial one. (right)

Figure – Zachary karate network Figure – Walther network

13/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Datasets - Artificial networks

Parameters : Graph size, node distribution, link distribution, densitydistribution, etc.

Figure – GN benchmark network Figure – LFR benchmark network

14/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Summary

1 Introduction

2 Datasets and Benchmarks

3 Community detection in network

Dense structure in modular network

From dense structure to community

4 Evaluating partition quality

5 Objectives and Perspectives

15/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

An example of graph partitioning by K-means

Apply K-means graph partitioning on karate club network (K = 2)

Figure – The Zachary karate networkFigure – K-means partition on theZachary karate club

- Good solution ?- How do we know the club has 2 communities ?- Wait, tell me again what is a community ?

16/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Did we say a ”community” ? Isn’t it a ”cluster” ?

- What is a community ? - Answer : ”I know it when I see it”- No universal accepted definition.- More edges inside the community than edges linking its vertices with the restof the network.- Many detection methods : overlapping/non-overlapping, fast/slow,single-scale/multi-scale.

Figure – A graph division

Figure – The karate club is separateddue to a conflict coach/president

17/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Summary

1 Introduction

2 Datasets and Benchmarks

3 Community detection in network

4 Evaluating partition quality

5 Objectives and Perspectives

18/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Partition quality

- What is a good partition of a network into modules ?

- Quality function : assigns score to each partition of a graph- The most popular quality function is modularity

Q = 12|V |Σij(Aij − Pij)δ(Ci ,Cj)

Aij : Adjacency matrix, Pij : Expected connection adjacency matrixQ = Fraction of edges within communities - expected fraction of such edgesin a randomModularity favors inter community links.

19/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Community detection on the karate club

Edge betweenness O(|V ||E |2)

Fast greedy O(|V ||E |log(|V |))

Label propagation O(|V |+ |E |)Louvain method O(|V |log(|V |)

Leading eigenvector O(|V |2 + |E |)Modular optimization (NP-complete)

Infomap O (|V |(|V |+ |E |))

Walktrap O(|E ||V |2)

20/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Community detection on the Walther network

Many possible meaning divisions on a less modular network.

21/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Quality is relative ? Goodness is subjective

- Community characteristics :

Community density

Community connectiveness

Robustness to perturbation, etc

Question : How to choose appropriate method to satisfy certaincharacteristics and utilizing as much as possible available information ?

Ex : Which method to chose to :

- Divide students into the most cohesive groups.

- Establish geographic sites to minimize remote works of collaborators.

- Compromise between community density and calculation time.

- Maximize range of ages in a dancing community.

22/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Summary

1 Introduction

2 Datasets and Benchmarks

3 Community detection in network

4 Evaluating partition quality

5 Objectives and Perspectives

23/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Objectives and Perspectives

Guide to choose best methods base on expected quality indicators, graphcharacteristics and available resources

Create generative model to summarize community characteristics

Create a benchmark base on generative model

Construct criteria for evaluating partition quality base on end user pointof view

Propose methods to improve detection quality

Tools :- R for network analyzing, data analyzing, visualization- Gephi : Visualization

24/24 04/05/2016 Vinh Loc DAO Community detection in Complex Networks

Recommended