32
A scalable force-directed method for the visualization of large graphs Fabien Jourdan, Guy Melançon [email protected], [email protected] M O N T P E L L I E R L I R M Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier CNRS - Université Montpellier II Département Informatique Fondamentale et Applications 1

fjourdan/Publication/WSPresentation

Embed Size (px)

DESCRIPTION

http://www.lirmm.fr/~fjourdan/Publication/WSPresentation.pdf

Citation preview

Page 1: fjourdan/Publication/WSPresentation

A scalable force-directed method for thevisualization of large graphs

Fabien Jourdan, Guy Melanç[email protected], [email protected]

M O

N T

P E

L L

I E

R

L I R M

Laboratoire d’Informatique, deRobotique

et de Microélectronique deMontpellier

CNRS - Université Montpellier II

Département Informatique Fondamentale et Applications

1

Page 2: fjourdan/Publication/WSPresentation

Plan

1. Introduction

2. Force-directed layout methods

� Description

� Algorithm

3. Partitioning nodes

� Observation

� Spreading activation metric

� Extracting nodes by layers

4. The Layout algorithm

� Virtual graphs

� Algorithm

� Complexity of the algorithm

5. Experimentation

� Energy computation

� Tests

6. Conclusion and future work

2

Page 3: fjourdan/Publication/WSPresentation

1. Introduction� Graph Visualization

Definition 1 Subfield of Information Visualization dealing with relational data.

– Areas of application

� social networks visualization

� class browsers (object oriented systems)

� metabolic pathways (post-genomic data)

� etc ...

– Graph Visualization challenges

� choosing an abstraction of a graph that will provide a beginning for exploration

� providing a good view of a graph while keeping the computing time low

� reducing the amount of data displayed without omitting key informations

3

Page 4: fjourdan/Publication/WSPresentation

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

4

Page 5: fjourdan/Publication/WSPresentation

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

4-a

Page 6: fjourdan/Publication/WSPresentation

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

5

Page 7: fjourdan/Publication/WSPresentation

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

6

Page 8: fjourdan/Publication/WSPresentation

2. Force-directed layout methods� Description

– Physical metaphor

� Nodes � physical bodies

� Edges � attractors (ex : springswith prefered length)

� Between each neighbours �

attraction forces

� Between each pair of nodes �

repulsion forces

� Attempt to find a minimum energyconfiguration of this system

– Advantages

� Require no specific graph properties

� Respect aesthetic criteria as edgelength, covering space...

� Good readability

7

Page 9: fjourdan/Publication/WSPresentation

2. Force-directed layout methods

Data : A graph � � ��� ��� �

Result : A drawing of the graph

give every node a random positionrepeat a linear number of times :(stop when the system energy gets under a lower bound)begin

for each node � do1 compute a vector �� obtained by summing up

all attractive and repulsive forces acting on

for each node � do2 apply the previously computed vector � �

to obtain the node’s new position

end

Its a priori time complexity is ����

� .

� To slow to be immersed in an interactive environment.

8

Page 10: fjourdan/Publication/WSPresentation

3. Partitioning nodes

Nodes and edges in the figure on the right have been colored according to their associatedSpreading Activation value.

9

Page 11: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Spreading activation

– Signals are sent through the network (the graph)

– How spreads the signal through this graph

– Hogg and Huberman mathematical model (87) :

� Nodes are assigned a sequence of values computed iteratively following a simplerecurrence:

� � � � � � � � �� � � �� � �� � �� � � � � � (1)

� Hogg and Huberman give conditions under which this iterative process converge

� They evaluate the speed of convergence according to and �

– Nodes with the highest spreading activation values should provide a skeleton for thegraph

10

Page 12: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

11

Page 13: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

12

Page 14: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

13

Page 15: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Extracting nodes by layers

– Partition :Let � � � � �� � be a graph and let � � � �

��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �

denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �

� � � � � � � � � � associated with

� � � � � � � � � � � by� � � � � �� ��� � �

� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes

is imposed by the limitations of the force-directed method

– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )

14

Page 16: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Partitioning the interval of values

– Consider the statistical repartition of values in the graph

– � denote the statistical density function associated with the spreading activationmetric on �

� ��� � � �� � � � ��� ��� �

– It is possible to compute a unique inverse value �� �

�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)

– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���

500 1000 1500 2000 2500 5000 1000012 22 30 40 45 83 154

15

Page 17: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Partitioning the interval of values

– Consider the statistical repartition of values in the graph

– � denote the statistical density function associated with the spreading activationmetric on �

� ��� � � �� � � � ��� ��� �

– It is possible to compute a unique inverse value �� �

�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)

– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���

�� � 500 1000 1500 2000 2500 5000 10000

�� � � � 12 22 30 40 45 83 154

15-a

Page 18: fjourdan/Publication/WSPresentation

3. Partitioning nodes� Filtration

– Gadjer, Goodrich and Kobourov (2000)

– Not based on numerical values but on neighborhood relationships

– Used to speed up force-directed layout methods for large graphs

– Their approach only work well for graphs with strong regularities(a grid, for instance)

Sommet de deg 3

Sommet de deg 4

Sommet de deg 2

16

Page 19: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

17

Page 20: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

17-a

Page 21: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Introduction

– Assume that the spreading activationmetric has been computed on thewhole graph

– �� denote an instance of a force-directed layout algorithm

�� : � � � � positions for everynode �

– �� � a modified instance of the algo-rithm ��

�� � : � � � � � � � � � positions forevery node � � � � �

� Nodes � � � are fixed but act onnodes � � � � �

� Nodes � � � � � can move

18

Page 22: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

19

Page 23: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

19-a

Page 24: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

20

Page 25: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Virtual graphs

– Let � � � and � � � � the induced sub-graph

– Target : taking into account paths be-tween two nodes in � when comput-ing �� �

– The edges in

�� are only used to in-

duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �

21

Page 26: fjourdan/Publication/WSPresentation

4. The Layout algorithm

Data : A graph � � ��� ��� � .

Result : A drawing of the graph �

begin1 Compute a partition� � � � � � � � � � � �

Compute the virtual graph

�� ��� � � associated with� �

Run �� ��

� � � � � � � � �Record the positions of nodes � � and mark them as fixed

2 for each level� � ( �� � ) doCompute the induced subgraph � � � ��� � � � � �

Compute the virtual graph�

� ��� ��� � � � � �

Run �� ��

�� ��� ��� � � � � � � � ��� � � � � �

Report the position for nodes � � � � � and mark them as fixed

return The resulting drawing of �

end

22

Page 27: fjourdan/Publication/WSPresentation

4. The Layout algorithm� Complexity of the algorithm

– Theorem 1 The algorithm runs in time � � � � � � � .

– Claim 1 The partition� � � � � � � can be computed in linear time ��� � .

– Claim 2 Let� �� � � � � be two consecutive layers in the partition for� . Under theassumption that �� � � ��� � , the induced subgraph � � � ��� � � � � � is computed inlinear time ��� � . This also holds true for � ��� � � . Moreover, the additional time tocompute the virtual subgraph

�� ��� ��� � � � � � is constant on average. Thus the graph

�� ��� ��� � � � � � can be computed in time ��� � as well.

23

Page 28: fjourdan/Publication/WSPresentation

5. Experimentation� Energy computation

– Basalaj PhD thesis (2000)

– Computing the energy associated with a drawing � is measuring how close theeuclidean distances between points in � are to the actual distance between thecorresponding nodes in the graph.

� � � �� � � � denote the euclidean distance for a drawing �

� � � ��� � � denote the distance in �

� � � � � � � � � �� � ��� � � � � �� � � � �

� � � � � � � � � � (2)

24

Page 29: fjourdan/Publication/WSPresentation

5. Experimentation� Tests

– Tests on a variety of graphs

– Java API Royere

Graphs Algorithm 2 GEM Random Ring

�� � �� � Ref59 87 1 0.049 0.0139 0.349 0.388216 528 3 0.112 0.154 0.203 0.224250 490 4 0.220 0.130 0.223 0.236296 824 5 0.210 0.132 0.213 0.228300 639 6 0.217 0.169 0.228 0.243350 668 7 0.206 0.135 0.220 0.233400 992 8 0.200 0.130 0.220 0.224450 865 9 0.191 0.165 0.219 0.234500 784 10 0.203 0.196 0.214 0.233550 1172 11 0.190 0.128 0.210 0.221600 1182 12 0.175 0.189 0.210 0.221

25

Page 30: fjourdan/Publication/WSPresentation

5. Experimentation

26

Page 31: fjourdan/Publication/WSPresentation

5. Experimentation

27

Page 32: fjourdan/Publication/WSPresentation

6. Conclusion and future work� Conclusion

– Using a partition based on metric values

– � ��

� � ��� � � � � �

� Future work

– The drawing could serve as the input for an animated force-directed layout

– Choose an other metric

28