Upload
fabien-jourdan
View
212
Download
0
Tags:
Embed Size (px)
DESCRIPTION
http://www.lirmm.fr/~fjourdan/Publication/WSPresentation.pdf
Citation preview
A scalable force-directed method for thevisualization of large graphs
Fabien Jourdan, Guy Melanç[email protected], [email protected]
M O
N T
P E
L L
I E
R
L I R M
Laboratoire d’Informatique, deRobotique
et de Microélectronique deMontpellier
CNRS - Université Montpellier II
Département Informatique Fondamentale et Applications
1
Plan
1. Introduction
2. Force-directed layout methods
� Description
� Algorithm
3. Partitioning nodes
� Observation
� Spreading activation metric
� Extracting nodes by layers
4. The Layout algorithm
� Virtual graphs
� Algorithm
� Complexity of the algorithm
5. Experimentation
� Energy computation
� Tests
6. Conclusion and future work
2
1. Introduction� Graph Visualization
Definition 1 Subfield of Information Visualization dealing with relational data.
– Areas of application
� social networks visualization
� class browsers (object oriented systems)
� metabolic pathways (post-genomic data)
� etc ...
– Graph Visualization challenges
� choosing an abstraction of a graph that will provide a beginning for exploration
� providing a good view of a graph while keeping the computing time low
� reducing the amount of data displayed without omitting key informations
3
2. Force-directed layout methods� Description
– Physical metaphor
� Nodes � physical bodies
� Edges � attractors (ex : springswith prefered length)
� Between each neighbours �
attraction forces
� Between each pair of nodes �
repulsion forces
� Attempt to find a minimum energyconfiguration of this system
– Advantages
� Require no specific graph properties
� Respect aesthetic criteria as edgelength, covering space...
� Good readability
4
2. Force-directed layout methods� Description
– Physical metaphor
� Nodes � physical bodies
� Edges � attractors (ex : springswith prefered length)
� Between each neighbours �
attraction forces
� Between each pair of nodes �
repulsion forces
� Attempt to find a minimum energyconfiguration of this system
– Advantages
� Require no specific graph properties
� Respect aesthetic criteria as edgelength, covering space...
� Good readability
4-a
2. Force-directed layout methods� Description
– Physical metaphor
� Nodes � physical bodies
� Edges � attractors (ex : springswith prefered length)
� Between each neighbours �
attraction forces
� Between each pair of nodes �
repulsion forces
� Attempt to find a minimum energyconfiguration of this system
– Advantages
� Require no specific graph properties
� Respect aesthetic criteria as edgelength, covering space...
� Good readability
5
2. Force-directed layout methods� Description
– Physical metaphor
� Nodes � physical bodies
� Edges � attractors (ex : springswith prefered length)
� Between each neighbours �
attraction forces
� Between each pair of nodes �
repulsion forces
� Attempt to find a minimum energyconfiguration of this system
– Advantages
� Require no specific graph properties
� Respect aesthetic criteria as edgelength, covering space...
� Good readability
6
2. Force-directed layout methods� Description
– Physical metaphor
� Nodes � physical bodies
� Edges � attractors (ex : springswith prefered length)
� Between each neighbours �
attraction forces
� Between each pair of nodes �
repulsion forces
� Attempt to find a minimum energyconfiguration of this system
– Advantages
� Require no specific graph properties
� Respect aesthetic criteria as edgelength, covering space...
� Good readability
7
2. Force-directed layout methods
Data : A graph � � ��� ��� �
Result : A drawing of the graph
give every node a random positionrepeat a linear number of times :(stop when the system energy gets under a lower bound)begin
for each node � do1 compute a vector �� obtained by summing up
all attractive and repulsive forces acting on
for each node � do2 apply the previously computed vector � �
to obtain the node’s new position
end
Its a priori time complexity is ����
� .
� To slow to be immersed in an interactive environment.
8
3. Partitioning nodes
Nodes and edges in the figure on the right have been colored according to their associatedSpreading Activation value.
9
3. Partitioning nodes� Spreading activation
– Signals are sent through the network (the graph)
– How spreads the signal through this graph
– Hogg and Huberman mathematical model (87) :
� Nodes are assigned a sequence of values computed iteratively following a simplerecurrence:
� � � � � � � � �� � � �� � �� � �� � � � � � (1)
� Hogg and Huberman give conditions under which this iterative process converge
� They evaluate the speed of convergence according to and �
– Nodes with the highest spreading activation values should provide a skeleton for thegraph
10
3. Partitioning nodes� Extracting nodes by layers
– Partition :Let � � � � �� � be a graph and let � � � �
��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �
denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �
� � � � � � � � � � associated with
� � � � � � � � � � � by� � � � � �� ��� � �
� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes
is imposed by the limitations of the force-directed method
– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )
11
3. Partitioning nodes� Extracting nodes by layers
– Partition :Let � � � � �� � be a graph and let � � � �
��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �
denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �
� � � � � � � � � � associated with
� � � � � � � � � � � by� � � � � �� ��� � �
� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes
is imposed by the limitations of the force-directed method
– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )
12
3. Partitioning nodes� Extracting nodes by layers
– Partition :Let � � � � �� � be a graph and let � � � �
��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �
denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �
� � � � � � � � � � associated with
� � � � � � � � � � � by� � � � � �� ��� � �
� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes
is imposed by the limitations of the force-directed method
– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )
13
3. Partitioning nodes� Extracting nodes by layers
– Partition :Let � � � � �� � be a graph and let � � � �
��� ��� � be the map assigning to each nodeits spreading activation value.Let� � � ��� � � � � � � �
denote a partition of �� ��� � into � consec-utive and distinct intervals.Then, we define the partition � �
� � � � � � � � � � associated with
� � � � � � � � � � � by� � � � � �� ��� � �
� � �� � � � for � � � , . . . , � .– The size of the extracted layers of nodes
is imposed by the limitations of the force-directed method
– Partition deepneth has to be controled– Layers of size � �� � � � (with �� � )
14
3. Partitioning nodes� Partitioning the interval of values
– Consider the statistical repartition of values in the graph
– � denote the statistical density function associated with the spreading activationmetric on �
� ��� � � �� � � � ��� ��� �
– It is possible to compute a unique inverse value �� �
�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)
– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���
500 1000 1500 2000 2500 5000 1000012 22 30 40 45 83 154
15
3. Partitioning nodes� Partitioning the interval of values
– Consider the statistical repartition of values in the graph
– � denote the statistical density function associated with the spreading activationmetric on �
� ��� � � �� � � � ��� ��� �
– It is possible to compute a unique inverse value �� �
�� � for� � � � � .(Simple adjustements are needed to compute an inverse value in the general cases)
– The partition � � , . . . , � must be such that � � � � � � � � ��� � � � � � ���
�� � 500 1000 1500 2000 2500 5000 10000
�� � � � 12 22 30 40 45 83 154
15-a
3. Partitioning nodes� Filtration
– Gadjer, Goodrich and Kobourov (2000)
– Not based on numerical values but on neighborhood relationships
– Used to speed up force-directed layout methods for large graphs
– Their approach only work well for graphs with strong regularities(a grid, for instance)
Sommet de deg 3
Sommet de deg 4
Sommet de deg 2
16
4. The Layout algorithm� Introduction
– Assume that the spreading activationmetric has been computed on thewhole graph
– �� denote an instance of a force-directed layout algorithm
�� : � � � � positions for everynode �
– �� � a modified instance of the algo-rithm ��
�� � : � � � � � � � � � positions forevery node � � � � �
� Nodes � � � are fixed but act onnodes � � � � �
� Nodes � � � � � can move
17
4. The Layout algorithm� Introduction
– Assume that the spreading activationmetric has been computed on thewhole graph
– �� denote an instance of a force-directed layout algorithm
�� : � � � � positions for everynode �
– �� � a modified instance of the algo-rithm ��
�� � : � � � � � � � � � positions forevery node � � � � �
� Nodes � � � are fixed but act onnodes � � � � �
� Nodes � � � � � can move
17-a
4. The Layout algorithm� Introduction
– Assume that the spreading activationmetric has been computed on thewhole graph
– �� denote an instance of a force-directed layout algorithm
�� : � � � � positions for everynode �
– �� � a modified instance of the algo-rithm ��
�� � : � � � � � � � � � positions forevery node � � � � �
� Nodes � � � are fixed but act onnodes � � � � �
� Nodes � � � � � can move
18
4. The Layout algorithm� Virtual graphs
– Let � � � and � � � � the induced sub-graph
– Target : taking into account paths be-tween two nodes in � when comput-ing �� �
– The edges in
�� are only used to in-
duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �
19
4. The Layout algorithm� Virtual graphs
– Let � � � and � � � � the induced sub-graph
– Target : taking into account paths be-tween two nodes in � when comput-ing �� �
– The edges in
�� are only used to in-
duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �
19-a
4. The Layout algorithm� Virtual graphs
– Let � � � and � � � � the induced sub-graph
– Target : taking into account paths be-tween two nodes in � when comput-ing �� �
– The edges in
�� are only used to in-
duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �
20
4. The Layout algorithm� Virtual graphs
– Let � � � and � � � � the induced sub-graph
– Target : taking into account paths be-tween two nodes in � when comput-ing �� �
– The edges in
�� are only used to in-
duce additional attractive forces be-tween nodes in � and will not bedrawn in the final layout for �
21
4. The Layout algorithm
Data : A graph � � ��� ��� � .
Result : A drawing of the graph �
begin1 Compute a partition� � � � � � � � � � � �
�
Compute the virtual graph
�� ��� � � associated with� �
Run �� ��
� � � � � � � � �Record the positions of nodes � � and mark them as fixed
2 for each level� � ( �� � ) doCompute the induced subgraph � � � ��� � � � � �
Compute the virtual graph�
� ��� ��� � � � � �
Run �� ��
�� ��� ��� � � � � � � � ��� � � � � �
Report the position for nodes � � � � � and mark them as fixed
return The resulting drawing of �
end
22
4. The Layout algorithm� Complexity of the algorithm
– Theorem 1 The algorithm runs in time � � � � � � � .
– Claim 1 The partition� � � � � � � can be computed in linear time ��� � .
– Claim 2 Let� �� � � � � be two consecutive layers in the partition for� . Under theassumption that �� � � ��� � , the induced subgraph � � � ��� � � � � � is computed inlinear time ��� � . This also holds true for � ��� � � . Moreover, the additional time tocompute the virtual subgraph
�� ��� ��� � � � � � is constant on average. Thus the graph
�� ��� ��� � � � � � can be computed in time ��� � as well.
23
5. Experimentation� Energy computation
– Basalaj PhD thesis (2000)
– Computing the energy associated with a drawing � is measuring how close theeuclidean distances between points in � are to the actual distance between thecorresponding nodes in the graph.
� � � �� � � � denote the euclidean distance for a drawing �
� � � ��� � � denote the distance in �
� � � � � � � � � �� � ��� � � � � �� � � � �
� � � � � � � � � � (2)
24
5. Experimentation� Tests
– Tests on a variety of graphs
– Java API Royere
Graphs Algorithm 2 GEM Random Ring
�� � �� � Ref59 87 1 0.049 0.0139 0.349 0.388216 528 3 0.112 0.154 0.203 0.224250 490 4 0.220 0.130 0.223 0.236296 824 5 0.210 0.132 0.213 0.228300 639 6 0.217 0.169 0.228 0.243350 668 7 0.206 0.135 0.220 0.233400 992 8 0.200 0.130 0.220 0.224450 865 9 0.191 0.165 0.219 0.234500 784 10 0.203 0.196 0.214 0.233550 1172 11 0.190 0.128 0.210 0.221600 1182 12 0.175 0.189 0.210 0.221
25
5. Experimentation
26
5. Experimentation
27
6. Conclusion and future work� Conclusion
– Using a partition based on metric values
– � ��
� � ��� � � � � �
� Future work
– The drawing could serve as the input for an animated force-directed layout
– Choose an other metric
28