Key-node-separated graph clustering and visualization
Takayuki ItohOchanomizu University, Japan
China-Japan Joint Visualization Workshop2017/7/24
Itoh Laboratory, Ochanomizu University
Career of the speaker
1
Itoh Laboratory, Ochanomizu University
• 1992 Researcher of IBM Tokyo Research Lab.– 1992 M.S. at Waseda Univ.– 1997 Ph. D. at Waseda Univ.– 2000 Visiting researcher at Carnegie Mellon Univ. (6 month)– 2003 (Concurrent) researcher at Kyoto Univ. (2 year)CAD / SciVis / InfoVis / Distributed Computing
• 2005 Professor at Ochanomizu Univ.– 2005 Associate professor– 2008 Visiting researcher at Univ. California Davis (2 month)– 2011 Full professorInfoVis / Multimedia / HCI / CG applications
Organizing …
2
Itoh Laboratory, Ochanomizu University
• International– 2014, 2018 IEEE PacificVis Organizing/General chair– 2015 VINCI General Chair– 2012-2015 ACM SAC Multimedia&Visualization Track Chair– 2016 ACM Advanced Visual Interface Associate Program Chair– 2018 ACM Intelligent User Interface Student Volunteer Chair
• Japanese Domestic– 2014-2016 Director of Society for Art and Science– 2015-2017 Chief Program Committee of Interaction Symposium– 2018-2019 Director of SIG on Interactive System & Software
Women-viewpoint projects
3
Skin measurement and synthesis
Music user interface
Cartoon icon generation
Crowdsourcing of photo retouch
Apparel product recommendation
Crowdsourcing ofwomen’s appearance evaluation
Itoh Laboratory, Ochanomizu University
Still implementing by myself
• Hierarchical data visualization
4
Itoh Laboratory, Ochanomizu University
Itoh et al., Hierarchical Data Visualization Using a Fast Rectangle-Packing Algorithm, TVCG 2004
Itoh et al., Hierarchical Visualization of Network Intrusion Detection Data in the IP Address Space, CG&A 2006
Still implementing by myself
• Network data visualization
5
Itoh Laboratory, Ochanomizu University
Itoh et al., Key-node-Separated Graph Clustering and Layout for Human Relationship Graph Visualization, CG&A 2015
Itoh et al., A Hybrid Space-Filling and Force-Directed Layout Method for Visualizing Multiple-Category Graphs, PacificVis 2009
Still implementing by myself
• High-dimensional data visualization
6
Itoh Laboratory, Ochanomizu University
Itoh et al., High-dimensional data visualization by interactive construction of low-dimensional parallel coordinate, JVLC 2017
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
7
Itoh Laboratory, Ochanomizu University
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
8
Itoh Laboratory, Ochanomizu University
Graph Visualization
• Computer-powered graph drawing
9
Itoh Laboratory, Ochanomizu University
Graph drawing … Long history
10
Itoh Laboratory, Ochanomizu University
Euler’s drawing in 1736 Ball’s abstract drawing in 1892
Graph drawing … Hard work
11
Itoh Laboratory, Ochanomizu University
Computer saved graph drawing
12
Itoh Laboratory, Ochanomizu University
• Automatic• Quick• Interactive• Publishable
Visual representation
• Node-link diagram • Matrix representation
13
Itoh Laboratory, Ochanomizu University
A
E
D C
B
A B C D E
A
B
C
D
E
Data types
14
Itoh Laboratory, Ochanomizu University
Undirected Directed
Unfixed
Fixed
Applications: Social analysis
15
Itoh Laboratory, Ochanomizu University
Applications: Bioinformatics
16
Itoh Laboratory, Ochanomizu University
Applications: Traffic/Communication
17
Itoh Laboratory, Ochanomizu University
Hairball problem
18
Itoh Laboratory, Ochanomizu University
Node layout (force-directed)
19
Itoh Laboratory, Ochanomizu University
Node clustering as a preprocess
20
Itoh Laboratory, Ochanomizu University
Edge bundling as a postprocess
21
Itoh Laboratory, Ochanomizu University
Interaction for graph visualization
• Immersive environment
22
Itoh Laboratory, Ochanomizu University
• Zooming interfaceFocus+Context
Complex data
23
Itoh Laboratory, Ochanomizu University
Time-varying or interactiveaddition / removal of
nodes or edges
AssociatedTime-varying / multivariate
values at nodes / edges
Coordinate view for complex data
24
Itoh Laboratory, Ochanomizu University
Graph visualization: summary
• Computer-powered graph drawing• Techniques
– Node layout– Node clustering– Edge bundling– Interactions
• Data types– Direction, Node positions
• Applications– Social analysis, Bioinformatics, Traffic, Communication, …
25
Itoh Laboratory, Ochanomizu University
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
26
Itoh Laboratory, Ochanomizu University
Contents
27
T. Itoh, C. Muelder, K.-L. Ma, J. Sese, A Hybrid Space-Fillingand Force-Directed Layout Method for Visualizing Multiple-Category Graphs, IEEE Pacific Visualization Symposium,pp. 121-128, 2009.
Itoh Laboratory, Ochanomizu University
The third place of the papers presented at IEEE PacificVis on the number of citations!
Definition: Multiple-Category Graph
• Graphs consisting of nodesbelonging to one or more categories
28
Itoh Laboratory, Ochanomizu University
},{ LNG },...,{ 1 nNnnN
},...,{ 1 nLllL
},...,{ 1 mi bbn
GraphNodesLinksA NodeA Link },{ qpi nnl
Category belonging information(array of boolean values)
Drawing example (color=category)Drawing example (color=category)
Easy example: Social Networking
• Node = Person• Link = Friendship• Category = Community/Keyword
29
Itoh Laboratory, Ochanomizu University
CakeCake ViolinViolinSoccerSoccer
Tight sub-networkTight sub-network
Multi-community personMulti-community person
Hub personHub person
Requirements for Visualization
• Place common-category nodes closer• Reduce:
– sum of lengths of edges– number of intersections among edges
• Avoid the cluttering of nodes• Maximize screen space utilization• Reduce the computation time
30
Itoh Laboratory, Ochanomizu University
Multiple-category graph specific
General
Requirements Satisfaction
• Place common-category nodes closer• Reduce:
– sum of lengths of edges– number of intersections among edges
• Avoid the cluttering of nodes• Maximize screen space utilization• Reduce the computation time
31
Itoh Laboratory, Ochanomizu University
Force-directed
Space-filling
Hybrid Space-Filling and Force-Directed Method
Hybrid Approach: Overview
Step1: Hierarchical clustering– Categorized & Non-categorized– Category & Connection based
Step 2: Layout– Rectangle packing for Non-categorized nodes– Hybrid for categorized nodes
Step 3: Interaction– Focus+Context– Category selection
32
Itoh Laboratory, Ochanomizu University
Step 1: Hierarchical Clustering
33
(1) Root
(2) Top of categorized nodes
(3) Clusters of categorized nodes
(4) Clusters of non-categorized nodes
(5) Categorized nodes (6) Non-categorized nodes
Itoh Laboratory, Ochanomizu University
Category & connection based clustering
Connection based clustering
Step 2: Layout
34
Itoh Laboratory, Ochanomizu University
Root
Top of categorized nodes
w
Clusters of categorized nodes
Clusters of non-categorized nodes
Categorized Non-categorized
Data structure Layout
Rectangle Packing Technique
• Originally for tree visualization [Itoh04][Itoh06]
• Treemap-like, but better on:– Aspect ratio– Flexible cluster positioning (by referring templates)
35
Itoh Laboratory, Ochanomizu University
Input Result
Tree
Template(Optional)
Switching Layout Algorithms
36
Itoh Laboratory, Ochanomizu University
Root
Top of categorized nodes
w
Clusters of categorized nodes
Clusters of non-categorized nodes
Categorized Non-categorized
Data structure
Switching Layout Algorithms
37
Itoh Laboratory, Ochanomizu University
Root
Top of categorized nodes
w
Clusters of categorized nodes
Clusters of non-categorized nodes
Simply apply rectangle packing
Consider both connection & category
Rectangle packing is not always good
New approach(Hybrid rectangle packing
& force directed)
Hybrid Layout for Categorized Nodes
38
Itoh Laboratory, Ochanomizu University
(b) Force-directed layoutfor cluster graph
(c) Positions as a template (d) Rectangle packing
(a) Clustering ofcategorized nodes
Edge weight is proportional to the
number of linksConnect if their
category is common, even if there is no links
Drawing Nodes
39
Colors of nodes denote categories
Categorized nodes are drawn as
colored circles
There are 3 levels of thickness and
transparency of links
Non-categorized nodes are drawn
as gray dots
Radii of nodes denote number of links
Three colors denote that the node belongs
to three categories
Itoh Laboratory, Ochanomizu University
Result (1) Zooming categorized nodes
40
Itoh Laboratory, Ochanomizu University
No cluster-cluttering
A
BC
A,B,C: Good concentrationof common-category nodes
D
D: Good for discoveryof isolated categories
Comparison
41
Itoh Laboratory, Ochanomizu University
Space-Filling (1)Space-Filling (1) Space-Filling (2)Space-Filling (2) ProposedProposed
1.3 (sec.) 1.2 (sec.) 4.7 (sec.)
* Force-directed … 267.5 (sec.)
Comparison
• Three criteria– Node distance, edge length, & num. intersection
42
0
0.5
1
1.5
2
2.5
3
3.5
Node distance Edge length Num. intersect.
S.-F. (1)S.-F. (2)Proposed
Itoh Laboratory, Ochanomizu University
* Relative values (1 for proposed technique)
Focus+Context
43
Itoh Laboratory, Ochanomizu University
Non-categorizednode concentration
Space distortionfor focus+context
Applied Active Biological Data
• Data– 6,152 genes (as nodes)– 7,564 gene-gene interactions (as edges)– 10 conditions of gene expression (as categories)
• Goal– Hub / Multi-functional gene discovery– Sub-network discovery by conditions
44
Itoh Laboratory, Ochanomizu University
Visualization Example
[A] Hub gene discovery
[B] Multi-functional gene discovery
[C] Well-divided sub-network discovery
45
Itoh Laboratory, Ochanomizu University
A
A
B
C
Visualization Example
[D,E,F] Separation of common-condition gene clusters
46
Itoh Laboratory, Ochanomizu University
D
D'E E'
F
F'
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
47
Itoh Laboratory, Ochanomizu University
Contents
48
Itoh Laboratory, Ochanomizu University
IEEE Computer Graphics & Application, 2015Organized by Xiaoru, Baoquan, Koji, & Issei
T. Itoh, K. Klein, Key-node-Separated Graph Clustering and Layout for Human Relationship Graph Visualization,IEEE Computer Graphics and Applications, Vol. 35, No. 6,pp. 30-40, 2015.
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
49
Itoh Laboratory, Ochanomizu University
Human Relationship Graph
• Node=Human, Edge=Relationship– Paper co-authorship– Friendship on SNS
• Characteristics– Authority persons– Topic-based clusters
50
Itoh Laboratory, Ochanomizu University
Human Relationship Graph
• Node=Human, Edge=Relationship– Paper co-authorship– Friendship on SNS
• Characteristics– Authority persons– Topic-based clusters
51
Itoh Laboratory, Ochanomizu University
Issues on node clustering
• Community finding schemes are well applied– Extracting subgraphs which have dense edges
• Issue: “key nodes” are involved in large clusters• Often we want to separate such nodes from clusters
52
Itoh Laboratory, Ochanomizu University
(a) Clustering based ondensity of connections
(1) (2)
(b) High-level drawingof the clustered graph (a)
Key node and many edgesare hidden inside clustersKey node and many edgesare hidden inside clusters
Issues on node clustering
• Community finding schemes are well applied– Extracting subgraphs which have dense edges
• Issue: “key nodes” are involved in large clusters• Often we want to separate such nodes from clusters
53
Itoh Laboratory, Ochanomizu University
(3)
(4)
(5)
(c) Clustering based oncommonality of neighbors
(d) High-level drawingof the clustered graph (c)
• Key nodes are more visible• Many edges are bundled• Key nodes are more visible• Many edges are bundled
Visualization with key-node-aware clustering
• Two metrics for node-to-node distances– (Dis-) Commonality of connected nodes– (Dis-) Similarity of feature vectors
• Used for:– Node clustering– Node layout
54
Itoh Laboratory, Ochanomizu University
Similar feature vectors=Persons with similar topics
Commonly connected nodes=Authority persons
Data Structure & Node Distance
55
Graph={Node, Edge}Node={n1, n2, …}Edge={e1, e2, …}
A node hasa feature vectorni={ai, ….}
An edge connectstwo nodes ei={ni1, ni2}
Data structure Node distance
∝ 1.0 ∝
Itoh Laboratory, Ochanomizu University
· /: feature vector of a node
→Similarity of topics
1.0/ 1: number of common adjacent nodes
→Connection to same persons
Clustering & Layout
56
Hierarchical clustering → Cluster Layout → Node Layout• Centroid method • MDS & Stress minimization
• Laplacian smoothing• Swapping in a circle
Itoh Laboratory, Ochanomizu University
Edge Bundling
57
(a) Edge bundling between two clusters of nodes
Node Node
Center ofcluster
Center ofcluster
Node Node
Center ofcluster
Center ofcluster
(b) Placement of control points of Bezier curves
Itoh Laboratory, Ochanomizu University
Before
After
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
58
Itoh Laboratory, Ochanomizu University
Example Dataset
• Paper co-authorship data– 564 papers by NBAF (NERC Biomolecular Analysis Facilities)– 1,821 nodes(=authors), 11,097 edges(=co-authorships)
• Feature vectors from paper titles– Frequency of 12 words for each author → 12 dim. Vector
• Computation time– 2.5 sec. for clustering– 8.8 sec. for node layout
59
Itoh Laboratory, Ochanomizu University
Genetic (Red), Molecular (Orange), Loci (Yellow),Microsatellites (Yellow green), Isolation (Green),Inbreeding (Blue green), Transcriptomics (Sky blue),Expression (Blue), Bacterial (Indigo),Breeding (Purple), Polymorphic (Pink)
Node Layout Example
• Clusters as circles → Nodes inside the circles• Colors according to feature vectors• Edge display control
60
Itoh Laboratory, Ochanomizu University
Clusters of nodes
Feature-based colors
Edges of a node
Key-node Separation from Large Clusters
61
Itoh Laboratory, Ochanomizu University
170 clusters, by our algorithm
* Color=degree of a node
Two key nodes are separatedfrom large clusters
Key-node Separation from Large Clusters
62
159 clusters, by common algorithm
Itoh Laboratory, Ochanomizu University
* Color=degree of a node
Two key nodes are involvedin a large cluster
Case Study with Co-authorship Graph
63
Itoh Laboratory, Ochanomizu University
Expression
Genetic
Isolation
Polymorphic
Molecular
* Colors are based on feature vectors
Genetic (Red), Molecular (Orange), Loci (Yellow),Microsatellites (Yellow green), Isolation (Green),Inbreeding (Blue green), Transcriptomics (Sky blue),Expression (Blue), Bacterial (Indigo),Breeding (Purple), Polymorphic (Pink)
Case Study with Co-authorship GraphItoh Laboratory, Ochanomizu University
* Colors are based on degree of nodes
Case Study with Co-authorship GraphItoh Laboratory, Ochanomizu University
* Colors are based on degree of nodesCluster A
Case Study with Co-authorship Graph
66
Itoh Laboratory, Ochanomizu University
One of the key persons in cluster A:Many connections with particular fields of people
Expression
Genetic
Molecular
Polymorphic
Breeding
Transcriptomics
Loci
Case Study with Co-authorship Graph
67
Itoh Laboratory, Ochanomizu University
Another key person in cluster A:More variety of connections with many fields of people
Expression
Genetic
Molecular
Polymorphic
Breeding
Transcriptomics
Loci
Case Study with Co-authorship GraphItoh Laboratory, Ochanomizu University
* Colors are based on degree of nodes
Case Study with Co-authorship GraphItoh Laboratory, Ochanomizu University
* Colors are based on degree of nodes
Cluster B
Cluster C
Case Study with Co-authorship Graph
70
Itoh Laboratory, Ochanomizu University
A key person in cluster B:Many connections with other fields of people
Case Study with Co-authorship Graph
71
Itoh Laboratory, Ochanomizu University
A key person in cluster C:Many connections with uncolored fields of people
Numeric comparison
Num. clusters Num. nodes of clustersof two key nodes
Num. edgesinside clusters
Ours (1) 813 4, 4 5964Ours (2) 354 4, 4 5421Ours (3) 264 4, 4 5868Ours (4) 170 9, 9 6141Common 159 33, 54 8214
72
Itoh Laboratory, Ochanomizu University
Successfully separatedkey nodes from large clusters
Smaller number of edgesare hidden inside clusters
Subjective Evaluation
Data 1(ours)
Data 1(common)
Data 2(ours)
Data 2(common)
Q: Interested in key nodes?
9 4 10 3
Q: Find the clusters connected to key nodes?
9 4 7 6
Q: Find the number of nodes connected to key nodes?
10 3 8 5
73
Itoh Laboratory, Ochanomizu University
Participants: 13 university students in computer science
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
74
Itoh Laboratory, Ochanomizu University
On-going work: for directed graphs
75
Itoh Laboratory, Ochanomizu University
[Toeda16]
Three types of bundling for directed graphs
76
A. Edges betweentwo clusters of nodes
B. Edges in the pair of bidirectionalbundles connecting to thesame pair of clusters
C. Edges of pairs of bundles whichstart or end at the same cluster
Itoh Laboratory, Ochanomizu University
[Toeda16]
Potential Applications
77
Itoh Laboratory, Ochanomizu University
Paper Citation Network Music Ordering
Graph Drawing
Time-Varying
Visual Analytics
GPU-based
Tree Evaluation
Pops
Rock
Slow Ballad
Blues Dance
Jazz
Contents
• Graph Visualization Overview• Multiple-Category Graph Visualization• Key-Node-Separated Graph Visualization
– Concept & Algorithm– Experiment– On-going work
• General Discussion
78
Itoh Laboratory, Ochanomizu University
Current research situation
• Slowly growing research field– Constant number of papers every year– But still remaining many essential problems– Recent meetings of theoretical and application people
(e.g. NII Shonan meeting in 2015/01 and 2016/08)• Mostly used by professional people
– Social analysis, bioinformatics, system monitoring, …– Less popular applications for general users
• General software vs. Application-oriented tool
79
Itoh Laboratory, Ochanomizu University
My interested directions
• Numeric evaluation of comprehensibility– Are mathematically beautiful visualization results
really good?• Automatic algorithm selection
– Especially node clustering and layout algorithms• User interfaces
– Newer devices (touch panels, VR/AR, …)– Voice processing, Natural language processing
• New application development
80
Itoh Laboratory, Ochanomizu University