Upload
allan-white
View
222
Download
0
Embed Size (px)
Citation preview
1
Special Topics on Graph Algorithms
Finding the Diameter in Real-World GraphsExperimentally Turning a Lower Bound into an
Upper Bound
F96943167 施信瑋F97943070 方劭云R98943086 莊舜翔R98943090 曹蕙芳
R98943088 周邦彥R98921072 金 蘊R99921040 林國偉R99942061 葉書豪
Outline
2
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
F96943167 施信瑋F97943070 方劭云
R98921072 金 蘊
Diameter The length of the "longest shortest path" between any two vertices in a graph or a tree
Given a connected graph G = (V,E) with n=|V| vertices and m=|E| edges the diameter D is Max d(u,v) for u,v in V, where d(u,v) denotes the distance between node u and v
1
5
2 3
3
2
3
A Tree, D = 13
1
5
2 3
3
2
3
24
35
A Graph, D = 9
3
Diameter of a Tree The diameter of a tree can be computed by applying double-
sweep algorithm: 1. Choose a random vertex r, run a BFS at r, and find a vertex
a farthest from r 2. Run a BFS at a and find a vertex b farthest from a 3. Return D = d(a,b)
1
5
2 3
3
2
3
0
86
53
23
3
r
a
1
5
2 3
3
2
3
8
08
1311
10 11
5
a
b
4
Diameter of a Graph Double-sweep algorithm might not correctly compute the
diameter of a graph It provides a lower bound instead
0
32
33
7 6
5
7
85
65
0 4
6
1
5
2 3
3
2
3
24
35
r
1
5
23
2
3
24
35
1
5
23
2
3
24
35
a a
b
D = 9
5
Outline
6
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
F96943167 施信瑋F97943070 方劭云
R98921072 金 蘊
Naïve Algorithm Perform n breadth-first searches (BFS) from each vertex
to obtain distance matrix of the graph Θ(n(n+m)) time and Θ(m) space
By using matrix multiplication, the distance matrix can be computed in O(M(n)logn) time and Θ(n2) space [Seidel, ACM STC’92]
M(n): the complexity for matrix multiplication involving small integers only (O(n2.376))
Is too slow for massive graphs and has a prohibitive space cost
7
All Pairs Shortest Path Compute the distances between all pairs of vertices
without resorting to matrix products [Feder, ACM STC’91]: Θ(n3 / logn) time and O(n2) space [Chan, ACM-SIAM’06]: O(n2(loglogn)2 / logn) time and O(n2) space
Still too slow and space consuming for massive graphs
8
All Pairs Almost Shortest Path (1/2) Compute almost shortest paths between all pairs of
vertices [Dor, ECCC’97] Additive error 2 Treat high-degree vertices and low-degree vertices separately
9
All Pairs Almost Shortest Path (2/2) Additive error 2: apasp2
O(min(n3/2m1/2, n7/3)logn) time and Θ(n2) space
Still too expensive
u vw
w’
u vw
w’
10
Self-checking Heuristics Too expensive to obtain the exact value or accurate
estimations of the diameter for massive graphs
Empirically establish some lower and upper bounds by executing a suitable small number of BFS
L ≦ D ≦ U Obtain the actual value of D for G when L = U
Self-checking heuristics
11
Self-checking Heuristics No guarantee of success for every feasible input, BUT
1) It requires few BFSes in practice, and thus its complexity is linear [Magnien, JEA’09]
2) An empirical upper bound is possible 3) Large graphs can be analyzed
since BFS has a good external-memory implementation [Mayer, AESA’02] and works on graphs stored in compressed format [Vigna, IWWWC’04]
12
A Comparing Work “Fast Computation of Empirically Tight Bounds for the
Diameter of Massive Graphs” [Magnien, JEA’09] Various bounds to confine the solution range
Trivial bounds Double sweep lower bound Tree upper bound
Iterative algorithm to obtain the actual diameter
13
Trivial Bounds The eccentricity of any vertex v gives trivial bounds of the
diameter: ecc(v) ≤ D ≤ 2•ecc(v)
Trivial bounds can be computed in Θ(m) space and time, where m is the number of edges in the graph
D ≤ 2•ecc(v) If D > 2•ecc(v), then max(ecc(v)) > 2•ecc(v) We can choose a center point in the diameter that contradicts the
derived inequality Therefore, D ≤ 2•ecc(v)
14
Double Sweep Lower Bound On chordal graphs, AT-free graphs, and tree graphs, if a
vertex v is chosen such that d(u, v) = ecc(u) for a vertex u, then D = ecc(u) (i.e. v is among the vertices which are at maximal distance from u) [Corneil’01, Handler’73]
The diameter may therefore be computed by a BFS from any node u and then a BFS from a node at maximal distance from u, thus in Θ(m) space and time, where m is the number of edges.
Generally, the value obtained in this way may different from the diameter, but still better than trivial lower bounds
15
Double Sweep Lower Bound: An Example
0
11
21
2 2
2
2
21
21
0 1
2
D = 2
D = 4
actual diameter
16
Tree Upper Bound The diameter of any spanning connected subgraph of G
is larger than or equal to the diameter of G Tree diameter can be obtain in Θ(m) time and space
[Handler’73], where m is the number of edges in G Spanning trees of G, are good candidates for obtaining an upper
bound
A tree upper bound is the diameter of a BFS tree from a vertex
It is always better than the corresponding trivial upper bound
17
Tree Upper Bound: An Example
2
31
04
5 5
4
1
02
31
2 2
1
D’ = 5
D = 4
actual diameter
18
Tighten the Bounds Iteratively choosing different initial vertices for tighter
bounds (for tree upper bounds) Random tree upper bound (rtub)
Iterate the tree upper bound from random vertices Highest degree tree upper bound (hdtub)
Consider vertices in decreasing order of degrees when iterating the algorithm
19
The Iterative Algorithm Iterate the double sweep lower bound and highest degree
tree upper bound until the difference between the best bounds obtained is lower than or equal to a given threshold value
Multiple choices for this threshold value Depending factors: the graph considered, the desired quality of
the bounds, or even set the threshold to be a given precision (e.g. D’-D/D<p)
All heuristics have a Θ(m) time complexity, and a Θ(m+n) space complexity.
Does the tree upper bound eventually converge to the exact diameter?
20
Possibly Unmatching Upper Bound No guarantee of obtaining the exact diameter as all the
tree upper bounds may be strictly larger than D E.g. if G is a cycle of n vertices, its diameter is n/2 and the tree
upper bound is n-1 which ever vertex one starts from
Is there an algorithm that provides more matching upper bounds?
D = 3 D’ = 5
21
Outline
22
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
F96943167 施信瑋F97943070 方劭云
R98921072 金 蘊
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
The Fringe Algorithm Fringe method is used to improve the upper bound U and
possibly match the lower bound L obtained by the double sweep method
23
The Fringe Algorithm An unweighted, undirected and connected graph G=( V,
E ) For any vertex
Tu denotes an unordered BFS-tree
Eccentricity ecc(u) is the height of Tu
=> 2* ecc(u) diam(G)≧
24
The Fringe Algorithm Proof 2* ecc(u) diam(G) ≧ => ecc(u) diam(G)/2≧ 1) if ecc(u) < diam(G)/2, diam(G) ≡d(a,b)
d(u,v) < diam(G)/2, for all
then d(u,a)<diam(G)/2
d(u,b)<diam(G)/2
=> d(u,a)+d(u,b)< d(a,b)
contradiction!!!
∴ 2* ecc(u) diam(G) ≧
diameter
a
b
udiamete
r
25
U
The Fringe Algorithm
Tu denotes an unordered BFS-tree
Tu is a subgraph of G
, , ,
=>
let , so
diam (Tu )
26
The Fringe Algorithm The fringe of u, denote F(u), as the set of vertices
such that
U
|F(U)| = 3
27
The Fringe Algorithm
U
A B CA B C BFS(A)
=>ecc(A) BFS(B)
=>ecc(B) BFS(C)
=>ecc(C)
B(u) = max {ecc(A), ecc(B), ecc(C)}
28
The Fringe Algorithm The fringe of u, denote F(u), as the set of vertices
such that
29
The Fringe Algorithm
Lemma. U(u) D, where D is the diameter of G ≧
30
The Fringe Algorithm Case 1 : |F(u)| = 1 => Case 2 : |F(u)| > 1 , B(u)=2ecc(u)
=> Case 3 : |F(u)| > 1 , B(u)=2ecc(u)-1
=> Case 4 : |F(u)| > 1 , B(u)<2ecc(u)-1
=>
31
The Fringe Algorithm Case 1 : |F(u)| = 1
U
32
The Fringe Algorithm Case 2 : |F(u)| > 1 , B(u)=2ecc(u) ecc(u) = 3 , diam(Tu) = 6
diameter upper bound = 6 B(u) provides lower bound
=> if B(u) = 2 * ecc(u)
∴ diameter = diam(Tu)
U
33
The Fringe Algorithm Case 3 : |F(u)| > 1 , B(u)=2ecc(u)-1
Non-leave node
upper bound = 2ecc(u)-2 Leave node
upper bound = 2ecc(u) if B(u) = 2ecc(u)-1
=> diameter = 2ecc(u)-1
U
ab
d(a,u) ecc(u)-1≦d(b,u) ecc(u)-1≦
34
The Fringe Algorithm Case 4 : |F(u)| > 1 , B(u)<2ecc(u)-1
Non-leave node
upper bound = 2ecc(u)-2 Leave node
upper bound = 2ecc(u) if B(u) < 2ecc(u)-1
=> diameter 2ecc(u)-2≦
U
ab
d(a,u) ecc(u)-1≦d(b,u) ecc(u)-1≦
35
The Fringe Algorithm The fringe algorithm correctly computes an upper bound
for the diameter of the input graph G, using at most |F(u)|+3 BFS.
36
The Fringe Algorithm Let r,a,and b be the vertices identified by double
sweep(using two BFSes) Find the vertex u that is halfway along the path
connecting a and b inside the BFS-tree Ta
Compute the BFS-tree Tu and its eccentricity ecc(u)
If |F(u)|>1,find the BFS-tree Tz for each and compute B(u)
If B(u)=2ecc(u)-1,return 2ecc(u)-1 If B(u)<2ecc(u)-1,return 2ecc(u)-2
Return the diameter(Tu)
37
Example(1/2)
x1 … xp
y1
row=3
column=6
Diameter=6
When number of P is large !!We choose X1 as r
A B
* DSx1->A = 3x1->B = 3x1->y1 = 4
Choose y1 as a
B choose A ,B, x1 as b
y1->A =4y1->B =4 y1->x1 =4
diameter = 4 Wrong !
!!
38
Example(2/2)
x1 … xp
y1
row=3
column=6
•FringeI. Use DS to find a and b x1 as a y1 as b
II. Find a vertex u that is halfway along the path connecting a and b
Case 1 :III. ecc(u) = 4 |F(u)|>1 B(u)=6
Case 2 :IV.B(u)=2ecc(u) 6 = (2*3) return 2ecc(u) diameter = 6
Case 2 :III. ecc(u) = 3 |F(u)|>1 B(u)=6
Case 1 :IV.B(u)<2ecc(u)-1 6 < (2*4) -1 return 2ecc(u)-2 diameter = 6
u
39
A Bad Case for Fringe
r
a
40
A Bad Case for Fringe
a
b
u
41
A Bad Case for Fringe
F(u)
u
Ecc(u) = 3 B(u) = 3
B(u) < 2ecc(u) – 1(5) return 2ecc(u) – 2(4) Real diameter = 3 ∴ Fringe fail !!!
42
Experimental Results (1/2)
ApproachesResults (44 in total)
Matches Failures
fub 37 7
mtub 13 31
hdtub 10 34
rtub 7 37
Implemented in C on a 2.93Ghz Linux workstation with 24 GB memory
44 real-word graphs are tested each with 4000 ~ 50 million nodes, 20000 ~ 3000 million edges
Real diameter is found by exhaustive search to check the obtained upper bounds
43
Experimental Results (2/2)
Benchmarks D fub mtub hdtub rtub
CAH2 18 20 20 20 20
CITP 26 28 30 29 31
DBLP 22 24 24 24 25
P2PG 11 14 15 14 15
ROA1 865 987 987 1047 988
ROA2 794 803 803 873 832
ROA3 1064 1079 1079 1166 1128
The proposed method generates the tightest upper bound for the 7 mismatches, compared with the approaches in previous work
44
Outline
45
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
F96943167 施信瑋F97943070 方劭云
R98921072 金 蘊
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
Finding the Diameter on Weighted Graphs Consider a large complete graph with edge weight be 1
except for only one edge The eccentricities of most points are 1 However, the diameter of the graph is larger than 1
The fringe algorithm may not efficiently find tight diameter bounds for weighted graphs
46
1
11.5
1 1
1
47
Minimum Diameter Spanning Trees Minimum diameter spanning tree (MDST) problem
Given a graph G=(V,E) with edge weight
Find a spanning tree T for G such that
is minimized
EeRew ,)(
peTp
ew )(maxpath simple
1
1
22
3
42
2
3
1
1
2
Diameter=5 Diameter=3MDST
Outline
48
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
Geometric MDST
MDST
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
F96943167 施信瑋
F97943070 方劭云
R98921072 金 蘊
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
Geometric MDST Geometric MDST (GMDST)
Given a set of n points in the Euclidean space, find a spanning tree connecting these points so that the length of its diameter is minimum
GMDST corresponds to finding an MDST on a complete graph with edge weight being the Euclidean distance between two points
49
Monopolar and Dipolar A spanning tree is said to be monopolar if there exists a
point (called monopole) s.t. all remaining points are connected to it
A spanning tree is said to be dipolar if there exists two points (called dipole) s.t. all remaining points are connected to one of the two points in the dipole
50
A monopolarspanning tree
A dipolarspanning tree
monopole dipole
GMDST with a Simple Topology Theorem
There exists a GMDST of a set S of n points which is either monopolar (n 3) or dipolar (≧ n 4) ≧
51
All monopolar spanning trees of the 4 points
4 dipolar spanning trees of the 4 points
Center Edge
An edge (ai-1,ai) is a center edge of a path P=(a0,a1,…,ak) if
is minimized
52
)},(),,(max{ 10 kiPiP aadistaadist
a0 a1
aiai-1 ak-1
ak
dist(a0, ai-1) dist(ai, ak)
Lemma Lemma
Let (ai-1,ai) be a center edge of a path P=(a0,a1,…,ak), then:
(1) and
(2)
53
),(),( 110 kiPiP aadistaadist
),(),( 0 iPkiP aadistaadist
dist(a0, ai-1) dist(ai-1, ak)
a0a1
aiai-1 ak-1
ak
ai-2
a0a1
aiai-1ak-1
ak
ai-2
Otherwise, the center edge is not (ai-1, ai)
A B If A > B:max{ A, B-ei-1 } > max{ A-ei-2, B }
ei-1ei-2
ei-1ei-2
Proof of the Theorem Theorem There exists a GMDST of a set S of n points
which is either monopolar or dipolar Proof
Case 1 Given any optimal GMDST T with a diameter composed of only two edges, i.e., D(T) = (a0, a1, a2) of size DT, a monopolar spanning tree T’ can be constructed with the same diameter
54
a0
a1
a2
a0
a1
a2
Optimal T T’
T
TT
T
D
aaaa
avdistaudist
avauvudist
,,
),(),(
,,),(
2110
11
11
u
v
u v
Proof of the Theorem (cont’d) Case 2 Given any optimal GMDST T with diameter D(T) = (a0,a1,
…,ak) of size DT, k 3. A dipolar spanning tree ≧ T’’ can be constructed with the same diameter
Let (ai-1,ai) be the center edge of D(T)
Connect all points in the subtree Ti-1 to ai-1, and connect all points in the subtree Ti to ai
55
a0
a1
ak
ak-1aiai-1a0
a1
ak
ak-1aiai-1
Ti-1 Ti
Center edge
T’’Optimal T
Proof of the Theorem (cont’d) For any point pair u and v, if the two points are in different
subtrees, their distance is obviously less than DT
If u and v are in the same subtree
56
a0
a1
ak
ak-1aiai-1a0
a1
ak
ak-1aiai-1
Ti-1 Ti
vu u v
u u
T
iTiT
kiTkiT
iTiT
iiT
D
avdistaadist
aadistaadist
avdistaudist
avauvudist
),(),(
),(),(
),(),(
,,),(
0
T’’Optimal T
Finding a Geometric MDST Theorem There exists a GMDST of a set S of n points
which is either monopolar or dipolar By enumerating all monopolar and dipolar spanning trees
of a set of given points, an optimal GMDST can be found The enumeration process can be done in θ(n3)
57
Outline
58
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
Geometric MDST
MDST
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
F96943167 施信瑋
F97943070 方劭云
R98921072 金 蘊
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
[Ho et al., SIAMJ’91] solved the geometric MDST in O(n3) Actually, the general MDST problem is identical to the
absolute 1-center problem (A1CP) The absolute 1-center problem
5959
Introduction
Find x=x* such that F(x) is minimized
),( EVG nV || mE ||, ,For
),(max)( vxdxF GVvG Let
Theorem
SPT(x*) is minimum diameter spanning treeSPT(x*) is minimum diameter spanning treex* is absolute 1-center(continuum set)
x* is absolute 1-center(continuum set)
6060
Equivalence of A1CP and MDST
Proof idea Considering metric space solution with continuum set SPT(y*) Diameter of SPT(x*) equals to that of SPT(y*) As SPT(y*) is minimum, the MDST is solved
Continuum set Let the graph be rectifiable Refer interior points on an edge by their distances from the two
nodes
6161
The Proof of Equivalence
10
3
7
5
5
For given tree T, diameter D(T) equals to 2‧FT(y*) y* is the absolute 1-center of T
6262
Property of Continuum Set
diameterequal distance
Assume that z* is the absolute 1-center of G By following the property of continuum set,
Since the tree is the shortest path tree rooted at z*,
Since z* is the absolute 1-center of G,
For any tree Ti rooted at u,
It implies that, for any spanning tree Tj ,
6363
Proof
)*,(max2*))(( *)( vzdzSPTD zSPTVv
)*,(max2)*,(max2 *)( vzdvzd GVvzSPTVv
),(max2)*,(max2 , vudvzdVu GVvGVv
),(max2),(max2 )( vudvud uTVvGVv i
)()*,(max2 jGVv TDvzd
Conclusion The concepts of monopolar and dipolar [Ho et al.,
SIAMJ’91] are exactly the same as the proved result
By using all pairs shortest distance [Fredman and Tarjan, JACM’87], the A1CP can be solved in O(mn + n2 log n)
monopole dipole
absolute 1-center
64
Outline
65
Introduction
Previous Work
Finding the Diameter in Real-World Graphs
Conclusion and Future Work
Other Related Topics
R98943086 莊舜翔
R98943090 曹蕙芳R98943088 周邦彥
F96943167 施信瑋F97943070 方劭云
R98921072 金 蘊
R99921040 林國偉R99942061 葉書豪R98943086 莊舜翔
Conclusion In today’s presentation, we have
Introduce the difference between finding the diameter on a tree and finding the diameter on a general graph
Give some naïve algorithms for finding the diameter on a graph Present the double sweep algorithm introduced in the previous
work Present the fringe algorithm which extends the double sweep
algorithm Compare the double sweep algorithm and the fringe algorithm
66
Conclusion (cont’d) Besides, we further
Identify the difference between finding the diameter on an unweighted graph and finding the diameter on a weighted graph
Present two algorithms that find minimum diameter spanning trees on weighted graphs
67
Future Work Another topic related to the design methodology for
directed graphs with minimum diameter is interesting as well
“Design to minimize diameter on building-block network”, Makoto Imase and Masaki Itoh
“A design for directed graphs with minimum diamter”, Makoto Imase and Masaki Itoh
Given # nodes and the upper bounds of in- and out-degree, design a directed graph s.t. the diameter is minimized
68
n = 9, d = 2
Future Work (cont’d) How to find the diameter (or find the tight upper and lower
bounds) of a weighted graph is still an opening problem
69
1
11.5
1 1
1
1
5
2 3
3
2
3
24
35
Diameter = 1.5 Diameter = 9