Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
1 / 27
The Structure of Sparse Random Bipartite Graphs
Reinhard Kutzelnigg∗
Workshop on Discrete Mathematics
Vienna, November 22, 2008
based on joint work with Michael Drmota
∗Research supported by the Austrian Science Foundation FWF, project S9604 and by theEU FP6-NEST-Adventure Programme, Contract number 028875 (NEMO).
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
c
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
c
d
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
c
de
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
c
de
f
Random Bipartite Graphs
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
2 / 27
• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and
is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.
1 1
2 2
3 3
4 4
5 5
a
b
de
f
cg
First Results
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
3 / 27
Consider a random bipartite graph G, consisting of m nodesof each type and n edges.
Theorem 1. The probability, that G contains only tree andunicyclic components, provided that n = ⌊(1 − ε)m⌋ andε ∈ (0, 1), equals
1 − (2ε2 − 5ε + 5)(1 − ε)3
12(2 − ε)2ε3
1
m+ O
(
1
m2
)
.
Theorem 2. Assume m equals n. The probability, that Gcontains only tree and unicyclic components, equals√
2/3 + O(1).
Introduction• Random BipartiteGraphs
• First Results
Generating Functions
The ComponentStructure
Application
4 / 27
• Devroye and Morin showed 1 − O(1/m) in 2001.• The analytic structure of generating functions for bipartite
random graphs is more difficult than that of usual randomgraphs.
• Nevertheless the results look the same, cf. Janson et al.1993. Thus, one can expect that most properties of randomgraphs have a counterpart in random bipartite graphs (birthof giant component etc.).
Generating Functions
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
5 / 27
Bipartite Trees
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
6 / 27
• We use the generating functions t1(x, y) and t2(x, y) forbipartite rooted trees, where the root is contained in firstrespectively second subset of nodes.
• This generating functions are given by
t1(x, y) = xet2(x,y), t2(x, y) = yet1(x,y).
• Let t̃(x, y) denote the generating function of unrootedbipartite trees.
• Furthermore, one can show:
t̃(x, y) = t1(x, y) + t2(x, y) − t1(x, y)t2(x, y)
Cyclic Components
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
7 / 27
• Of course, a cycle has to have an even number of nodes,say 2k, where k nodes are of type 1 and the other k nodesof type 2.
• A cyclic node of type 1 can be considered as the root of arooted tree of type 1 and similarly, for type 2.
• The product of the generating functions of this trees isdivided by 2k, to account for cyclic order and change oforientation.
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
8 / 27
• Consequently the generating function of a connected graphwith exactly one cycle is given by
c(x, y) =∑
k≥1
1
2kt1(x, y)kt2(x, y)k
=1
2log
1
1 − t1(x, y)t2(x, y).
Trees and Unicycles
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
9 / 27
• Let g◦(x, y) denote the generating function of bipartitegraphs consisting only of tree and unicyclic components.
• Note that such a graph possesses exactly 2m − n treecomponents.
• Hence, we get
g◦(x, y) =1
(2m − n)!
t̃(x, y)2m−n
√
1 − t1(x, y)t2(x, y).
• We are interested in [xmym]g◦(x, y).
Asymptotic Analysis
10 / 27
[xmym]g◦(x, y)
=−(m!)2
4π(2m − n)!
∫
|x|=x0
∫
|y|=y0
t̃(x, y)2m−n
√
1 − t1(x, y)t2(x, y)
dx dy
(xy)m+1
This is in fact an integral that can be asymptotically evaluated with helpof a (double) saddle point method. It turns out, that if n = (1− ε)m andε ∈ (0, 1) is fixed, the saddle point is given by
x0 = y0 =n
me−
nm = (1 − ε)eε−1 <
1
e.
The Saddle Point Method
11 / 27
Lemma 1. f(x, y) and g(x, y) analytic functions in a ball around (0, 0)(+ technical assumptions):
[xm1ym2 ]g(x, y)f(x, y)k
=g(x0, y0)f(x0, y0)
k
2πxm1
0 ym2
0 k√
∆
(
1 +h
24∆3
1
k+ O
(
1
k2
))
,
where x0 and y0 are uniquely defined by
m1
k=
x0
f(x0, y0)
[
∂
∂xf(x, y)
]
(x0,y0)
m2
k=
y0
f(x0, y0)
[
∂
∂yf(x, y)
]
(x0,y0)
.
(m1, m2, and k have to be of the same order of magnitude)
12 / 27
Generally, let the cummulants κij and κij be
κij =
[
∂i
∂ui
∂j
∂vjlog f(x0e
u, y0ev)
]
(0,0)
κij =
[
∂i
∂ui
∂j
∂vjlog g(x0e
u, y0ev)
]
(0,0)
.
Further let ∆ = κ20κ02 − κ211, then h is a constant depending on
κ02, κ11, κ20, κ03, κ12, κ21, κ30, κ01, κ10, κ02, κ11, and κ20.
The “critical” case
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
13 / 27
• Here, we consider the special case ε = 0.• The proof of Theorem 2 follows the same idea.• However, the saddle point x0 = y0 = 1/e coalesces with
the singularity of the denominator of
t̃(x, y)2m−n
√
1 − t1(x, y)t2(x, y).
• We use the following series of #G◦m,m,m and consider
each summand separately,
(m!)2∑
k≥0
(
2k
k
)
1
4k[xmym]t̃(x, y)mt1(x, y)kt2(x, y)k.
14 / 27
• Using Lagrange’s Inversion Theorem, we get
[xmym]t̃(x, y)mt1(x, y)kt2(x, y)k =1
m[umym]f(u, y)ml(u, y)h(u, y).
Hereby, we use the following functions:
f(u, y) = (u + yeu(1 − u)) exp (yeu) ,
l(u, y) = uk (yeu)k ,
h(u, y) = umu − myeuu2 + ku + kyeu + ku2 − ku2yeu
u (u + yeu(1 − u)).
Introduction
Generating Functions
• Bipartite Trees
• Cyclic Components
• Trees and Unicycles
• Asymptotic Analysis
• The Saddle PointMethod
• The “critical” case
The ComponentStructure
Application
15 / 27
• The saddle point now equals u0 = 1, y0 = 1/e.• We have to handle the integral
∞∫
0
se− 2
3s3+ 2ζ
3√
mks
dt ds.
• This function is related to the Lommel function of secondkind, that is a solution of the inhomogeneous Besseldifferential equation
x2 d2y
dx2+ x
dy
dx+ (x2 − ν2)y = xµ+1.
The Component Structure
Introduction
Generating Functions
The ComponentStructure
• Number of Cycles
• Trees with fixed size• Nodes in all cyclicComponents
• Remarks
• Proof: Step 1
• Proof: Step 2
Application
16 / 27
Number of Cycles
Introduction
Generating Functions
The ComponentStructure
• Number of Cycles
• Trees with fixed size• Nodes in all cyclicComponents
• Remarks
• Proof: Step 1
• Proof: Step 2
Application
17 / 27
Suppose that ε ∈ (0, 1) is fixed and that n = ⌊(1 − ε)m⌋.Then a labelled random bipartite multigraph with 2 × mvertices and n edges satisfies the following properties:
• The number of unicyclic components with cycle length 2khas in limit a Poisson distribution Po(λk) with parameter
λk =1
2k(1 − ε)2k ,
and the number of unicyclic components has in limit aPoisson distribution Po(λ), too, with parameter
λ = −1
2log
(
1 − (1 − ε)2)
.
Trees with fixed size
18 / 27
• Denote the number of tree components with k vertices by tk. Meanand variance of this random variable are asymptotically equal to
mµ = 2mkk−2(1 − ε)k−1ek(ε−1)
k!,
respectively
mσ2 = mµ−2me2k(ε−1)k2k−4(1 − ε)2k−3(k2ε2 + k2ε − 4kε + 2)
(k!)2.
Furthermore tk satisfies a central limit theorem of the form
tk − µ
σ→ N(0, 1).
Nodes in all cyclic Components
Introduction
Generating Functions
The ComponentStructure
• Number of Cycles
• Trees with fixed size• Nodes in all cyclicComponents
• Remarks
• Proof: Step 1
• Proof: Step 2
Application
19 / 27
• Furthermore, the expected value of the number of nodes inunicyclic components is asymptotically given by
(1 − ε)2
ε (1 − (1 − ε)2),
and its variance by
(1 − ε)2(ε2 − 3ε + 4)
ε2 (1 − (1 − ε)2)2 .
Remarks
Introduction
Generating Functions
The ComponentStructure
• Number of Cycles
• Trees with fixed size• Nodes in all cyclicComponents
• Remarks
• Proof: Step 1
• Proof: Step 2
Application
20 / 27
• Because of Theorem 1, it is sufficient to consider graphsthat contain tree and unicylic components only. Recall thecorresponding generating function
g◦(x, y) =t̃(x, y)2m−n
(2m − n)!exp(c(x, y)),
where c(x, y) denotes the generating function of anunicyclic component.
• Similar results hold for “usual” random graphs too.
Proof: Step 1
21 / 27
Introduce a “new” Variable w to mark the Parameter of interest:
• Number of cycles
g◦1(x, y, w) =
t̃(x, y)2m−n
(2m − n)!exp(wc(x, y))
• Trees possessing k nodes
g◦2(x, y, w) =
(
t̃(x, y) + (w − 1)t̃k(x, y))2m−n
(2m − n)!exp(c(x, y))
• Nodes in all cyclic Components
g◦3(x, y, w) =
t̃(x, y)2m−n
(2m − n)!exp(c(wx,wy))
Proof: Step 2
Introduction
Generating Functions
The ComponentStructure
• Number of Cycles
• Trees with fixed size• Nodes in all cyclicComponents
• Remarks
• Proof: Step 1
• Proof: Step 2
Application
22 / 27
Calculate the l-th factorial Moment
Ml =[xmym]
[
∂l
∂wl g◦(x, y, w)
]
w=1
[xmym]g◦t (x, y, 1)
,
or the characteristic function
φ(s) =[xmym]g◦(x, y, eis)
[xmym]g◦t (x, y, 1)
.
The calculation itself is again performed using the saddle pointmethod.
Application
Introduction
Generating Functions
The ComponentStructure
Application
• Cuckoo Hashing
23 / 27
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
c cc
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
c cc
d d
d
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
c cc
e eed d
d
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
c cc
e e ed d
df f
f
Cuckoo Hashing
24 / 27
• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both
determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.
1 1
2 2
3 3
4 4
5 5
a aa
b bb
e e ed d
df f
f
cg c
c
g
25 / 27
References (1)
26 / 27
L. Devroye and P. Morin. Cuckoo hashing: Further analysis. InformationProcessing Letters, 86(4):215–219, 2003.
Michael Drmota. A bivariate asymptotic expansion of coefficients ofpowers of generating functions. European Journal of Combinatorics,15(2):139–152, 1994.
Dimitris Fotakis, Rasmus Pagh, Peter Sanders, and Paul G. Spirakis.Space efficient hash tables with worst case constant access time.Theory Comput. Syst., 38(2):229–248, 2005.
Svante Janson, Donald E. Knuth, Tomasz Łuczak, and Boris Pittel. Thebirth of the giant component. Random Structures and Algorithms,4(3):233–359, 1993.
I. B. Kalugin. The number of components of a random bipartite graph.Discrete Mathematics and Applications, 1(3):289–299, 1991.
References (2)
27 / 27
Donald E. Knuth. The Art of Computer Programming, Volume III: Sortingand Searching. Addison-Wesley, Boston, second edition, 1998.
R. Kutzelnigg. Bipartite random graphs and cuckoo hashing. InProceedings of the 4th Colloquium on Mathematics and ComputerScience, DMTCS, pages 403–406, 2006.
R. Kutzelnigg. An improved version of cuckoo hashing: Average caseanalysis of construction cost and search operations. In Proceedings ofthe 19th IWOCA, pages 253–266, 2008.
Reinhard Kutzelnigg. Random Bipartite Graphs and their Application toCuckoo Hashing. PhD thesis, Vienna University of Technology, 2008.
R. Pagh and F. F. Rodler. Cuckoo hashing. Journal of Algorithms,51(2):122–144, 2004.