The Structure of Sparse Random Bipartite Graphsdmg.tuwien.ac.at/nfn/WDM2008/talks/wdm_cuckoo.pdf · Random Bipartite Graphs Introduction • Random Bipartite Graphs • First Results

1 / 27

The Structure of Sparse Random Bipartite Graphs

Reinhard Kutzelnigg∗

Workshop on Discrete Mathematics

Vienna, November 22, 2008

based on joint work with Michael Drmota

∗Research supported by the Austrian Science Foundation FWF, project S9604 and by theEU FP6-NEST-Adventure Programme, Contract number 028875 (NEMO).

Random Bipartite Graphs

Introduction• Random BipartiteGraphs

• First Results

Generating Functions

The ComponentStructure

Application

2 / 27

• We consider multigraphs with two types of labelled nodes.• Each labelled edge connects nodes of different types and

is chosen uniformly at random.• We concentrate on (relatively) sparse graphs.



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b

c



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b

c

d



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b

c

de



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b

c

de

f



• First Results



Application

2 / 27



1 1

2 2

3 3

4 4

5 5

a

b

de

f

cg

First Results


• First Results



Application

3 / 27

Consider a random bipartite graph G, consisting of m nodesof each type and n edges.

Theorem 1. The probability, that G contains only tree andunicyclic components, provided that n = ⌊(1 − ε)m⌋ andε ∈ (0, 1), equals

1 − (2ε2 − 5ε + 5)(1 − ε)3

12(2 − ε)2ε3

1

m+ O

(

1

m2

)

.

Theorem 2. Assume m equals n. The probability, that Gcontains only tree and unicyclic components, equals√

2/3 + O(1).


• First Results



Application

4 / 27

• Devroye and Morin showed 1 − O(1/m) in 2001.• The analytic structure of generating functions for bipartite

random graphs is more difficult than that of usual randomgraphs.

• Nevertheless the results look the same, cf. Janson et al.1993. Thus, one can expect that most properties of randomgraphs have a counterpart in random bipartite graphs (birthof giant component etc.).


Introduction


• Bipartite Trees

• Cyclic Components

• Trees and Unicycles

• Asymptotic Analysis

• The Saddle PointMethod

• The “critical” case


Application

5 / 27

Bipartite Trees

Introduction


• Bipartite Trees







Application

6 / 27

• We use the generating functions t1(x, y) and t2(x, y) forbipartite rooted trees, where the root is contained in firstrespectively second subset of nodes.

• This generating functions are given by

t1(x, y) = xet2(x,y), t2(x, y) = yet1(x,y).

• Let t̃(x, y) denote the generating function of unrootedbipartite trees.

• Furthermore, one can show:

t̃(x, y) = t1(x, y) + t2(x, y) − t1(x, y)t2(x, y)

Cyclic Components

Introduction


• Bipartite Trees







Application

7 / 27

• Of course, a cycle has to have an even number of nodes,say 2k, where k nodes are of type 1 and the other k nodesof type 2.

• A cyclic node of type 1 can be considered as the root of arooted tree of type 1 and similarly, for type 2.

• The product of the generating functions of this trees isdivided by 2k, to account for cyclic order and change oforientation.

Introduction


• Bipartite Trees







Application

8 / 27

• Consequently the generating function of a connected graphwith exactly one cycle is given by

c(x, y) =∑

k≥1

1

2kt1(x, y)kt2(x, y)k

=1

2log

1

1 − t1(x, y)t2(x, y).

Trees and Unicycles

Introduction


• Bipartite Trees







Application

9 / 27

• Let g◦(x, y) denote the generating function of bipartitegraphs consisting only of tree and unicyclic components.

• Note that such a graph possesses exactly 2m − n treecomponents.

• Hence, we get

g◦(x, y) =1

(2m − n)!

t̃(x, y)2m−n

√

1 − t1(x, y)t2(x, y).

• We are interested in [xmym]g◦(x, y).

Asymptotic Analysis

10 / 27

[xmym]g◦(x, y)

=−(m!)2

4π(2m − n)!

∫

|x|=x0

∫

|y|=y0

t̃(x, y)2m−n

√

1 − t1(x, y)t2(x, y)

dx dy

(xy)m+1

This is in fact an integral that can be asymptotically evaluated with helpof a (double) saddle point method. It turns out, that if n = (1− ε)m andε ∈ (0, 1) is fixed, the saddle point is given by

x0 = y0 =n

me−

nm = (1 − ε)eε−1 <

1

e.

The Saddle Point Method

11 / 27

Lemma 1. f(x, y) and g(x, y) analytic functions in a ball around (0, 0)(+ technical assumptions):

[xm1ym2 ]g(x, y)f(x, y)k

=g(x0, y0)f(x0, y0)

k

2πxm1

0 ym2

0 k√

∆

(

1 +h

24∆3

1

k+ O

(

1

k2

))

,

where x0 and y0 are uniquely defined by

m1

k=

x0

f(x0, y0)

[

∂

∂xf(x, y)

]

(x0,y0)

m2

k=

y0

f(x0, y0)

[

∂

∂yf(x, y)

]

(x0,y0)

.

(m1, m2, and k have to be of the same order of magnitude)

12 / 27

Generally, let the cummulants κij and κij be

κij =

[

∂i

∂ui

∂j

∂vjlog f(x0e

u, y0ev)

]

(0,0)

κij =

[

∂i

∂ui

∂j

∂vjlog g(x0e

u, y0ev)

]

(0,0)

.

Further let ∆ = κ20κ02 − κ211, then h is a constant depending on

κ02, κ11, κ20, κ03, κ12, κ21, κ30, κ01, κ10, κ02, κ11, and κ20.

The “critical” case

Introduction


• Bipartite Trees







Application

13 / 27

• Here, we consider the special case ε = 0.• The proof of Theorem 2 follows the same idea.• However, the saddle point x0 = y0 = 1/e coalesces with

the singularity of the denominator of

t̃(x, y)2m−n

√

1 − t1(x, y)t2(x, y).

• We use the following series of #G◦m,m,m and consider

each summand separately,

(m!)2∑

k≥0

(

2k

k

)

1

4k[xmym]t̃(x, y)mt1(x, y)kt2(x, y)k.

14 / 27

• Using Lagrange’s Inversion Theorem, we get

[xmym]t̃(x, y)mt1(x, y)kt2(x, y)k =1

m[umym]f(u, y)ml(u, y)h(u, y).

Hereby, we use the following functions:

f(u, y) = (u + yeu(1 − u)) exp (yeu) ,

l(u, y) = uk (yeu)k ,

h(u, y) = umu − myeuu2 + ku + kyeu + ku2 − ku2yeu

u (u + yeu(1 − u)).

Introduction


• Bipartite Trees







Application

15 / 27

• The saddle point now equals u0 = 1, y0 = 1/e.• We have to handle the integral

∞∫

0

se− 2

3s3+ 2ζ

3√

mks

dt ds.

• This function is related to the Lommel function of secondkind, that is a solution of the inhomogeneous Besseldifferential equation

x2 d2y

dx2+ x

dy

dx+ (x2 − ν2)y = xµ+1.

The Component Structure

Introduction



• Number of Cycles

• Trees with fixed size• Nodes in all cyclicComponents

• Remarks

• Proof: Step 1

• Proof: Step 2

Application

16 / 27

Number of Cycles

Introduction





• Remarks

• Proof: Step 1

• Proof: Step 2

Application

17 / 27

Suppose that ε ∈ (0, 1) is fixed and that n = ⌊(1 − ε)m⌋.Then a labelled random bipartite multigraph with 2 × mvertices and n edges satisfies the following properties:

• The number of unicyclic components with cycle length 2khas in limit a Poisson distribution Po(λk) with parameter

λk =1

2k(1 − ε)2k ,

and the number of unicyclic components has in limit aPoisson distribution Po(λ), too, with parameter

λ = −1

2log

(

1 − (1 − ε)2)

.

Trees with fixed size

18 / 27

• Denote the number of tree components with k vertices by tk. Meanand variance of this random variable are asymptotically equal to

mµ = 2mkk−2(1 − ε)k−1ek(ε−1)

k!,

respectively

mσ2 = mµ−2me2k(ε−1)k2k−4(1 − ε)2k−3(k2ε2 + k2ε − 4kε + 2)

(k!)2.

Furthermore tk satisfies a central limit theorem of the form

tk − µ

σ→ N(0, 1).

Nodes in all cyclic Components

Introduction





• Remarks

• Proof: Step 1

• Proof: Step 2

Application

19 / 27

• Furthermore, the expected value of the number of nodes inunicyclic components is asymptotically given by

(1 − ε)2

ε (1 − (1 − ε)2),

and its variance by

(1 − ε)2(ε2 − 3ε + 4)

ε2 (1 − (1 − ε)2)2 .

Remarks

Introduction





• Remarks

• Proof: Step 1

• Proof: Step 2

Application

20 / 27

• Because of Theorem 1, it is sufficient to consider graphsthat contain tree and unicylic components only. Recall thecorresponding generating function

g◦(x, y) =t̃(x, y)2m−n

(2m − n)!exp(c(x, y)),

where c(x, y) denotes the generating function of anunicyclic component.

• Similar results hold for “usual” random graphs too.

Proof: Step 1

21 / 27

Introduce a “new” Variable w to mark the Parameter of interest:

• Number of cycles

g◦1(x, y, w) =

t̃(x, y)2m−n

(2m − n)!exp(wc(x, y))

• Trees possessing k nodes

g◦2(x, y, w) =

(

t̃(x, y) + (w − 1)t̃k(x, y))2m−n

(2m − n)!exp(c(x, y))

• Nodes in all cyclic Components

g◦3(x, y, w) =

t̃(x, y)2m−n

(2m − n)!exp(c(wx,wy))

Proof: Step 2

Introduction





• Remarks

• Proof: Step 1

• Proof: Step 2

Application

22 / 27

Calculate the l-th factorial Moment

Ml =[xmym]

[

∂l

∂wl g◦(x, y, w)

]

w=1

[xmym]g◦t (x, y, 1)

,

or the characteristic function

φ(s) =[xmym]g◦(x, y, eis)

[xmym]g◦t (x, y, 1)

.

The calculation itself is again performed using the saddle pointmethod.

Application

Introduction



Application

• Cuckoo Hashing

23 / 27

Cuckoo Hashing

24 / 27

• Hash table data structure introduced by Pagh and Rodler in 2001.• Offers constant worst case search time.• Uses two tables and two different hash functions h1 and h2, both

determine a unique position in each table.• Resolve conflicts by rearranging keys.• Algorithm can be modelled by a random bipartite graph.

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

c cc

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

c cc

d d

d

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

c cc

e eed d

d

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

c cc

e e ed d

df f

f

Cuckoo Hashing

24 / 27



1 1

2 2

3 3

4 4

5 5

a aa

b bb

e e ed d

df f

f

cg c

c

g

25 / 27

References (1)

26 / 27

L. Devroye and P. Morin. Cuckoo hashing: Further analysis. InformationProcessing Letters, 86(4):215–219, 2003.

Michael Drmota. A bivariate asymptotic expansion of coefficients ofpowers of generating functions. European Journal of Combinatorics,15(2):139–152, 1994.

Dimitris Fotakis, Rasmus Pagh, Peter Sanders, and Paul G. Spirakis.Space efficient hash tables with worst case constant access time.Theory Comput. Syst., 38(2):229–248, 2005.

Svante Janson, Donald E. Knuth, Tomasz Łuczak, and Boris Pittel. Thebirth of the giant component. Random Structures and Algorithms,4(3):233–359, 1993.

I. B. Kalugin. The number of components of a random bipartite graph.Discrete Mathematics and Applications, 1(3):289–299, 1991.

References (2)

27 / 27

Donald E. Knuth. The Art of Computer Programming, Volume III: Sortingand Searching. Addison-Wesley, Boston, second edition, 1998.

R. Kutzelnigg. Bipartite random graphs and cuckoo hashing. InProceedings of the 4th Colloquium on Mathematics and ComputerScience, DMTCS, pages 403–406, 2006.

R. Kutzelnigg. An improved version of cuckoo hashing: Average caseanalysis of construction cost and search operations. In Proceedings ofthe 19th IWOCA, pages 253–266, 2008.

Reinhard Kutzelnigg. Random Bipartite Graphs and their Application toCuckoo Hashing. PhD thesis, Vienna University of Technology, 2008.

R. Pagh and F. F. Rodler. Cuckoo hashing. Journal of Algorithms,51(2):122–144, 2004.

Documents

The Structure of Sparse Random Bipartite Graphsdmg.tuwien.ac.at/nfn/WDM2008/talks/wdm_cuckoo.pdf · Random Bipartite Graphs Introduction • Random Bipartite Graphs • First Results