Analytic Combinatorics (of P. Flajolet) Proﬁle of Digital ... · Analytic Combinatorics (of P....

Analytic Combinatorics (of P. Flajolet)

Profile of Digital Trees

Wojciech Szpankowski∗

Department of Computer Science

Purdue University, W. Lafayette, IN

U.S.A.

August 14, 2012

Polish Combinatorial Conference, Bedlewo, 2012

Dedicated to PHILIPPE FLAJOLET

∗Joint work with H-K. Hwang, M. Drmota, P. Flajolet, P. Nicodeme, G. Park, and M. Ward.

Outline of the Presentation

1. Tries and Suffix Trees

2. Profiles of Tries

• Trie Parameters

• Main Results

• Sketch of Proofs

• Consequences (height, fill-up, shortest path)

3. Profile of Digital Search Trees

4. Applications: Error Resilient Lempel-Ziv’77 (Suffix Trees)

5. Analysis of Algorithms and Analytic Information Theory

Tries and Suffix Trees

A trie, or prefix tree, is an ordered tree

data structure that stores keys usually

represented by strings.

Tries were introduced by de la Briandais

(1959) and Fredkin (1960) who introduced

the name:

“tries” derived from retrieval.

Suffix tree is a trie built form suffixes of

one string.

Other digital trees are: PATRICIA and

digital search trees (DST).

Typical Tries: In this talk we mostly discuss random tries (and DST)

built from n (independent) sequences generated by a binary memoryless

source with p denoting the probability of generating a “0” (q = 1−p ≤ p).

Digital Trees

Digital Trees:

Tries, PATRICIA Tries, Digital Search Trees:

Figure 1: A trie, PATRICIA trie and a digital search tree bu ilt from:

x1 = 11100 . . . , x2 = 10111 . . . , x3 = 00110 . . . , and x4 = 00001 . . ..

Usefulness of Tries

Tries, suffix tress, and DST are widely used in diverse applications:

• automatically correcting words in texts; Kukich (1992);

• taxonomies of regular language; Watson (1995);

• event history in datarace detection for multi-threaded object-oriented

programs; Choi et al. (2002);

• internet IP addresses lookup; Nilsson and Tikkanen (2002);

• data compression, Lempel-Ziv, . . . ; W.S. (2001);

• distributed hash tables, Malkhi et al. (2002) and Adler et al. (2003).

Fundamental, prototype data structures:

• have a large number of variations and extensions (Patricia, DST, bucket

digital search trees, k-d tries, quadtries, LC-tries, multiple-tries, etc.);

• closely connected to several splitting procedures using coin-flipping:

collision resolution in multi-access (or broadcast) communication

models, loser selection or leader election, etc.

• have direct combinatorial interpretations in terms of words, urn models,

Outline Update

2. Usefulness of Tries

• Trie Parameters

• Main Results

• Sketch of Proofs

• Consequences

4. Applications

(G. Park) (P. Nicodeme)

G. Park, H-K. Hwang, P. Nicodeme, W. Szpankowski,

“Profile in Tries,”

SIAM J. Computing, 38, 5, 1821-1880, 2009.

External and Internal Profiles

B0n = 0, I0

B1n = 0, I1

B2n = 1, I2

B3n = 1, I3

B4n = 3, I4

B5n = 2 I5

External profile and internal profile:

Bkn = # external nodes at distance k from the root;

Ikn = # internal nodes at distance k from the root.

Why to Study Profiles?

• Fine, informative shape characteristic;

• Related to path length, depth, height, shortest path, width, etc.;

• Breadth-first search;

• Compression algorithms.

• Mathematically challenging, phenomenally interesting!

Example: Parameters such height Hn, shortest path, sn, fill-up level Fn, and

depth, Dn can be studied through the profiles since:

Hn = max{k : Bkn > 0},

sn = min{k : Bkn > 0},

Fn = max{k : Ikn = 2k},

Pr(Dn = k) =E[Bk

Fn Dn5 Hns n

Recurrence for the Profiles

External Profile Bkn:

Define the probability generating function as

Bnk(u)

Bik-1(u) Bn-i(u)

Bkn(u) = E[uBk

n] =∑

ℓ≥0

P (Bkn = l)ul.

Bkn(u) =

)piqn−i

Bk−1i (u)B

k−1n−i(u)

with B0n = 1 for n 6= 1 and B0

Internal Profile probability generating function Ikn(u) = E[Ik

n] satisfies the

same recurrence with I0n(u) = u for n > 1 and I0

0(u) = I01(u) = 1.

Average External Profile:

E[Bkn] =

)piqn−i

(E[Bk−1i ] + E[B

k−1n−i ]), n ≥ 2, k ≥ 1,

under some initial conditions (e.g., E[Bk0 ] = 0 for all k).

Main Results

Notation: r = p/q = p/(1 − p)> 1, and α := αn,k = k

log n:

p0.50.60.70.80.9 1

α1 :=1

log(1/q),

α2 :=p2 + q2

p2 log(1/p) + q2 log(1/q),

α3 :=2

log(1/(p2 + q2)).

1: Exponential Growth (0 < α < α1):

Let 1 ≤ k ≤ 1log q−1(log n − log log logn + log(r − 1) − ε):

E[Bkn] = nqk(1 − qk)n−1

(1 + O

((log n)−δ

))= O(2−nν)

2: Logarithmic Growth (0 < α < α1):

Let 1 ≤ k ≤ 1log q−1(log n − log log logn + m log(r − 1) − ε):

E[Bkn] = O(log log n · logm−β

where m and β are constants (smaller or greater than m).

Phase Transitions

3: Polynomial Growth: α1 · log n < k < α2 · logn: (α1 < α < α2)

E[Bkn] ∼ G1 (log n)

pρqρ(p−ρ + q−ρ)√2πα log(p/q)

· nυ1

√log n

where G1(x) is a periodic function and

υ1 = −ρ + α log(p−ρ

+ q−ρ

), ρ = − 1

log(p/q)log

(−1 − α log q

1 + α log p

6 × 10−23

−6 × 10−23

1p = 0.55

3 × 10−7

−3 × 10−7

p = 0.65

6 × 10−3

−6 × 10−3

1p = 0.75p = 0.85p = 0.95

Figure 2: The fluctuating part of the periodic function G1(x) for p = 0.55, 0.65, . . . , 0.95.

4: Polynomial Growth/Decay: α2 · log n < k: (α2 < α)

E[Bkn] =

p2 + q2n

ν2 + O (nν3)

where ν2 = 2 + α log(p2 + q2) for some ν3 < ν2.

External Shapes

log n + O(1)

log2 n + O(1)

2 log2 n + O(1)

– Typeset by FoilTEX – 1

log1/qn

log log n + O(1)

p log(1/p) + q log(1/q)

2log(1/(p2+q2))

log n + O(1)

1(p = 0.5, α1 = α2 = 1/ log 2) (p = 0.75)

Average Internal Profile

1: Almost Full Tree: k < α1 · logn

E(Ikn) = 2k − E(Bn,k)(1 + o(1)).

2: Phase Transition I: α1 · logn < k < α0 · logn, where α0 = 2log(1/p)+log(1/q)

E[Ikn] = 2

k − G2 (log n)E(Bn,k)(1 + o(1))

where G2(x) is a periodic function.

3: Phase Transition II: α0 · logn < k < α2 · logn

E[Ikn] = G2 (logn)E(Bn,k)(1 + o(1))

where G2(x) is a periodic function.

4: Polynomial Growth/Decay: α2 · log n < k

E[Ikn] =

ν2(1 + o(1))

where ν2 = 2 − α log(p2 + q2).

Variance and Limiting Distributions of the External Profile

Variance:

1: k < α1 · logn: V[Bkn] ∼ E[Bk

2: α1 · logn < k < α2 · logn: V[Bkn] ∼ G3(log n)E[Bk

n].where G3(log n) is a periodic function.

3: α2 · logn < k: V[Bkn] ∼ 2E[Bk

Limiting Distributions:

Central Limit Theorem: For α1 · logn < k < α3 · log n:

Bkn − E[Bk

n]√V[Bk

n]→ N(0, 1),

where N(0, 1) is the standard normal distribution.

Poisson Distribution: For α3 · log n < k:

P (Bn,k = 2m) =λ0

m!e−λ0 + o(1), and P (Bn,k = 2m + 1) = o(1),

where λ0 := pqn2(p2 + q2)k−1.

Consequences

Height: For large n (cf. Flajolet, 1980, Pittel, 1985, W.S., 1988, Devroye, 1992)

log(p2 + q2)−1logn = α3log n := kH, (whp).

••

• ••

••

Upper Bound: P (Hn > (1 + ǫ)kH) ≤ P (Bkn ≥ 1) ≤ E[Bk

n] → 0.

Lower Bound: P (Hn < (1 − ǫ)kH) ≤ P (B⌈(1−ε)kH⌉)n = 0)

≤ V[B⌈(1−ǫ)kHn ⌉]

(E[B⌈(1−ǫ)kH⌉n ])2

E[B⌈(1−ǫ)kH⌉n ]

)→ 0.

Define: kS := ⌊ 1log q−1(log n − log log log n + log(e log r))⌋.

Shortest Path: For large n (cf. Knessl and W.S., 2005)

P (sn = kS or sn = kS + 1) → 1.

Fill-up: For large n (cf. Pittel, 1986, Devroye, 1992, Knessl & W.S., 2005)

P (Fn = kS − 1 or Fn = kS) → 1.

Sketch of the Proof

1. Recurrence: E[Bkn] =

∑ni=0

)piqn−i(E[Bk−1

i ]+E[Bk−1n−i ]), n ≥ 2, k ≥ 1.

Sketch of the Proof

∑ni=0

)piqn−i(E[Bk−1

i ]+E[Bk−1n−i ]), n ≥ 2, k ≥ 1.

2. Poisson Transform: Ek(z) =∑∞

n=0 E[Bkn]

n! e−z:

Ek(z) = Ek−1(zp) + Ek−1(zq), k ≥ 2,

Sketch of the Proof

∑ni=0

)piqn−i(E[Bk−1

i ]+E[Bk−1n−i ]), n ≥ 2, k ≥ 1.

n=0 E[Bkn]

n! e−z:

Ek(z) = Ek−1(zp) + Ek−1(zq), k ≥ 2,

3. Mellin Transform: E∗k(s) :=

∫∞0

zs−1Ek(z)dz = (p−s + q−s)E∗k−1(s):

E∗k(s) = (p

−s+ q

−s)k−1 · s · (p−s

+ q−s − 1)Γ(s)

for ℜ(s) ∈ (−2,∞), where Γ(s) is the Euler Gamma function.

Sketch of the Proof

∑ni=0

)piqn−i(E[Bk−1

i ]+E[Bk−1n−i ]), n ≥ 2, k ≥ 1.

n=0 E[Bkn]

n! e−z:

Ek(z) = Ek−1(zp) + Ek−1(zq), k ≥ 2,

∫∞0

E∗k(s) = (p

−s+ q

−s)k−1 · s · (p−s

+ q−s − 1)Γ(s)

4. Inverse Mellin Transform: Ek(z) = 12πi

∫ c+i∞c−i∞ z−sE∗

k(s)ds:

Ek(z) =1

∫ c+i∞

c−i∞s(p

−s+ q

−s − 1)Γ(s)z−s

(p−s

+ q−s

)k−1

through the saddle point method (see next slide for more).

Sketch of the Proof

∑ni=0

)piqn−i(E[Bk−1

i ]+E[Bk−1n−i ]), n ≥ 2, k ≥ 1.

n=0 E[Bkn]

n! e−z:

Ek(z) = Ek−1(zp) + Ek−1(zq), k ≥ 2,

∫∞0

E∗k(s) = (p

−s+ q

−s)k−1 · s · (p−s

+ q−s − 1)Γ(s)

4. Inverse Mellin Transform: Ek(z) = 12πi

∫ c+i∞c−i∞ z−sE∗

k(s)ds:

Ek(z) =1

∫ c+i∞

c−i∞s(p

−s+ q

−s − 1)Γ(s)z−s

(p−s

+ q−s

)k−1

through the saddle point method (see next slide for more).

5. Depoissonization: From the Poisson transform Ek(z) to E[Bkn].

Saddle Point Method: Phase Transitions

By depoisonization we have Ek(n) ∼ Ek(z), where recall

Ek(n) =1

∫ c+i∞

c−i∞g(s)Γ(s + 1)n

−s(p

−s+ q

−s)kds

∫ c+i∞

c−i∞g(s)Γ(s + 1) exp(h(s)log n)ds, k = αlogn.

where g(s) = (p−s + q−s − 1) (note g(0) = g(−1) = 0). For k = αlog n,

n−s(p−s + q−s)k = exp(h(s)logn) is large.

Saddle Point Method: Phase Transitions

By depoisonization we have Ek(n) ∼ Ek(z), where recall

Ek(n) =1

∫ c+i∞

c−i∞g(s)Γ(s + 1)n−s(p−s + q−s)kds

∫ c+i∞

c−i∞g(s)Γ(s + 1) exp(h(s)log n)ds, k = αlogn.

where g(s) = (p−s + q−s − 1) (note g(0) = g(−1) = 0). For k = αlog n,

n−s(p−s + q−s)k = exp(h(s)logn) is large.

Saddle Point Method:

A function F (z) analytic with nonnegative coefficients and “fast growth”.

fn = [zn]F (z) =

CF (z)

CeH(z)

where H(z) := logF (z) − (n + 1) log z.

Define H ′(z0) = 0 to be the saddle point, that is,

H(z) = H(z0) +12(z − z0)

2H ′′(z0) + O(H ′′′(z0)(z − z0)3).

[zn]F (z) =

1√2π|H ′′(z0)|

exp(H(z0)) ·(1 + O

(H ′′′(z0)

(H ′′(z0))3/2

Infinity of Saddle Points

The saddle point equation is h′(s) = 0 where

h(s) = α log(p−s + q−s) − s.

It has a unique real root:

ρ =−1

log rlog

(α log q−1 − 1

1 − α log p−1

log q−1< α <

log p−1.

Note∣∣∣p−ρ−itj + q−ρ−itj

∣∣∣ = p−ρ + q−ρ for tj = 2πj/ log r, j ∈ Z

ρ +⋅

rlog----------i

⋅rlog

----------

ρ –

ρ – i

rlog----------

2πlog-----------

nlog–

j ≤ρ – i⋅

rlog----------

ρ + i⋅

rlog----------

2π j,

Phase Transitions:

1. There are infinitely many saddle points ρ + itj.

2. ρ → ∞ as α ↓ 1/ log q−1 = α1.

3 ρ → −∞ when α ↑ 1/ log p−1.

4. Saddle points coalesce with poles of the

Γ(s + 1) function at s = −2,−3, . . ..

Pole s = −2 leads to α2.

Analytic Depoissonization

Theorem 1 (Jacquet, and W.S., 1998). Let G(z) =∑

n≥0 gnzn

n! e−z be the

Poisson transform of gn. In a linear cone Sθ assume two conditions:

(I) For z ∈ Sθ

and some reals B,R > 0, ν

|z| > R ⇒ |G(z)| ≤ B|z|νΨ(|z|)

where Ψ(x) is a

slowly varying function.

(O) For z /∈ Sθ and A,α < 1

|z| > R ⇒ |G(z)ez| ≤ Aexp(α|z|).

gn = G(n) + O(nν−1

Ψ(n)).

Using the depoissonization theorem, we get E[Bkn] = Ek(n) + O

(nν−1√log n

Outline Update

3. Profile of Digital Search Trees

4. Applications: Error Resilient Lempel-Ziv’77

M. Drmota and W. Szpankowski,

“The Expected Profile of Digital Search Trees”,

J. Combin. Theory, Ser. A, 118, 1939-1965, 2011.

Digital Search Trees

Figure 3: A digital search tree built on eight strings s1, . . . , s8 (i.e.,

s1 = 0 . . ., s2 = 1 . . ., s3 = 01 . . ., s4 = 11 . . ., etc.) with internal and

external (squares) nodes, and its profiles.

Profile of Digital Search Trees

1. Recurrence: E[Bk+1n+1] =

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

n=0 E[Bkn]

n! e−z:

E′k+1(z) + Ek+1(z) = Ek(zp) + Ek(zq), k ≥ 2,

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

n=0 E[Bkn]

n! e−z:

∫∞0

zs−1Ek(z)dz = −Γ(s)Fk(s):

F∗k+1(s) − F

∗k+1(s − 1) = (p

−s+ q

−s) · F ∗

for ℜ(s) ∈ (−k − 1, 0), and F ∗0 (s) = 1.

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

n=0 E[Bkn]

n! e−z:

∫∞0

F∗k+1(s) − F

∗k+1(s − 1) = (p

−s+ q

−s) · F ∗

for ℜ(s) ∈ (−k − 1, 0), and F ∗0 (s) = 1.

4. The Power Series: f(x, s) =∑

k≥0 F k(s)xs becomes

f(x, s) =g(x, s)

g(x, 0), g(x, s) =

h(x, s)

1 − x(p−s + q−s),

where g(x, s) = 1+x∑

j≥0 g(x, s − j)(p−s+j+q−s+j), and asymptotically

F ∗k(s) ∼ A(s)(p−s + q−s)k.

where A(s) is analytic with A(−r) = 0 for r = 0, 1, 2 . . ..

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

n=0 E[Bkn]

n! e−z:

∫∞0

F∗k+1(s) − F

∗k+1(s − 1) = (p

−s+ q

−s) · F ∗

for ℜ(s) ∈ (−k − 1, 0), and F ∗0 (s) = 1.

f(x, s) =g(x, s)

g(x, 0), g(x, s) =

h(x, s)

1 − x(p−s + q−s),

F ∗k(s) ∼ A(s)(p−s + q−s)k.

5. Inverse Mellin Transform by saddle point and depoissonization.

∑ni=0

)piqn−i(E[Bk

i ] +E[Bkn−i]), n ≥ 2, k ≥ 0.

n=0 E[Bkn]

n! e−z:

∫∞0

F∗k+1(s) − F

∗k+1(s − 1) = (p

−s+ q

−s) · F ∗

for ℜ(s) ∈ (−k − 1, 0), and F ∗0 (s) = 1.

f(x, s) =g(x, s)

g(x, 0), g(x, s) =

h(x, s)

1 − x(p−s + q−s),

F∗k(s) ∼ A(s)(p

−s+ q

−s)k.

5. Inverse Mellin Transform by saddle point and depoissonization.

6. Asymptotically DST profile behaves similarly to the profile of tries.

Main Results

Theorem 2. Let E Ikn denote the expected internal profile in (asymmetric)

digital search trees, 0 < p < q = 1 − p < 1. The following assertions hold:

1. α1 = 1

log 1p+ ε ≤ k

log n ≤ α0 − ε:

E Ikn = 2

k − G3

(log p

kn) (p−ρn,k + q−ρn,k)kn−ρn,k

√2πβ(ρn,k)k

(1 + O

(k−1/2

where G3(x) is a non-zero periodic function with period 1.

2. k = α0

(log n + ξ

√α0β(0) logn

), where ξ = o((log n)

16), then

E Ikn = 2

kΦ(−ξ)

(1 + O

(1 + |ξ|3√

where Φ is the normal distribution function.

3. α0 + ε ≤ klog n ≤ 1

log 1q− ε = α4 − ε:

E Ikn = G3

(log p

kn) (p−ρn,k + q−ρn,k)kn−ρn,k

√2πβ(ρn,k)k

(1 + O

(k−1/2

where ρn,k = ρ(k/ logn).

Thank You!

Outline Update

2. Usefulness of Tries

4. Applications: Error Resilient Lempel-Ziv’77 (Suffix Trees)

S. Lonardi and M. Ward, and W. Szpankowski,

“Error Resilient LZ’77 Compression: Algorithms, Analysis, and Experiments”,

IEEE Trans. Information Theory, 53, 1799-1813, 2007.

Error Resilient LZ’77 Scheme

1. The Lempel-Ziv’77 works on-line: It compresses phrases by replacing the

longest prefix by (pointer, length) of its copy.

2. Castelli and Lastras in 2004 proved that a single error in LZ’77 corrupts

O(n2/3) phrases, thus about O(n2/3 logn) symbols, where n is the size.

3. There are multiple copies of the longest prefix that we denote by Mn for

a database of length n.

4. By a judicious choice of pointers in the LZ’77 scheme, we can recover

⌊log2 Mn⌋ bits without losing a bit in compression. Parity bits recovered

from the multiple copies are used for the Reed-Solomon channel coding.

historyhistory current positioncurrent position

Analysis of Mn Via Suffix Trees

Why does LZRS’77 work so well?

Performance of LZRS’77 depends on Mn. How does Mn typically behave?

Build a suffix tree from the first n suffixes of the database X (i.e., S1 =X∞

1 , S2 = X∞2 , . . . , Sn = X∞

n ). Then insert the (n+1)st suffix, Sn+1 = X∞n+1.

Depth of insertion of Sn+1 is the (n+1)-st phrase length, and Mn is the size

of the subtree that starts at the insertion point of the (n + 1)st suffix.

Figure 6: M4(=2) is the size of the subtree at the insertion point of S5.

Analysis of Mn for Independent Tries

1. Consider digital tries built over n independent strings.

Average E[MIn] and probability generating function

satisfying the following recurrences

(p = 1 − q is the probability of generating a “1”) MkI

Mn k–I

q = 1-pp

E[MIn] = pn(qn + pE[MI

n]) + qn(pn + qE[MIn)] +

n−1∑

)pkqn−k(pE[MI

k ] + qE[MIn−k])

E[uMIn] = pn(qun + pE[uMI

n]) + qn(pun + qE[uMIn]) +

n−1∑

)pkqn−k(pE[uMI

k ] + qE[uMI

n−k]).

• (Analytic) Poissonization(W (z) =

∑n≥0 E[M

n! e−z)

W (z) = qpzeqz + pqzepz + pW (pz) + qW (qz).

• Mellin Transform(f∗(s) =

∫∞0

f(x)xs−1dx):

W∗(s) =

Γ(s + 1)(pq−s + qp−s)

1 − p−s+1 − q−s+1.

• Inverse Mellin Transform: W (z) = 1/h + fluctuations.

• (Analytic) Depoissonization: E[MIn] = 1/h + fluctuations.

Analysis of Mn for Dependent Strings

2. Suffix Trees: Using analytic combinatorics on words we prove that

M(z, u) =∞∑

∞∑

P(Mn = k)ukzn

w∈A∗α∈A

uP(β)P(w)

Dwα(z) − (1 − z)

Dw(z) − u(Dwα(z) − (1 − z))

Dw(z) = (1 − z)Sw(z) + zmP (w) and Sw(z) is the autocorrelation

polynomial:

Sw(z) =∑

k∈P(w)

P(wmk+1)z

P(w) denotes the set of positions k of w satisfyingw1 . . . wk = wm−k+1 . . . wm.

For any ε > 0 there exists β > 1 such that (all hard analytic work is here!)

|Pr(Mn = k) − Pr(MIn = k)| = O(n−εβ−k)

Random suffix trees resemble random independent tries (cf. P. Jacquet,

W.S., 1994, Lonardi, W.S., Ward, 2005: IEEE Trans. Inf. Th., 2007).

Main Results

Theorem 4 (Ward, W.S., 2005). Let zk = 2krπiln p ∀k ∈ Z, where ln p

ln q = rs for

some relatively prime integers r, s (i.e., ln pln q is rational).

The jth factorial moment E[(Mn)j] = E[Mn(Mn − 1) · · ·Mn(−j + 1)] is

E[(Mn)j] = Γ(j)

q(p/q)j + p(q/p)j

h+ δj(log1/p n) + O(n−η)

where h = −p log p− q log q is the entropy rate, η > 0, and where Γ is the

Euler gamma function and

δj(t) =∑

−e2krπitΓ(zk + j)

(pjq−zk−j+1 + qjp−zk−j+1

p−zk+1 ln p + q−zk+1 ln q.

δj is a periodic function that has a small magnitude and exhibits fluctuation

when ln pln q is rational

Note: On average there are

E[Mn] ∼ 1/h additional pointers.

j 1ln 2

∑k 6=0

∣∣Γ(j − 2kiπ

)∣∣1 1.4260 ×10−5

3 1.2072 ×10−3

5 1.1421 ×10−1

6 1.1823 ×100

8 1.4721 ×102

9 1.7798 ×103

10 2.2737 ×104

Distribution of Mn

Theorem 5 (Ward, W.S., 2005). Let zk = 2krπiln p ∀k ∈ Z, where ln p

ln q = rs. Then

P (Mn = j) =pjq + qjp

jh+∑

−e2krπi log1/p n

Γ(zk)(pjq + qjp)(zk)

j!(p−zk+1 ln p + q−zk+1 ln q)+ O(n

−η)

where η > 0, and Γ is the Euler gamma function.

Therefore, Mn follows the logarithmic series distribution with mean 1/h (plus

some fluctuations).

The logarithmic series distribution ((pjq + qjp)/(jh))

is well concentrated around its mean EMn ≈ 1/h.

2 3 4 5 6 7 8

Analytic Information Theory and Analysis of Algorithms

• In the 1997 Shannon Lecture Jacob Ziv presented compelling

arguments for “backing off” from first-order asymptotics in order to

predict the behavior of real systems with finite length description.

• To overcome these difficulties we propose replacing first-order analyses

by full asymptotic expansions and more accurate analyses (e.g., large

deviations, central limit laws).

• Following Hadamard’s precept1, we study information theory problems

using techniques of complex analysis such as generating functions,

combinatorial calculus, Rice’s formula, Mellin transform, Fourier series,

sequences distributed modulo 1, saddle point methods, analytic

poissonization and depoissonization, and singularity analysis.

• This program, which applies complex-analytic tools of analysis of

algorithms to information theory, constitutes analytic information

theory.

1The shortest path between two truths on the real line passes through the complex plane.

Thank You!

Analytic Combinatorics (of P. Flajolet) Proﬁle of Digital ... · Analytic Combinatorics (of P....

Documents

Workshop in Analytic and Probabilistic Combinatorics BIRS ... · In probabilistic combinatorics, the objects of study often come from extremal combinatorics or graph theory, or computational

Analytic Combinatorics of Lattice Paths with Forbidden

Philippe Flajolet and Analytic Combinatoricsalgo.inria.fr/pfac/PFAC/Program_files/programPFAC.pdf · Wojciech Szpankowski (Purdue University, USA). Analytic Information Theory. Analytic

Additive and Analytic Combinatorics · 10/3/2014 · Additive and Analytic Combinatorics Additive combinatorics is the theory of counting additive structures in sets. This theory

ANALYTIC COMBINATORICSalgo.inria.fr/flajolet/Publications/AnaCombi1to9.pdf · 2005-04-29 · analytic methods. Generating functions are the central objects of the theory. Analytic

AnalyticCombinatoricsinSeveral Variables: … · 2017-09-18 · Abstract The ﬁeld of analytic combinatorics, which studies the asymptotic behaviour of sequences through analytic

From Analysis of Algorithms to Analytic Combinatorics - Inria

ANALYTIC COMBINATORICS COMPLEX ASYMPTOTICS - Inriaalgo.inria.fr/flajolet/Publications/ch4567.pdf · ANALYTIC COMBINATORICS — COMPLEX ASYMPTOTICS (Chapters IV, V, VI, VII) PHILIPPE

Analytic Combinatorics: A Primerconrado/docencia/udelar-course... · 2011-04-13 · Analytic Combinatorics: A Primer Conrado Martínez Univ. Politècnica de Catalunya, Spain April

Philippe Flajolet & Analytic Combinatorics: Inherent …algo.inria.fr/pfac/PFAC/Program_files/nicaud.pdfsequence of letters : u = aaba, v = bcbaa, w = aaaaab = a5b. I Theempty word

Gaussian Laws in Analytic Combinatoricssemflajolet.math.cnrs.fr/uploads/Main/Soria-slides-IHP.pdf · 1/43 Michèle Soria Gaussian Laws in Analytic Combinatorics 1/43. Analytic Combinatorics

Philippe Flajolet and Analytic Combinatoricsbassino/programPFAC.pdf · Philippe Flajolet and Analytic Combinatorics This conference pays homage to the man as well as the multi-faceted

VO Combinatorics - univie.ac.atmfulmek/scripts/KOMBI/... · 2020-05-01 · ‚ Asymptotics: Analytic Combinatorics by P. Flajolet and R. Sedgewick [3]. The reader is assumed to be

Basic analytic combinatorics of directed lattice pathsbanderier/Papers/tcs... · clearlydependent on the basic combinatorics of lattice paths while the corresponding performance analyses

Introductory Multivariable Analytic Combinatorics · A Course on Analytic Combinatorics Objectives Develop a combinatorial understanding of various function classes, esp. algebraic,

Analytic Pattern Matching: From DNA to Twitter · 2020-01-03 · combinatorics and analytic information theory. Part I compiles known results of pattern matching problems via analytic

Multivariate Analytic Combinatorics for Cost Constrained

Analytic Combinatorics of the Mabinogion Urnalgo.inria.fr/flajolet/Publications/FlHu08.pdf · Analytic Combinatorics of the Mabinogion Urn Philippe Flajolet1 and Thierry Huillet2

Lattice path asymptotics via Analytic Combinatorics in Several … · 2015-12-02 · Lattice path asymptotics via Analytic Combinatorics in Several Variables Mark C. Wilson Department

Analytic Combinatorics of the Mabinogion and OK-Corral Urnsalgo.inria.fr/flajolet/Publications/Slides/aofa08.pdf · Urn models Ehrenfest and Mabinogion Friedman and OK-Corral Analytic