Download pdf - Analytic Samplers and the Combinatorial Rejection Method

Analytic Samplers and the Combinatorial Rejection Method

Olivier Bodini∗ Jérémie Lumbroso† Nicolas Rolin˚

AbstractBoltzmann samplers, introduced by Duchon et al. in2001, make it possible to uniformly draw approximatesize objects from any class which can be specifiedthrough the symbolic method. This, through by evalu-ating the associated generating functions to obtain thecorrect branching probabilities.

But these samplers require generating functions, inparticular in the neighborhood of their sunglarity, whichis a complex problem; they also require picking an ap-propriate tuning value to best control the size of gen-erated objects. Although Pivoteau et al.have brought asweeping question to the first question, with the intro-duction of their Newton oracle, questions remain.

By adapting the rejection method, a classical toolfrom the random, we show how to obtain a variant ofthe Boltzmann sampler framework, which is tolerant ofapproximation, even large ones. Our goal for this istwofold: this allows for exact sampling with approxi-mate values; but this also allows much more flexibilityin tuning samplers. For the class of simple trees, wewill show how this could be used to more easily cali-brate samplers.

IntroductionBeing able to randomly generate large objects of anygiven combinatorial class (for instance described by agrammar), is a fundamental problem with countlessapplications in scientific modeling.

Nijenhuis and Wilf introduced the recursivemethod [16] in the late 70s (later extended by Fla-jolet et al. [12]), the first automatic random genera-tion method; so termed automatic because it can di-rectly derive random samplers from any combinatorialdescription—no bijection, no clever algorithm, no com-plicated equations are needed. The drawback is thatthis method is costly, notably in preprocessing: to com-pute the probabilities involved in generating an object ofsize n, the method requires knowing the complete enu-

∗LIPN, UMR 7030, Université Paris 13/CNRS, F-93430 Vil-letaneuse, France.†Department of Computer Science, Princeton University,

35 Olden Street, Princeton, NJ 08540, USA.

meration of the combinatorial class up to size n; andpredictably when n is large, this enumeration is signifi-cant both to calculate and to store.

Enter Boltzmann sampling, introduced by Duchonet al. in 2002 [7, 8], of which the key insight was that aclass’ enumeration is not required to compute the cor-rect branching probabilities: instead, such probabilitiescan be obtained by evaluating the counting generatingfunctions—for an unlabelled combinatorial class C, forwhich there are cn elements of size n, its counting gen-erating function is defined as

Cpzq :“8ÿ

n“1

cnzn.

Through evaluation, all the coefficients of a generatingfunction are smashed together, and the resulting prob-abilities take into account objects of all sizes. Thus,while you do know that the object returned will be uni-formly sampled among objects of the same size, the sizeitself is a random variable—which you have no directcontrol over. As a result, a significant aspect of Boltz-mann sampling involves: rejecting objects which are notwithin the desired size interval; manipulating the gener-ating functions so the size distribution is such that nottoo many objects need be rejected.

The efficiency of this approach, combined withits mathematical appeal—in many regards Boltzmannsampling is an elegant and natural application ofAnalytic Combinatorics pioneered by Flajolet andSedgewick [11]—have made it a fertile topic, and manyof its aspects have been developed through a broad num-ber of papers.

The Boltzmann model. A Boltzmann samplerfor an unlabelled combinatorial class C (of which thereare cn elements of size n), is an algorithm that drawsany given object γ P C with probability

Pzrγs “z|γ|

Cpzqwith Cpzq :“

8ÿ

n“1

cnzn “

ÿ

γPCz|γ|

where |γ| denotes the size of object γ and z is somecontrol parameter to be chosen. Thus the probability of

arX

iv:1

304.

1881

v3 [

cs.D

M]

13

Nov

201

4

0.0 0.1 0.2 0.3 0.4 0.50

2

4

6

8

10

Figure 1. Plot associated with the combinatorial specification of binary trees where all nodes are counted,B “ Z ` Z ˆ B2. The bottom thick black curve plots Cpzq “ z ` zCpzq2, or all coordinates pz, Cpzqq usuallyconsidered for Boltzmann sampling; the shaded area is the region verifying c ě z ` zc2, and from which we getcoordinates pz, cq which we use in our modified model.

obtaining an object of size n is

Pzr|γ| “ ns “cnz

n

CpzqPsrγ | |γ| “ ns “

1

cn

while the probability of drawing an object conditionedon its size is uniform.

The name of the method is an analogy to theBoltzmann model of statistical physics that assigns toeach possible state of a system the probability e´βEZ,where E is the energy of the state, β “ 1T is aconstant, and Z is normalizing constant—the originalauthors noted that this was similar to the probabilitydistribution of objects. But in truth, the distributionof the sizes of objects is a very generic distributionalready known to probabilists as the Power SeriesDistribution1, and according to Johnson et al. [13, §2.2],the terminology is usually credited to Noack [17] around1950.

Evaluating GFs near their singularity. Boltz-mann samplers depend on the evaluation2 of generatingfunction in the neighborhood of their singularity3—andit was assumed that this could be done in constant timearithmetic complexity.

The problem of how these functions should be eval-uated was left open by the original paper, and re-

1The Poisson, geometric, log-series distributions are all specialcases of this distributions, a fact which is put to use by Flajo-let et al. [10, §2] who designed the Von Neumann/Flajolet schemeto simulate power series distributions using only random bits.

2A feature of Boltzmann samplers is that they generally require(see Otter trees in Section 3 for an exception) a constant numberof such evaluations which can be charged a preprocessing; and thisnumber of evaluation is dependent on the size of the combinatorialsystem, rather than on the size of the objects to be generated.

3In keeping with the usage of analytic combinatorics, we callsingularity, the smallest positive point at which a generatingfunction is not defined.

mained without answer until the contribution of Piv-oteau et al. [19, 20]. They introduced a variant of New-ton’s iteration for combinatorial systems, which has ahighly efficient quadratic convergence (and what’s more,is provably convergent for any specifiable combinatorialclass). However some aspects have remained open:

(a) Solving the evaluation problem does not entirelysolve the issue of tuning the samplers—that is,picking the value of z, which will yield the bestconcentration of objects of the targeted size. Cur-rently, expected value tuning requires inverting asystem of equations, and to our knowledge thisis not routinely done for large combinatorial sys-tems. For certain combinatorial classes (algebraicclasses), singular samplers are tuned by approach-ing the singularity as close as possible using a bi-nary search, requiring making a logarithmic num-ber of calls to the oracle, as shown by Darrasse [5].

(b) As a related issue, it seems legitimate to ask: isevaluating the generating functions truly necessary,since these evaluations are in fine used to computeprobabilities of average? and can relaxing thisrequirement possibly lead to more simple (or moreefficient) implementations?

(c) Finally, as a point of minor practical concern,but of conceptual interest: because of the finitenature of computers, and although the oracle canprovide arbitrary precise values, the evaluationof generating functions is in practice restrictedto fixed precision approximations. It has beenargued that the incurred bias in uniformity isminimal: is it possible to make exact simulationsfrom approximate values?

Our contribution: an extended framework. Thispresent paper attempts to investigate some of the afore-mentioned questions. Our novel idea appeals to theclassical random generation concept of rejection, as de-scribed for instance in Devroye’s chapter on the rejec-tion method [6, §2]. Instead of evaluating the generat-ing functions exactly, we pick some nearby point that iseasier to compute.

This is illustrated in Figure 1. Both Boltzmannsamplers and our samplers use coordinates from theshaded region. But while Boltzmann samplers limitthemselves to the coordinates that belong to the thickblack curve at the bottom of the region, we allowourselves to pick any point within the region. Of course,this introduces a bias, which needs to be compensatedwith some additional rejection. But we show that thisrejection is constant and that for reasonable choices ofcoordinates it is practically negligible.

In this paper, we introduce this idea in Section 1,and showcase how it may be used through some illus-trating examples: in Section 2, we show our samplerscan enable using alternate techniques to determine thetuning parameter; in Section 3, we use the example ofOtter trees (non plane binary trees) to show how wecan circumvent having to make a non-constant numberof evaluations of the generating function.

The ideas presented here followed from the firstauthor’s work to extend Boltzmann samplers to infiniteobjects [2], and the second author’s attempts to modifygenerating functions to shape the size distribution ofsampled objects.

Limits of this first version. This preliminarywork comes with a set of restrictions: because none ofthe authors are presently familiar with the extensivelitterature on multidimensional optimization, we haveavoided describing how to apply this idea to combinato-rial systems (or multitype definitions), focusing insteadon combinatorial classes which can be described in asingle equation4. This is not because the idea of rejec-tion cannot be trivially extended to systems, but ratherbecause we felt this strengthened our exposition, whilethe added complexity (in notations, etc.) of describingsystems could not be justified by our current findings.

1 Analytic SamplersIn this section, we provide the main definitions forour analytic random samplers, and then show thealgorithms associated with the basic constructions.

We have used the name Analytic Samplers to dis-tinguish our contribution in this paper from Boltzmannsamplers, but it should be noted that the latter could

4This was also the case of the original Boltzmann paper.

legitimately be termed Analytic Samplers.

1.1 Main definitions

Definition 1.1. Let A be an unlabelled combinatorialclass, and an the number of objects from A that havesize n. The ordinary generating function (OGF) asso-ciated with class A is defined equivalently by

Apzq :“8ÿ

n“0

anzn or Apzq :“

ÿ

αPAz|α|.

The ordinary generating function enumerates combina-torial class it is associated to. The tenet of the sym-bolic method [11] is that if a combinatorial class canbe symbolically specified using a set of operators (dis-joint union, Cartesian product, sequence, multiset, etc.)from initial terminal symbols called atoms which haveunit size, then this specification can be directly trans-lated to obtain the ordinary generating function.

Definition 1.2. Let A be a symbolically defined com-binatorial class which can be translated, following thesymbolic method, to a functional equation on Apzq, thegenerating function associated with A,

A “ ΦpZ,A,Xq ñ Apzq “ φpz,Apzq,Xpzqq,

where both Φ and φ may possibly involve otherclasses/generating functions which we note using vec-tors in bold (and each symbol/generating function com-ponent of the vector itself defined by their own equa-tions).

Definition 1.3. Given a combinatorial class A asgiven in Definition 1.2, a pair of coordinates pz, aq issaid to be analytically valid coordinates for the combi-natorial class A if and only if they verify the inequality

a ě φpz, a,xq.

Remark. It is true that we could have some strongerbound, for instance, a ě Apzq. But this is not desirablebecause the bound involves the generating function, theevaluation of which we are trying to avoid.

Indeed, a subtle remark is that while computingthe right-hand side of the inequality is not necessary torun the analytic samplers, it is necessary to make theinitial calibration. If this right-hand side depended onthe generating function, we would be requiring strictlythe same amount of work as traditional Boltzmannsamplers—if not more!

In general, for convenience and clarity, we will omitthe vector in the notations, and any additional boundsymbols will be implicit.

Definition 1.4. An analytic sampler for an unlabelledcombinatorial class A is an algorithm which samples anobject α P A, of size |α|, with probability

Ppz,aqrαs “z|α|

a

and fails with probability

Ppz,aqr=s “ 1Ápzq

a

where Apzq is the ordinary generating function associ-ated with class A, and the analytically valid coordinatespz, aq are called the control parameter. Moreover wedenote by ΓApz, aq such an analytic sampler.

Because the original Boltzmann samplers alreadyused the concept of rejection to control the size ofthe output and constrain it to a tolerance interval, wechoose instead to call our additional rejection, failure,to avoid confusion.

Theorem 1.1. Let A be a combinatorial class and Apzqits generating function, and let pz, aq be analyticallyvalid coordinates for A. The proportion of objects forwhich the generation has failed, does not depend on thesize of the successfully generated object, and is equal to1Ápzqa.

Proof. This follows from the definition of the model ofAnalytic Samplers wherein the probability of a singledraw failing is constant—in the sense that it doesnot depend on the size of the object that was beingconstructed when the sampling failed—and equal to

Ppz,aqr=s “ 1Ápzq

a,

and thus an object (of some random size) is drawn withthe complementary probability. The number of failuresbefore an actual object is drawn is then geometricallydistributed with p “ 1Ápzqa. We then have:

Epz,aqr#=s “8ÿ

k“0

k

ˆ

1Ápzq

a

˙kApzq

a

“Apzq

a

´

1´ Apzqa

¯

´

Apzqa

¯2

“a

Apzq

ˆ

1Ápzq

a

˙

“a

Apzq´ 1

and because there is one last object generated (theone that does not fail, and after which we are done)the expected proportion of objects which have failed isEpz,aqr#=s pEpz,aqr#=s ` 1q as stated.

Note that the lower bound for a is a “ Apzq, and forthis choice of value, the analytic sampler does not fail:the proportion of failed objects is 0%, and we revert tothe case of Boltzmann samplers.

Indeed the inequality can naturally be seen as anequality involving a slack variable δ, a “ φpz, a,xq ` δ.In essence, if you are willing to spend the computationaltime needed to compute the generating function, thenyou are rewarded for your efforts by having no rejectionat all.

1.2 Elementary constructions. In this subsection,we give the basic constructions used by our analyticsamplers. We follow the notation of the original arti-cle [8], and extend it to include our failure probability,

ΓA : rp1s ¨ Berpp2q ñ X | Y

means that we first fail with probability 1 ´ p1, thenwe draw a Bernoulli variable U of parameter p2: if itis equal to U “ 1 then we return X, otherwise wereturn Y . Furthermore, when instead of a Bernoullidistribution, we have some variable K drawn accordingto a discrete distrbution, we mean that we return a tupleof K independent calls to the specified sampler.

Let A, B and C be combinatorial classes. Werecall, for clarity, that in this article we note Prαs theprobability of drawing an object α; when we want tomake explicit from which class this object is drawn, wenote Prα P As.

Disjoint union. Let A “ B ` C, and a ě b ` c.We first reject 1 ´ pb ` cqa of the objects, then we doa normal Boltzmann.

ΓA :

„

b` c

a

¨ Ber

ˆ

b

b` c

˙

ñ ΓB | ΓC

Proof. We must show that the sampler ΓA returnsobjects α P A with the correct probability Ppz,aqrαs “z|α|a (that is the probability of drawing an object fromA follows the law given in Definition 1.4), assuminginductively that the generators ΓB and ΓC are correct.

Hence:

Ppz,aqrα P As “b` c

a

ˆ

b

b` cPpz,bqrα P Bs

`c

b` cPpz,cqrα P Cs

˙

that is, the probability of sampling an object from A isthe probability of first not failing, pb ` cqa, and thenthe probability of drawing the object using the samplerfor class B or C with the correct Bernoulli probability.By hypothesis those two samplers return objects with

correct probabilities, so

Ppz,aqrα P As “b` c

a

ˆ

b

b` c

z|α|

b`

c

b` c

z|α|

c

˙

“z|α|

a.

Note that in this proof, and the following, we do notexplicitly prove the probability of failure as it is astraightforward consequence: sum the probability ofdrawing an object over all possible possible objects, andtake the complimentary probability.

Cartesian product. Let A “ Bˆ C, and a ě b ¨ c.

ΓA :

„

b ¨ c

a

ñ pΓB ; ΓCq

Proof. The proof follows the same model as the previousconstruction; let α “ pβ, γq,

Ppz,aqrα P As “b ¨ c

aPpz,bqrβ P BsPpz,cqrγ P Cs

and since the samplers for ΓB and ΓC are inductivelyassumed to be correct,

Ppz,aqrα P As “b ¨ c

a

z|β|

b

z|γ|

c“z|β|`|γ|

a“z|α|

a.

Example. At this point we will illustrate the initialdefinitions and these constructors by looking at the classB of binary trees, in which all nodes both internal andexternal count towards the size of the tree. These treescan be symbolically specified as either a leaf (Z) or anode which has two subtrees (Z ˆ B2),

B “ Z ` Z ˆ B2 ñ Bpzq “ z ` zBpzq2.(1.1)

This functional equation can then be translated to aninequality,

b ě z ` zb2.

The analytically valid coordinates for B are all pointsthat belong to the shaded region in Figure 1. Thecorresponding analytic sampler is:

ΓB :

„

z ` zb2

b

¨ Ber

ˆ

z

z ` zb2

˙

ñ | pΓB ; ΓBq

where is a leaf of unit weight.Sequence. Let A “ Seq pBq and a ě 1p1´ bq.

ΓA :

„

p1´ bq´1

a

¨Geopbq ñ pΓB, . . .q

Proof. We follow again the same model as previously.Let α “ pβ1, . . . , βkq.

Ppz,aqrα P As “p1´ bq´1

aPrGeopbq “ ks

kź

i“1

Ppz,bqrβi P Bs

and by hypothesis

Ppz,aqrα P As “p1´ bq´1

abkp1´ bq

kź

i“1

z|βi|

b

“p1´ bq´1

abkp1´ bq

z|β1|`...`|βk|

bk

“z|β1|`...`|βk|

a“z|α|

a.

1.3 Illustration of the rate of failure with Cay-ley Trees. For the purpose of giving an example thatis somewhat more interesting than binary trees, we goslightly beyond the scope of unlabelled constructionswhich we have presented thus far, and present a labelledclass which uses the Set operator. The point of thissubsection is to illustrate, with an example, how mov-ing away from the curve of a generating function impactsthe rate of failure—that is, the rejection which must bedone to compensate for sampling bias introduced by theapproximation.

Consider the example given by the class T of Cay-ley trees (labelled, unrestricted, non-plane trees), sym-bolically specified as T “ Z ‹ Set pTq. The exponen-tial generating function of this class, T pzq “ zeT pzq, isclosely related to Lambert’s W -function, which is im-plicitly defined. Actual standalone5 implementations ofBoltzmann samplers requiring this function have, for in-stance, have resorted to using its truncated Taylor seriesexpansion, see Bassino et al. [1].

With analytic samplers, our starting point is thesystem of functional equations yielded by the symbolicmethod (here there is only a single equation), replaceany occurrence of a function by a free variable, andobtain the inequality t ě z ¨ exp ptq and the algorithm

ΓTpz, tq :

„

z exp ptq

t

¨ Poiptq ñ ˝pΓTpz, tq, . . . ,ΓTpz, tqq

in other words, after an initial rejection (what we callfailure) with probability z ¨ exp ptqt to account for theapproximation of the generating function, we draw a

5By standalone, we are referring to Boltzmann samplers notimplemented within a computer algebra system, such as Mapleor Mathematica, which usually provide computational access tosuch functions.

0 0.1 0.2 0.3

0.0

0.5

1.0

1.5

2.0

2.5

3.0

0.350 0.355 0.360 0.3650.90

0.92

0.94

0.96

0.98

1.00

1.02

Figure 2. This is a plot of the region defined by the inequality t ě z ¨ exp ptq, with z on the x-axis and t on they-axis. The lower bound of the region, in bolded-red, is the curve of the exponential generating function T pzq.Each point corresponds to one of the columns of Table 1. The second figure, on the right, is a close-up near thesingularity, at z “ e´1.

Poisson random variable of parameter t, to indicate howmany children to generate.

It is straightforward enough to see that this algo-rithm is correct for any pair pz, tq which satisfies theaforementioned inequality. Notice also the failure ra-tio can easily be simulated exactly using techniques de-scribed by Flajolet et al. [10].

An experiment summarized in Table 1 illustratesthat the impact of approximation is modest. Forvarious pairs of pz, tq, the table summarizes the resultof making 1000 calls to the sampler: it indicates inwhat proportion the sampling failed prematurely; andmakes note of the average and maximal size among thetrees actually drawn. The case where z “ e´1 andt “ 1 “ T pzq is special: first because this is the onlycase in which t is exactly equal to the evaluated EGF(thus we have 0% failure and our analytic sampler is atraditional Boltzmann sampler); second because sincewe are evaluating the EGF in its singularity, this isactual a singular Boltzmann sampler (for which theexpected value of the size of the output in unbounded).All other points, as illustrated in Figure 2 are more orless distant to the plot of T pzq, with a consequentlyhigher failure rate: but even at relatively significantdistance from the curve, the failure rate remains largelytolerable.

2 Simply generated treesWe now would like to illustrate how dealing with aregion (and inequality) might make searching for anoptimal pair of values for the sampler easier. It thissection, we show how we can search for the best valueof the tuning parameter z (which happens to be in thevicinity of the singularity) for a family of combinatorialclasses, without evaluating the generating function a

logarithmic number of times.Simply generated trees were introduced by Meir and

Moon [15] as classes of trees defined by the followingspecification

Y “ Z ˆ ΦpYq(2.2)

where Φ is a polynomial defined as

Φpxq “ÿ

ωPΩ

xω Φpxq “ÿ

ωPΩ

x|ω|

|ω|!(2.3)

respectively depending on whether the class is unla-belled or labelled, and where Ω Ď N is the multi-set of allowable degrees (for instance, for binary trees,Ω “ t0, 2u). Meir and Moon identified that trees fami-lies defined in such a way shared an important numberof common properties (such as mean path length of or-der n

?n or average height of order

?n).

2.1 Existing approaches. Randomly samplingfrom this class of tree is no longer particularly challeng-ing: there are several methods to do this, with variousproperties of optimality (time, random-bit, etc.). Sowe do not presume to introduce samplers with anysort of new efficiency. However the example of simplygenerated trees illustrates a way in which calibrationmight be more practical with analytic samplers.

Simply generated trees happen to have a branchingsingularity. This means: that their generating functioncan be evaluated at the singularity, and also that thesize distribution of objects produced by a Boltzmannsampler would be ‘peaked’, that is, highly concentratedtowards smaller objects. The solution has traditionallybeen to do singular sampling: to pick z as being at, ornear, the singularity, generate objects with unboundedexpected size, and reject those that are too big.

t “ 1 t “ 0.98

z “ 0.35 0.36 0.367 0.3678 0.36787 0.367879 e´1 0.367

failure (observed) 28.8% 19.2% 6.4% 1.7% 0.4% 0.3% 0% 3.5%failure (theoretical) 28.3% 19.4% 6.8% 2.1% 0.7% 0.2% 0% 3.9%average size 6.6 9.9 28.8 127. 177.3 2716.7 4944.3 35.9maximal size 235 131 1493 17 799 26 531 826 167 2 518 975 1563

Table 1. This table summarizes the result of making 1000 calls to an analytic sampler for Cayley trees, withvarious values of z (the control parameter) and t (the approximation of the generating function). The failure isthe ratio of trees that must be rejected as a direct result of approximating the generating function, instead ofevaluating it. Thus for the pair of values z “ e´1 and t “ 1 “ T pzq, in which we use the exact value of T pzq, oursamplers are exactly equivalent to Boltzmann samplers, hence the failure is of 0%. What is remarkable is thatthe failure resulting from rather large approximations remains manageable.

0.0 0.2 0.4 0.6 0.8 1.0

0

5

10

15

0 5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

Figure 3. This is the plot of a the generating function of a simply generated tree (in this case, the class U ofunary-binary trees, i.e., Ω “ t0, 1, 2u). On the left, the x-axis is z, the control parameter and the y-axis hasUpzq “ zΦpUpzqq. On the right, we are plotting the function u ÞÑ uΦpuq. The problem of looking for thesingularity, left, has been reduced to the more palatable problem of maximizing a function, right.

Except in simple cases (such as binary trees, forwhich the singularity is well known to be 14), thesingularity is not known, so it must be determinedempirically. This is usually done with a binary search, asimplemented by Darrasse [5]: the oracle introduced byPivoteau et al. [20] converges when inside the radius ofconvergence, and diverges otherwise; thus it is possibleto detect whether we have gone beyond the singularity.This method requires a logarithmic number of calls tothe oracle—a logarithmic number of evaluations thatare not done in constant time.

2.2 New approach: maximize a polynomial.From the specification in Equation 2.2, we obtain thecondition for analytic-validity of a pair pz, yq,

y ě z ¨ Φpyq.

With this, it is now easier, instead of looking at thegenerating function Y pzq, to look instead at y ÞÑ

yΦpyq, which is a rational function. This functionadmits a maximal point in the unit interval, which isthe singular point of Y pzq.

Looking for this maximal point is a considerablyeasier problem, that does not require any evaluation ofthe generating function (except perhaps for an initialguess): it can be solved by differentiation, by Newton it-eration, or with specifically optimized algorithms avail-able in the litterature, such as Brent’s algorithm [4].

3 Substitution operatorWhile we have only described, for space and pertinencepurposes, how to build analytic samplers for classes us-ing elementary constructors, the possibilities are muchbroader. In particular, functional operators, such aspointing (differentiation) or substituting (composition)can naturally be used.

We will not go into detail, but instead provide theexample of the unordered pair, MSet2, and present anapplication with the random sampling of Otter trees.

3.1 Unordered pair construction Let B be a com-binatorial class, and A “ MSet2 pBq be the class con-taining unordered pairs of elements of B. The corre-sponding generating functions Apzq and Bpzq verify thefunctional equation

Apzq “Bpzq2 `Bpz2q

2.

Assuming there is an analytic sampler for B, we canbuild an analytic sampler for A. Let pz, bq and pz2, bqboth be analytically valid for B (note that the variablez must be the same in both pairs), and let pz, aq be

analytically valid for A, that is

a ěb2 ` b

2.

Using the notation we have introduced,

ΓApz, aq :

„

b2 ` b

2a

¨ Ber

ˆ

b2

b2 ` b

˙

ñ

tΓBpz, bq ; ΓBpz, bqu |

ΓBpz2, bq and duplicate.

In other terms, after making the obligatory failure test,we choose with the proper probability whether to createa pair of elements resulting from independent callsto ΓBpz, bq, or whether to make one call to ΓBpz2, bqand duplicating the resulting object to make a pair ofidentical objects.

Proof. As before, proving the validity of this algorithminvolves showing that the analytic sampler ΓApz, aqreturns any object α P A with probability z|α|a. Wedistinguish two disjoint cases.

• Either the pair α “ tβ1;β2u contains two distinctelements, β1 “ β2. Then this pair could only havebeen produced by two independent (and distin-guished) calls to ΓBpz, bq. Thus under this setting,

Pz,arα | β1 “ β2s “b2 ` b

2a¨

b2

b2 ` b¨

pPz,brβ1s ¨ Pz,brβ2s

`Pz,brβ2s ¨ Pz,brβ1sq .

By hypothesis, ΓBpz, bq is an analytic sampler forclass B, which means it returns an object β P Bwith probability z|β|b,

Pz,arα | β1 “ β2s “b2 ` b

2a¨

b2

b2 ` b

2z|β1|`|β2|

b2

“z|β1|`|β2|

a“z|α|

a.

• Or the pair α “ tβ;βu contains two identicalobjects. The pair could then have been drawnby either branch: from two independent calls toΓBpz, bq which happen to return the same object;or from the call to ΓBpz2, bq which is duplicated.In this case,

Pz,arα | β “ βs “b2 ` b

2a¨

ˆ

b2

b2 ` b¨ Pz,brβs2

`b2

b2 ` b¨ Pz2,brβs

˙

.

Assuming the analytic sampler for B is correct,

Pz,arα | β “ βs “b2 ` b

2a¨

˜

b2

b2 ` b¨

ˆ

z|β|

b

˙2

`b2

b2 ` b¨pz2q|β|

b

˙

which finally yields

Pz,arα | β “ βs “z2|β|

a“z|α|

a.

3.2 Otter Trees We’ve already thoroughly discussedthe class B of binary trees. These binary trees areplane, in the sense that there the children of an internalnode are distinguished: there is a left node and a rightnode. We now consider the class V of Otter tree, whichare binary trees that are non plane, using the MSet2

operator introduced in the previous subsection,

V “ Z `MSet2 pVq .

The generating function V pzq for Otter trees satisfiesthe functional equation

V pzq “ z `V pzq2 ` V pz2q

2,

and note that, for this class, we only count exter-nal nodes. This combinatorial class does not have aclosed form generating function: prior Boltzmann sam-plers for Otter trees have already informally used ap-proximations [9, §5]; Pivoteau [18] used the fact thatV pzq “ 1´

a

1´ 2z ´ V pz2q. In practice these approxi-mations yield correct simulations, but theoretically theycould introduce a bias. With analytic samplers, thispossible bias is corrected by failing with some proba-bility; this also gives us more flexibility to choose theapproximations.

Setting up the inequality. For our analyticalsamplers, we need the values vris, corresponding toV pz2i

q, which are defined recursively by the system ofinequalities

@i P N`, vris ě z2i

`vris

2 ` vri`1s

2.(3.4)

Because this system is infinite, we are first going to picka threshold index after which the equations will be ap-proximated; and we will determine a good approxima-tion for the remaining terms.

In order to find solutions, we need an initial intervalfor z, which need not be especially precise: to this end,

0 ď z ď 1 suffices (even though it is simple enough toargue that 14 ă z ă 12). The constant part of thisrecursive inequation is z2i , thus it makes sense to letvris “ Kz2i , which we can then inject in our inequation.Dividing both sides by z2i and factoring, we obtain

K ě 1`Kz2i

2pK ` 1q .(3.5)

Choosing parameters. At this point we now havetwo parameters to pick. First we have to find a constantK satisfying Inequation (3.5); K can be as small as wewant, as long as K ą 1.

Once we have picked a threshold i0, and the con-stant K which will approximate terms vris for i ą i0, wecan exactly compute the initial terms. This is done bysolving exactly the quadratic equations,

1

2vris

2 ´ vris `

ˆ

z2i

`1

2vri`1s

˙

“ 0

going backwards from i0 ´ 1 to 0, and with, as we said,the remaining terms oris “ Kz2i .

The approximations we have taken here will impactthe failure rate, and we can decrease it by taking any ofthe following measures: we can pick a higher thresholdi0; we can pick a z that is closer to the singularity; wecan use more than the constant part of the equation inthe step where we reject z2i`1 to approximate the termsbeyond the threshold.

This leads to an efficient sampler for Otter trees, ofwhich we have drawn a very large tree in Figure 4. Con-sider that this allows for interesting empirical analysesof these trees.

4 ConclusionIn this paper, we have proposed to integrate the classicalidea of rejection sampling to the Boltzmann samplermodel, therefore relaxing the condition that generatingfunctions must be evaluated exactly.

The resulting model, which we call analytic sam-plers, is fully compatible with all prior approaches usedin Boltzmann samplers (in particular, these samplerscan work well with Pivoteau et al.’s oracle), and in factprovides sound theoretical ground by which to allow theroutine approximations that have been made in existingBoltzmann samplers.

But beyond that, we also believe the relaxed prop-erties can allow for possible improvements and simplifi-cations in the way the samplers target the size of theiroutput. To illustrate these ideas, we show two typesof applications. First, with the example of simply gen-erated trees, we illustrate how tuning can be done inan alternate way, by using the added degree of freedom

of exploring points in a region instead of a curve. Sec-ond, we show for Otter trees, that our samplers allowfor much larger approximation to be made with littleside-effects.

Some details involve how to simulate the probabil-ities without resorting to arbitrary precision: severalanswers exist, for instance in the form of Buffon ma-chine as introduced by Flajolet et al. [10] or, because weare not restricted to fixed curve, selecting only rationalprobabilities, which can be easily simulated exactly, asshown by Lumbroso [14].

The open question is to determine whether theseproperties can be leveraged for large combinatorialsystems: indeed, the initial Boltzmann paper was onlyillustrated by combinatorial classes defined as one or ahandful of equations. The real impressive strength ofthe oracle provided by Pivoteau et al. [20] was to beable to handle combinatorial systems with thousandsof equations. It remains to be seen if, when dealingwith much more complex polytopes, it is possible to usesimple refinements of the ideas we have shown for simplygenerated trees. Finally, an topic which has not yetreached practical maturity is that of multidimensionalcombinatorial classes (where the distribution is notuniform, but biased according to some combinatorialparameter). The article of Bodini and Ponty [3] hashighlighted some issues, which we believe our presentframework might help bypass.

AcknowledgmentsWe would like to acknowledge and thank the consider-able feedback we have received from anonymous refereesboth on a prior draft of this work, and on this currentsubmission at ANALCO 2015.

This work was financed in part by the French ANRMagnum. We would also like to thank the LIA Lircoand J.-C. Aval and J.-F. Marckert, as well as the ANR

QuasiCool for travel funds which have enabled thesecond author to give talks on this topic.

References

[1] Frédérique Bassino, Julien David, and Cyril Nicaud.Regal: A library to randomly and exhaustively gener-ate automata. In Jan Holub and Jan Žďárek, editors,Implementation and Application of Automata, volume4783 of Lecture Notes in Computer Science, pages 303–305. Springer Berlin Heidelberg, 2007.

[2] Olivier Bodini, Guillaume Moroz, and Hanane Tafat-Bouzid. Infinite Boltzmann samplers and applicationsto branching processes. In Proceedings of GASCOM2012, pages 1–7, June 2012.

[3] Olivier Bodini and Yann Ponty. Multi-dimensionalBoltzmann sampling of languages. DMTCS Proceed-ings, 0(01):49–64, 2010.

[4] Richard P. Brent. Algorithms for minimization withoutderivatives. Courier Dover Publications, 1973.

[5] Alexis Darrasse. Structures arborescentes complexes:analyse combinatoire, génération aléatoire et applica-tions. PhD thesis, Paris 6, 2010.

[6] Luc Devroye. Non-Uniform Random Variate Genera-tion. Springer Verlag, 1986.

[7] Philippe Duchon, Philippe Flajolet, Guy Louchard,and Gilles Schaeffer. Random sampling from Boltz-mann principles. In Peter Widmayer et al., editor, Au-tomata, Languages, and Programming, number 2380in Lecture Notes in Computer Science, pages 501–513.Springer Verlag, 2002.

[8] Philippe Duchon, Philippe Flajolet, Guy Louchard,and Gilles Schaeffer. Boltzmann samplers for the ran-dom generation of combinatorial structures. Combi-natorics, Probability and Computing, 13(4–5):577–625,2004. Special issue on Analysis of Algorithms.

[9] Philippe Flajolet, Éric Fusy, and Carine Pivoteau.Boltzmann sampling of unlabelled structures. InDavid Applegate et al., editor, Proceedings of the NinthWorkshop on Algorithm Engineering and Experiments

0

2000

4000

6000

8000

10 000

12 000

14 000

Figure 4. An Otter tree of size n “ 17 979 (we were targeting 20 000 ˘ 10%), generated in 13s on a standard2012 laptop computer. The bar chart summarizes the degree of symmetry of the leaves: the first bar indicateshow many leaves are not duplicated; then duplicated once; then four times; then eight times.

and the Fourth Workshop on Analytic Algorithmics andCombinatorics, pages 201–211. SIAM Press, 2007. Pro-ceedings of the New Orleans Conference.

[10] Philippe Flajolet, Maryse Pelletier, and Michèle So-ria. On Buffon machines and numbers. In DanaRandall, editor, Proceedings of the Twenty-Second An-nual ACM-SIAM Symposium on Discrete Algorithms,SODA 2011, pages 172–183. SIAM, 2011.

[11] Philippe Flajolet and Robert Sedgewick. AnalyticCombinatorics. Cambridge University Press, 2009.824 pages (ISBN-13: 9780521898065); also availableelectronically from the authors’ home pages.

[12] Philippe Flajolet, Paul Zimmerman, and BernardVan Cutsem. A calculus for the random generationof labelled combinatorial structures. Theoretical Com-puter Science, 132(1-2):1–35, 1994.

[13] Norman L. Johnson, Adrienne W. Kemp, and SamuelKotz. Univariate Discrete Distributions. Wiley Seriesin Probability and Statistics. John Wiley & Sons, Inc.,3rd edition, 2005.

[14] Jérémie Lumbroso. Optimal Discrete Uniform Genera-tion from Coin Flips, and Applications. Preprint, 2013.

[15] A. Meir and J. W. Moon. On the altitude of nodesin random trees. Canadian Journal of Mathematics,30:997–1015, 1978.

[16] Albert Nijenhuis and Herbert S. Wilf. CombinatorialAlgorithms. Academic Press, 2nd edition, 1978.

[17] Albert Noack. A Class of Random Variables withDiscrete Distributions. The Annals of MathematicalStatistics, 21(1):127–132, March 1950.

[18] Carine Pivoteau. Mémoire de master d’informatique.génération aléatoire et modeles de boltzmann. Mas-ter’s thesis, 2009.

[19] Carine Pivoteau, Bruno Salvy, and Michèle Soria.Boltzmann oracle for combinatorial systems. In Pro-ceedings of the Fifth Colloquium on Mathematics andComputer Science. Blaubeuren, Germany, pages 475–488, 2008.

[20] Carine Pivoteau, Bruno Salvy, and Michèle Soria. Al-gorithms for combinatorial structures: Well-foundedsystems and Newton iterations. Journal of Combina-torial Theory, Series A, 119(8):1711–1773, 2012.