5
Families of power series distributions, with particular reference to the Lerch family Adrienne W. Kemp School of Mathematics and Statistics, University of St Andrews, Scotland article info Available online 20 January 2010 MSC: 33E20 60E99 60G10 Keywords: Families of power series distributions Lerch family Stochastic process models abstract The paper revisits the concept of a power series distribution by defining its series function, its power parameter, and hence its probability generating function. Realization that the series function for a particular distribution is a special case of a recognized mathematical function enables distributions to be classified into families. Examples are the generalized hypergeometric family and the q-series family, both of which contain generalizations of the geometric distribution. The Lerch function (a third generalization of the geometric series) is the series function for the Lerch family. A list of distributions belonging to the Lerch family is provided. The advantage of classifying power series distributions into families is that distributions within a family can be expected to have analogous properties which depend on the mathematical properties of the series function. This is demonstrated by focussing on equilibrium birth and death processes for the Lerch family. & 2010 Published by Elsevier B.V. 1. Introduction Most discrete statistical distributions are lattice distributions. The number of values can be either infinite, e.g. 0; 1; ... for the Poisson distribution, or finite, e.g. 0; 1; ... ; n for the binomial distribution. They were studied intermittently pre- 1960. In the 1960s Patil, e.g. Patil (1962), made great advances in the theory of the power series subclass (PSD) and the generalized power series subclass (GPSD). PSD’s have probability mass functions (pmfs) of the form Pr½X ¼ x¼ a x y x ZðyÞ ; x ¼ 0; 1; ... ; y 40; where a x Z0, and ZðyÞ¼ P 1 x ¼ 0 a x y x . We call y the power parameter of the distribution and ZðyÞ its series function. For GPSD’s the set of values that the variable can take is any nonempty enumerable set of nonnegative integers. The class of modified power series distributions (MPSD) was created by Gupta in the 1970s; see e.g. Gupta (1974). These have pmfs of the form Pr½X ¼ x¼ a x ½uðyÞ x = P x a x ½uðyÞ x . They include distributions derived from Lagrangian expansions by Consul and his colleagues; see e.g. Consul and Shenton (1972). PSD’s and GPSD’s have many global properties (although some may fail to hold if y ¼ 1). Beheading or truncating a GPSD creates another GPSD, as does the convolution of two GPSD’s. Moreover, given a sample of independent observations from a PSD, then equating the observed and theoretical means gives the maximum likelihood estimate of y. If a pmf can be put into the form Pr½X ¼ x¼ exp½aðfÞbðxÞþ cðfÞþ dðxÞ, where f is a parameter and aðÞ; bðÞ; cðÞ and dðÞ are known functions, then the distribution belongs to the important exponential family. Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/jspi Journal of Statistical Planning and Inference ARTICLE IN PRESS 0378-3758/$ - see front matter & 2010 Published by Elsevier B.V. doi:10.1016/j.jspi.2010.01.021 E-mail address: [email protected] Journal of Statistical Planning and Inference 140 (2010) 2255–2259

Families of power series distributions, with particular reference to the Lerch family

Embed Size (px)

Citation preview

Page 1: Families of power series distributions, with particular reference to the Lerch family

ARTICLE IN PRESS

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference

Journal of Statistical Planning and Inference 140 (2010) 2255–2259

0378-37

doi:10.1

E-m

journal homepage: www.elsevier.com/locate/jspi

Families of power series distributions, with particular referenceto the Lerch family

Adrienne W. Kemp

School of Mathematics and Statistics, University of St Andrews, Scotland

a r t i c l e i n f o

Available online 20 January 2010

MSC:

33E20

60E99

60G10

Keywords:

Families of power series distributions

Lerch family

Stochastic process models

58/$ - see front matter & 2010 Published by

016/j.jspi.2010.01.021

ail address: [email protected]

a b s t r a c t

The paper revisits the concept of a power series distribution by defining its series

function, its power parameter, and hence its probability generating function. Realization

that the series function for a particular distribution is a special case of a recognized

mathematical function enables distributions to be classified into families. Examples are

the generalized hypergeometric family and the q-series family, both of which contain

generalizations of the geometric distribution. The Lerch function (a third generalization

of the geometric series) is the series function for the Lerch family. A list of distributions

belonging to the Lerch family is provided.

The advantage of classifying power series distributions into families is that

distributions within a family can be expected to have analogous properties which

depend on the mathematical properties of the series function. This is demonstrated by

focussing on equilibrium birth and death processes for the Lerch family.

& 2010 Published by Elsevier B.V.

1. Introduction

Most discrete statistical distributions are lattice distributions. The number of values can be either infinite, e.g. 0;1; . . .for the Poisson distribution, or finite, e.g. 0;1; . . . ;n for the binomial distribution. They were studied intermittently pre-1960. In the 1960s Patil, e.g. Patil (1962), made great advances in the theory of the power series subclass (PSD) and thegeneralized power series subclass (GPSD). PSD’s have probability mass functions (pmfs) of the form

Pr½X ¼ x� ¼axy

x

ZðyÞ; x¼ 0;1; . . . ; y40;

where axZ0, and ZðyÞ ¼P1

x ¼ 0 axyx. We call y the power parameter of the distribution and ZðyÞ its series function. For

GPSD’s the set of values that the variable can take is any nonempty enumerable set of nonnegative integers. The class ofmodified power series distributions (MPSD) was created by Gupta in the 1970s; see e.g. Gupta (1974). These have pmfs of

the form Pr½X ¼ x� ¼ ax½uðyÞ�x=P

xax½uðyÞ�x. They include distributions derived from Lagrangian expansions by Consul and his

colleagues; see e.g. Consul and Shenton (1972).PSD’s and GPSD’s have many global properties (although some may fail to hold if y¼ 1). Beheading or truncating a GPSD

creates another GPSD, as does the convolution of two GPSD’s. Moreover, given a sample of independent observations froma PSD, then equating the observed and theoretical means gives the maximum likelihood estimate of y. If a pmf can be putinto the form Pr½X ¼ x� ¼ exp½aðfÞbðxÞþcðfÞþdðxÞ�, where f is a parameter and að�Þ; bð�Þ; cð�Þ and dð�Þ are known functions,then the distribution belongs to the important exponential family.

Elsevier B.V.

Page 2: Families of power series distributions, with particular reference to the Lerch family

ARTICLE IN PRESS

A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–22592256

Much of the theory of PSD’s and GPSD’s can be deduced from the probability generating function (pgf)

GðzÞ ¼X

x

axyxzx=

Xx

axyx¼ ZðyzÞ=ZðyÞ:

For the Poisson distribution ZðyÞ ¼ ey, y40, and for the binomial distribution ZðyÞ ¼ ð1þyÞn, n 2 Zþ , y¼ p=ð1�pÞ40. Some

researchers have focussed on the characteristic function (cf) GðeitÞ; others on the Laplace transform Gðe�tÞ.

The moment generating function (mgf) is GðetÞ ¼ ZðyetÞ=ZðyÞ ¼ 1þP

rZ1 mr0 tr=r! where mr

0 are the uncorrected moments.

Often it is easier to obtain the moment properties from the factorial moment generating function (fmgf)

Gð1þtÞ ¼ ZðyþytÞ=ZðyÞ ¼ 1þP

rZ1 m½r�0 tr=r! where m½r�0 are the factorial moments, using the relationships m¼ m10 ¼ m½1�0 ,

m2 ¼ m20 �m2 ¼ m½2�0 þm�m2, etc. The cumulant generating function (cgf), ln GðetÞ ¼

PrZ1 krtr=r!, and the factorial cumulant

generating function (fcgf), ln Gð1þtÞ ¼P

rZ1 k½r�tr=r!, are also useful theoretically. For example, krþ1 ¼ ydkr=dy. So if

m¼ k1 is known as a function of y, then all the other cumulants and hence all the moments are known.

2. Choices for the series function

Many different types of function for ZðyÞ have been found useful. When the function has many mathematical properties,the resultant distribution generally has many statistical properties, e.g. the exponential function, ZðyÞ ¼ ey, gives the pgfGðzÞ ¼ expðyzÞ=expðyÞ and the Poisson distribution. The nineteenth century ‘‘wooden plough’’ of Gaussian hypergeometricfunctions was one of the first classes of mathematical functions to be explored; see e.g. Johnson et al. (2005, Section 6.2).Here the series function is ZðyÞ ¼ 2F1½a;b; c; y�, giving

GðzÞ ¼ 2F1 ½a; b; c; yz�

2F1½a; b; c; y�:

The series function, AFB½a1; . . . ; aA; c1; . . . ; cB; y� (a generalized hypergeometric function) is also used. The geometricdistribution is the simplest in these families, with ZðyÞ ¼ 2F1½1;0;0;y� ¼ 1F1½1;�; y� ¼ ð1�yÞ and

GðzÞ ¼ 2F1 ½1;0;0; yz�

2F1½1;0;0; y�¼ ð1�yÞ

X1x ¼ 0

yxzx; 0oyo1:

Interest in the q-series family of distributions with series functions of the form

ZðyÞ ¼ AfB½a1; . . . ;aA; g1; . . . ; gb; q; y�

(see Johnson et al., 2005, Section 11.2.20) began with Dunkl’s (1981) study of the inverse absorption distribution andcontinues at this Conference with Charalambides’ paper on a q-binomial distribution and the paper by Kyriakoussis andVamvakari. The simplest member of the q-series family is again the geometric distribution; here ZðyÞ ¼ 1f0½q;�; q; y� ¼1=ð1�yÞ and again

GðzÞ ¼ 1f0 ½q;�; q; yz�

1f0½q;�; q; y�¼ ð1�yÞ

X1x ¼ 0

yxzx; 0oyo1:

The Lerch (1887) function (Lerch’s transcendent function) is a less well-known generalization of the geometric series. Here

Fðr; c; nÞ ¼X1x ¼ 0

ðnþxÞ�crx with Fðr;0;1Þ ¼X1x ¼ 0

rx when 0oro1

!;

see e.g. Erdelyi et al. (1953, Higher Transcendental Functions, Vol. 1, eqn. (1.11.1)) Gradshteyn and Ryzhik (1980, Table of

Integrals, Series and Products, Section 9.55), Weisstein (1998, CRC Concise Encyclopedia of Mathematics, p. 1070).The Riemann zeta function is

zðcÞ ¼X1x ¼ 1

1

xc¼X1x ¼ 0

1

ð1þxÞc¼Fð1; c;1Þ;

the generalized zeta (Hurwitz zeta) function is

zðc; nÞ ¼X1x ¼ 0

1

ðnþxÞc¼Fð1; c; nÞ;

we have

n�12F1ð1; n; nþ1;rÞ ¼

X1x ¼ 0

rx

ðnþxÞ¼Fðr;1; nÞ;

Page 3: Families of power series distributions, with particular reference to the Lerch family

ARTICLE IN PRESS

A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–2259 2257

and the polylogarithmic (Joncqui�ere’s) function (Erdelyi et al., 1953, Eqn. (1.11.14)) is

Fðr; cÞ ¼X1x ¼ 1

rx

xc¼ r

X1x ¼ 0

rx

ð1þxÞc¼ rFðr; c;1Þ:

In the statistical literature there is confusion concerning the Lerch function. It appears correctly in Kulasekera and Tonkyn(1992) and in Wimmer and Altmann (1999). Unfortunately there is inconsistancy in Johnson et al. (2005, Section 11.2.20)concerning the lower limit of summation. In Zornig and Altmann (1995), the seminal paper on the Lerch family, the secondand third parameters are in reverse order and the limits of summation are given as x=1 to 1.

3. Members of the Lerch family

For GðzÞ ¼Fðrz; c; nÞ=Fðr; c; nÞ to be a valid pgf with support 0;1; . . . we need r40, n40 and also, for convergence,either

ðiÞ r¼ 1; c41 or ðiiÞ 0oro1; �1oco1;

see e.g. Johnson et al. (2005, p. 527). Tail truncation is necessary when the series does not converge. Zornig and Altmann(1995) gave a list of members of the Lerch family accompanied by extensive references going back to Estoup (1916).Table 1 gives the series functions, pmf, support, and constraints for such distributions; in addition to Zornig and Altmann(1995) see Johnson et al. (2005, Section 11.2.20) and Gupta et al. (2008). We note that the Reimann zeta and Hurwitz zetadistributions of Lin and Hu (2001) and Hu et al. (2006) are the usual Reimann zeta and Hurwitz zeta distributions moved tosupport �ln x, x¼ 1;2; . . . . The recent paper by Shi Shan (2005) uses a different definition of a generalized Zipf function; hisdistributions include the Yule but not the geometric distribution.

For distributions with support 0;1; . . . the pgf is GðzÞ ¼Fðrz; c; nÞ=Fðr; c; nÞ.For distributions with support 1;2; . . . the pgf is GðzÞ ¼ zFðrz; c; nÞ=Fðr; c; nÞ.And for distributions with support 1;2; . . . ;n the pgf is

GðzÞ ¼zFðrz; c; nÞ�znþ1rnFðrz; c; nþnÞ

Fðr; c; nÞ�rnFðr; c; nþnÞ:

The Riemann zeta distribution is also known as the discrete Pareto distribution and as the Zipf distribution. It has beenapplied to a variety of data, for example to numbers of insurance policies per individual by Seal (1947), in linguistics byGood (1957), and for the distribution of surnames by Fox and Lasker (1983). Seal (1952) discussed methods for fitting thedistribution. The series Fð1; c;1Þ does not converge when c=0, 1; tail truncation produces the discrete rectangular whenc=0 and the Estoup when c=1. The Estoup and the Lotka (c=2) distributions have been used in ranking problems.

The Hurwitz zeta distribution has been used for ranking problems in linguistics; see e.g. Mandelbrot (1959) and WenChen (1980). The series Fð1;1; nÞ does not converge; tail truncation yields the Zipf–Mandelbrot distribution.

The Good-I distribution was used in population studies by Good (1953), for word frequency data by Good (1957), andfor sizes of business firms by Ijiri and Simon (1977). It is a size-biased logarithmic distribution and was found useful forspecies per genus data by Kemp (1995) who called it the polylogarithmic distribution. Kulasekera and Tonkyn (1992)relaxed the parameter constraint that had previously been assumed, changing it to �1oco1. They showed that for c40the distribution (like those previously discussed) is reversed J-shaped, while for co0 the distribution is unimodal. Theydiscussed in depth its properties for survival studies, moments, genesis, and parameter estimation. Simon (1955)investigated the closeness of the Good-I and the Yule distribution. Doray and Luong (1997) reconsidered parameterestimation, with emphasis on maximum likelihood estimation and a quadratic distance method. The geometric andlogarithmic distributions are the special cases c=0 and 1.

Table 1Distributions belonging to the Lerch family (C is an appropriate summation constant; except where n¼ 1 the constraint is n40).

Distribution Series function pmf(=px) Support Constraints

Riemann zeta zðcÞ ¼Fð1; c;1Þ C/xc x=1,2,y c41

Rectangular zð0Þ ¼Fð1;0;1Þ C x=1,2,y,n –

Estoup zð1Þ ¼Fð1;1;1Þ C/x x=1,2,y,n –

Lotka zð2Þ ¼Fð1;2;1Þ C/x2 x=1,2,y –

Hurwitz zeta zðc; nÞ ¼Fð1; c; nÞ C=ðn�1þxÞc x=1,2y c41

Zipf–Mandelbrot zð1; nÞ ¼Fð1;1; nÞ C=ðn�1þxÞ x=1,2,y,n –

Good-I Fðr; cÞ ¼ rFðr; c;1Þ Crx=xc x=1,2,y 0oro1;�1oco1Geometric Fðr;0Þ ¼ rFðr;0;1Þ Crx x=1,2,y 0oro1

Logarithmic Fðr;1Þ ¼rFðr;1;1Þ Crx=x x=1,2,y 0oro1

Pearson-III Fðr; c; nÞ Crx=ðnþxÞc x=0.1,y 0oro1; co0,

Good-II rFðr;1; nÞ Crx=ðn�1þxÞ x=1,2,y 0oro1

Page 4: Families of power series distributions, with particular reference to the Lerch family

ARTICLE IN PRESS

A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–22592258

The only published member of the Lerch family with choice for all three parameters is the discrete Pearson-IIIdistribution. This was obtained by Haight (1957) as an equilibrium queue size distribution for the M/M/1 queue withbalking. The Good-II distribution is the special case with c=1; it appears in Good’s (1957) paper on modelling long-taileddata.

Although Zornig and Altmann (1995) did not use the standard mathematical form of the Lerch function, their paperprovides an interesting discussion of parameter estimation for Lerch distributions, examples of fitting to empirical data onsurnames, and an extensive bibliography.

Recent applications using members of the Lerch family include: numbers of e-mail contacts (Adamic and Huberman,2002); DNA sequencing (Aksenov et al., 2002); and numbers of bankruptcies (Fujiwara, 2004).

4. Stochastic process models for the Lerch family

Power series distributions arise in many ways, including (i) sampling (e.g. the binomial and classical hypergeometricdistributions), (ii) from stochastic processes (e.g. the assumption of exponential interarrival times leads to Poissonnumbers of arrivals, the lack of memory process also characterizes the Poisson distribution), and (iii) the construction ofnew mathematical expressions for ZðyÞ when fitting data and when approximating other distributions.

Many Lerch-type distributions were created by method (iii) and subsequently gained popularity because of the good fitsthat they provided.

Stochastic process models for the negative binomial distribution were the subject of Kendall’s (1948) innovative paper.His limiting process for the logarithmic distribution was adapted by Caraco (1979) in his study of avian foraging groupswith size x¼ 1;2; . . . . The derivation is equivalent to assuming an equilibrium birth and death process with birth and deathrates li ¼ li, i¼ 1;2; . . ., mi ¼ mi, i¼ 2;3: . . ., and l0 ¼ m0 ¼ m1 ¼ 0. From the forward Kolmogorov equations the equilibriumsolution is

px

px�1¼

lx�1

mx

i:e: px ¼ p1 rx�1=x; x¼ 2;3; . . . ;

where r¼ l=m. The series function is rFðr;1;1Þ and the pgf is rzFðrz;1;1Þ=rFðr;1;1Þ.Kemp’s (1995) genesis of the Good-I (polylogarithmic) distribution for the number of species per genus, x¼ 1;2; . . .,

assumes that li ¼ lic , i¼ 1;2; . . ., mi ¼ mic , i¼ 2;3: . . ., and l0 ¼ m0 ¼ m1 ¼ 0. This gives

px ¼ p1 rx�1=xc ; x¼ 2;3; . . . ;

the series function is rFðr; c;1Þ and the pgf is rzFðrz; c;1Þ=rFðr; c;1Þ.Kulasekera and Tonkyn (1992) were interested in the survival theory properties of the Good-I distribution with support

x¼ 1;2; . . . . Incidentally they showed that it is the equilibrium solution of a density-dependent birth and death process

with li ¼ r½ðiþ1Þ=i�r , i¼ 1;2; . . . ; miþ1 ¼ ½ðiþ1Þ=i�s, i¼ 2;3; . . . ; and l0 ¼ m0 ¼ m1 ¼ 0. This gives px ¼ p1rx�1xr�s; x¼ 2;3; . . . .

The model allows for independent density dependent increasing/decreasing birth and death rates. Here the series functionis rFðr; s�r;1Þ and the pgf is rzFðrz; s�r;1Þ=rFðr; s�r;1Þ.

Haight’s (1957) queueing-with-balking model for the discrete Pearson type-III distribution is also equivalent to anequilibrium birth and death process. Consider the M/M/1 queue of size x¼ 0;1; . . . with constant service rate and balking‘distribution’

bðxÞ ¼aðaþxþ1Þ

ðaþ1ÞðaþxÞ

� �g; x¼ 0;1; . . . :

Here li ¼ lbðiÞ, i¼ 0;1; . . ., mi ¼ m, i¼ 1;2; . . ., and m0 ¼ 0. This gives

px

px�1¼

lm

aðaþxÞ

ðaþ1Þðaþx�1Þ

� �g;

i.e. px ¼ Crx=ðaþxÞc , where r¼ lagm�1ðaþ1Þ�g and c¼�g. The series function is Fðr;�g; aÞ and the pgf iszFðrz;�g; aÞ=Fðr;�g; aÞ.

We can generalize these results in the following manner, thereby giving a general equilibrium birth and death process forthe Lerch family. Suppose that the birth and death rates are li ¼ l½ðnþ iþ1Þ=ðnþ iÞ�r , i¼ 0;1; . . . ; and miþ1 ¼ m½ðnþ iþ1Þ=ðnþ iÞ�s, i¼ 1;2; . . . ; m0 ¼ 0. This gives

pxp lx nþ1

n�nþ2

nþ1� � �

nþx

nþx�1

� �r� �mx nþ1

n�nþ2

nþ1� � �

nþx

nþx�1

� �s� �:

x¼ 0;1; . . . ; i.e.

px ¼ Crx=ðnþxÞc where r¼ l=m; c¼ s�r;

the series function is Fðr; c; nÞ and the pgf is Fðrz; c; nÞ=Fðr; c; nÞ.

Page 5: Families of power series distributions, with particular reference to the Lerch family

ARTICLE IN PRESS

A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–2259 2259

5. Comments

With notable exceptions, such as Zornig and Altmann (1995), very few papers in the literature have studied the entireLerch family. Recently Gut (2005) has shown that the Riemann zeta distribution is a compound Poisson distribution andhas given expressions for its higher moments. Also this distribution is log-convex and therefore when shifted to support0;1; . . . ; it is infinitely divisible, i.e. it is a generalized Poisson distribution. Also Meintanis (2007) has suggested a teststatistic for this distribution and has put forward a bootstrap method given an unknown parameter.

The purpose of this paper is to highlight the advantage of classifying power series distributions into families.Distributions within a family can be expected to have analogous properties conferred on them by the mathematical form ofthe series function. Generalization of known properties for individual distributions will give wider-holding results. Theprevious section showed, for example, that equilibrium birth and death models can be constructed for the whole Lerchfamily.

Very recently two important papers that deal with the entire Lerch family have become available. Gupta et al. (2008)have obtained important results concerning its reliability properties, discussed maximum likelihood estimation of theparameters and shown that the whole family is log-convex and hence that shifted to the origin it is infinitely divisible. Inpreprints submitted to the Web (February 2008), Aksenov and Savageau (2008) refer to a Mathematica package for theLerch distribution, Aksenov (2004); they obtain certain general results for the family, include comments on the variance tomean ratio, and discuss estimation methods, with examples.

References

Adamic, L.A., Huberman, B.A., 2002. Zipf’s law and the internet. Glottometrics 3, 143–150.Aksenov, S.V., 2004. Mathematica package lerchdistribution.m. URL /http://aksenov.freeshell.org/S.Aksenov, S.V., Savageau, M.A., 2008. Some properties of the Lerch family of discrete distributions; Auxiliary information for article ‘‘Some properties of the

Lerch family of discrete distributions’’, February, Preprints submitted to Elsevier Science and available on the Web.Aksenov, S.V., Savageau, M.A., Jentschura, U.D., Becher, J., Soff, G., Mohr, P.J., 2002. Application of the combined non-linear-condensation transformation to

problems in statistical analysis and theoretical physics. arXiv:math.NA/0207086 v1.Caraco, T., 1979. Ecological response of animal group size frequencies. In: Ord, J.K., Patil, G.P., Taillie, C. (Eds.), Statistical Ecology. Statistical Distributions

in Ecological Work, vol. 4. International Co-operative Publishing House, Fairland, MD, pp. 371–386.Consul, P.C., Shenton, L.R., 1972. Some interesting properties of Lagrangian distributions. Communications in Statistics 2, 263–272.Doray, L.G., Luong, A., 1997. Efficient estimators for the Good family. Communications in Statistics-Simulation and Computation 26, 1075–1088.Dunkl, C.F., 1981. The absorption distribution and the q-binomial theorem. Communications in Statistics-Theory and Methods 10, 1915–1920.Erdelyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G., 1953. Higher Transcendental Functions, vol. I. McGraw-Hill, New York.Estoup, J.B., 1916. Les Gammes Stenographiques. Institut Stenographique, Paris.Fox, W.R., Lasker, G.W., 1983. The distribution of surname frequencies. International Statistical Review 51, 81–87.Fujiwara, Y., 2004. Zipf law in firms bankruptcy. Physica A 337, 219–230.Good, I.J., 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–264.Good, I.J., 1957. Distribution of word frequencies. Nature 179, 595.Gradshteyn, I.S., Ryzhik, I.M., 1980. Tables of Integrals, Series and Products (Enlarged edition). Academic Press, London.Gupta, R.C., 1974. Modified power series distributions and some of its applications. Sankhya, Series B 35, 288–298.Gupta, P.L., Gupta, R.C., Ong, S.-H., Srivastava, H.M., 2008. A class of Hurwitz-Lerch Zeta distributions and their applications in reliability. Applied

Mathematics and Computation 196, 521–531.Gut, A., 2005. Some remarks on the Riemann zeta distribution. Uppsala University Department of Mathematics Report 2005:6.Haight, F.A., 1957. Queueing with balking. Biometrika 44, 360–369.Hu, C.-Y., Iksanov, A.M., Lin, G.D., Zakusylo, O.K., 2006. The Hurwitz zeta distribution. Australian and New Zealand Journal of Statistics 48, 1–6.Ijiri, Y., Simon, H.A., 1977. Skew Distributions and the Size of Business Firms. North-Holland, Amsterdam.Johnson, N.L., Kemp, A.W., Kotz, S., 2005. Univariate Discrete Distributions (3e). Wiley, Hoboken, NJ.Kemp, A.W., 1995. Splitters, lumpers and species per genus. The Mathematical Scientist 20, 107–118.Kendall, D.G., 1948. On some modes of population growth leading to R.A. Fisher’s logarithmic series distribution. Biometrika 35, 6–15.Kulasekera, K.B., Tonkyn, D.W., 1992. A new distribution with applications to survival dispersal and dispersion. Communications in Statistics-Simulation

and Computation 21, 499–518.Lerch, M., 1887. Acta Mathematica, XI.Lin, G.D., Hu, C.-Y., 2001. The Riemann zeta distribution. Bernoulli 7, 817–828.Mandelbrot, B., 1959. A note on a class of skew distribution functions: analysis and critique of a paper by H.A. Simon. Information and Control 2, 90–99.Meintanis, S.G., 2007. A unified approach of testing for discrete and continuous Pareto laws Statistical Papers 50, 569–580.Patil, G.P., 1962. Certain properties of the generalized power series distribution. Annals of the Institute of Statistical Mathematics, Tokyo 14, 179–182.Seal, H.L., 1947. A probability distribution of deaths at age x when policies are counted instead of lives. Skandinavisk Aktuarietidskrift 30, 18–43.Seal, H.L., 1952. The maximum likelihood fitting of the discrete Pareto law. Journal of the Institute of Actuaries 78, 115–121.Shi Shan, 2005. On the generalized Zipf distribution, Part I. Information Processing and Management 41, 1369–1386.Simon, H.A., 1955. On a class of skew distribution functions. Biometrika 42, 425–440.Weisstein, E.W., 1998. CRC Concise Encyclopedia of Mathematics. CRC Press, Boca Raton.Wen Chen, 1980. On the weak form of Zipf’s law. Journal of Applied Probability 17, 611–622.Wimmer, G., Altmann, G., 1999. Thesaurus of Univariate Discrete Probability Distributions. Stamm Verlag, Essen.Zornig, P., Altmann, G., 1995. Unified representation of Zipf distributions. Computational Statistics and Data Analysis 19, 461–473.