Upload
adrienne-w-kemp
View
216
Download
3
Embed Size (px)
Citation preview
ARTICLE IN PRESS
Contents lists available at ScienceDirect
Journal of Statistical Planning and Inference
Journal of Statistical Planning and Inference 140 (2010) 2255–2259
0378-37
doi:10.1
E-m
journal homepage: www.elsevier.com/locate/jspi
Families of power series distributions, with particular referenceto the Lerch family
Adrienne W. Kemp
School of Mathematics and Statistics, University of St Andrews, Scotland
a r t i c l e i n f o
Available online 20 January 2010
MSC:
33E20
60E99
60G10
Keywords:
Families of power series distributions
Lerch family
Stochastic process models
58/$ - see front matter & 2010 Published by
016/j.jspi.2010.01.021
ail address: [email protected]
a b s t r a c t
The paper revisits the concept of a power series distribution by defining its series
function, its power parameter, and hence its probability generating function. Realization
that the series function for a particular distribution is a special case of a recognized
mathematical function enables distributions to be classified into families. Examples are
the generalized hypergeometric family and the q-series family, both of which contain
generalizations of the geometric distribution. The Lerch function (a third generalization
of the geometric series) is the series function for the Lerch family. A list of distributions
belonging to the Lerch family is provided.
The advantage of classifying power series distributions into families is that
distributions within a family can be expected to have analogous properties which
depend on the mathematical properties of the series function. This is demonstrated by
focussing on equilibrium birth and death processes for the Lerch family.
& 2010 Published by Elsevier B.V.
1. Introduction
Most discrete statistical distributions are lattice distributions. The number of values can be either infinite, e.g. 0;1; . . .for the Poisson distribution, or finite, e.g. 0;1; . . . ;n for the binomial distribution. They were studied intermittently pre-1960. In the 1960s Patil, e.g. Patil (1962), made great advances in the theory of the power series subclass (PSD) and thegeneralized power series subclass (GPSD). PSD’s have probability mass functions (pmfs) of the form
Pr½X ¼ x� ¼axy
x
ZðyÞ; x¼ 0;1; . . . ; y40;
where axZ0, and ZðyÞ ¼P1
x ¼ 0 axyx. We call y the power parameter of the distribution and ZðyÞ its series function. For
GPSD’s the set of values that the variable can take is any nonempty enumerable set of nonnegative integers. The class ofmodified power series distributions (MPSD) was created by Gupta in the 1970s; see e.g. Gupta (1974). These have pmfs of
the form Pr½X ¼ x� ¼ ax½uðyÞ�x=P
xax½uðyÞ�x. They include distributions derived from Lagrangian expansions by Consul and his
colleagues; see e.g. Consul and Shenton (1972).PSD’s and GPSD’s have many global properties (although some may fail to hold if y¼ 1). Beheading or truncating a GPSD
creates another GPSD, as does the convolution of two GPSD’s. Moreover, given a sample of independent observations froma PSD, then equating the observed and theoretical means gives the maximum likelihood estimate of y. If a pmf can be putinto the form Pr½X ¼ x� ¼ exp½aðfÞbðxÞþcðfÞþdðxÞ�, where f is a parameter and að�Þ; bð�Þ; cð�Þ and dð�Þ are known functions,then the distribution belongs to the important exponential family.
Elsevier B.V.
ARTICLE IN PRESS
A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–22592256
Much of the theory of PSD’s and GPSD’s can be deduced from the probability generating function (pgf)
GðzÞ ¼X
x
axyxzx=
Xx
axyx¼ ZðyzÞ=ZðyÞ:
For the Poisson distribution ZðyÞ ¼ ey, y40, and for the binomial distribution ZðyÞ ¼ ð1þyÞn, n 2 Zþ , y¼ p=ð1�pÞ40. Some
researchers have focussed on the characteristic function (cf) GðeitÞ; others on the Laplace transform Gðe�tÞ.
The moment generating function (mgf) is GðetÞ ¼ ZðyetÞ=ZðyÞ ¼ 1þP
rZ1 mr0 tr=r! where mr
0 are the uncorrected moments.
Often it is easier to obtain the moment properties from the factorial moment generating function (fmgf)
Gð1þtÞ ¼ ZðyþytÞ=ZðyÞ ¼ 1þP
rZ1 m½r�0 tr=r! where m½r�0 are the factorial moments, using the relationships m¼ m10 ¼ m½1�0 ,
m2 ¼ m20 �m2 ¼ m½2�0 þm�m2, etc. The cumulant generating function (cgf), ln GðetÞ ¼
PrZ1 krtr=r!, and the factorial cumulant
generating function (fcgf), ln Gð1þtÞ ¼P
rZ1 k½r�tr=r!, are also useful theoretically. For example, krþ1 ¼ ydkr=dy. So if
m¼ k1 is known as a function of y, then all the other cumulants and hence all the moments are known.
2. Choices for the series function
Many different types of function for ZðyÞ have been found useful. When the function has many mathematical properties,the resultant distribution generally has many statistical properties, e.g. the exponential function, ZðyÞ ¼ ey, gives the pgfGðzÞ ¼ expðyzÞ=expðyÞ and the Poisson distribution. The nineteenth century ‘‘wooden plough’’ of Gaussian hypergeometricfunctions was one of the first classes of mathematical functions to be explored; see e.g. Johnson et al. (2005, Section 6.2).Here the series function is ZðyÞ ¼ 2F1½a;b; c; y�, giving
GðzÞ ¼ 2F1 ½a; b; c; yz�
2F1½a; b; c; y�:
The series function, AFB½a1; . . . ; aA; c1; . . . ; cB; y� (a generalized hypergeometric function) is also used. The geometricdistribution is the simplest in these families, with ZðyÞ ¼ 2F1½1;0;0;y� ¼ 1F1½1;�; y� ¼ ð1�yÞ and
GðzÞ ¼ 2F1 ½1;0;0; yz�
2F1½1;0;0; y�¼ ð1�yÞ
X1x ¼ 0
yxzx; 0oyo1:
Interest in the q-series family of distributions with series functions of the form
ZðyÞ ¼ AfB½a1; . . . ;aA; g1; . . . ; gb; q; y�
(see Johnson et al., 2005, Section 11.2.20) began with Dunkl’s (1981) study of the inverse absorption distribution andcontinues at this Conference with Charalambides’ paper on a q-binomial distribution and the paper by Kyriakoussis andVamvakari. The simplest member of the q-series family is again the geometric distribution; here ZðyÞ ¼ 1f0½q;�; q; y� ¼1=ð1�yÞ and again
GðzÞ ¼ 1f0 ½q;�; q; yz�
1f0½q;�; q; y�¼ ð1�yÞ
X1x ¼ 0
yxzx; 0oyo1:
The Lerch (1887) function (Lerch’s transcendent function) is a less well-known generalization of the geometric series. Here
Fðr; c; nÞ ¼X1x ¼ 0
ðnþxÞ�crx with Fðr;0;1Þ ¼X1x ¼ 0
rx when 0oro1
!;
see e.g. Erdelyi et al. (1953, Higher Transcendental Functions, Vol. 1, eqn. (1.11.1)) Gradshteyn and Ryzhik (1980, Table of
Integrals, Series and Products, Section 9.55), Weisstein (1998, CRC Concise Encyclopedia of Mathematics, p. 1070).The Riemann zeta function is
zðcÞ ¼X1x ¼ 1
1
xc¼X1x ¼ 0
1
ð1þxÞc¼Fð1; c;1Þ;
the generalized zeta (Hurwitz zeta) function is
zðc; nÞ ¼X1x ¼ 0
1
ðnþxÞc¼Fð1; c; nÞ;
we have
n�12F1ð1; n; nþ1;rÞ ¼
X1x ¼ 0
rx
ðnþxÞ¼Fðr;1; nÞ;
ARTICLE IN PRESS
A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–2259 2257
and the polylogarithmic (Joncqui�ere’s) function (Erdelyi et al., 1953, Eqn. (1.11.14)) is
Fðr; cÞ ¼X1x ¼ 1
rx
xc¼ r
X1x ¼ 0
rx
ð1þxÞc¼ rFðr; c;1Þ:
In the statistical literature there is confusion concerning the Lerch function. It appears correctly in Kulasekera and Tonkyn(1992) and in Wimmer and Altmann (1999). Unfortunately there is inconsistancy in Johnson et al. (2005, Section 11.2.20)concerning the lower limit of summation. In Zornig and Altmann (1995), the seminal paper on the Lerch family, the secondand third parameters are in reverse order and the limits of summation are given as x=1 to 1.
3. Members of the Lerch family
For GðzÞ ¼Fðrz; c; nÞ=Fðr; c; nÞ to be a valid pgf with support 0;1; . . . we need r40, n40 and also, for convergence,either
ðiÞ r¼ 1; c41 or ðiiÞ 0oro1; �1oco1;
see e.g. Johnson et al. (2005, p. 527). Tail truncation is necessary when the series does not converge. Zornig and Altmann(1995) gave a list of members of the Lerch family accompanied by extensive references going back to Estoup (1916).Table 1 gives the series functions, pmf, support, and constraints for such distributions; in addition to Zornig and Altmann(1995) see Johnson et al. (2005, Section 11.2.20) and Gupta et al. (2008). We note that the Reimann zeta and Hurwitz zetadistributions of Lin and Hu (2001) and Hu et al. (2006) are the usual Reimann zeta and Hurwitz zeta distributions moved tosupport �ln x, x¼ 1;2; . . . . The recent paper by Shi Shan (2005) uses a different definition of a generalized Zipf function; hisdistributions include the Yule but not the geometric distribution.
For distributions with support 0;1; . . . the pgf is GðzÞ ¼Fðrz; c; nÞ=Fðr; c; nÞ.For distributions with support 1;2; . . . the pgf is GðzÞ ¼ zFðrz; c; nÞ=Fðr; c; nÞ.And for distributions with support 1;2; . . . ;n the pgf is
GðzÞ ¼zFðrz; c; nÞ�znþ1rnFðrz; c; nþnÞ
Fðr; c; nÞ�rnFðr; c; nþnÞ:
The Riemann zeta distribution is also known as the discrete Pareto distribution and as the Zipf distribution. It has beenapplied to a variety of data, for example to numbers of insurance policies per individual by Seal (1947), in linguistics byGood (1957), and for the distribution of surnames by Fox and Lasker (1983). Seal (1952) discussed methods for fitting thedistribution. The series Fð1; c;1Þ does not converge when c=0, 1; tail truncation produces the discrete rectangular whenc=0 and the Estoup when c=1. The Estoup and the Lotka (c=2) distributions have been used in ranking problems.
The Hurwitz zeta distribution has been used for ranking problems in linguistics; see e.g. Mandelbrot (1959) and WenChen (1980). The series Fð1;1; nÞ does not converge; tail truncation yields the Zipf–Mandelbrot distribution.
The Good-I distribution was used in population studies by Good (1953), for word frequency data by Good (1957), andfor sizes of business firms by Ijiri and Simon (1977). It is a size-biased logarithmic distribution and was found useful forspecies per genus data by Kemp (1995) who called it the polylogarithmic distribution. Kulasekera and Tonkyn (1992)relaxed the parameter constraint that had previously been assumed, changing it to �1oco1. They showed that for c40the distribution (like those previously discussed) is reversed J-shaped, while for co0 the distribution is unimodal. Theydiscussed in depth its properties for survival studies, moments, genesis, and parameter estimation. Simon (1955)investigated the closeness of the Good-I and the Yule distribution. Doray and Luong (1997) reconsidered parameterestimation, with emphasis on maximum likelihood estimation and a quadratic distance method. The geometric andlogarithmic distributions are the special cases c=0 and 1.
Table 1Distributions belonging to the Lerch family (C is an appropriate summation constant; except where n¼ 1 the constraint is n40).
Distribution Series function pmf(=px) Support Constraints
Riemann zeta zðcÞ ¼Fð1; c;1Þ C/xc x=1,2,y c41
Rectangular zð0Þ ¼Fð1;0;1Þ C x=1,2,y,n –
Estoup zð1Þ ¼Fð1;1;1Þ C/x x=1,2,y,n –
Lotka zð2Þ ¼Fð1;2;1Þ C/x2 x=1,2,y –
Hurwitz zeta zðc; nÞ ¼Fð1; c; nÞ C=ðn�1þxÞc x=1,2y c41
Zipf–Mandelbrot zð1; nÞ ¼Fð1;1; nÞ C=ðn�1þxÞ x=1,2,y,n –
Good-I Fðr; cÞ ¼ rFðr; c;1Þ Crx=xc x=1,2,y 0oro1;�1oco1Geometric Fðr;0Þ ¼ rFðr;0;1Þ Crx x=1,2,y 0oro1
Logarithmic Fðr;1Þ ¼rFðr;1;1Þ Crx=x x=1,2,y 0oro1
Pearson-III Fðr; c; nÞ Crx=ðnþxÞc x=0.1,y 0oro1; co0,
Good-II rFðr;1; nÞ Crx=ðn�1þxÞ x=1,2,y 0oro1
ARTICLE IN PRESS
A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–22592258
The only published member of the Lerch family with choice for all three parameters is the discrete Pearson-IIIdistribution. This was obtained by Haight (1957) as an equilibrium queue size distribution for the M/M/1 queue withbalking. The Good-II distribution is the special case with c=1; it appears in Good’s (1957) paper on modelling long-taileddata.
Although Zornig and Altmann (1995) did not use the standard mathematical form of the Lerch function, their paperprovides an interesting discussion of parameter estimation for Lerch distributions, examples of fitting to empirical data onsurnames, and an extensive bibliography.
Recent applications using members of the Lerch family include: numbers of e-mail contacts (Adamic and Huberman,2002); DNA sequencing (Aksenov et al., 2002); and numbers of bankruptcies (Fujiwara, 2004).
4. Stochastic process models for the Lerch family
Power series distributions arise in many ways, including (i) sampling (e.g. the binomial and classical hypergeometricdistributions), (ii) from stochastic processes (e.g. the assumption of exponential interarrival times leads to Poissonnumbers of arrivals, the lack of memory process also characterizes the Poisson distribution), and (iii) the construction ofnew mathematical expressions for ZðyÞ when fitting data and when approximating other distributions.
Many Lerch-type distributions were created by method (iii) and subsequently gained popularity because of the good fitsthat they provided.
Stochastic process models for the negative binomial distribution were the subject of Kendall’s (1948) innovative paper.His limiting process for the logarithmic distribution was adapted by Caraco (1979) in his study of avian foraging groupswith size x¼ 1;2; . . . . The derivation is equivalent to assuming an equilibrium birth and death process with birth and deathrates li ¼ li, i¼ 1;2; . . ., mi ¼ mi, i¼ 2;3: . . ., and l0 ¼ m0 ¼ m1 ¼ 0. From the forward Kolmogorov equations the equilibriumsolution is
px
px�1¼
lx�1
mx
i:e: px ¼ p1 rx�1=x; x¼ 2;3; . . . ;
where r¼ l=m. The series function is rFðr;1;1Þ and the pgf is rzFðrz;1;1Þ=rFðr;1;1Þ.Kemp’s (1995) genesis of the Good-I (polylogarithmic) distribution for the number of species per genus, x¼ 1;2; . . .,
assumes that li ¼ lic , i¼ 1;2; . . ., mi ¼ mic , i¼ 2;3: . . ., and l0 ¼ m0 ¼ m1 ¼ 0. This gives
px ¼ p1 rx�1=xc ; x¼ 2;3; . . . ;
the series function is rFðr; c;1Þ and the pgf is rzFðrz; c;1Þ=rFðr; c;1Þ.Kulasekera and Tonkyn (1992) were interested in the survival theory properties of the Good-I distribution with support
x¼ 1;2; . . . . Incidentally they showed that it is the equilibrium solution of a density-dependent birth and death process
with li ¼ r½ðiþ1Þ=i�r , i¼ 1;2; . . . ; miþ1 ¼ ½ðiþ1Þ=i�s, i¼ 2;3; . . . ; and l0 ¼ m0 ¼ m1 ¼ 0. This gives px ¼ p1rx�1xr�s; x¼ 2;3; . . . .
The model allows for independent density dependent increasing/decreasing birth and death rates. Here the series functionis rFðr; s�r;1Þ and the pgf is rzFðrz; s�r;1Þ=rFðr; s�r;1Þ.
Haight’s (1957) queueing-with-balking model for the discrete Pearson type-III distribution is also equivalent to anequilibrium birth and death process. Consider the M/M/1 queue of size x¼ 0;1; . . . with constant service rate and balking‘distribution’
bðxÞ ¼aðaþxþ1Þ
ðaþ1ÞðaþxÞ
� �g; x¼ 0;1; . . . :
Here li ¼ lbðiÞ, i¼ 0;1; . . ., mi ¼ m, i¼ 1;2; . . ., and m0 ¼ 0. This gives
px
px�1¼
lm
aðaþxÞ
ðaþ1Þðaþx�1Þ
� �g;
i.e. px ¼ Crx=ðaþxÞc , where r¼ lagm�1ðaþ1Þ�g and c¼�g. The series function is Fðr;�g; aÞ and the pgf iszFðrz;�g; aÞ=Fðr;�g; aÞ.
We can generalize these results in the following manner, thereby giving a general equilibrium birth and death process forthe Lerch family. Suppose that the birth and death rates are li ¼ l½ðnþ iþ1Þ=ðnþ iÞ�r , i¼ 0;1; . . . ; and miþ1 ¼ m½ðnþ iþ1Þ=ðnþ iÞ�s, i¼ 1;2; . . . ; m0 ¼ 0. This gives
pxp lx nþ1
n�nþ2
nþ1� � �
nþx
nþx�1
� �r� �mx nþ1
n�nþ2
nþ1� � �
nþx
nþx�1
� �s� �:
�
x¼ 0;1; . . . ; i.e.
px ¼ Crx=ðnþxÞc where r¼ l=m; c¼ s�r;
the series function is Fðr; c; nÞ and the pgf is Fðrz; c; nÞ=Fðr; c; nÞ.
ARTICLE IN PRESS
A.W. Kemp / Journal of Statistical Planning and Inference 140 (2010) 2255–2259 2259
5. Comments
With notable exceptions, such as Zornig and Altmann (1995), very few papers in the literature have studied the entireLerch family. Recently Gut (2005) has shown that the Riemann zeta distribution is a compound Poisson distribution andhas given expressions for its higher moments. Also this distribution is log-convex and therefore when shifted to support0;1; . . . ; it is infinitely divisible, i.e. it is a generalized Poisson distribution. Also Meintanis (2007) has suggested a teststatistic for this distribution and has put forward a bootstrap method given an unknown parameter.
The purpose of this paper is to highlight the advantage of classifying power series distributions into families.Distributions within a family can be expected to have analogous properties conferred on them by the mathematical form ofthe series function. Generalization of known properties for individual distributions will give wider-holding results. Theprevious section showed, for example, that equilibrium birth and death models can be constructed for the whole Lerchfamily.
Very recently two important papers that deal with the entire Lerch family have become available. Gupta et al. (2008)have obtained important results concerning its reliability properties, discussed maximum likelihood estimation of theparameters and shown that the whole family is log-convex and hence that shifted to the origin it is infinitely divisible. Inpreprints submitted to the Web (February 2008), Aksenov and Savageau (2008) refer to a Mathematica package for theLerch distribution, Aksenov (2004); they obtain certain general results for the family, include comments on the variance tomean ratio, and discuss estimation methods, with examples.
References
Adamic, L.A., Huberman, B.A., 2002. Zipf’s law and the internet. Glottometrics 3, 143–150.Aksenov, S.V., 2004. Mathematica package lerchdistribution.m. URL /http://aksenov.freeshell.org/S.Aksenov, S.V., Savageau, M.A., 2008. Some properties of the Lerch family of discrete distributions; Auxiliary information for article ‘‘Some properties of the
Lerch family of discrete distributions’’, February, Preprints submitted to Elsevier Science and available on the Web.Aksenov, S.V., Savageau, M.A., Jentschura, U.D., Becher, J., Soff, G., Mohr, P.J., 2002. Application of the combined non-linear-condensation transformation to
problems in statistical analysis and theoretical physics. arXiv:math.NA/0207086 v1.Caraco, T., 1979. Ecological response of animal group size frequencies. In: Ord, J.K., Patil, G.P., Taillie, C. (Eds.), Statistical Ecology. Statistical Distributions
in Ecological Work, vol. 4. International Co-operative Publishing House, Fairland, MD, pp. 371–386.Consul, P.C., Shenton, L.R., 1972. Some interesting properties of Lagrangian distributions. Communications in Statistics 2, 263–272.Doray, L.G., Luong, A., 1997. Efficient estimators for the Good family. Communications in Statistics-Simulation and Computation 26, 1075–1088.Dunkl, C.F., 1981. The absorption distribution and the q-binomial theorem. Communications in Statistics-Theory and Methods 10, 1915–1920.Erdelyi, A., Magnus, W., Oberhettinger, F., Tricomi, F.G., 1953. Higher Transcendental Functions, vol. I. McGraw-Hill, New York.Estoup, J.B., 1916. Les Gammes Stenographiques. Institut Stenographique, Paris.Fox, W.R., Lasker, G.W., 1983. The distribution of surname frequencies. International Statistical Review 51, 81–87.Fujiwara, Y., 2004. Zipf law in firms bankruptcy. Physica A 337, 219–230.Good, I.J., 1953. The population frequencies of species and the estimation of population parameters. Biometrika 40, 237–264.Good, I.J., 1957. Distribution of word frequencies. Nature 179, 595.Gradshteyn, I.S., Ryzhik, I.M., 1980. Tables of Integrals, Series and Products (Enlarged edition). Academic Press, London.Gupta, R.C., 1974. Modified power series distributions and some of its applications. Sankhya, Series B 35, 288–298.Gupta, P.L., Gupta, R.C., Ong, S.-H., Srivastava, H.M., 2008. A class of Hurwitz-Lerch Zeta distributions and their applications in reliability. Applied
Mathematics and Computation 196, 521–531.Gut, A., 2005. Some remarks on the Riemann zeta distribution. Uppsala University Department of Mathematics Report 2005:6.Haight, F.A., 1957. Queueing with balking. Biometrika 44, 360–369.Hu, C.-Y., Iksanov, A.M., Lin, G.D., Zakusylo, O.K., 2006. The Hurwitz zeta distribution. Australian and New Zealand Journal of Statistics 48, 1–6.Ijiri, Y., Simon, H.A., 1977. Skew Distributions and the Size of Business Firms. North-Holland, Amsterdam.Johnson, N.L., Kemp, A.W., Kotz, S., 2005. Univariate Discrete Distributions (3e). Wiley, Hoboken, NJ.Kemp, A.W., 1995. Splitters, lumpers and species per genus. The Mathematical Scientist 20, 107–118.Kendall, D.G., 1948. On some modes of population growth leading to R.A. Fisher’s logarithmic series distribution. Biometrika 35, 6–15.Kulasekera, K.B., Tonkyn, D.W., 1992. A new distribution with applications to survival dispersal and dispersion. Communications in Statistics-Simulation
and Computation 21, 499–518.Lerch, M., 1887. Acta Mathematica, XI.Lin, G.D., Hu, C.-Y., 2001. The Riemann zeta distribution. Bernoulli 7, 817–828.Mandelbrot, B., 1959. A note on a class of skew distribution functions: analysis and critique of a paper by H.A. Simon. Information and Control 2, 90–99.Meintanis, S.G., 2007. A unified approach of testing for discrete and continuous Pareto laws Statistical Papers 50, 569–580.Patil, G.P., 1962. Certain properties of the generalized power series distribution. Annals of the Institute of Statistical Mathematics, Tokyo 14, 179–182.Seal, H.L., 1947. A probability distribution of deaths at age x when policies are counted instead of lives. Skandinavisk Aktuarietidskrift 30, 18–43.Seal, H.L., 1952. The maximum likelihood fitting of the discrete Pareto law. Journal of the Institute of Actuaries 78, 115–121.Shi Shan, 2005. On the generalized Zipf distribution, Part I. Information Processing and Management 41, 1369–1386.Simon, H.A., 1955. On a class of skew distribution functions. Biometrika 42, 425–440.Weisstein, E.W., 1998. CRC Concise Encyclopedia of Mathematics. CRC Press, Boca Raton.Wen Chen, 1980. On the weak form of Zipf’s law. Journal of Applied Probability 17, 611–622.Wimmer, G., Altmann, G., 1999. Thesaurus of Univariate Discrete Probability Distributions. Stamm Verlag, Essen.Zornig, P., Altmann, G., 1995. Unified representation of Zipf distributions. Computational Statistics and Data Analysis 19, 461–473.