Normal Entropy

Embed Size (px)

Citation preview

  • 8/8/2019 Normal Entropy

    1/5

    688 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 35, NO . 3. MA Y 1989Entropy Expressions and Their Estimators

    for Multivariate DistributionsN A BI L ALI AHMED AND D. V . GOKHALE. MEMBER, IEEE

    Ahstruet -Entropy express ions for several continuous multivariate dis-tributions are derived. Point estimation of entropy for the multinormaldistribution and for the distribution of order statistics from Weinman'sexponential distribution is considered. The asymptotic distribution of theuniformly minimum variance unbiased estimator for multinormal entropyis obtained. Simulation results on convergence of the means and variancesof these estimators are provided.

    INTRODUCTIONThe entropy of a random variable X having density f ( x )absolutely continuous with respect to the Lebesgue measure onRJ ' is given by:

    wherex = ( X I , x2; . ,X J ' .

    In receni years the concept of entropy has been used increasinglyin inferential statistics. Vasicek [12] constructed an entropy esti-mate and a test of goodness of fit of univariate normality basedon that estimate. Dudewicz and Van der Meulen [3] proposed agoodness-of-fit test of the uniform distribution based on maxi-mum entropy criteria. Gokhale [4] proposed a general form of agoodness-of-fit test statistic for families of maximum entropydistributions; particular cases of such families are normal, expo-nential, double-exponential, gamma, and beta distributions.Lazo and Rathie [7] derived and tabulated entropies for vari-ous univariate continuous probability distributions. This corre-spondence extends their table and results to the entropy ofseveral families of multivariate distributions (Section I). SectionI1 deals with point estimation for the multivariate normal and thedistribution of the order statistics from the multivariate exponen-tial distribution proposed by Weinman [13]. Two types of estima-tors are considered. One, denoted by H,(f),s a consistentestimator and the second, denoted by H,(f),s the uniformlyminimum variance unbiased estimator. Section I11 deals with theasymptotic distribution of the entropy estimators H*(f)ndH,(f) or the multivariate normal distribution. To compare theirperformances, means and variances obtained by Monte Carlosimulation are tabulated.

    I. PARAMETRICENTROPYThe expressions of entropy derived in this section often involvedigamma and trigamma functions, + ( a )= ( d / d a ) n I a ) and+'(a)= (d/da)+(a), espectively (Magnus, [SI), and Euler's con-stant y = 0.5772156649.The density function of the multivariate normal distribution isgiven by

    x E R P , p E RP

    Manuscript received July 3, 1987; revised July 22, 1988. This correspon-dence was presented in part at the International Symposium on Informationand C oding Theory , University of Campinas, Campinas, Brazil. July 1987.

    N. Ali Ahmed is with Bell Communications Research, 3 Corporate Place,Piscataway. NJ 08854. USA.D. V. Gokhale is with the Department of Statistics. University of California.Riverside, CA 92521, USA.IEEE Log Number 8927888.

    and Z positive definite. The entropy, denoted by H(NORM), isgiven byP P 1H(N0RM) =- + - n(2n) + - n 121. (1.1)

    Let H(LN0RM) be the entropy of a multivariate lognormal2 2 2

    distribution given by

    whereIn= In x1 . . In x P ) ' x = ( x 1 . ., , , ) ' .

    ThenP

    H ( LNORM) = H( NORM) + p, . ( 14r = lLet H(L0GS T) denote the entropy of a multivariate logisticdistribution given by

    r = l

    - O < x , < + OO,pr> 0 , - 0 < p , < + 00 .Then

    PH(LOGST) = C In(p , ) - ln(p! )+(p+l)A(p) (1 .3)I = 1

    where

    I 1 9 forp = l .Let H(PARET0) denote the entropy of the multivariate Paretodistribution of Type I1 (Arnold, [2]) given by

    -(a+,)a + i - 1r = l

    for x , > p , , i = 1,2,. . . ,p .ThenH ( PARETO)

    r = l

    1( a + i - 1 ) 1 a + i - 1 )[ =1 ( u+ i ) - 112[ I = 1 ( a + i ) - 2 ] . . . [ I =1 ( a + i ) - p ] '+ 1001S-9448/89/0500-0688$01 .OO 01989 IEEE

    Authorized licensed use limited to: UNIVERSITY OF SUSSEX. Downloaded on October 13, 2008 at 11:45 from IEEE Xplore. Restrictions apply.

  • 8/8/2019 Normal Entropy

    2/5

    IEEE TRANSACTIONSON INFORMATION THEORY, VOL. 35 , NO. 3, MA Y 1989 689Let H(MEXP) denote the entropy of the multivariate exponen-tial probability density (Amold, [2]) given by

    /(I)= I(a+i- l)(e . l + . . . + '.-p+ l ) - ( ~ + p ) { e..).I = 1 I = 1

    for x , > 0, i = 1,2; . .,p .ThenH( MEXP)= - I n ( a + i - 1 ) + p I n ( a )P

    i = l

    - P a

    For the multivariate exponential distribution (Weinman, [13]),the joint density function of XI, X,; . . Xp over all the p!regions of the sample space is give by

    1Pp -1+ . . . +-(xl-xx,) ,where

    1POP1 ' . .Pp- 1C =

    The entropy H(EXPW) for the foregoingfunction is

    (1.6)

    probability density

    Let XI; . ., X p be the i.i.d. failure times from the multivariateexponential distribution due to Weinman [13] and Y1; . .,Yp bethe corresponding order statistics. The joint density of the orderstatistics of the multivariate exponential distribution (Weinman[13]) consisting of the joint density of the random variables { y }

    Using the fact that ( y - y- 1), =1; . , n has an exponentialdistribution with parameter p , /( p + 1- ), the entropyH(0MEXP) of g( yl , y2 ,. . . yp) s given by

    is the entropy of the univariate exponential distribution withscale parameter p , ;

    11. PARAMETRICNTROPYSTIMATORSIn this section we consider two point estimators for the para-metric entropy; the first is a consistent estimator denoted byIT,.(/), nd the second is the uniformly minimum variance unbi-ased estimator (UMVUE) denoted by H*( /).Consistent estimators for the parametric entropy of all themultivariate distributions, considered in the previous section, canbe formed by replacing the parameters with their consistentestimators. Consistent estimators for the parameters of the multi-variate normal, lognormal, logistic, and Pareto are well-known(see, for example, Rao [l l] , Press [lo], Johnson and Kotz [6],Arnold [2 ] for the respective families of distribution), and thosefor the distributions studied by Weinman [13] are given in hisreference. Hence for the sake of brevity we will not pursue thistopic here in more detail.We first consider the UMVUE for H(N0 R M).Definition I: Let V : p x p ) be symmetric and positive defi-nite. The random matrix V is said to follow the nonsingularp-dimensional Wishart distribution with scale matrix E , and ndegrees of freedom, n 2 p , if the joint distribution of the distinctelements of V is continuous with the density function of V given

    by

    2 ,V are positive definiteand p( V )=0, otherwise, where c is a numerical constant de-fined as

    Authorized licensed use limited to: UNIVERSITY OF SUSSEX. Downloaded on October 13 2008 at 11:45 from IEEE X lore. Restrictions a l .

  • 8/8/2019 Normal Entropy

    3/5

    690 IEEE TRANSA CTIONS ON INFORMATION THEORY. VOL. 35, NO . 3. MAY 1989with

    Theorem I : For X, :p Xl , let X,,X2;.. ,X2;.. ,Xn be mu-tually independent; let X be normally distributed with parame-ters 0 and Z , B > O , and let X=(X, , X2;. . ,X, , ) ' ,X:pxn,n 2 p. Define V= XX ' ; then V >0 and V will have a Wishartdistribution with parameters Z , p, and n.

    Proof: See Press [lo].Lemma I (Wijsman, [14]): The distribution of lVl/ lZl is thedistribution of the product of p independent chi-squared vari-ables with n , -1,.. .,n - p + 1 degrees of freedom, respec-tively.Theorem 2: The uniformly minimum variance unbiased esti-mator for the parametric entropy of the multivariate normaldistribution is

    Proof: From Wijsman's lemma we haveIV IIZ I r = l- n X:,,+l-,)]

    Taking logarithms, we getP

    In IV I= In VI+ c In [ x;,z+ - , ) Ir = 1

    The mean and the variance of In[X;,,+ - I ) ] are given by

    where +(.) and +'(.) are the digamma and trigamma functions,respectively. (See Johnson and Kotz, [6 ] .)The expected value of In 1V1 and the variance can be found as

    Then it is easy to show thatE { &(NORM)} = H(N0RM)

    andvar { H,(NORM) } =- 4' ( n + i - i ) . (2.6)4 r = l

    Since the Wishart distribution is a member of the exponentialfamily, S is a complete and sufficient statistic for 8. By theLehmann-Scheffe theorem (Graybill, 151). H,(NORM) is

    We can express H,(NORM) in terms of H,.(NORM) as fol-lows:H,(NORM) = H,.(NORM)

    The following lemma shows that the limit of the bias terms inH,.(NORM) is equal to zero. Note that from (2.7) the bias B,,eauals

    Lemma 2: We haveProof: We have

    B,, = 0.

    Iim { B , , } = Iimn + m

    The inequality e.' - 1 2 x for x 2 0 will be needed to show thatthe limit of +((n +1- )/2) is the limit of {In(n +1- ) - ln2)for i=1,2;. . ,p.Consider the equality

    t dt . (2.9)n + l - iLet y = 2nt; then

    t dtr w

    The term in the bracket is Cauchy with parameters p = v ( n +1- ) and p = 0. Thus the last term of (2.9) is less than or equalto l / ( n + 1- ) :n + l - i n + l - i 2

    n - w= l im In(n+1-i)-In2.

    n + m.~ ~ _, . .UMVUE for H(N0RM). Therefore, the limit of (2.8) is equal to zero.

    Authorized licensed use limited to: UNIVERSITY OF SUSSEX. Downloaded on October 13, 2008 at 11:45 from IEEE Xplore. Restrictions apply.

  • 8/8/2019 Normal Entropy

    4/5

    I EEE TRANSACTIONS ON INFORMATION THEORY, VOL. 35 , NO. 3, MAY 1989 691Lemma 3: Va r { H,(NORM)}, given by (2.6), converges to zero TABLE ITHEORETICALt )AND SIMULATEDs)MEANS ND VARIANCES

    OF H,(NORM) A N D H , (N O RM )"Size Type Mean Variance

    as n -+ 00.Proof: From (2.6) we have to show that lim+'(n) = 0. Con- Sample Estimatorsider the equality

    - n-*Thus lim +'(n) = 0.n + m

    Theorem 3: Both H,(NORM) and H,(NORM) converge inprobability to H(N0RM) as n tends to infinity.Proof: Since S is a consistent estimator of Z and the logfunction is continuous the statement of the theorem follows.

    Now we consider UMVUE for H(0MEXP). Let X , , . . . X,, ,bea random sample from the multivariate exponential distributiongiven by (1.6), and let Y1; . .,5 be the corresponding randomsample of order statistics.Theorem 4: The uniformly minimum variance unbiased esti-mator of the parametric entropy of the joint density of the orderstatistics from the multivariate exponential distribution (1.6)(Weinman, [13]) is

    PH,( OMEXP) = In( n x + p - p+ ( n ) + n ( p ! )i = l

    withvar { H,( OMEXP)} = p+'( n )

    Proof: Equation (1.8) for H(0MEXP) involves the sum ofthe parametric entropies of the univariate exponential distribu-tions with parameters p ,, i = 1 . . p . The UMVUE for theentropy of_the univariate exponential distribution with parameterp, is ln(n &)+l - +(n) with variance +'(n). The result follows.

    111. ASYMPTOTICISTRIBUTIONF H,(NORM)Note that from (2.1), (2.2), and the fact that In IS1 =I n I VJ-p In( n - 1) the random variable in &(NORM) is distributed asi = l

    The distribution of ln[X:n+r-l )] is well approximated by thenormal d istribution with appropriate mean and variance (Olshen,[9]).Hence H,(NORM) is also well approximated by the normaldistribution with mean H(NORM), given by (1.1) and the vari-ance (2.6).To study the convergence properties of H,(NORM) andH,(NORM), we generated lo00 samples of different sizes frombivariate normal distributions with different covariance matrices2 . For each sample the estimates &(NORM) and H,(NORM)were calculated. Their means and variances for different samplesizes are given in Tables I-IV. From the tables it is seen thatH,(NORM) converges slightly faster to the true value than"(NORM). The simulated variance of &(NORM) is slightlysmaller than that of H,(NORM) as expected. This convergencebehavior does not seem to be affected much by the increasingcorrelation between the two variables.

    2.781322.837902.539742.596322.820562.837902.737492.758472.827662.837902.794022.804262.832842.837902.809692.81475

    0.11751000.11751000.13750680.13750510.03508000.03508000.04693600.04693000.02062000.02062000.02072310.02072200.01015000.01015000.01010930.0101048" 2= [,!.: ; theoretical entropy: 2.8379

    TABLE rrTHEORETICAL1 ) AND SIMULATEDs) MEANSND VARIANCESOF H d N O R M ) AND H,(NORM)"

    Sample EstimatorSize Type Mean Variance2.734122.790702.504232.560822.773362.790702.703 962.721302.780462.790702.731352.741592.785642.790702.762532.76760

    0.11751000.11751000.13163680.13 162430.03508000.03508000.03318710.03 318040.02062000.02062000.02418060.02307540.01015000.01015000.01123720.0109676

    "I:= A:: ; theoretical entropy: 2.7907.TABLE 111THEORETICALt ) N D SIMULATEDs ) MEANS ND VARIANCES

    OF H,(NORM) AND H , (N O RM )aSize TKJe Mean VarianceSample Estimator

    2.558522.615102. 328242.384832.597762.615102.527982.545322.604862.615102.555362.565002.610042.615102.586552.59161

    0.11751000.11751000.13164550.13 162680.03508000.03508000.03319160.03 3 17880.02062000.02062000.02065480.02065370.01015000.01015000.01044870.0104384

    "Z = [ z] theoretical entropy: 2.6151.

    Authorized licensed use limited to: UNIVERSITY OF SUSSEX. Downloaded on October 13, 2008 at 11:45 from IEEE Xplore. Restrictions apply.

  • 8/8/2019 Normal Entropy

    5/5

    69 2

    001

    IEEE TRANSACTIONS O N INFORMATION THEOR Y, VOL. 35. NO . 3. MA Y 1989

    A =

    TABLE IVTHEORETICALt )AND SIMULATEDs ) MEANS ND VARIANCESOF H,(NORM) AND H,(NORM)=

    Size Type Mean VarianceSample Estimator

    10.0

    1.951322.007901.721021.777611.990562.007901.920751.938101.997662.007901.948141.958382.002842.007901.979331.98438

    0.1 1751000.11751000.13163780.13163030.03508000.03508000.03317870.03318270.02062000.02062000.02066870.02066720.01015000.01015000.01043500.0104406 Z = I ; theoretical entropy: 2.0079.

    REFERENCESN. A. Ahmed. Goodness-of-fit tests for testing multivariate families ofdistributions, Ph.D. dissertation, University of California, Riverside,1987.B. C. Arnold, Pureto Distributions. Burtonsville, MD: InternationalCo-operative Publishing House, 1985.E. J. Dudewicz and E. C. Van der Meulen, Entropy based statisticalinfer ence, I: Testing hypothesis on continuous probability densities, withreference to uniformity, Mededelingen Uit Het Wiskundig Institut,Katholieke Universiteit, Leuven, Belgium, 120, 1979.D. V. Gokhale. On entropy-based goodness-of-fit test, Comput. Statist.Dutu Anal., vol. 1, pp. 157-165, 1983.F. A. Graybill, A. M. Mood. and D. C. Boes, Introduction to the Theoryof Stutistics. New York: McGraw-Hill, Series in Probubility und Statis-tics. 1974.N. L. Johnson and S . Kotz, Distributions in Stutistics: Continuous Multi-ouriute Distributions.C. G. Verdugo Lazo and Pushpa N. Rathie, On the entropy of continu-ous probability distributions, IEEE Trans. Inform. Theory, vol. IT-24,no. 1, 1978.W. Magnus, F. Oberhettinger, and R. P. SON, Formulus and Theoremsfor the Speciul Functions of Mathematical Physics. New York:Springer-Verlag, 1966.A. C. O lshen, Tra nsfor mation s of the Pearson Type 111 distributions,Ann. Muth. Stutist., vo l 8, pp. 176-200, 1937.S. J. Press, Applied Multiouriute Analysis. Malabar, F L : Krieger, 1982.C. R. Rao. Lineur Storistical Inference and its Appli cution s. New York:Wiley. 1976.0. Vasicek. A test for normality based on sample entropy, J . Roy.Stutist. Soc. : Series B , vol. 36, pp. 54-59, 1976.D. G. Weinman. A multivariate extension of the exponential distribu-tion, Ph.D. dissertation, Arizona State University, Tempe, 1966.R. A. Wijsman, Random orthogonal transformation and their use insome classical distribu tion problems in multivariate analysis, Ann.Math. Statist., vol. 28, no. 2 , 1957.

    New York: Wiley, 1972.

    The Power Spectral Density of Maximum EntropyCharge Constrained Sequenc esK E N N E T H J. KERPEZ, MEMBER, IEEE

    Abstract -A limit on the absolute value of the running digital sum of asequence is known as the charge constraint. Such a limit imposes aspectral null at dc. The maximum entropy distribution of a charge con-

    Manusc ript received June 13, 1988; revised July 14, 1988. This work wassupported in part by the National Science Foundation under Grant ECS-8352220, and in part by IBM, CDC, and AT&T.

    The author is with the Department of Electrical Engineering. Cornel1 Uni-versity, Ithaca. NY 14853.

    IEEE Log Number 8928198.

    strained sequence is presented. A closed-form expression forthe powerspectral density of maximum entropy charge constrained sequences isgiven and plotted.

    I. INTRODUCTIONA binary magnetic recording system is considered. Due totransformer coupling, the magnetic recorder cannot pass dc. Toavoid errors, the message sequence is mapped by an invertibleencoder into a sequence { x} hat has no dc component and cansafely pass through the magnetic recorder. Pierobon [2] showed

    that the charge constraint is a necessary and sufficient conditionfor a null in the spectrum of a binary sequence at dc. The chargeconstraint maintains a bound on the absolute value of the run-ning sum of { x}.}.For every rn input bits, the encoder outputs n bits. The coderate m / n is maximized to pass the largest amount of informationpossible. The maximization occurs when { x} as the maximumentropy probability distribution; the rate is then the capacity[ll]. The maximum entropy distribution of a charge constrainedsequence is presented in Section 11. A closed-form solution isgiven in Section I11 for the power spectral density (spectrum) ofthe associated stat5 sequence, from which it is easy to find thespectrum of { x}. he spectrum shows the relation between thecharge constraint and the shape of the null at dc. As the con-straint becomes looser the capacity increases and the null be-comes less pronounced.Let E;E +1 , -1}. Here is the same as A, , given byGallopoulos et al . [8], the charge associated to a (0 , } run-lengthsequence. Define the state as the accumulated charge S,, =Xr=oE;.Note that

    r, =s,-The charge constraint C makes IS, I C for all n. This statesequence has the 2C + 1 X 2C + 1 adjacency matrix A ,

    010

    . . .. . .

    .. .. ... ..01

    11. THEMAXIMUMNTROPY ISTRIBUTIONThe following theorem of Shannons about general constrainedsequences is employed to find the maximum entropy distributionof charge constrained sequences. Many results in this section canbe found in [l], [3], [5], and [7].Theorem I [3]: Given a constrained sequence with an irre-ducible state transition diagram and an n X n adjacency matrixA with entries a,,,, let the maximum eigenvalue of A be X. Finda left eigenvector wT and a right eigenvector U of A associatedwith X such that d u = 1. Then the capacity of the sequence islog, ( A ) and the maximum entropy distribution is Markov withtransition probability matrix P. The entries of P are p I , ,=(a,,,u,)/(Xu,).Thestationarydistributionof P is n with? = wlu, ,

    i, j =1,2; . ,n .For the char e constraint, the symmetry of the matrix Aimplies that w= U. The maximum eigenvalue of A , A , wasfound by Chien [l] o bemX=2cos(y) with y=- 2(C+1)

    Thus the capacity of the charge constrained sequence is given bylog, (2COS(Tr/2(C + 1))).0018-9448/89/05OO-0692$01 .OO 1 9 8 9 EEE