Statistical inference for Rényi entropy

Statistical inference for Renyi entropy

David Kallberg

Department of Mathematicsand Mathematical Statistics

Umea University

David Kallberg Statistical inference for Renyi entropy

CoauthorOleg SeleznjevDepartment of Mathematicsand Mathematical StatisticsUmea University


Outline

Introduction

Measures of uncertainty

U-statistics

Estimation of entropy

Numerical experiment

Conclusion


Introduction

A system described by probability distribution POnly partial information about P is available, e.g. the mean

A measure of uncertainty (entropy) in PWhat P should we use if any?


Why entropy?

The entropy maximization principle: choose P, satisfying givenconstraints, with maximum uncertainty.

Objectivity: we don’t use more information than we have.


Measures of uncertainty. The Shannon entropy.

Discrete P = {p(k), k ∈ D}

h1(P) := −∑k

p(k) log p(k)

Continuous P with density p(x), x ∈ Rd

h1(P) := −∫

Rd

log (p(x))p(x)dx


Measures of uncertainty. The Renyi entropy.

A class of entropies. Of order s 6= 1 given by

Discrete P = {p(k), k ∈ D}

hs(P) :=1

1− slog (

∑k

p(k)s)

Continuos P with density p(x), x ∈ Rd

hs(P) :=1

1− slog (

∫Rd

p(x)sdx)


Motivation

The Renyi entropy satisfies axioms on how a measure ofuncertainty should behave, Renyi (1970).

For both discrete and continuous P, the Renyi entropy is ageneralization of the Shannon entropy, because

limq→1

hq(P) = h1(P)


Problem

Non-parametric estimation of integer order Renyi entropy, fordiscrete and continuous P, from sample {X1, . . .Xn} of P-iidobservations.


Overview of Renyi entropy estimation, continuos P

Consistency of nearest neighbor estimators for any s,Leonenko et al. (2008)

Consistency and asymptotic normality for quadratic case s=2,Leonenko and Seleznjev (2010)


U-statistics: basic setup

For a P-iid sample {X1, . . . ,Xn} and a symmetric kernel functionh(x1, . . . , xm)

Eh(X1, . . . ,Xm) = θ(P)

The U-statistic estimator of θ is defined as:

Un = Un(h) :=

(n

m

)−1 ∑1≤i1<...<im≤n

h(Xi1 , . . . ,Xim)


U-statistics: properties

Symmetric, unbiased

Optimality properties for large class of PAsymptotically normally distributed


Estimation, continuous case

Method relies on estimating functional

qs :=

∫Rd

ps(x)dx = E(ps−1(X ))


Estimation, some notation

d(x , y) the Euclidean distance in Rd

Bε(x) := {y : d(x , y) ≤ ε} ball of radius ε with center x .

bε volume of Bε(x)

pε(x) := P(X ∈ Bε(x)) the ε-ball probability at x


Estimation, useful limit

When p(x) bounded and continuous, we rewrite

qs = limε→0

E(ps−1ε (X ))/bs−1

ε

So, unbiased estimate of qs,ε := E(ps−1ε (X )) leads to

asymptotically unbiased estimate of qs as ε→ 0.


Estimation of qs,ε

For s = 2, 3, 4, . . ., let

Iij(ε) := I (d(Xi ,Xj) ≤ ε)

Ii (ε) :=∏

1≤j≤s

j 6=i

Iij(ε)

Define U-statistic Qs,n for qs,ε by kernel

hs(x1, . . . , xs) :=1

s

s∑i=1

Ii (ε)


Estimation of Renyi entropy

Denote by Qs,n := Qs,n/bs−1ε an estimator of qs and by

Hs,n := 11−s log (max (Qs,n, 1/n)) corresponding estimator of hs


Consistency

Assume ε = ε(n)→ 0 as n→∞

Let v2s,n := Var(Qs,n)

v2s,n → 0 as nεd → a ∈ (0,∞], so we get

Theorem

Let nεd → a, 0 < a ≤ ∞ and p(x) be bounded and continuos.Then Hs,n is a consistent estimator of hs


Smoothness conditions

Denote by Hα(K ), 0 < α ≤ 2, K > 0, a linear space of continuosfunctions in Rd satisfying α-Holder condition if 0 < α ≤ 1 or if1 < α ≤ 2 with continuos partial derivates satisfying(α− 1)-Holder condition with constant K .


Asymptotic normality

When nεd →∞, we have v2s,n ∼ s(q2s−1 − q2

s )/n

Let Ks,n = max(s(Q2s−1,n − Q2s,n), 1/n)

L(n) > 0, n ≥ 1 is a slowly varying function as n→∞

Theorem

Let ps−1(x) ∈ Hα(K ) for α > d/2. If ε ∼ L(n)n−1/d andnεd →∞, then

√nQs,n(1− s)√

Ks,n

(Hs,n − hs)D−→ N(0, 1)


Numerical experiment

χ2 distribution, 4 degrees of freedom. h3 = −12 log (q3),

where q3 = 1/54

300 simulations, each of size n = 500.

Quantile plot and histogram supports standard normality


Figures

−3 −2 −1 0 1 2 3

−2

−1

01

23

Normal Q−Q Plot

Theoretical Quantiles

Sam

ple

Qua

ntile

s


Figures

Chi2 sample

Sta

ndar

d no

rmal

den

sity

−2 −1 0 1 2 3

0.0

0.1

0.2

0.3

0.4


Conclusion

Asymptotically normal estimates possible for integer order Renyientropy.


References

Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class ofRenyi information estimators for multidimensional densities,Annals of Statistics 36 2153-2182

Leonenko, N. and Seleznjev, O. (2009). Statistical inferencefor ε-entropy and quadratic Renyi entropy. Univ. Umea,Research Rep., Dep. Math. and Math. Stat., 1-21, J.Multivariate Analysis (submitted)

Renyi, A. Probability theory, North-Holland PublishingCompany 1970


Documents

Statistical inference for Rényi entropy