Upload
lamnhu
View
236
Download
2
Embed Size (px)
Citation preview
Statistical inference for Renyi entropy
David Kallberg
Department of Mathematicsand Mathematical Statistics
Umea University
David Kallberg Statistical inference for Renyi entropy
CoauthorOleg SeleznjevDepartment of Mathematicsand Mathematical StatisticsUmea University
David Kallberg Statistical inference for Renyi entropy
Outline
Introduction
Measures of uncertainty
U-statistics
Estimation of entropy
Numerical experiment
Conclusion
David Kallberg Statistical inference for Renyi entropy
Introduction
A system described by probability distribution POnly partial information about P is available, e.g. the mean
A measure of uncertainty (entropy) in PWhat P should we use if any?
David Kallberg Statistical inference for Renyi entropy
Why entropy?
The entropy maximization principle: choose P, satisfying givenconstraints, with maximum uncertainty.
Objectivity: we don’t use more information than we have.
David Kallberg Statistical inference for Renyi entropy
Measures of uncertainty. The Shannon entropy.
Discrete P = {p(k), k ∈ D}
h1(P) := −∑k
p(k) log p(k)
Continuous P with density p(x), x ∈ Rd
h1(P) := −∫
Rd
log (p(x))p(x)dx
David Kallberg Statistical inference for Renyi entropy
Measures of uncertainty. The Renyi entropy.
A class of entropies. Of order s 6= 1 given by
Discrete P = {p(k), k ∈ D}
hs(P) :=1
1− slog (
∑k
p(k)s)
Continuos P with density p(x), x ∈ Rd
hs(P) :=1
1− slog (
∫Rd
p(x)sdx)
David Kallberg Statistical inference for Renyi entropy
Motivation
The Renyi entropy satisfies axioms on how a measure ofuncertainty should behave, Renyi (1970).
For both discrete and continuous P, the Renyi entropy is ageneralization of the Shannon entropy, because
limq→1
hq(P) = h1(P)
David Kallberg Statistical inference for Renyi entropy
Problem
Non-parametric estimation of integer order Renyi entropy, fordiscrete and continuous P, from sample {X1, . . .Xn} of P-iidobservations.
David Kallberg Statistical inference for Renyi entropy
Overview of Renyi entropy estimation, continuos P
Consistency of nearest neighbor estimators for any s,Leonenko et al. (2008)
Consistency and asymptotic normality for quadratic case s=2,Leonenko and Seleznjev (2010)
David Kallberg Statistical inference for Renyi entropy
U-statistics: basic setup
For a P-iid sample {X1, . . . ,Xn} and a symmetric kernel functionh(x1, . . . , xm)
Eh(X1, . . . ,Xm) = θ(P)
The U-statistic estimator of θ is defined as:
Un = Un(h) :=
(n
m
)−1 ∑1≤i1<...<im≤n
h(Xi1 , . . . ,Xim)
David Kallberg Statistical inference for Renyi entropy
U-statistics: properties
Symmetric, unbiased
Optimality properties for large class of PAsymptotically normally distributed
David Kallberg Statistical inference for Renyi entropy
Estimation, continuous case
Method relies on estimating functional
qs :=
∫Rd
ps(x)dx = E(ps−1(X ))
David Kallberg Statistical inference for Renyi entropy
Estimation, some notation
d(x , y) the Euclidean distance in Rd
Bε(x) := {y : d(x , y) ≤ ε} ball of radius ε with center x .
bε volume of Bε(x)
pε(x) := P(X ∈ Bε(x)) the ε-ball probability at x
David Kallberg Statistical inference for Renyi entropy
Estimation, useful limit
When p(x) bounded and continuous, we rewrite
qs = limε→0
E(ps−1ε (X ))/bs−1
ε
So, unbiased estimate of qs,ε := E(ps−1ε (X )) leads to
asymptotically unbiased estimate of qs as ε→ 0.
David Kallberg Statistical inference for Renyi entropy
Estimation of qs,ε
For s = 2, 3, 4, . . ., let
Iij(ε) := I (d(Xi ,Xj) ≤ ε)
Ii (ε) :=∏
1≤j≤s
j 6=i
Iij(ε)
Define U-statistic Qs,n for qs,ε by kernel
hs(x1, . . . , xs) :=1
s
s∑i=1
Ii (ε)
David Kallberg Statistical inference for Renyi entropy
Estimation of Renyi entropy
Denote by Qs,n := Qs,n/bs−1ε an estimator of qs and by
Hs,n := 11−s log (max (Qs,n, 1/n)) corresponding estimator of hs
David Kallberg Statistical inference for Renyi entropy
Consistency
Assume ε = ε(n)→ 0 as n→∞
Let v2s,n := Var(Qs,n)
v2s,n → 0 as nεd → a ∈ (0,∞], so we get
Theorem
Let nεd → a, 0 < a ≤ ∞ and p(x) be bounded and continuos.Then Hs,n is a consistent estimator of hs
David Kallberg Statistical inference for Renyi entropy
Smoothness conditions
Denote by Hα(K ), 0 < α ≤ 2, K > 0, a linear space of continuosfunctions in Rd satisfying α-Holder condition if 0 < α ≤ 1 or if1 < α ≤ 2 with continuos partial derivates satisfying(α− 1)-Holder condition with constant K .
David Kallberg Statistical inference for Renyi entropy
Asymptotic normality
When nεd →∞, we have v2s,n ∼ s(q2s−1 − q2
s )/n
Let Ks,n = max(s(Q2s−1,n − Q2s,n), 1/n)
L(n) > 0, n ≥ 1 is a slowly varying function as n→∞
Theorem
Let ps−1(x) ∈ Hα(K ) for α > d/2. If ε ∼ L(n)n−1/d andnεd →∞, then
√nQs,n(1− s)√
Ks,n
(Hs,n − hs)D−→ N(0, 1)
David Kallberg Statistical inference for Renyi entropy
Numerical experiment
χ2 distribution, 4 degrees of freedom. h3 = −12 log (q3),
where q3 = 1/54
300 simulations, each of size n = 500.
Quantile plot and histogram supports standard normality
David Kallberg Statistical inference for Renyi entropy
Figures
−3 −2 −1 0 1 2 3
−2
−1
01
23
Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
David Kallberg Statistical inference for Renyi entropy
Figures
Chi2 sample
Sta
ndar
d no
rmal
den
sity
−2 −1 0 1 2 3
0.0
0.1
0.2
0.3
0.4
David Kallberg Statistical inference for Renyi entropy
Conclusion
Asymptotically normal estimates possible for integer order Renyientropy.
David Kallberg Statistical inference for Renyi entropy
References
Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class ofRenyi information estimators for multidimensional densities,Annals of Statistics 36 2153-2182
Leonenko, N. and Seleznjev, O. (2009). Statistical inferencefor ε-entropy and quadratic Renyi entropy. Univ. Umea,Research Rep., Dep. Math. and Math. Stat., 1-21, J.Multivariate Analysis (submitted)
Renyi, A. Probability theory, North-Holland PublishingCompany 1970
David Kallberg Statistical inference for Renyi entropy