Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Statistical inference for Renyi entropy of integerorder
David Kallberg
August 23, 2010
David Kallberg Statistical inference for Renyi entropy of integer order
Outline
Introduction
Measures of uncertainty
Estimation of entropy
Numerical experiment
More results
Conclusion
David Kallberg Statistical inference for Renyi entropy of integer order
Coauthor:
Oleg SeleznjevDepartment of Mathematics and Mathematical Statistics
Umea university
David Kallberg Statistical inference for Renyi entropy of integer order
Introduction
A system described by probability distribution P.
Only partial information about P is available, e.g. thecovariance matrix.
A measure of uncertainty (entropy) in P.
What P should we use if any?
David Kallberg Statistical inference for Renyi entropy of integer order
Why entropy?
The entropy maximization principle: choose P, satisfying givenconstraints, with maximum uncertainty.
Objectivity: we don’t use more information than we have.
David Kallberg Statistical inference for Renyi entropy of integer order
Measures of uncertainty. The Shannon entropy.
Discrete P = {p(k), k ∈ D}
h1(P) := −∑k
p(k) log p(k)
Continuous P with density p(x), x ∈ Rd
h1(P) := −∫Rd
log (p(x))p(x)dx
David Kallberg Statistical inference for Renyi entropy of integer order
Measures of uncertainty.The Renyi entropy
A class of entropies. Of order s ≥ 0 given by
Discrete P = {p(k), k ∈ D}
hs(P) :=1
1− slog (
∑k
p(k)s), s 6= 1
Continuos P with density p(x), x ∈ Rd
hs(P) :=1
1− slog (
∫Rd
p(x)sdx), s 6= 1
David Kallberg Statistical inference for Renyi entropy of integer order
Motivation
The Renyi entropy satisfies axioms on how a measure ofuncertainty should behave, Renyi(1961,1970).
For both discrete and continuous P, the Renyi entropy is ageneralization of the Shannon entropy,
limq→1
hq(P) = h1(P)
David Kallberg Statistical inference for Renyi entropy of integer order
Problem
Non-parametric estimation of integer order Renyientropy, fordiscrete and continuous multivariate P, from sample {X1, . . .Xn}of P-i.i.d. observations.
Estimators of entropy are widely used.
Distribution identification problems (Student-r distributions).
Average case analysis for random databases.
Clustering.
David Kallberg Statistical inference for Renyi entropy of integer order
Overview of Renyi entropy estimation
Consistency of nearest neighbor estimators for any s,Leonenko et al. (2008). Only for continuous entropy.
Consistency and asymptotic normality for quadratic case s=2,Leonenko and Seleznjev (2010).
David Kallberg Statistical inference for Renyi entropy of integer order
Notation
I (C ) indicator function of event C .
Ss,n set of all s-subsets of {1, . . . , n}.∑(s,n) is summation over Ss,n.
d(x , y) the Euclidean distance in Rd
Bε(x) := {y : d(x , y) ≤ ε} ball of radius ε with center x .
bε volume of Bε(x)
pε(x) := P(X ∈ Bε(x)) the ε-ball probability at x
David Kallberg Statistical inference for Renyi entropy of integer order
Estimation
Method relies on estimating functional
qs := E(ps−1(X )) =
∑
k p(k)s (Discrete)∫Rd p(x)sdx (Continuous)
David Kallberg Statistical inference for Renyi entropy of integer order
Estimation, discrete case
An estimator of qs ,
Qs,n :=
(s
n
)−1 ∑(s,n)
ψ(S),
where
ψ(S) :=1
s
∑i∈S
I (Xi = Xj , ∀j ∈ S).
Hs,n := 11−s log(max(Qs,n, 1/n)) estimator of hs .
Qs,n is a U-statistic, so properties follows from conventionaltheory.
David Kallberg Statistical inference for Renyi entropy of integer order
Estimation, continuous case
A U-statistic estimator of qε,s := E(pε(X )s−1),
Qs,n :=
(s
n
)−1 ∑(s,n)
ψ(S),
where
ψ(S) :=1
s
∑i∈S
I (d(Xi ,Xj) ≤ ε, ∀j ∈ S).
Qs,n := Qs,n/bε(d)s−1 asymptotically unbiased estimator of qsif ε = ε(n)→ 0 as n→∞.
Hs,n := 11−s log(max(Qs,n, 1/n)) estimator of hs .
David Kallberg Statistical inference for Renyi entropy of integer order
Smoothness conditions
Denote by H(α)(K ), 0 < α ≤ 2,K > 0, a linear space ofcontinuous in Rd functions satisfying α-Holder condition if0 < α ≤ 1 or if 1 < α ≤ 2 with continuous partial derivativessatisfying (α− 1)-Holder condition with constant K.
David Kallberg Statistical inference for Renyi entropy of integer order
Asymptotics, continuous case
ε = ε(n)→ 0 as n→∞.
L(n) > 0, n ≥ 1, a slowly varying function as n→∞.
Ks,n := max(s2(Q2s−1,n − Q2s,n), 1/n) consistent estimator of
the asymptotic variance.
Theorem
Suppose that p(x) is bounded and continuous. Let nεd → a forsome 0 < a ≤ ∞.
(i) Then Hs,n is a consistent estimator of hs .
(ii) Let p(x)s−1 ∈ H(α)(K ) for some d/2 < α ≤ 2. Ifε ∼ L(n)n−1/d and a =∞, then
√nQs,n(1− s)√
Ks,n
(Hs,n − hs)D→ N(0, 1) as n→∞.
David Kallberg Statistical inference for Renyi entropy of integer order
Numerical experiment
χ2 distribution, 4 degrees of freedom. h3 = −12 log (q3),
where q3 = 1/54.
500 simulations, each of size n = 1000, ε = 1/4.
Quantile plot and histogram supports standard normality.
Remark. Choice of ε in a practical situation remains an openproblem.
David Kallberg Statistical inference for Renyi entropy of integer order
Figures
Histogram of x
x
Sta
ndar
d no
rmal
den
sity
−3 −2 −1 0 1 2
0.0
0.1
0.2
0.3
0.4
David Kallberg Statistical inference for Renyi entropy of integer order
Figures
−3 −2 −1 0 1 2 3
−2
−1
01
2Normal Q−Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
David Kallberg Statistical inference for Renyi entropy of integer order
More results
Estimation of (quadratic) Renyi entropy from m-dependentsample.
Two different distributions PX and PY . Inference for
...functionals of type∫Rd
pX (x)s1pY (x)s2dx , s1, s2 ∈ N+.
...statistical distances (Bregman)
Discrete:D2 :=
∑k
(pX (k)− pY (k))2
Continuous:
D2 :=
∫Rd
(pX (x)− pY (x))2dx
David Kallberg Statistical inference for Renyi entropy of integer order
Conclusion
Asymptotically normal estimates possible for Renyi entropy ofinteger order.
David Kallberg Statistical inference for Renyi entropy of integer order
References
Leonenko, N. ,Pronzato, L. ,Savani, V. (1982). A class ofRenyi information estimators for multidimensional densities,Annals of Statistics 36 2153-2182.
Leonenko, N. and Seleznjev, O. (2010). Statistical inferencefor ε-entropy and quadratic Renyientropy, J. MultivariateAnalysis 101, Issue 9, 1981-1994.
Renyi, A.(1961). On measures of information and entropyProceedings of the 4th Berkeley Symposium on Mathematics,Statistics and Probability 1960, 547-561.
Renyi, A. Probability theory, North-Holland PublishingCompany 1970
David Kallberg Statistical inference for Renyi entropy of integer order