Nonparametric Divergence Estimators for Independent Subspace Analysis

Nonparametric Divergence Estimators for Independent Subspace Analysis

Barnabás Póczos (Carnegie Mellon University, USA)

Zoltán Szabó (Eötvös Loránd University, Hungary)

Jeff Schneider (Carnegie Mellon University, USA) EUSIPCO‐2011

Barcelona, SpainSept 2, 2011

2

Outline

•Goal: divergence estimation

•Definitions, basic properties, motivation

•The estimator

•Theoretical results•Consistency

•Experimental results•Mutual information estimation•Independent subspace analysis•Low-dimensional embedding of distributions

Measuring divergences

www.juhokim.com/projects.php

Cristiano RonaldoRio FerdinandOwen Hargreaves

KL

Rényi

Tsallis

Manchester United 07/08

4

How should we estimate them?

• Naïve plug-in approach using density estimation– density estimators

• histogram• kernel density estimation• k-nearest neighbors [D. Loftsgaarden & C. Quesenberry. 1965.]

• How can we estimate them directly?

Density: nuisance parameterDensity estimation: difficult

5

kNN density estimation

How good is this estimation?

[D. Loftsgaarden and C. Quesenberry. 1965.]

[N. Leonenko et. al. 2008]

6

Divergence Estimation

6

7

Asymptotically unbiased

We need to prove:

The estimator

1-, and -1 moments of the “normalized k-NN distances”

Normalized k-NN distances converge to the Erlang distribution

Agner Krarup Erlang

7

8

Asymptotically unbiased

If we could move the limit inside the expectation…

All we need is

9

A little problem…

Asymptotically uniformly integrability…

Solutions:

Increases the paper length by another 20 pages…

10

Results for divergence estimation

2D Normal

10

11

Results for MI estimation

rotated uniform distribution

1212

Independent Subspace Analysis

Observation X=AS

Independent subspaces

Estimate A and S observing samples from X onlyGoal:

6 by 6 mixing matrix

1313

Independent Subspace Analysis

Objective:

14

Low dimensional embeddig of digits

Noisy USPS datasets

15

Embedding using raw image data

16

Embedding using Rényi divergences

17

Be careful, some mistakes are easy to make…

We want:

Helly–Bray theorem

[Annals of Statistics]

18

Some mistakes …

We want:

Enough:

Erlang

Fatou lemma:

[Journal of Nonparametric Statistics, Problems Information Transmission, IEEE Trans. on Information Theory]

Fatou lemma:

19

Takeaways

If you need to estimate divergences, then use me!

Consistent divergence estimator Direct: no need to estimate densities Simple: it needs only kNN based statistics Can be used for mutual information estimation,

independent subspace analysis, low-dimensional embedding

Thanks for your attention!

20

Attic

Documents

Nonparametric Divergence Estimators for Independent Subspace Analysis