28
Introduction Models of ID Extreme Value Theory ID Estimation Estimating Local Intrinsic Dimensionality L. Amsaleg 1 O. Chelly 2 T. Furon 1 S. Girard 3 M. E. Houle 2 K. Kawarabayashi 2 M. Nett 24 1 Equipe TEXMEX, INRIA/IRISA Rennes, France 2 National Institute of Informatics, Chiyoda-ku, Tokyo, Japan 3 Equipe MISTIS, INRIA Grenoble Rhˆone-Alpes, France 4 Google Japan, Minato-ku, Tokyo, Japan August 3, 2015 1/28 Estimating Local Intrinsic Dimensionality

Estimating Local Intrinsic Dimensionality

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Estimating Local Intrinsic Dimensionality

L. Amsaleg1 O. Chelly2 T. Furon1

S. Girard3 M. E. Houle2 K. Kawarabayashi2 M. Nett24

1Equipe TEXMEX, INRIA/IRISA Rennes, France

2National Institute of Informatics, Chiyoda-ku, Tokyo, Japan

3Equipe MISTIS, INRIA Grenoble Rhone-Alpes, France

4Google Japan, Minato-ku, Tokyo, Japan

August 3, 2015

1/28

Estimating Local Intrinsic Dimensionality

Page 2: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Table of contents

1 Introduction

2 Models of IDGlobal vs. local models of IDLocal models of ID

3 Extreme Value TheoryThreshold modelID in the threshold model

4 ID EstimationOur estimatorsExperimental results and interpretation

2/28

Estimating Local Intrinsic Dimensionality

Page 3: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

What is intrinsic dimensionality?

Definition: Intrinsic Dimensionality

Intrinsic dimensionality can be described as the minimum numberof coordinates required to locate a point in the space.

3/28

Estimating Local Intrinsic Dimensionality

Page 4: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Current applications of ID

Analysis of search indices (Expansion Dimension)

Navigating Nets [Krauthgamer & Lee (2005)]

Cover Tree [Beygelzimer, Kakade, Langford (2006)]

Rank Cover Tree [Houle & Nett (2013)]

Analysis of projection, Outlier detection (Expansion Dimension)

LOF-based PINN outlier detection [de Vries, Chawla, Houle(2010)]

Projection and dimensional reduction

Principal component analysis [Pearson (1901)]

4/28

Estimating Local Intrinsic Dimensionality

Page 5: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Potential applications of ID

Prediction

estimate the target dimension for dimensionality reduction.

predict the difficulty of data sets/subsets/points.

Processing

redesign machine learning algorithms by adapting them to thelocal variation of ID (queries in points with low ID are moretrustworthy).

Evaluation

explain the behavior of algorithms and evaluate them morefairly by accounting for the disparity in ID.

5/28

Estimating Local Intrinsic Dimensionality

Page 6: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Global models of ID vs. Local models of ID

6/28

Estimating Local Intrinsic Dimensionality

Page 7: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Global models of ID vs. Local models of ID

Properties of global and local models

Global models Local models

measure the dimensinality of thewhole dataset.

the dimensionality inthe neighborhood of apoint.

require a set of data points. a set of distances to thenearest neighbors.

examples Topological models(PCA), fractal models(Hausdorff Dimension,Correlation Dimension),Graph-based models,etc.

Expansion models (ED,LID).

7/28

Estimating Local Intrinsic Dimensionality

Page 8: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Expansion models

x1

x2

B1

B2

Expansion-based methods estimate the intrinsic dimension bycomparing the expansion in distance and the associated expansionin volume (number of points).Examples: Expansion Dimension (ED), Minimum NeighborhoodDistance (MiND), Local Intrinsic Dimension (LID).

8/28

Estimating Local Intrinsic Dimensionality

Page 9: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

How to measure the dimension?

Dimensional query

In an Lp-norm space of dimension m, ifV is a measure of volume then :

V (B2)

V (B1)=

(x2x1

)m

The representational dimension m canbe obtained by :

m =lnV (B2)− lnV (B1)

ln x2 − ln x1

x1

x2

B1

B2

9/28

Estimating Local Intrinsic Dimensionality

Page 10: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Expansion Dimension

Expansion dimension

If volume is measured in terms ofnumber of points that are captured,then

ED(q, x1, x2) =ln k2 − ln k1ln x2 − ln x1

x1

x2

B1

B2

10/28

Estimating Local Intrinsic Dimensionality

Page 11: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Distance as a continuous random variable

Main assumption

Data can be seen as a sample generated from a set of continuousrandom variables.

Continuous random distance variable

Let X be an absolutely continuous random distance variable thatrepresents the distance from a reference point q with

probability density function fX

cumulative distribution function FX =∫fX (t)dt

(xi )1≤i<n an ordered sample of X

11/28

Estimating Local Intrinsic Dimensionality

Page 12: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Intrinsic dimensional query

Definition

When FX (x) > 0, the continuous ID ofX at distance x is defined as

IntrDimFX(x) =

limε→0+

(lnFX ((1 + ε)x)− lnFX (x)

ln(1 + ε)

)

Remarks

The volume is measured in termsof expected number of points.

X depends on reference point q.

x

ε · x

12/28

Estimating Local Intrinsic Dimensionality

Page 13: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Global vs. local models of IDLocal models of ID

Intrinsic dimensional query

By applying l’Hopital’s rule:

IDFX(x) =

x · fX (x)

FX (x)

x

ε · x

13/28

Estimating Local Intrinsic Dimensionality

Page 14: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Threshold modelID in the threshold model

Disaster prevention

14/28

Estimating Local Intrinsic Dimensionality

Page 15: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Threshold modelID in the threshold model

Introduction to EVT

Analogy between central and extreme values

Central values → Central limit theorem → Normal distribution

Extreme values → Pickands-Balkema-de Haan theorem →Generalized Pareto Distribution

15/28

Estimating Local Intrinsic Dimensionality

Page 16: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Threshold modelID in the threshold model

Generalized Pareto Distribution

Cases of the Generalized ParetoDistribution

ξ = 0 → Gumbel family

ξ > 0 → Frechet family

ξ < 0 → Weibull family (Thedistribution is upper bounded)

Distance distributions are lowerbounded.

We can model distances under athreshold w using Weibulldistribution.

Figure: pdf of the Weibulldistribution

16/28

Estimating Local Intrinsic Dimensionality

Page 17: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Threshold modelID in the threshold model

Modeling the tail of a distance distribution

Modeling threshold excesses

As a certain threshold w approaches 0, the excess Y = w − Xfollows a GPD with parameters ξ and σ:

Pr[Y ≤ y |Y < w ] ≈ 1−(

1 +ξy

σ

)− 1ξ

Heuristic

As w → 0, the distribution of X restricted to the tail [0,w)converges to that of a random distance variable X ∗

FX ,w (x) ≈ FX∗,w (x) =

(x

w

)− 1ξ

17/28

Estimating Local Intrinsic Dimensionality

Page 18: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Threshold modelID in the threshold model

Modeling the tail of a distance distribution

Modeling threshold excesses

IDFX(0) ≈ lim

x→0IDFX∗ (x) = lim

x→0

x · fX∗,w (x)

FX∗,w (x)= −1

ξ

Remarks

This is an approximation since it holds for the limitdistribution as w → 0.

We can use classical estimation methods to estimate theparameter ξ.

Note: In EVT, −1/ξ is called the index of the distribution.

18/28

Estimating Local Intrinsic Dimensionality

Page 19: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

Usual statistical methods

Maximum likelihood estimator

IDFX∗ (0) = −(

1

n

n∑i=1

lnxixn

)−1

Method of moments estimator

IDFX∗ (0) = −k µkµk − xkn

where µk =n∑

i=1

xki

Probability weighted moments estimator

IDFX∗ (0) =νk

w − (k + 1)νkwhere νk =

1

n

n∑i=1

(i − 0.35

n

)k

xi

19/28

Estimating Local Intrinsic Dimensionality

Page 20: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

Regularly Varying Functions Estimator

Regularly Varying Functions Estimator

IDFX∗ (0) = κ =

J∑j=1

αj ln

[FX ((1 + τjδn)xn)/FX (xn)

]J∑

j=1αj ln(1 + τjδn)

under the assumption that δn → 0+ as n→∞where : (αj)1≤j<J and (τj)1≤j<J are sequences.

20/28

Estimating Local Intrinsic Dimensionality

Page 21: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

LID estimation in artificial distances

1.8

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

10 100 1000 10000

Est

imat

ed ID

Number of samples

MLEMoM

RV (J=1)RV (J=2)

ID

0.5

1

1.5

2

10 100 1000 10000

Sta

ndar

d D

evia

tion

/ ID

Number of samples

MLEMoM

RV (J=1)RV (J=2)

Figure: Comparison of ID estimates (ID=2)

21/28

Estimating Local Intrinsic Dimensionality

Page 22: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

LID estimation in artificial distances

30

32

34

36

38

40

42

44

46

48

10 100 1000 10000

Est

imat

ed ID

Number of samples

MLEMoM

RV (J=1)RV (J=2)

ID

0.5

1

1.5

2

10 100 1000 10000

Sta

ndar

d D

evia

tion

/ ID

Number of samples

MLEMoM

RV (J=1)RV (J=2)

Figure: Comparison of ID estimates (ID=32)

22/28

Estimating Local Intrinsic Dimensionality

Page 23: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

ID estimation in artificial datasets

Datasets

manifold d D description

1 10 11 Uniformly sampled sphere

2 3 5 Affine space

3 4 6 Concentrated figureconfusable with a 3d one

4 4 8 Non-linear manifold

5 2 3 2-d Helix

6 6 36 Non-linear manifold

7 2 3 Swiss-Roll

23/28

Estimating Local Intrinsic Dimensionality

Page 24: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

ID estimation in artificial datasets

Datasets

manifold d D description

8 12 72 Non-linear manifold

9 20 20 Affine space

10a 10 11 Uniformly sampled hypercube

10b 17 18 Uniformly sampled hypercube

10c 24 25 Uniformly sampled hypercube

11 2 3 Mobius band 10-times twisted

12 20 20 Isotropic multivariate Gaussian

13 1 13 Curve

24/28

Estimating Local Intrinsic Dimensionality

Page 25: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

What is the Intrinsic Dimension?

-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Column 0

-10-9-8-7-6-5-4-3-2-10123456789

1011121314151617181920

Column 1

-10-8

-6-4

-20

24

6810

1214

1618

20Column 2

12

3

5

10

15

20

Estimated dimensionality

Figure: Data structures are detected by ID

25/28

Estimating Local Intrinsic Dimensionality

Page 26: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

ID estimation in artificial datasets

Conclusions

Local estimators tend to over-estimate the dimensionality ofnon-linear manifolds, and to under-estimate that of linearmanifolds.

For nonlinear manifolds, global estimators have difficulty inidentifying the intrinsic dimension.

The higher the sampling rate, the lower the bias.

Global methods are very affected by noise, while localmethods are more resistant.

26/28

Estimating Local Intrinsic Dimensionality

Page 27: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Our estimatorsExperimental results and interpretation

Values of LID in real datasets

0K

1K

2K

3K

4K

0 5 10 15 20 25 30 35

Abs.

Fre

quency

Estimated ID

ALOI

0K

1K

2K

3K

4K

0 5 10 15 20 25 30 35

Abs.

Fre

quency

Estimated ID

MNIST

Figure: LID MLE estimates in ALOI and MNIST

27/28

Estimating Local Intrinsic Dimensionality

Page 28: Estimating Local Intrinsic Dimensionality

IntroductionModels of ID

Extreme Value TheoryID Estimation

Thank you for your attention!

28/28

Estimating Local Intrinsic Dimensionality