Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
IntroductionModels of ID
Extreme Value TheoryID Estimation
Estimating Local Intrinsic Dimensionality
L. Amsaleg1 O. Chelly2 T. Furon1
S. Girard3 M. E. Houle2 K. Kawarabayashi2 M. Nett24
1Equipe TEXMEX, INRIA/IRISA Rennes, France
2National Institute of Informatics, Chiyoda-ku, Tokyo, Japan
3Equipe MISTIS, INRIA Grenoble Rhone-Alpes, France
4Google Japan, Minato-ku, Tokyo, Japan
August 3, 2015
1/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Table of contents
1 Introduction
2 Models of IDGlobal vs. local models of IDLocal models of ID
3 Extreme Value TheoryThreshold modelID in the threshold model
4 ID EstimationOur estimatorsExperimental results and interpretation
2/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
What is intrinsic dimensionality?
Definition: Intrinsic Dimensionality
Intrinsic dimensionality can be described as the minimum numberof coordinates required to locate a point in the space.
3/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Current applications of ID
Analysis of search indices (Expansion Dimension)
Navigating Nets [Krauthgamer & Lee (2005)]
Cover Tree [Beygelzimer, Kakade, Langford (2006)]
Rank Cover Tree [Houle & Nett (2013)]
Analysis of projection, Outlier detection (Expansion Dimension)
LOF-based PINN outlier detection [de Vries, Chawla, Houle(2010)]
Projection and dimensional reduction
Principal component analysis [Pearson (1901)]
4/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Potential applications of ID
Prediction
estimate the target dimension for dimensionality reduction.
predict the difficulty of data sets/subsets/points.
Processing
redesign machine learning algorithms by adapting them to thelocal variation of ID (queries in points with low ID are moretrustworthy).
Evaluation
explain the behavior of algorithms and evaluate them morefairly by accounting for the disparity in ID.
5/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Global models of ID vs. Local models of ID
6/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Global models of ID vs. Local models of ID
Properties of global and local models
Global models Local models
measure the dimensinality of thewhole dataset.
the dimensionality inthe neighborhood of apoint.
require a set of data points. a set of distances to thenearest neighbors.
examples Topological models(PCA), fractal models(Hausdorff Dimension,Correlation Dimension),Graph-based models,etc.
Expansion models (ED,LID).
7/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Expansion models
x1
x2
B1
B2
Expansion-based methods estimate the intrinsic dimension bycomparing the expansion in distance and the associated expansionin volume (number of points).Examples: Expansion Dimension (ED), Minimum NeighborhoodDistance (MiND), Local Intrinsic Dimension (LID).
8/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
How to measure the dimension?
Dimensional query
In an Lp-norm space of dimension m, ifV is a measure of volume then :
V (B2)
V (B1)=
(x2x1
)m
The representational dimension m canbe obtained by :
m =lnV (B2)− lnV (B1)
ln x2 − ln x1
x1
x2
B1
B2
9/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Expansion Dimension
Expansion dimension
If volume is measured in terms ofnumber of points that are captured,then
ED(q, x1, x2) =ln k2 − ln k1ln x2 − ln x1
x1
x2
B1
B2
10/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Distance as a continuous random variable
Main assumption
Data can be seen as a sample generated from a set of continuousrandom variables.
Continuous random distance variable
Let X be an absolutely continuous random distance variable thatrepresents the distance from a reference point q with
probability density function fX
cumulative distribution function FX =∫fX (t)dt
(xi )1≤i<n an ordered sample of X
11/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Intrinsic dimensional query
Definition
When FX (x) > 0, the continuous ID ofX at distance x is defined as
IntrDimFX(x) =
limε→0+
(lnFX ((1 + ε)x)− lnFX (x)
ln(1 + ε)
)
Remarks
The volume is measured in termsof expected number of points.
X depends on reference point q.
x
ε · x
12/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Global vs. local models of IDLocal models of ID
Intrinsic dimensional query
By applying l’Hopital’s rule:
IDFX(x) =
x · fX (x)
FX (x)
x
ε · x
13/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Threshold modelID in the threshold model
Disaster prevention
14/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Threshold modelID in the threshold model
Introduction to EVT
Analogy between central and extreme values
Central values → Central limit theorem → Normal distribution
Extreme values → Pickands-Balkema-de Haan theorem →Generalized Pareto Distribution
15/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Threshold modelID in the threshold model
Generalized Pareto Distribution
Cases of the Generalized ParetoDistribution
ξ = 0 → Gumbel family
ξ > 0 → Frechet family
ξ < 0 → Weibull family (Thedistribution is upper bounded)
Distance distributions are lowerbounded.
We can model distances under athreshold w using Weibulldistribution.
Figure: pdf of the Weibulldistribution
16/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Threshold modelID in the threshold model
Modeling the tail of a distance distribution
Modeling threshold excesses
As a certain threshold w approaches 0, the excess Y = w − Xfollows a GPD with parameters ξ and σ:
Pr[Y ≤ y |Y < w ] ≈ 1−(
1 +ξy
σ
)− 1ξ
Heuristic
As w → 0, the distribution of X restricted to the tail [0,w)converges to that of a random distance variable X ∗
FX ,w (x) ≈ FX∗,w (x) =
(x
w
)− 1ξ
17/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Threshold modelID in the threshold model
Modeling the tail of a distance distribution
Modeling threshold excesses
IDFX(0) ≈ lim
x→0IDFX∗ (x) = lim
x→0
x · fX∗,w (x)
FX∗,w (x)= −1
ξ
Remarks
This is an approximation since it holds for the limitdistribution as w → 0.
We can use classical estimation methods to estimate theparameter ξ.
Note: In EVT, −1/ξ is called the index of the distribution.
18/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
Usual statistical methods
Maximum likelihood estimator
IDFX∗ (0) = −(
1
n
n∑i=1
lnxixn
)−1
Method of moments estimator
IDFX∗ (0) = −k µkµk − xkn
where µk =n∑
i=1
xki
Probability weighted moments estimator
IDFX∗ (0) =νk
w − (k + 1)νkwhere νk =
1
n
n∑i=1
(i − 0.35
n
)k
xi
19/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
Regularly Varying Functions Estimator
Regularly Varying Functions Estimator
IDFX∗ (0) = κ =
J∑j=1
αj ln
[FX ((1 + τjδn)xn)/FX (xn)
]J∑
j=1αj ln(1 + τjδn)
under the assumption that δn → 0+ as n→∞where : (αj)1≤j<J and (τj)1≤j<J are sequences.
20/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
LID estimation in artificial distances
1.8
2
2.2
2.4
2.6
2.8
3
3.2
3.4
3.6
10 100 1000 10000
Est
imat
ed ID
Number of samples
MLEMoM
RV (J=1)RV (J=2)
ID
0.5
1
1.5
2
10 100 1000 10000
Sta
ndar
d D
evia
tion
/ ID
Number of samples
MLEMoM
RV (J=1)RV (J=2)
Figure: Comparison of ID estimates (ID=2)
21/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
LID estimation in artificial distances
30
32
34
36
38
40
42
44
46
48
10 100 1000 10000
Est
imat
ed ID
Number of samples
MLEMoM
RV (J=1)RV (J=2)
ID
0.5
1
1.5
2
10 100 1000 10000
Sta
ndar
d D
evia
tion
/ ID
Number of samples
MLEMoM
RV (J=1)RV (J=2)
Figure: Comparison of ID estimates (ID=32)
22/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
ID estimation in artificial datasets
Datasets
manifold d D description
1 10 11 Uniformly sampled sphere
2 3 5 Affine space
3 4 6 Concentrated figureconfusable with a 3d one
4 4 8 Non-linear manifold
5 2 3 2-d Helix
6 6 36 Non-linear manifold
7 2 3 Swiss-Roll
23/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
ID estimation in artificial datasets
Datasets
manifold d D description
8 12 72 Non-linear manifold
9 20 20 Affine space
10a 10 11 Uniformly sampled hypercube
10b 17 18 Uniformly sampled hypercube
10c 24 25 Uniformly sampled hypercube
11 2 3 Mobius band 10-times twisted
12 20 20 Isotropic multivariate Gaussian
13 1 13 Curve
24/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
What is the Intrinsic Dimension?
-10 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Column 0
-10-9-8-7-6-5-4-3-2-10123456789
1011121314151617181920
Column 1
-10-8
-6-4
-20
24
6810
1214
1618
20Column 2
12
3
5
10
15
20
Estimated dimensionality
Figure: Data structures are detected by ID
25/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
ID estimation in artificial datasets
Conclusions
Local estimators tend to over-estimate the dimensionality ofnon-linear manifolds, and to under-estimate that of linearmanifolds.
For nonlinear manifolds, global estimators have difficulty inidentifying the intrinsic dimension.
The higher the sampling rate, the lower the bias.
Global methods are very affected by noise, while localmethods are more resistant.
26/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Our estimatorsExperimental results and interpretation
Values of LID in real datasets
0K
1K
2K
3K
4K
0 5 10 15 20 25 30 35
Abs.
Fre
quency
Estimated ID
ALOI
0K
1K
2K
3K
4K
0 5 10 15 20 25 30 35
Abs.
Fre
quency
Estimated ID
MNIST
Figure: LID MLE estimates in ALOI and MNIST
27/28
Estimating Local Intrinsic Dimensionality
IntroductionModels of ID
Extreme Value TheoryID Estimation
Thank you for your attention!
28/28
Estimating Local Intrinsic Dimensionality