229
Intrinsic dimension Yoav Freund UCSD

UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimensionYoav Freund

UCSD

Page 2: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic and extrinsic dimensions

Page 3: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic and extrinsic dimensions

Page 4: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic and extrinsic dimensions

Dimension ~ number of degrees of freedom

Page 5: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic and extrinsic dimensions

•Extrinsic: Dimension as a video frame: 600x400

Dimension ~ number of degrees of freedom

Page 6: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic and extrinsic dimensions

•Extrinsic: Dimension as a video frame: 600x400•Intrinsic: Dimension as a mechanical system: 1

Dimension ~ number of degrees of freedom

Page 7: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

Page 8: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

• Suppose we have a uniform distribution over some domain.

Page 9: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

• Suppose we have a uniform distribution over some domain.

• We partition it into n cells.

Page 10: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

• Suppose we have a uniform distribution over some domain.

• We partition it into n cells.

• The “Diameter” ε of the partition is the maximal distance between two points belonging to the same cell.

Page 11: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

• Suppose we have a uniform distribution over some domain.

• We partition it into n cells.

• The “Diameter” ε of the partition is the maximal distance between two points belonging to the same cell.

• As n increases, ε decreases, but at what rate?

Page 12: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Intrinsic dimension

• Suppose we have a uniform distribution over some domain.

• We partition it into n cells.

• The “Diameter” ε of the partition is the maximal distance between two points belonging to the same cell.

• As n increases, ε decreases, but at what rate?

• Lets look at some simple examples.

Page 13: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A line segment0 1 n=2

ε=1/2

Page 14: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A line segment0 1 n=2

ε=1/2

n=4ε=1/4

Page 15: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A line segment0 1 n=2

ε=1/2

n=4ε=1/4

n=8ε=1/8

Page 16: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A line segment0 1 n=2

ε=1/2

n=4ε=1/4

n=8ε=1/8

n=16ε=1/16

Page 17: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A line segment0 1 n=2

ε=1/2

n=4ε=1/4

n=8ε=1/8

n=16ε=1/16

General rule: ε=1/n

Page 18: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 2-D set

Page 19: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 2-D set0

0

1

1

n = 4

ε = 22

= 12

Page 20: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 2-D set0

0

1

1

n = 4

ε = 22

= 12

n = 16

ε = 24

= 12 2

Page 21: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 2-D set0

0

1

1

n = 4

ε = 22

= 12

n = 16

ε = 24

= 12 2

n = 64

ε = 28

= 14 2

Page 22: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 2-D set0

0

1

1

n = 4

ε = 22

= 12

n = 16

ε = 24

= 12 2

n = 64

ε = 28

= 14 2

general formula ε = 2n

or n = 2ε2

Page 23: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 3d set

Page 24: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 3d set

n = 1

ε = 3

Page 25: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 3d set

n = 1

ε = 3

n = 27

ε = 33

Page 26: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 3d set

n = 1

ε = 3

n = 27

ε = 33

n = 125

ε = 35

Page 27: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A 3d set

n = 1

ε = 3

n = 27

ε = 33

n = 125

ε = 35

general formula ε = 3n3

or n = 3 3ε3

Page 28: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

General dependence of number of elements on diameter

Page 29: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

General dependence of number of elements on diameter

ε = max diametern = number of cellsd = dimension of space

General Formula: n = Cεd

Alternatively: logn = logC + d log1ε

Page 30: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

General dependence of number of elements on diameter

ε = max diametern = number of cellsd = dimension of space

General Formula: n = Cεd

Alternatively: logn = logC + d log1ε

We can use the last equation to define the dimension of a dataset

Page 31: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 32: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

0size=1/4line=7blob=6

total=13

00

1

1

Page 33: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

1size=1/8line=30blob=12total=42

00

1

1

Page 34: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2size=1/16line=56blob=44

total=100

00

1

1

Page 35: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

3size=1/32line=134blob=142total=276

00

1

1

Page 36: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

4size=1/64line=281blob=598total=879

00

1

1

Page 37: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating intrinsic dimension

logn = logC + d log1ε

Two Scales: (n1,ε1),(n2,ε2 ); n1 < n2, ε1 > ε2

log n2

n1

= d log ε1ε2

d = logn2 − logn1

logε1 − logε2

Page 38: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

Steeper decline = lower dimension

d = − logn2 − logn1logε2 − logε1

Page 39: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

d = − logn2 − logn1logε2 − logε1

Page 40: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

log10 ε1 − log10 ε4 = 0.90

d = − logn2 − logn1logε2 − logε1

Page 41: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

log10 ε1 − log10 ε4 = 0.90

blob : log10 n4 − log10 n1 = 1.70

d = − logn2 − logn1logε2 − logε1

Page 42: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

log10 ε1 − log10 ε4 = 0.90

blob : log10 n4 − log10 n1 = 1.70

line : log10 n4 − log10 n1 = 0.99

d = − logn2 − logn1logε2 − logε1

Page 43: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating the dimension

log10 ε1 − log10 ε4 = 0.90

blob : log10 n4 − log10 n1 = 1.70

line : log10 n4 − log10 n1 = 0.99

dimension of line = 0.990.90

= 1.10 ≈1

dimension of blob= 1.700.90

= 1.89 ≈ 2d = − logn2 − logn1

logε2 − logε1

Page 44: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating using kmeans++

Page 45: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating using kmeans++

• Add representatives using the K-means++ rule.

Page 46: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating using kmeans++

• Add representatives using the K-means++ rule.

• After adding a representative, estimate the average square distance.

Page 47: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Estimating using kmeans++

• Add representatives using the K-means++ rule.

• After adding a representative, estimate the average square distance.

d = logn2 − logn1log ε1 − log ε2

= 2 logn2 − logn1logε1 − logε2

Page 48: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

rotating handhttp://vasc.ri.cmu.edu/idb/html/motion/hand/index.html

Page 49: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Rotating hand dimension estimation

33

37

0 6

2*6/4 = 3

Page 50: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Swiss Roll

Page 51: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Swiss Roll dimension estimation

11.5

18.5

0 6.5

2*7/6.5 ~ 2

Page 52: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The turning tea-pot

Page 53: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The turning tea-pot

Page 54: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 55: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Tea-pot dimension estimation

Page 56: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 57: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 58: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 59: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 60: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Signal Processing

Page 61: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Normal Heart

Page 62: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Anomalous Heart

Page 63: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Integer and fractional dimensions

Page 64: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Integer and fractional dimensions

• We saw dimensions 1,2,3,….

Page 65: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Integer and fractional dimensions

• We saw dimensions 1,2,3,….

• can there be fractional dimensions?

Page 66: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Koch Snowflake

i=0

i=3i=2

i=1

Page 67: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Koch Snowflake

i=0

i=3i=2

i=1

Snowflake corresponds to i→∞

Page 68: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Koch Snowflake

εi =13i

ni = 3× 4i

ni = 3×1εi

⎛⎝⎜

⎞⎠⎟

log4log3

i=0

i=3i=2

i=1

Snowflake corresponds to i→∞

Page 69: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Koch Snowflake

εi =13i

ni = 3× 4i

ni = 3×1εi

⎛⎝⎜

⎞⎠⎟

log4log3

=dimension=1.26

i=0

i=3i=2

i=1

Snowflake corresponds to i→∞

Page 70: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

Page 71: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

Page 72: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

• Housdorff dimension: max distance between 2 points in the same cell

Page 73: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

• Housdorff dimension: max distance between 2 points in the same cell

• VQ: Average distance to representative.

Page 74: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

• Housdorff dimension: max distance between 2 points in the same cell

• VQ: Average distance to representative.

• Epsilon-cover: all points are at a distance of at most epsilon from a representative.

Page 75: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

• Housdorff dimension: max distance between 2 points in the same cell

• VQ: Average distance to representative.

• Epsilon-cover: all points are at a distance of at most epsilon from a representative.

• One can use grids, circles, triangles,line segments ….

Page 76: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Variations on a theme

• Partition count can be defined in many ways

• Housdorff dimension: max distance between 2 points in the same cell

• VQ: Average distance to representative.

• Epsilon-cover: all points are at a distance of at most epsilon from a representative.

• One can use grids, circles, triangles,line segments ….

• In most cases they all converge to the same number!

Page 77: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

How many balls of radius r it takes to cover the British coastline?

Page 78: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

How many balls of radius r it takes to cover the British coastline?

Page 79: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

d=1.25

How many balls of radius r it takes to cover the British coastline?

Page 80: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

How many squares of size 1/2i it takes to cover the British coastline?

Page 81: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

How many squares of size 1/2i it takes to cover the British coastline?

Page 82: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

d=1.25

How many squares of size 1/2i it takes to cover the British coastline?

Page 83: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using line segments: how many line segments of length 1/2i it takes to trace the British coastline?

Page 84: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using line segments: how many line segments of length 1/2i it takes to trace the British coastline?

Page 85: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

d=1.25

Using line segments: how many line segments of length 1/2i it takes to trace the British coastline?

Page 86: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

A comparative study of coastlines

Only slopes are significant!

Page 87: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Plants

Page 88: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Plants

d btwn 1 and 2

Page 89: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimensions for different tree typesBoccio and Bastian 2011

http://www.andreasbastian.com/fractal/fractal.html

Page 90: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimensions for different tree typesBoccio and Bastian 2011

http://www.andreasbastian.com/fractal/fractal.html

Original (color) preprocessed

Page 91: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimensions for different tree typesBoccio and Bastian 2011

http://www.andreasbastian.com/fractal/fractal.html

Original (color) preprocessed

Page 92: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimensions for different tree typesBoccio and Bastian 2011

http://www.andreasbastian.com/fractal/fractal.html

Original (color) preprocessed

Page 93: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The nile from the air.

Page 94: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

More examples

Examples of objects with different Hausdorff Dimension:http://en.wikipedia.org/wiki/List_of_fractals_by_Hausdorff_dimension

Page 95: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Application to gesture recognition

Page 96: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Facial Motion Capture - Avatar

Page 97: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Motion Capture - Avatar

• Intrinsic dimension=number of degree of freedom < number of muscles in the human face: around 23.

• 23 markers suffice to capture all expressions!

Page 98: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2007

Page 99: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2007

• Viseme ~ simple model using 11 DoF (Degrees of freedom)

Page 100: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2007

• Viseme ~ simple model using 11 DoF (Degrees of freedom)• expresseme ~ Using additional codewords to detect extremal expressions

Page 101: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2007

• Viseme ~ simple model using 11 DoF (Degrees of freedom)• expresseme ~ Using additional codewords to detect extremal expressions• Goal of work: complement speech signal to improve language recognition.

Page 102: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Emotions and facial expressions

Page 103: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Human facial expressions are universal, not learned

Paul Ekman / 1963 / New Guinnea

Page 104: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Human/ape facial expressions

Page 105: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

https://www.youtube.com/watch?v=R6galodflTQ

Page 106: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Different notions of dimension and low-D

embeddings• PCA (Linear dimension)

• Locally near Embedding

• Differential Geometry

• Doubling / Haussdorf dimension

• RP-trees

Page 107: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Eigen-Faces

Page 108: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Weinberger & Saul / 2006Locally Linear Embedding (LLE)

Page 109: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCA

Page 110: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCA

Page 111: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCA

Page 112: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCAIf variance is dominated bythe d largest eigen-values then set has intrinsic dimension d

Page 113: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCAIf variance is dominated bythe d largest eigen-values then set has intrinsic dimension d

What can we do if set is on d-dim manifoldthat is not affine?

Page 114: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

PCAIf variance is dominated bythe d largest eigen-values then set has intrinsic dimension d

What can we do if set is on d-dim manifoldthat is not affine?Partition space into small regions in which the set is approximately affine.

Page 115: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Manifold dimension• Differentiable manifold dimension: dimension of

local tangent space.

• local, infinitesimally small regions. Requires smoothness. Hard to use for sampled data.

http://mathinsight.org/dynamical_system_idea

Page 116: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Manifold dimension• Differentiable manifold dimension: dimension of

local tangent space.

• local, infinitesimally small regions. Requires smoothness. Hard to use for sampled data.

http://mathinsight.org/dynamical_system_idea

Page 117: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

Page 118: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

Page 119: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

Page 120: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

• For any ball B of radius r

Page 121: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

• For any ball B of radius r

• Intersection of set S and B can be covered by at most 2^d balls of radius r/2.

Page 122: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

• For any ball B of radius r

• Intersection of set S and B can be covered by at most 2^d balls of radius r/2.

• Global, all scales, does not require smoothness.

Page 123: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

• For any ball B of radius r

• Intersection of set S and B can be covered by at most 2^d balls of radius r/2.

• Global, all scales, does not require smoothness.

• More general than manifold dimension.

Page 124: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Doubling dimension

• Similar to Hausdorff dimension.

• Doubling dimension of set S is d if:

• For any ball B of radius r

• Intersection of set S and B can be covered by at most 2^d balls of radius r/2.

• Global, all scales, does not require smoothness.

• More general than manifold dimension.

Page 125: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimension can depend on scale

Page 126: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Dimension can depend on location

Page 127: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Haussdorff vs. PCA

• With PCA we can find a low dimensional representation (eigen-vectors explaining 90% of variance). But only for a linear mapping.

• With Hausdorff dimension we can identify arbitrary low dimensional structure, but there is no coordinate system.

• Can we combine the two?

Page 128: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Data

Page 129: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00

1

1

Partition using grid

Page 130: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00

1

1

PCA in each cell

Page 131: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00 1

1

Page 132: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00 1

1

• Green ellipses: First eigenvector explains > X% of variance in cell. - we are done.

Page 133: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00 1

1

• Green ellipses: First eigenvector explains > X% of variance in cell. - we are done.

• Orange ellipses: First eigenvector explains < X% of variance in cell - subdivide cell.

Page 134: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00 1

1

• Green ellipses: First eigenvector explains > X% of variance in cell. - we are done.

• Orange ellipses: First eigenvector explains < X% of variance in cell - subdivide cell.

• In high dimensions data can be divided very unequally among the cells. -> leads to non-uniform accuracy.

Page 135: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

00 1

1

• Green ellipses: First eigenvector explains > X% of variance in cell. - we are done.

• Orange ellipses: First eigenvector explains < X% of variance in cell - subdivide cell.

• In high dimensions data can be divided very unequally among the cells. -> leads to non-uniform accuracy.

• We need a better way to divide cells.

Page 136: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 137: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

points=data, circles=centroids.

Page 138: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

points=data, circles=centroids.

Page 139: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

points=data, circles=centroids.

Page 140: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

points=data, circles=centroids.

Page 141: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

points=data, circles=centroids.

Page 142: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

• repeat for leaves.

points=data, circles=centroids.

Page 143: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for low-dimensional spaces.

points=data, circles=centroids.

Page 144: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for low-dimensional spaces.

• Works poorly for data with low intrinsic dimension embedded in a high dimensional space.

points=data, circles=centroids.

Page 145: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for low-dimensional spaces.

• Works poorly for data with low intrinsic dimension embedded in a high dimensional space.

• If dimension is D, then D levels are required to half the max-diameter of the cells.

points=data, circles=centroids.

Page 146: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Balanced space partitioning using KD-Trees

• Goal: partition space into regions with similar number of examples in each.

• KD-trees:

• Choose a coordinate at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for low-dimensional spaces.

• Works poorly for data with low intrinsic dimension embedded in a high dimensional space.

• If dimension is D, then D levels are required to half the max-diameter of the cells.

• D=20 -> 220 >1,000,000 cells to reduce the diameter from 1 to 1/2.

points=data, circles=centroids.

Page 147: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees

Page 148: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

Page 149: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

Page 150: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

Page 151: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

• Choose a direction uniformly at random.

Page 152: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

• Choose a direction uniformly at random.

• Partition the data at the median.

Page 153: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

• Choose a direction uniformly at random.

• Partition the data at the median.

• repeat for leaves.

Page 154: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

• Choose a direction uniformly at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for datasets with low covariance dimension. Even if embedded in a high dimensional space.

Page 155: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Random-Projection trees• Goal: partition space into regions with similar number

of examples in each.

• AND create shallow trees if data has low intrinsic dimension.

• RP-trees:

• Choose a direction uniformly at random.

• Partition the data at the median.

• repeat for leaves.

• Works well for datasets with low covariance dimension. Even if embedded in a high dimensional space.

• If covariance dimension is d, then d levels are required to half the max-diameter of the cells.

Page 156: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Splitting a set with low covariance dimension

Page 157: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Splitting a set with low covariance dimension

• “optimal” split - orthogonal to largest eigen-vector.

Page 158: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Splitting a set with low covariance dimension

• “optimal” split - orthogonal to largest eigen-vector.

• Split on random direction - almost optimal with constant probability.

Page 159: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

Dasgupta & Freund, STOC08

Page 160: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

• Space: R^D

Dasgupta & Freund, STOC08

Page 161: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

• Space: R^D

• Measure of progress: average cell diameter

Dasgupta & Freund, STOC08

Page 162: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

• Space: R^D

• Measure of progress: average cell diameter

• Tree-structured VQ: average diameter halved every D tree levels

Dasgupta & Freund, STOC08

Page 163: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

• Space: R^D

• Measure of progress: average cell diameter

• Tree-structured VQ: average diameter halved every D tree levels

• Data of intrinsic dimension d<<D

Dasgupta & Freund, STOC08

Page 164: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

theoretical properties of RP-trees.

• Space: R^D

• Measure of progress: average cell diameter

• Tree-structured VQ: average diameter halved every D tree levels

• Data of intrinsic dimension d<<D

• RP-tree: average diameter halved every d tree levels (with constant probability)

Dasgupta & Freund, STOC08

Page 165: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 166: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 167: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 168: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 169: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 170: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 171: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The turning tea-pot

Page 172: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The turning tea-pot

Page 173: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 174: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Charting turning teapot manifoldProblem: put an unordered set of images in

order of rotational angle.

Page 175: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Charting turning teapot manifoldProblem: put an unordered set of images in

order of rotational angle.

Page 176: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using RP-trees to represent high-dimensional data

Page 177: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using RP-trees to represent high-dimensional data

• Goal: map each data point to a localized PCA projection.

Page 178: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using RP-trees to represent high-dimensional data

• Goal: map each data point to a localized PCA projection.

• Identify the sufficiently linear pieces. (percent variance explained)

Page 179: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Using RP-trees to represent high-dimensional data

• Goal: map each data point to a localized PCA projection.

• Identify the sufficiently linear pieces. (percent variance explained)

• Combine representations from different nodes along the path.

Page 180: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 181: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 182: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 183: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 184: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 185: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 186: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Modeling the manifold of handwritten digits

• Using the MNIST digit dataset.

• We use RP-trees to model one digit at a time.

• Can be a useful pre-processing step for digit recognition.

Page 187: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

RP-tree for the digit 1

Page 188: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

2d distribution of 1

Page 189: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

KD-tree vs. RP-tree performance

1 2 3 4 5950

1000

1050

1100

1150

1200

1250

1300

1350

Levels

Avg

VQ E

rror

k−d Tree (random coord)k−d Tree (max var coord)RP TreePCA Tree

Page 190: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Unexplained variance vs. tree depth

Page 191: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Another Application of RP trees

Page 192: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

• Controlling a PTZ camera using audio triangulation

• Learning low dimensional manifolds from sampled data.

• http://www.cse.ucsd.edu/~yfreund/cameraman/index.html

Automatic Cameraman

Page 193: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal
Page 194: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Beamforming basics

Arrays allow us to F O CUS on a source...these techniques are calledbeamformers.T he signal arrives with a delay ✓ ij between microphones i and j.

s(x , y )

s

x

cousticSource

y

δ δ

s s1

s2

s

1 2

1 21 1 22

E G2007 ( M anifold Learning Seminar) M ic A rrays June 7, 2007 3 / 21

Page 195: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

Page 196: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

• Goal: map measured delays to pan-tilt direction of camera.

Page 197: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

• Goal: map measured delays to pan-tilt direction of camera.

• Training data:

Page 198: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

• Goal: map measured delays to pan-tilt direction of camera.

• Training data:

• High-correlation delay for each microphone pair (21)

Page 199: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

• Goal: map measured delays to pan-tilt direction of camera.

• Training data:

• High-correlation delay for each microphone pair (21)

• Camera pan+tilt (2)

Page 200: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Calibration process

• Goal: map measured delays to pan-tilt direction of camera.

• Training data:

• High-correlation delay for each microphone pair (21)

• Camera pan+tilt (2)

delay 1,2 delay 1,3 delay 2,3 . . . pan tilt

9±2 35±1 ? 77±2 31±2

13±2 30±2 50±20 80±2 33±2

Page 201: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

Page 202: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

Page 203: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

• 21 microphone pairs

Page 204: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

• 21 microphone pairs

• 2 camera coordinates: pan,tilt

Page 205: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

• 21 microphone pairs

• 2 camera coordinates: pan,tilt

• Together: 23 dimensional space

Page 206: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

• 21 microphone pairs

• 2 camera coordinates: pan,tilt

• Together: 23 dimensional space

• Data lies (close to) a smooth 3 dimensional manifold.

Page 207: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

The delay manifold

• 7 microphones

• 21 microphone pairs

• 2 camera coordinates: pan,tilt

• Together: 23 dimensional space

• Data lies (close to) a smooth 3 dimensional manifold.

• If we can learn manifold from data we can map delay vector to (pan,tilt)

Page 208: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Delay manifold for laboratory setup

Page 209: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Mapping of Hallway using top 2 eigenvectorsFor one node of RP-tree.

Page 210: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary I

Page 211: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary I

• Dimensionality reduction / Lossy compression are methods for reducing data without losing much of the information.

Page 212: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary I

• Dimensionality reduction / Lossy compression are methods for reducing data without losing much of the information.

• PCA is the most popular method, but it can only find linear mappings. We say that PCA find a k-dimensional representation if >X% of the variance is explained by the top k eigen-vectors. Equivalently, the top k eigen-values sum to >X% of the total variance.

Page 213: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary I

• Dimensionality reduction / Lossy compression are methods for reducing data without losing much of the information.

• PCA is the most popular method, but it can only find linear mappings. We say that PCA find a k-dimensional representation if >X% of the variance is explained by the top k eigen-vectors. Equivalently, the top k eigen-values sum to >X% of the total variance.

• PCA dimension is a global concept.

Page 214: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

An old video

• https://www.youtube.com/watch?v=rrOy6LpL940

Page 215: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 2

Page 216: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 2

• Vector quantization is generic but it only finds a partition, not a mapping into new coordinates.

Page 217: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 2

• Vector quantization is generic but it only finds a partition, not a mapping into new coordinates.

• Scaling dimension / Haussdorf dimension / Metric dimension: characterizes the rate of increase in the number of partition as the radius/diameter of the parts decreases.

Page 218: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 2

• Vector quantization is generic but it only finds a partition, not a mapping into new coordinates.

• Scaling dimension / Haussdorf dimension / Metric dimension: characterizes the rate of increase in the number of partition as the radius/diameter of the parts decreases.

logn = logC + d log1ε

log n2n1

= d log ε1ε2

d =log n2

n1log ε1ε2

Page 219: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 3

Page 220: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 3

• Low dimensional manifold: a subset of the space that is defined by a set of constraints.

Page 221: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 3

• Low dimensional manifold: a subset of the space that is defined by a set of constraints.

• Not a statistical concept

Page 222: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 3

• Low dimensional manifold: a subset of the space that is defined by a set of constraints.

• Not a statistical concept

• The local dimension of the manifold is defined by the tangent hyperplane at that point.

Page 223: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Summary 3

• Low dimensional manifold: a subset of the space that is defined by a set of constraints.

• Not a statistical concept

• The local dimension of the manifold is defined by the tangent hyperplane at that point.

• Dimension is an infinitesimal concept.

Page 224: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Local covariance dimension

Page 225: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Local covariance dimension

• A local but not an infinitesimal concept.

Page 226: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Local covariance dimension

• A local but not an infinitesimal concept.

• Perform PCA on the data that is in a ball.

Page 227: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Local covariance dimension

• A local but not an infinitesimal concept.

• Perform PCA on the data that is in a ball.

• RP-Trees - a space-partitioning data structure that performs well (as opposed to KD-trees) when the intrinsic dimension is low.

Page 228: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Future direction learning piecewise-linear control

Tedrake et al. “Learning to walk in 20 minutes” Science 2004

Page 229: UCSD DSE MAS - Intrinsic dimension...• Viseme ~ simple model using 11 DoF (Degrees of freedom) • expresseme ~ Using additional codewords to detect extremal expressions • Goal

Future direction learning piecewise-linear control

Tedrake et al. “Learning to walk in 20 minutes” Science 2004