69
1 Bi-criteria Linear-time Approximations for Generalized k-Mean/Median/Center Speaker: Dan Feldman Joint work with Amos Fiat, Danny Segev & Micha Sharir

Bi-criteria Linear-time Approximations for Generalized k-Mean/Median/Center

  • Upload
    baris

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Bi-criteria Linear-time Approximations for Generalized k-Mean/Median/Center. Speaker: Dan Feldman Joint work with Amos Fiat, Danny Segev & Micha Sharir. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A A A A A A A A A A A A A A A A. 1-Line Median. - PowerPoint PPT Presentation

Citation preview

Page 1: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

1

Bi-criteria Linear-time Approximations for

Generalized k-Mean/Median/Center

Speaker: Dan Feldman

Joint work with

Amos Fiat, Danny Segev & Micha Sharir

Page 2: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

1-Line Median

Let P be a set of n points in ddR

P

Page 3: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

The line median * minimizes dist( )p P

p

1-Line Median

*

*dist( )p

p

Page 4: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

- Line Mediank

Page 5: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

- Approximation (PTAS)

L (k lines) is a (1 ) approximation if

dist OPT( ) (1 )p P

Lp

min distT ( )OPL k

p P

p L

(1 )

Page 6: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

- Line Meank

- Line Centerk

dist( )max ,p P

p L

k

- Line Median

Page 7: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Can you cover P by k lines?

Page 8: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Can you cover P by k lines?

Page 9: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

´ Does OPT = 0 ?

Page 10: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

´ Does 2OPT = 0 ?

Page 11: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

´ Does nOPT = 0 ?

Page 12: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

- Line Median/Mean/Center is NP-Hard

It is NP -Hard to decide whether a set of npoints can be covered by k lines [Megiddo andTamir, 1983]

Iafddfsd

It is NP -Hard to decide whether a set

P µ R2 can be covered by k-lines.

[Megiddo and Tamir, 1983]

No non-trivial approximations to the

k-line median/ mean/ center that takes

poly (k) time

k

Page 13: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

( Approximation

L is a -approximation for the k-line median if

dis PT) Ot(p P

p L

L is an (-approximation for the k-line median if

| |L dis PT) Ot(p P

p L

and

| |L k and

Page 14: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Example: The 2-Line Median of P

* * *1 2,L

*2

*1

Page 15: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

(3-)½, approximation to the 2-Line Median of P

`

1 2 3, ,L

1 2

3

dist( , ½ P) O T p P

Lp

Page 16: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

(4 ,10-) approximation to the 2-Line Median of P

1 2 3 4, , ,L

32

4

1

dist( , ) OPT10 p P

p L

Page 17: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

k j-Flat Median/Mean/Center

A set F of k j-dimensional flats that minimizes the

sum of distances/

sum of squared distances/

maximum distance

from P to F

Page 18: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Results forj ¸ 1;k = 1² Mean:

O(nd2) time, Exact (SVD) [Pearson, 1901]

nd¢poly(j ;1=²) time, P TAS[Deshpande et al.,][Sarlos][Har-Peled] (2006)

² Median:

nd¢2poly(j ;1=²) time, P TAS[Shyamalkumar & Varadarajan, 2007]

² Center:

nd¢exp³poly(2(j

2) ;1=²)´time, P TAS

[Har-Peled & Varadarajan, 2004]

Page 19: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Results forj = 1;k ¸ 1² Mean/ Median:

nd¢kO(1) + (²¡ 1 logn)O(dk) time PTAS ,[Feldman et al., 2006]

² Center:

n logn ¢(1=²)poly(d;k) time PTAS[Agarwal et al., 2002]

O(dnk3 log4 n) time for³O(dk logk);8

´-approximation

[Agarwal & P rocopiuc, 2000]

Page 20: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Results for

PTAS that takes d¢npoly(j ;k;1=²) time.

Mean: [Deshpande et al., 2006]Median: [Shyamalkumar & Varadarajan, 2007]Center: [Har-Peled & Varadarajan, 2002]

j ; k > 1

Page 21: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Our Result

A set F which is an (®;¯ )-approximation

to the k j -° at mean/ median/ center of

P simultaneously, where

jF j · ®= logn ¢(j k log logn)O(j )

¯ = 2O(j )

in dn ¢(j k)O(j ) time.

Page 22: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Applications for

F irst (1 + ²)-approximations (with exactlyk-lines) that takes time linear in n

² for the k-line median/ mean of P µ Rd,using [Feldman et al., 2006]

² for the k-line center of P µ Rd,using [Agarwal et al., 2002]

j = 1;k ¸ 1

Page 23: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Applications for

² P TAS for the 1 j -° at median,using [Feldman et al., 2006]

² More e±cient algorithm for the1 j -° at center, using[Har-Peled & Varadarajan, 2004]

² F irst coresets for the k-lineand j -° at median/ mean/ center

k = 1; j ¸ 1

Page 24: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

24

The Algorithm

Page 25: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

InputA set of n points P ½Rd, k; j ¸ 1.

Page 26: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Output (with high probability)F : an (®;¯ )-approximation to the

k j -° at mean/ median/ center of P

Page 27: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Output (with high probability)F : an (®;¯ )-approximation to the

k j -° at mean/ median/ center of P

Page 28: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Initialization

1) t à 1

Counter for iterations

2) F Ã ;

T he output set of j -° ats

Page 29: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

3) P ick a random sample St ½P ,

jStj = O(j 2k2t)

t = 1

Page 30: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

F t := All possible j -dimensional °ats

that pass through (j + 1) points of St

(t = 1)

Page 31: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

4) F Ã F [ Ft

(t = 1)

Page 32: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

5) 8p : Compute dist(p;F t)

p

(t = 1)

Page 33: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

6) Remove Pt: the half of P that is

closer to Ft

(t = 1)tP

tP

Page 34: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

6) Remove Pt: the half of P that is

closer to Ft

(t = 1)

Page 35: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

7) t à t + 1

8) Repeat steps 3 to 6:

Page 36: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

3) P ick a random sample St ½P ,

jStj = O(j 2k2t)

(t = 2)

Page 37: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

F t := All possible j -dimensional °ats

that pass through (j + 1) points of St

(t = 2)

Page 38: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

4) F Ã F [ Ft

(t = 2)

Page 39: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

5) 8p : Compute dist(p;F t)

p

(t = 2)

Page 40: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

6) Remove Pt: the half of P that is

closer to Ft

(t = 2)

Page 41: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

7) Remove Pt: the half of P that is

closer to Ft

(t = 2)

Page 42: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

7) Remove Pt: the half of P that is

closer to Ft

(t = 2)

Page 43: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

6) t à t + 1

7) Repeat steps 3 to 6

till there are no more input points.

8) Return F :

Page 44: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

44

Proof of Correctnessfor the case of lines ( j=1)

Page 45: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Let F ¤ be any set of k lines in Rd.

Page 46: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Let F ¤ be any set of k lines in Rd.

Consider F t that is constructed during

the tth iteration.

Page 47: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

A point b2 P is bad for Ft, if:

dist(b;F t) > 4dist(b;F ¤)

b

Page 48: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

A point g 2 P is good for F t otherwise:

dist(g;F t) · 4dist(g;F ¤)

g

Page 49: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Main Technical TheoremWe can map every bad point b2 Pt to

a distinct good point g 2 Pt+1.

b

g

tP

Page 50: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

50

dist(b;F ) · dist(b;Ft), because F ¶ Ft.

Since b2 Pt and g 2 Pt+1:

dist(b;Ft) · dist(g;Ft)

Since g is good for Ft:dist(g;Ft) · 4dist(g;F ¤)

Page 51: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

51

dist(b;F ) · dist(b;Ft), because F ¶ Ft.

Since b2 Pt and g 2 Pt+1:

dist(b;Ft) · dist(g;Ft)

Since g is good for Ft:dist(g;Ft) · 4dist(g;F ¤)

dist(b;F ) · 4dist(g;F ¤)

Page 52: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Applied for k-line MedianX

p2Pdist(p;F )=

X

gdist(g;F ) +

X

bdist(b;F )

·X

g4dist(g;F ¤) +

X

g4dist(g;F ¤)

· 8X

p2Pdist(p;F ¤)

Similarly for k j -°at mean/ center of P .

Page 53: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

² T he number of bad points is at most

jB j =jPtj8

²¯¯P̄t+1

¯¯¯=

jPtj2

T he number of good points in Pt+1 is at least

¯¯P̄t+1

¯¯¯¡ jB j ¸

jPtj2

¡jPtj8

¸ jB j

Proof of the Technical Theorem

Page 54: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f

`

p

1b0b

1f

dist(p; f1) · 4dist(p; f ¤)

Claim: Only B =jPtj8

points are bad for f 1 2 Ft

Page 55: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f

` 0B

B0: thejPtj8 closest points to f ¤

Page 56: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f

` 0B

0b

B0: thejPtj16 closest points to f ¤

B0 probably contains b0 2 St

Page 57: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f

`

0f 0b

0B

B0: thejPtj16 closest points to f ¤

B0 probably contains b0 2 St

Page 58: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f

`0f 0b

p

0B

dist(p; f0) = dist(p; f ¤) + dist(b0; f¤)

· 2dist(p; f ¤)

For every white point p 2 P nB0:

Page 59: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

0f1B

0b1B

B1 : T hejPtj16

points with smallest angle to f 0

Page 60: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

B1 probably contains b1 2 St

0f1B

0b1B

1b

B1 : T hejPtj16

points with smallest angle to f 0

Page 61: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

0f1B

0b

p

1B

1b1f

For every white point p 2 P nB1:

dist(p; f 1) · 2dist(p; f 0)

Page 62: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f`

0f

p

1b0b

1f

dist(p; f1) · 2dist(p; f0) · 4dist(p; f ¤)

All the white points are good for f1

Page 63: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

*f`

0f

p

1b0b

1f

jB j = jB0j + jB1j =jPtj16

+jPtj16

=jPtj8

Only the black points B are bad for F t

Page 64: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center
Page 65: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Lines/Edges Detection

Page 66: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Prediction/Analyzing Time Series

Page 67: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

Prediction/Analyzing Time Series

Page 68: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

µ(p; f 1) · µ(p; f 0) + µ(b1; f 0) · 2µ(p; f 0)

or,µ(p; f 1)

2· µ(p; f 0),

so:

B

0

q

`

B

sp{b}b

q

sinµ(p; f 1) = 2sinµ(p; f 1)

2cos

µ(p; f 1)2

· 2 sinµ(p; f 1)

2· 2 sinµ(p; f 0):

Page 69: Bi-criteria Linear-time Approximations for Generalized  k-Mean/Median/Center

So, we have sinµ(p; f 1) · 2 sinµ(p; f 0).

T he distance from p to f 1 is thus bounded by

dist(p; f 1) = kpksinµ(p; f1)

· kpk ¢2sinµ(p; f 0) = 2dist(p; f0):

B

0

q

`

B

sp{b}b

q