Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Administrivia, Plan
ā¢ Administrivia:
ā NO CLASS next Tuesday 11/3 (holiday)
ā¢ Plan:
ā Earth-Mover Distance
ā¢ Scriber?
2
Earth-Mover Distance
ā¢ Definition:
ā Given two sets š“, šµ of points in a metric space
ā šøšš·(š“, šµ) = min cost bipartite matching between š“ and šµ
ā¢ Which metric space?
ā Can be plane, ā2, ā1ā¦
ā¢ Applications in image vision
Images courtesy of Kristen Grauman
Embedding EMD into ā1
ā¢ Why ā1?
ā¢ At least as hard as ā1
ā Can embed 0,1 š into EMD with distortion 1
ā¢ ā1 is richer than ā2
ā¢ Will focus on integer grid Ī 2:
4
Embedding EMD into ā1
ā¢ Theorem: Can embed EMD over Ī 2 into ā1 with distortion š log Ī . In fact, will construct a randomized š: 2 Ī 2
ā ā1 such that:ā for any š“, šµ ā Ī 2:
šøšš· š“, šµ ā¤ š¬ ||š š“ ā š šµ ||1 ā¤ š log Ī ā šøšš·(š“, šµ)
ā time to embed a set of š points: š š logĪ .
ā¢ Consequences:ā Nearest Neighbor Search: š(š logĪ ) approximation with
š(š š1+1/š) space, and š(š1/š ā š log Ī) query time.
ā Computation: š(logĪ) approximation in š š logĪ timeā¢ Best known: 1 + š approximation in š š time [ASā12]
[Charikarā02, Indyk-Thaperā03]
What if š“ ā |šµ| ?
ā¢ Suppose:
ā š“ = š
ā šµ = š < š
ā¢ Define
šøšš·ā š“, šµ = Ī š ā š + šššš“ā²,š
šāš“ā²
š(š, š š )
where
š“ā² ranges over all subsets of š“ of size š
š: š“ā² ā šµ ranges over all 1-to-1 mappings
For optimal š“ā², call š ā š“\š“ā² unmatched
6
Embedding EMD over small gridā¢ Suppose = 3
ā¢ š(š“) has nine coordinates, counting # points in each integer point
ā š(š“) = (2,1,1,0,0,0,1,0,0)
ā š(šµ) = (1,1,0,0,2,0,0,0,1)
ā¢ Claim: 2 2 distortion embedding
High level embedding
ā¢ Set in Ī 2 box
ā¢ Embedding of set š“:ā take a quad-tree
ā¢ grid of cell size Ī/3ā¢ partition each cell in 3x3
ā¢ recurse until of size 3x3
ā randomly shift it
ā Each cell gives a
coordinate:
š (š“)š=#points in the
cell š
ā¢ Want to proveš¬ š š“ ā š šµ
1ā šøšš·(š“, šµ)
8
2 2
1 0
02 11
1
0 00
0 0 0
0
0 2
21
š šØ = ā¦2210ā¦0002ā¦0011ā¦0100ā¦0000ā¦
š š© = ā¦1202ā¦0100ā¦0011ā¦0000ā¦1100ā¦
Main idea: intuition
ā¢ Decompose EMD over []2 into EMDs over smaller grids
ā¢ Recursively reduce to = š(1)
+ā
Decomposition Lemma
ā¢ For randomly-shifted cut-grid šŗ of side length š, will prove:1) šøšš·(š“, šµ) ā¤ šøšš·š(š“1, šµ1) + šøšš·š(š“2, šµ2) + āÆ
+ š ā šøšš·Ī/š(š“šŗ, šµšŗ)
2) šøšš·Ī š“, šµ ā„1
3š¬[šøšš·š š“1, šµ1 + šøšš·š š“2, šµ2 + āÆ]
3) šøšš·Ī š“, šµ ā„ š¬[š ā šøšš·Ī/š(š“šŗ , šµšŗ)]
ā¢ The distortion will
follow by applying the lemma
recursively to (AG,BG)
/š
š
1 (lower bound)ā¢ Claim 1: for a randomly-shifted cut-grid šŗ of side length š:
šøšš·(š“, šµ) ā¤ šøšš·š(š“1, šµ1) + šøšš·š(š“2, šµ2) + āÆ+ š ā šøšš·Ī/š(š“šŗ , šµšŗ)
ā¢ Construct a matching š for šøšš·Ī š“, šµ from the matchings on RHS as follows
ā¢ For each šš“ (suppose šš“š) it is either:1) matched in šøšš·(š“š , šµš) to some ššµš (if š ā š“šā²)
ā¢ then š š = š
2) or š ā š“šā², and then it is matched
in šøšš·(š“šŗ , šµšŗ) to some ššµš (š ā š)ā¢ then š š = š
ā¢ Cost? 1) paid by šøšš·(š“š , šµš)2) Move š to center ()
ā¢ Charge to šøšš·(š“š , šµš)
Move from cell š to cell šā¢ Charge š to šøšš·(š“šŗ , šµšŗ)
ā¢ If š“ > |šµ|, extra |š“| ā |šµ|
pay š ā Ī
š= Ī on LHS & RHS
/š
š
2 & 3 (upper bound)
ā¢ Claims 2,3: for a randomly-shifted cut-grid šŗ of side length š, we have:
2) šøšš·Ī š“, šµ ā„1
3š¬[šøšš·š š“1, šµ1 + šøšš·š š“2, šµ2 + āÆ]
3) šøšš·Ī š“, šµ ā„ š¬[š ā šøšš·Ī/š(š“šŗ , šµšŗ)]
ā¢ Fix a matching š minimizing šøšš·Ī(š“, šµ)ā Will construct matchings for each EMD on RHS
ā¢ Uncut pairs š, š ā š are matched in respective (š“, šµ)ā¢ Cut pairs š, š ā š :
ā are unmatched in their mini-grids
ā are matched in (š“šŗ , šµšŗ)
3: Cost
ā¢ Claim 2: ā¢ 3 ā šøšš·Ī š“,šµ ā„ š¬[šøšš·š š“1, šµ1 + šøšš·š š“2, šµ2 + āÆ]
ā¢ Uncut pairs (š, š) are matched in respective (š“š , šµš)ā Total contribution from uncut pairs ā¤ šøšš·Ī(š“, šµ)
ā¢ Consider a cut pair (š, š) at distance š ā š = (šš„ , šš¦)ā (š, š) can contribute to RHS as they may be unmatched in their
own mini-grids
ā Pr[(š, š) cut] = 1 ā 1 āšš„
š +1 ā
šš¦
š +ā¤
šš„
š+
šš¦
šā¤
1
š||š ā š||2
ā Expected contribution of (š, š) to RHS:ā¤Pr[(š, š) cut] ā 2š ā¤ 2 š ā š
2
ā Total expected cost contributed to RHS:2 ā šøšš·Ī(š“, šµ)
ā¢ Total (cut & uncut pairs): 3 ā šøšš·Ī(š“, šµ)
šš„
š
3: Cost
ā¢ Claim:
ā šøšš·Ī š“, šµ ā„ š¬[š ā šøšš·Ī/š(š“šŗ , šµšŗ)]
ā¢ Uncut pairs: contribute zero to RHS!
ā¢ Cut pair: š, š ā š with š ā š = (šš„, šš¦)
ā if šš„ = š„š + šš, and šš¦ = š¦š + šš¦ , then
ā expected cost contribution to š ā šøšš·Ī/š(š“šŗ , šµšŗ):
ā¤ š„ +šš„
šā š + š¦ +
šš¦
šā š = šš„ + šš¦ = š ā š
2
ā¢ Total expected cost ā¤ šøšš·Ī(š“, šµ)
14
šš„
š š š
Recurse on decompositionā¢ For randomly-shifted cut-grid šŗ of side length š, we have:
1) šøšš·(š“, šµ) ā¤ šøšš·š(š“1, šµ1) + šøšš·š(š“2, šµ2) + āÆ
+ š ā šøšš·Ī/š(š“šŗ, šµšŗ)
2) šøšš·Ī š“, šµ ā„1
3š¬[šøšš·š š“1, šµ1 + šøšš·š š“2, šµ2 + āÆ]
3) šøšš·Ī š“, šµ ā„ š¬[š ā šøšš·Ī/š(š“šŗ , šµšŗ)]
ā¢ We applying decomposition recursively for š = 3ā Choose randomly-shifted cut-grid šŗ1 on []2
ā Obtain many grids [3]2, and a big grid [/3]2
ā Then choose randomly-shifted cut-grid šŗ2 on [/3]2
ā Obtain more grids [3]2, and another big grid [/9]2
ā Then choose randomly-shifted cut-grid šŗ3 on [/9]2
ā ā¦
ā¢ Then, embed each of the small grids [3]2 into ā1, using š(1)distortion embedding, and concatenate the embeddingsā Each 3 2 grid occupies 9 coordinates on ā1 embedding
15
Proving recursion worksā¢ Claim: embedding contracts distances by š 1 :
šøšš·(š“, šµ) ā¤
ā¤ āš šøšš·š š“š, šµš + š ā šøšš·Ī/š(š“šŗ1, šµšŗ1
)
ā¤ āš šøšš·š š“š, šµš + šāš šøšš·š š“šŗ1,š, šµšŗ1,š
+š ā šøšš· Ī
š2š“šŗ2
, šµšŗ2
ā¤ ā¦ā¤ sum of šøšš·3 costs of 3 Ć 3 instances
ā¤1
2 2||š š“ ā š šµ ||1
ā¢ Claim: embedding distorts distances by š log Ī in expectation:3 logš Ī ā šøšš·(š“, šµ)
ā„ 3 ā šøšš·(š“, šµ) + 3 logšĪ
šā šøšš·Ī(š“, šµ)
ā„ š[ āš šøšš·š(š“š, šµš) + 3 logšĪ
šā š ā šøšš·Ī/š(š“šŗ1
, šµšŗ1) ]
ā„ āÆā„ sum of šøšš·3 costs of 3 Ć 3 instances
ā„ ||š š“ ā š šµ ||1
Final theoremā¢ Theorem: can embed EMD over [Ī]2 into ā1
with š(log Ī) distortion in expectation.
ā¢ Notes:ā Dimension required: š(Ī2), but a set š“ of size š
maps to a vector that has only š(š ā log Ī) non-zero coordinates.
ā Time: can compute in š(š ā log )ā By Markovās, itās š(log Ī) distortion with 90%
probability
ā¢ Applications:ā Can compute šøšš·(š“, šµ) in time š(š ā log Ī)ā NNS: š(š ā log Ī) approximation, with š(š1+1/š ā
š ) space, and š(š1/š ā š ā log Ī) query time.