K-Nearest Neighbors Search in High Dimensions Tomer Peled Dan Kushnir Tell me who your neighbors are, and I'll know who you are

k-Nearest Neighbors Search in High Dimensions

Tomer Peled

Dan Kushnir

Tell me who your neighbors are and Ill know who you are

Outline

bullProblem definition and flavorsProblem definition and flavorsbullAlgorithms overview - low dimensions bullCurse of dimensionality (dgt1020)bullEnchanting the curse

Locality Sensitive Hashing (high dimension approximate solutions)

bulll2 extensionbullApplications (Dan)

bull Given a set P of n points in Rd

Over some metric

bull find the nearest neighbor p of q in P

Nearest Neighbor SearchProblem definition

Distance metric

Applications

bullClassification bullClustering

bullSegmentation

bullIndexingbullDimension reduction

(eg lle)

Weight

Naiumlve solution

bullNo preprocess

bullGiven a query point qndashGo over all n pointsndashDo comparison in Rd

bullquery time = O(nd)

Keep in mind

Common solution

bullUse a data structure for acceleration

bullScale-ability with n amp with d is important

When to use nearest neighbor

High level algorithms

Assuming no prior knowledge about the underlying probability structure

complex models Sparse data High dimensions

Parametric Non-parametric

Density estimation

Probability distribution estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

r - Nearest Neighbor

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

bullProblem definition and flavorsbullAlgorithms overview - low dimensionsAlgorithms overview - low dimensions bullCurse of dimensionality (dgt1020)bullEnchanting the curse

The simplest solution

bullLion in the desert

Quadtree

Split the first dimension into 2

Repeat iteratively

Stop when each cell has no more than 1 data point

Quadtree - structure

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

X1Y1PltX1PltY1 PltX1

PgeX1PgeY1

PgeX1PltY1

Quadtree ndash Pitfall1

In some cases doesnrsquot

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

In some cases nothing works

Quadtree ndash pitfall 2X

Could result in Query time Exponential in dimensions

Space partition based algorithms

Multidimensional access methods Volker Gaede O Gunther

Could be improved

Outline

bullProblem definition and flavorsbullAlgorithms overview - low dimensions bullCurse of dimensionality (dgt1020)Curse of dimensionality (dgt1020)bullEnchanting the curse

Curse of dimensionality

bullQuery time or spaceO(nd)bullDgt1020 worst than sequential scan

ndashFor most geometric distributionsbullTechniques specific to high dimensions are needed

bullProoved in theory and in practice by Barkol amp Rabani 2000 amp Beame-Vee 2002

O( min(nd nd) )Naive

Curse of dimensionalitySome intuition

Outline

bullProblem definition and flavorsbullAlgorithms overview - low dimensions bullCurse of dimensionality (dgt1020)bullEnchanting the curse Enchanting the curse

Locality Sensitive Hashing Locality Sensitive Hashing (high dimension approximate solutions)

Preview

bullGeneral Solution ndash Locality sensitive hashing

bullImplementation for Hamming space

bullGeneralization to l1 amp l2

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

X=Number in the range 0n

Storage Address

Data structure

Usually we would like related Data-items to be stored at the same bin

Recall r - Nearest Neighbor

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Locality sensitive hashing

r(1 + ) r

(r p1p2 )Sensitiveequiv Pr[I(p)=I(q)] is ldquohighrdquo if p is ldquocloserdquo to qequiv Pr[I(p)=I(q)] is ldquolowrdquo if p isrdquofarrdquo from q

r2=(1 + ) r1

Preview

Hamming Space

bullHamming space = 2N binary strings

bullHamming distance = changed digits

aka Signal distanceRichard Hamming

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

bullHamming distance

SUM(X1 XOR X2)

L1 to Hamming Space Embedding

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

Bits sampling from p

Store p into bucket p|Ij 2k buckets101

11000000000 111111110000 111000000000 111111110001

Construction

Alternative intuition random projections

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

2233 BucketsBucketsp

k samplings

Repeating

Repeating L times

Secondary hashing

Support volume tuning

dataset-size vs storage volume

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

The above hashing is locality-sensitive

bullProbability (pq in same bucket)=

k=1 k=2

Distance (qpi) Distance (qpi)

lity Pr

Adopted from Piotr Indykrsquos slides

dimensions

)(Distance1

Preview

bullGeneralization to l2

Direct L2 solution

bullNew hashing function

bullStill based on sampling

bullUsing mathematical trick

bullP-stable distribution for Lp distance bullGaussian distribution for L2 distance

Central limit theorem

v1 +v2 hellip+vn =+hellip

(Weighted Gaussians) = Weighted Gaussian

v1vn = Real Numbers

X1Xn = Independent Identically Distributed(iid)

+v2 X2 hellip+vn Xn =+hellipv1 X1

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Features vector 2 Distance

Norm Distance

XvuXvXui

Dot Product

Dot Product Distance

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

wDiscretization step

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

100Discretization step

The full Hashing

bvavh ba )(

a1 v d

iid from p-stable distribution

phaseRandom[0w]

Features vector

Generalization P-Stable distribution

bullLp p=eps2

bullGeneralized Central Limit Theorem

bullP-stable distributionCauchy for L2

bullL2

bullCentral Limit Theorem

bullGaussian (normal) distribution

P-Stable summary

bullWorks for bullGeneralizes to 0ltplt=2

bullImproves query time

Query time = O (dn1(1+)log(n) ) O (dn1(1+)^2log(n) )

Latest resultsReported in Email by

Alexander Andoni

Parameters selection

bull90 Probability Best quarry time performance

For Euclidean Space

Parameters selectionhellip

For Euclidean Space

bullSingle projection hit an - Nearest Neighbor with Pr=p1

bullk projections hits an - Nearest Neighbor with Pr=p1k

bullL hashings fail to collide with Pr=(1-p1k)L

bullTo ensure Collision (eg 1-δge90)

bull1( -1-p1k)Lge 1-δ)1log(

Reject Non-NeighborsAccept Neighbors

hellipParameters selection

time Candidates verification Candidates extraction

Better Query Time than Spatial Data Structures

Scales well to higher dimensions and larger data size ( Sub-linear dependence )

Predictable running time

Extra storage over-head

Inefficient for data with distances concentrated around average

works best for Hamming distance (although can be generalized to Euclidean space)

In secondary storage linear scan is pretty much all we can do (for high dim)

requires radius r to be fixed in advance

Pros amp Cons

From Pioter Indyk slides

Conclusion

bullbut at the endeverything depends on your data set

bullTry it at homendashVisit

httpwebmiteduandoniwwwLSHindexhtml

ndashEmail Alex AndoniAndonimitedundashTest over your own data

(C code under Red Hat Linux )

LSH - Applicationsbull Searching video clips in databases (Hierarchical Non-Uniform Locality Sensitive

Hashing and Its Application to Video Identificationldquo Yang Ooi Sun)

bull Searching image databases (see the following)

bull Image segmentation (see the following)

bull Image classification (ldquoDiscriminant adaptive Nearest Neighbor Classificationrdquo T Hastie R Tibshirani)

bull Texture classification (see the following)

bull Clustering (see the following)

bull Embedding and manifold learning (LLE and many others)

bull Compression ndash vector quantizationbull Search engines (ldquoLSH Forest SelfTuning Indexes for Similarity Searchrdquo M Bawa T Condie P Ganesanrdquo)

bull Genomics (ldquoEfficient Large-Scale Sequence Comparison by Locality-Sensitive Hashingrdquo J Buhler)

bull In short whenever K-Nearest Neighbors (KNN) are needed

Motivation

bull A variety of procedures in learning require KNN computation

bull KNN search is a computational bottleneck

bull LSH provides a fast approximate solution to the problem

bull LSH requires hash function construction and parameter tunning

Outline

Fast Pose Estimation with Parameter Sensitive Hashing G Shakhnarovich P Viola and T Darrell

bull Finding sensitive hash functions

Mean Shift Based Clustering in HighDimensions A Texture Classification Example

B Georgescu I Shimshoni and P Meer

bull Tuning LSH parametersbull LSH data structure is used for algorithm

speedups

Given an image x what are the parameters θ in this image

ie angles of joints orientation of the body etc1048698

The Problem

Fast Pose Estimation with Parameter Sensitive Hashing

G Shakhnarovich P Viola and T Darrell

Ingredients

bull Input query image with unknown angles (parameters)

bull Database of human poses with known anglesbull Image feature extractor ndash edge detector

bull Distance metric in feature space dx

bull Distance metric in angles space

2121 )cos(1)(

Example based learning

bull Construct a database of example images with their known angles

bull Given a query image run your favorite feature extractorbull Compute KNN from databasebull Use these KNNs to compute the average angles of the

Input queryFind KNN in database of examples

Output Average angles of KNN

Input Query

Features extraction

Processed query

PSH (LSH)

Database of examples

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Image features are multi-scale edge histograms

Feature Extraction PSH LWR

PSH The basic assumption

There are two metric spaces here feature space ( )

and parameter space ( )

We want similarity to be measured in the angles

space whereas LSH works on the feature space

bull Assumption The feature space is closely related to the parameter space

Insight Manifolds

bull Manifold is a space in which every point has a neighborhood resembling a Euclid space

bull But global structure may be complicated curved

bull For example lines are 1D manifolds planes are 2D manifolds etc

Parameters Space (angles)

Feature Space

Is this Magic

Parameter Sensitive Hashing (PSH)

The trick

Estimate performance of different hash functions on examples and select those sensitive to

The hash functions are applied in feature space but the KNN are valid in angle space

Label pairs of examples with similar angles

Define hash functions h on feature space

Predict labeling of similarnon-similar examples by using h

Compare labeling

If labeling by h is goodaccept h else change h

PSH as a classification problem

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

)x()(x examples ofpair A

otherwise 1-

T(x) if 1)(

A binary hash functionfeatures

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

sconstraint iesprobabilit with thelabeling true thepredicts that Tbest theFind

themseparateor bin

same in the examplesboth place willTh

Local Weighted Regression (LWR)bull Given a query image PSH returns

bull LWR uses the KNN to compute a weighted average of the estimated angles of the query

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

Synthetic data were generated

bull 13 angles 1 for rotation of the torso 12 for joints

bull 150000 images

bull Nuisance parameters added clothing illumination face expression

bull 1775000 example pairs

bull Selected 137 out of 5123 meaningful features (how)

18 bit hash functions (k) 150 hash tables (l)

bull Test on 1000 synthetic examplesbull PSH searched only 34 of the data per query

bull Without selection needed 40 bits and

1000 hash tables

Recall P1 is prob of positive hashP2 is prob of bad hashB is the max number of pts in a bucket

Results ndash real data

bull 800 images

bull Processed by a segmentation algorithm

bull 13 of the data were searched

Interesting mismatches

Fast pose estimation - summary

bull Fast way to compute the angles of human body figure

bull Moving from one representation space to another

bull Training a sensitive hash function

bull KNN smart averaging

Food for Thought

bull The basic assumption may be problematic (distance metric representations)

bull The training set should be dense

bull Texture and clutter

bull General some features are more important than others and should be weighted

Food for Thought Point Location in Different Spheres (PLDS)

bull Given n spheres in Rd centered at P=p1hellippn

with radii r1helliprn

bull Goal given a query q preprocess the points in P to find point pi that its sphere lsquocoverrsquo the query q

Courtesy of Mohamad Hegaze

Motivationbull Clustering high dimensional data by using local

density measurements (eg feature space)bull Statistical curse of dimensionality

sparseness of the databull Computational curse of dimensionality

expensive range queriesbull LSH parameters should be adjusted for optimal

performance

Mean-Shift Based Clustering in High Dimensions A Texture Classification Example

Outline

bull Mean-shift in a nutshell + examples

Our scope

bull Mean-shift in high dimensions ndash using LSH

bull Speedups1 Finding optimal LSH parameters

2 Data-driven partitions into buckets

3 Additional speedup by using LSH data structure

Mean-Shift in a Nutshellbandwidth

Mean-shift LSH optimal kl LSH data partition

LSH LSH data struct

KNN in mean-shift

Bandwidth should be inversely proportional to the density in the region

high density - small bandwidth low density - large bandwidth

Based on kth nearest neighbor of the point

The bandwidth is

Adaptive mean-shift vs non-adaptive

LSH LSH data struct

Image segmentation algorithm1 Input Data in 5D (3 color + 2 xy) or 3D (1 gray +2 xy)2 Resolution controlled by the bandwidth hs (spatial) hr (color)3 Apply filtering

LSH LSH data struct

Mean-shift A Robust Approach Towards Feature Space Analysis D Comaniciu et al TPAMI 02rsquo

Image segmentation algorithm

original segmented

filtered

Filtering pixel value of the nearest mode

Mean-shift trajectories

LSH LSH data struct

original squirrel filtered

original baboon filtered

Filtering examples

Segmentation examples

Mean-shift in high dimensions

Computational curse of dimensionality

Statistical curse of dimensionality

Expensive range queries implemented with LSH

Sparseness of the data variable bandwidth

LSH LSH data struct

LSH-based data structure

bull Choose L random partitionsEach partition includes K pairs

(dkvk)bull For each point we check

kdi vxK

It Partitions the data into cells

LSH LSH data struct

Choosing the optimal K and L

bull For a query q compute smallest number of distances to points in its buckets

LSH LSH data struct

points extra includemight big toois L ifbut

missed bemight points small toois L If

cell ain points ofnumber smaller k Large

LSH LSH data struct

structure data theof resolution thedetermines

decreases but increases increases L As

Choosing optimal K and LDetermine accurately the KNN for m randomly-selected data points

distance (bandwidth)

Choose error threshold

The optimal K and L should satisfy

the approximate distance

LSH LSH data struct

Choosing optimal K and Lbull For each K estimate the error forbull In one run for all Lrsquos find the minimal L satisfying the constraint L(K)bull Minimize time t(KL(K))

minimum

Approximationerror for KL

L(K) for =005 Running timet[KL(K)]

LSH LSH data struct

Data driven partitions

bull In original LSH cut values are random in the range of the databull Suggestion Randomly select a point from the data and use one of its coordinates as the cut value

uniform data driven pointsbucket distribution

LSH LSH data struct

Additional speedup

aggregate)an of typea like is (C mode same

the toconverge willCin points all that Assume

LSH LSH data struct

Speedup results

65536 points 1638 points sampled k=100

Food for thought

Low dimension High dimension

A thought for foodhellipbull Choose K L by sample learning or take the traditionalbull Can one estimate K L without samplingbull A thought for food does it help to know the data

dimensionality or the data manifoldbull Intuitively dimensionality implies the number of hash

functions neededbull The catch efficient dimensionality learning requires

1530 cookieshellip

Summary

bull LSH suggests a compromise on accuracy for the gain of complexity

bull Applications that involve massive data in high dimension require the LSH fast performance

bull Extension of the LSH to different spaces (PSH)

bull Learning the LSH parameters and hash functions for different applications

Conclusion

bull but at the endeverything depends on your data set

bull Try it at homendash Visit

httpwebmiteduandoniwwwLSHindexhtmlndash Email Alex Andoni Andonimitedundash Test over your own data

Thanks

bull Ilan Shimshoni (Haifa)

bull Mohamad Hegaze (Weizmann)

bull Alex Andoni (MIT)

bull Mica and Denis

Outline

Nearest Neighbor Search Problem definition

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Quadtree ndash pitfall 2

Curse of dimensionality Some intuition

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Parameters selection hellip

hellip Parameters selection

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Given an image x what are the parameters θ in this image ie angles of joints orientation of the body etc1048698

Ingredients

The image features

Insight Manifolds

Local Weighted Regression (LWR)

Results

Food for Thought

Motivation

Filtering examples

Choosing optimal K and L

Additional speedup

Speedup results

Food for thought

A thought for foodhellip

Summary

Thanks

Outline

bullProblem definition and flavorsProblem definition and flavorsbullAlgorithms overview - low dimensions bullCurse of dimensionality (dgt1020)bullEnchanting the curse

Over some metric

Distance metric

Applications

bullSegmentation

(eg lle)

Weight

Naiumlve solution

bullNo preprocess

Keep in mind

Common solution

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Over some metric

Distance metric

Applications

bullSegmentation

(eg lle)

Weight

Naiumlve solution

bullNo preprocess

Keep in mind

Common solution

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Applications

bullSegmentation

(eg lle)

Weight

Naiumlve solution

bullNo preprocess

Keep in mind

Common solution

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Naiumlve solution

bullNo preprocess

Keep in mind

Common solution

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Common solution

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Density estimation

Nearest neighbors

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Nearest Neighbor

min pi P dist(qpi)

Closestqq

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Quadtree

Repeat iteratively

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

X1Y1 PgeX1PgeY1

PltX1PltY1

PgeX1PltY1

PltX1PgeY1

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Quadtree - Query

In many cases works

PgeX1PgeY1

PgeX1PltY1

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

X1Y1PgeX1PgeY1

PltX1PltY1 PgeX1

PltY1PltX1PgeY1

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Could be improved

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Preview

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hash function

Data_Item

BinBucket

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hash function

X modulo 3

Storage Address

Data structure

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

(1 + ) r

dist(qp1) r

dist(qp2) (1 + ) r r2=(1 + ) r1

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

r(1 + ) r

r2=(1 + ) r1

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Preview

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hamming Space

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hamming SpaceN

010100001111

010010000011Distance = 4

bullHamming space

SUM(X1 XOR X2)

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1111111100011000000000

drsquo=Cd

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Hash function

Lj Hash function

p Hdrsquoisin

Gj(p)=p|Ij

j=1L k=3 digits

11000000000 111111110000 111000000000 111111110001

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Construction

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1111111100011000000000

drsquo=Cd

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1111111100011000000000

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

11000000000 111111110000 111000000000 111111110001

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

k samplings

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Repeating

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Repeating L times

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Secondary hashing

2k buckets

Size=B

M Buckets

Simple Hashing

MB=αn α=2

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

k=1 k=2

lity Pr

dimensions

)(Distance1

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Preview

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Direct L2 solution

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

v1vn = Real Numbers

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Dot Product Norm

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Norm Distance

XvuXvXui

Features vector 1

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Norm Distance

XvuXvXui

Dot Product

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The full Hashing

bvavh ba )(

[34 82 21]1

227742

d random numbers

phaseRandom[0w]

Features vector

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The full Hashing

bvavh ba )(

7900 8000 8100 82007800

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The full Hashing

bvavh ba )(

phaseRandom[0w]

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The full Hashing

bvavh ba )(

a1 v d

phaseRandom[0w]

Features vector

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

bullLp p=eps2

bullL2

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

P-Stable summary

Alexander Andoni

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

For Euclidean Space

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Pros amp Cons

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Conclusion

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Motivation

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

speedups

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The Problem

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Ingredients

2121 )cos(1)(

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Input Query

Features extraction

Processed query

PSH (LSH)

The algorithm flow

LWR (Regression)

Output Match

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The image features

Axx 4107 )(

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Insight Manifolds

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Feature Space

Is this Magic

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

The trick

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Compare labeling

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

+1 +1 -1 -1

(r=025)

Labels

)1()( if 1

)( if 1y

labeled is

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

otherwise 1-

T(x) if 1)(

otherwise 1

if 1ˆ

labels ePredict th

)(xh)(xh)x(xy

jTiTjih

Feature

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

themseparateor bin

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

weightdist

iXiixNx

xxdKxgdi

))(())((minarg0

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Results

bull 150000 images

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1000 hash tables

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

bull 800 images

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Food for Thought

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

performance

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Outline

Our scope

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

KNN in mean-shift

The bandwidth is

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

original segmented

filtered

LSH LSH data struct

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Filtering examples

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

kdi vxK

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

minimum

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

LSH LSH data struct

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Additional speedup

LSH LSH data struct

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Speedup results

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Food for thought

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

1530 cookieshellip

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Summary

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Conclusion

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

bull Mica and Denis

Outline

Applications

Naiumlve solution

Common solution

Nearest Neighbor

Quadtree

Quadtree - Query

Preview

Hash function

Hamming Space

Construction

k samplings

Repeating

Repeating L times

Secondary hashing

Direct L2 solution

Norm Distance

The full Hashing

P-Stable summary

Pros amp Cons

Conclusion

LSH - Applications

Motivation

Ingredients

The image features

Insight Manifolds

Results

Food for Thought

Motivation

Filtering examples

Additional speedup

Speedup results

Food for thought

Summary

Thanks

Documents

K-Nearest Neighbors Search in High Dimensions Tomer Peled Dan Kushnir Tell me who your neighbors are, and I'll know who you are