What Is the Most Efficient Way to Select Nearest Neighbor Candidates for Fast Approximate Nearest...

What Is the Most Efficient Way to Select Nearest Neighbor Candidates for Fast

Approximate Nearest Neighbor Search?

Masakazu Iwamura, Tomokazu Sato and Koichi Kise(Osaka Prefecture University, Japan)

ICCV’2013

Sydney, Australia

Finding similar data Basic but important problem in information

processing

Possible applications include Near-duplicate detection Object recognition Document image retrieval Character recognition Face recognition Gait recognition

A typical solution: Nearest Neighbor (NN) Search

Finding similar data by NN Search Desired properties

Fast and accurate Applicable to large-scale data

The paper presents a way to realizefaster approximate nearest neighbor

search for certain accuracy

Benefit from improvement of

computing power

Contents NN and Approximate NN Search Performance comparison Keys to improve performance

Nearest Neighbor (NN) Search This is a problem that the true NN is

always found In a naïve way

　　 Data　　Query

For more data,more time is required

Nearest Neighbor (NN) Search Finding nearest neighbor efficiently

Before query is given

1. Index dataNN

1. Select search regions2. Calculate distances of

selected data

After query is given

The true NN must be contained in the selected search regions

Ensuring this takes so long time

Search regions

Approximate Nearest Neighbor Search Finding nearest neighbor more efficiently

Search regions Much faster

“Approximate” means that the true NN is not

guaranteed to be retrieved

ANN search on 100M SIFT features

Selected results

IMI(Babenko 2012)

IVFADC(Jegou 2011)

Selected results

IMI(Babenko 2012)

IVFADC(Jegou 2011)

BDH(Proposed method)

2.0 times

4.5 times

9.4 times

2.9 times

Selected results

IMI(Babenko 2012)

IVFADC(Jegou 2011)

2.0 times

4.5 times

9.4 times

2.9 times

The novelty of BDH was reduced by IMI before we

succeeded in publishing it…(For more detail, check out the

Wakate program on Aug. 1) Selected results

IMI(Babenko 2012)

IVFADC(Jegou 2011)

2.0 times

4.5 times

9.4 times

2.9 times

So-called binary coding is not suitable for fast

retrieval but for saving memory usage Selected

results

Keys to improve performance Select search regions in subspaces Find the closest ones in the original space

efficiently

Select search regions in subspaces In past methods (IVFADC, Jegou 2011 &

VQ-index, Tuncel 2002)

Search regions

Indexed by k-means

clustering

Select search regions in subspaces In past methods (IVFADC, Jegou 2011 &

VQ-index, Tuncel 2002)

Search regions

Indexed by k-means

clustering

Taking very much time to select the search regions

Proven to be the least quantization error

Indexed by vector quantization

Select search regions in subspaces In the past state-of-the-art (IMI, Babenko

Feature vectors

Divide into two or more

Calculate distances

in subspaces

Select the regions in the original

Indexed by k-means

clustering

Indexed by k-means

clustering

Select search regions in subspaces In the past state-of-the-art (IMI, Babenko

Feature vectors

Divide into two or more

Calculate distances

in subspaces

Select the regions in the original

Less accurate(More quantization error)

Much less processing timePros.

Indexed by product quantization

Realize better ratio

efficiently

Find the closest search regionsin original space In the past state-of-the-art (IMI, Babenko

1 3 815

1 2 4 916

2 3 510

Centroid in original space

Search regions are selected in the ascending order of distances in the original space

Subspace 2

Distances in subspace

Centroid in

subspace

Find the closest search regionsin original space In the past state-of-the-art (IMI, Babenko

1 3 815

1 2 4 916

2 3 510

Subspace 2

Centroid in

subspace

This can be done more efficiently with the branch and bound

methodIt does not consider the

order of selecting buckets

Search regions are selected in the ascending order of distances in the original space

Find the closest search regionsin original space efficiently In the proposed method

Subspace 2

Centroid in

subspace

Assume that upper limit is set to 8

Subspace 2

Centroid in

subspace

Subspace 2

Centroid in

subspace

Max 8Max 8

Subspace 2

Centroid in

subspace

Max 8Max 8

Subspace 2

Centroid in

subspace

Max 8Max 8

The upper and lower bounds are increased in a step-by-step manner until enough number of data are selected

What Is the Most Efficient Way to Select Nearest Neighbor Candidates for Fast

Approximate Nearest Neighbor Search?

Masakazu Iwamura, Tomokazu Sato and Koichi Kise(Osaka Prefecture University, Japan)

ICCV’2013

Sydney, Australia

What Is the Most Efficient Way to Select Nearest Neighbor Candidates for Fast Approximate Nearest...

Documents

Nearest neighbor, defect prediction

Nearest Neighbor Machine Translation

Nearest Neighbor based Greedy Coordinate Descentpapers.nips.cc/paper/4425-nearest-neighbor-based-greedy... · 2014-04-23 · Nearest Neighbor based Greedy Coordinate Descent Inderjit

K Nearest Neighbor (K NN)

Comparison of nearest-neighbor-search strategies and ...stephane.magnenat.net/publications/Comparison of nearest-neighbor... · Comparison of nearest-neighbor-search strategies

Nearest Neighbor Predictors

K nearest neighbor

The Nearest-Neighbor Classifier

Proximity Graphs for Nearest Neighbor

Multi-Type Nearest and Reverse Nearest Neighbor Search

Won't You be my Nearest Neighbor?

PCL :: Search - PCL - Point Cloud Library (PCL) · KdTree3D Nearest Neighbor SearchHigh Dimensional Nearest Neighbor SearchOctree Nearest Neighbor Search I Nearest neighbor search

David Claus and Christoph F. Eick: Nearest Neighbor Editing and Condensing Techniques Nearest Neighbor Editing and Condensing Techniques 1.Nearest Neighbor

K Nearest Neighbor Presentation

Adaptation of k-Nearest Neighbor Queries for Inter ...... · Nearest Neighbor Algorithm Nearest neighbor is an algorithm to ﬁnding the given object of interest that is closest to

Deep Nearest Neighbor Anomaly Detection

Won't You be my Nearest Neighbor

Lecture 3 Nearest Neighbor Algorithms

Nearest neighbor - cseweb.ucsd.edu

Prototype Selection for Nearest Neighbor Classiﬁcation ......PROTOTYPE SELECTION FOR NEAREST NEIGHBOR CLASSIFICATION: SURVEY OF METHODS 3 nearest neighbor rank R of x i.Now the MNV