Upload
frank-nielsen
View
99
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Slides for the paper presented at MIRAGE 2009 http://link.springer.com/chapter/10.1007%2F978-3-642-01811-4_38?LI=true
Citation preview
K-nearest neighbor search GPU-based brute-force method Experiments
Searching high-dimensional neighbors:CPU-based tailored data-structures versus
GPU-based brute-force method
Vincent Garcia and Frank Nielsen
Ecole Polytechnique, Palaiseau, France
Mirage 2009
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
K-nearest neighbor search problem (kNN)
Input data
R = {r1, r2, · · · , rm}: set of m reference points in Rd
Q = {q1, q2, · · · , qn}: set of n query points in Rd
Problem: For a given query point q, find the k-nearest neighborsof q in R.
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Applications
Image processing
Tracking
Texture synthesis
Image retrieval
Statistics
Shannon entropy
Kullback-Leibler divergence
Other applications
Physics
Biology
Finance
MultimediaGarcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
State of the art
Brute-force method
Compute the distances between q and the reference points
Sort the distances and deduce the k-nearest neighbors
Space-partitionning based methods
Partition the reference points (kd-tree)
Fast kNN search using this structure
Locality sensitive hashing (LSH)
Apply a set of hashing functions to the reference points(buckets)
Apply the same hashing functions to q
Compute the distances between q and the references pointsbelonging to the same bucket
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPGPU
GPU: Graphics Processing Unit
Set of graphic processors (up to 240 at 1.5 GHz)
Dedicated graphics rendering device
Specialized in highly parallel process
GPGPU: General-Purpose computing on GPU
Technique of using a GPU to perform computation inapplications traditionally handled by the CPU
NVIDIA CUDA (since Feb. 2007)
OpenCL (specifications in Dec. 2008)
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method
Brute-force kNN search is highly parallelizable
Idea
Propose a GPU implementation of the brute-force kNN searchusing NVIDIA CUDA
Input data: m reference points and n query points
Kernels
1 Compute the m × n distances in parallel
2 Sort the n sets of distances in parallel
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method
Brute-force kNN search is highly parallelizable
Idea
Propose a GPU implementation of the brute-force kNN searchusing NVIDIA CUDA
Input data: m reference points and n query points
Kernels
1 Compute the m × n distances in parallel
2 Sort the n sets of distances in parallel
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method
Brute-force kNN search is highly parallelizable
Idea
Propose a GPU implementation of the brute-force kNN searchusing NVIDIA CUDA
Input data: m reference points and n query points
Kernels
1 Compute the m × n distances in parallel
2 Sort the n sets of distances in parallel
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method: sorting algorithm
We have to perform n sorts of m distances
Sorting algorithms:
Quicksort (recursive algorithm)Comb sortModified insertion sort
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method: sorting algorithm
We have to perform n sorts of m distances
Sorting algorithms:
Quicksort (recursive algorithm)
Comb sortModified insertion sort
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method: sorting algorithm
We have to perform n sorts of m distances
Sorting algorithms:
Quicksort (recursive algorithm)Comb sort
Modified insertion sort
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method: sorting algorithm
We have to perform n sorts of m distances
Sorting algorithms:
Quicksort (recursive algorithm)Comb sortModified insertion sort
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
GPU-based brute-force method: sorting algorithm
We have to perform n sorts of m distances
Sorting algorithms:
Quicksort (recursive algorithm)Comb sortModified insertion sort
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments
Goal
Decrease the computation time of the kNN search
Compare our GPU-based implementation with a similarCPU-based approach
Compare our GPU-based implementation with fast CPU-basedapproach
Setup
Pentium 4 @ 3.4 GHz, 2GB of DDR2 PC2-5300 memory
NVIDIA GeForce 8800 GTX, 768 MB of DDR3 memory, 128graphic processors @ 1350 MHz
Last graphic cards: NVIDIA GeForce GTX 285, 1024 MB ofDDR3 memory, 240 graphic processors @ 1476 MHz
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments
Goal
Decrease the computation time of the kNN search
Compare our GPU-based implementation with a similarCPU-based approach
Compare our GPU-based implementation with fast CPU-basedapproach
Setup
Pentium 4 @ 3.4 GHz, 2GB of DDR2 PC2-5300 memory
NVIDIA GeForce 8800 GTX, 768 MB of DDR3 memory, 128graphic processors @ 1350 MHz
Last graphic cards: NVIDIA GeForce GTX 285, 1024 MB ofDDR3 memory, 240 graphic processors @ 1476 MHz
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Synthetic data
Parameters
n: number of reference points and query points
d : dimension of points
k: number of neighbors
Data: points randomly drawn from U(0, 1).
Methods
BF-Matlab: Brute-force method implemented in Matlab
BF-C: Brute-force method implemented in C
BF-CUDA: Brute-force method implemented in CUDA
ANN-C++: ANN method implemented in C++
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Synthetic data
Methods n=1200 n=2400 n=4800 n=9600
d=8 BF-Matlab 0.51 1.69 7.84 35.08BF-C 0.13 0.49 1.90 7.53ANN-C++ 0.13 0.33 0.81 2.43BF-CUDA 0.01 0.02 0.04 0.13
d=64 BF-Matlab 2.24 9.37 38.16 149.76BF-C 1.71 7.28 26.11 111.91ANN-C++ 0.78 3.56 14.66 59.28BF-CUDA 0.02 0.04 0.11 0.40
d=96 BF-Matlab 3.30 13.89 55.77 231.69BF-C 2.54 10.56 39.26 168.58ANN-C++ 1.20 4.96 19.68 82.45BF-CUDA 0.02 0.05 0.15 0.57
Maximum speed-up of BF-CUDA
407 X faster than BF-Matlab
295 X faster than BF-C
148 X faster than ANN-C++Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Synthetic data
Influence of the parameter k
d = 32, n = 4800
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Synthetic data
Influence of the parameter d
k = 20, n = 4800
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Synthetic data
Influence of the parameter n
k = 20, d = 32
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Finding similar patches in images
Goal
Define manually the firstpatch
Find the k most similarpatches
A color patch of size 21× 21 =point in R1323
For a color image of size128× 128, the problem isequivalent to find the kNNamong 16383 points.
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Finding similar patches in images
Goal
Define manually the firstpatch
Find the k most similarpatches
A color patch of size 21× 21 =point in R1323
For a color image of size128× 128, the problem isequivalent to find the kNNamong 16383 points.
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Finding similar patches in images
Goal
Define manually the firstpatch
Find the k most similarpatches
A color patch of size 21× 21 =point in R1323
For a color image of size128× 128, the problem isequivalent to find the kNNamong 16383 points.
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Finding similar patches in images
Data
Image size: 128× 128 = 16384 pixelsPatch size: 21× 21k = 10
Gray level image (dimension=441)
ANN-C++: 3550 msBF-CUDA: 60 msSpeed-up = 60X
Color image (dimension=1323)
ANN-C++: 11.03 sBF-CUDA: 0.15 sSpeed-up = 75X
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Texture synthesis
Texture synthesisEfros and Leung, ICIP’99
Source Image Is Target Image It
Scanline
s
2s + 1L-shape window
Principle
Fill the target image byrandom-colored pixels
For a given pixel, find theclosest L-shape in the sourceimage
Assign the value of theL-shape center to thecurrent pixel
The L-shape corresponding to a square window of size(2s + 1)× (2s + 1) contains 2s2 + 2s pixels
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Texture synthesis
Efros and Leung
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Experiments: Texture synthesis
Data
Source image: 64× 64 = 4096 pixels
Target image: 128× 128 = 16384 pixels
Gray level image (dimension=220)
ANN-C++: 720 ms
BF-CUDA: 18 ms
Speed-up = 40X
Color image (dimension=660)
ANN-C++: 2 s
BF-CUDA: 40 ms
Speed-up = 50X
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Conclusion
We proposed a GPU-based brute-force kNN search implementation
Using GPU, a simple but highly parallelizable method can be fasterthan a smart algorithm
Speed-up obtained on synthetic data
up to 400 X faster than a Matlab implementation
up to 300 X faster than a C implementation
up to 150 X faster than the ANN C++ library
Speed-up obtained on real image processing applications
Finding similar patches: 75 X
Texture synthesis: 50 X
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method
K-nearest neighbor search GPU-based brute-force method Experiments
Questions
Thanks for your attention
Garcia, Nielsen Ecole Polytechnique, Palaiseau, France
Searching high-dimensional neighbors: CPU-based tailored data-structures versus GPU-based brute-force method