44
A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Embed Size (px)

Citation preview

Page 1: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

A Hardware Processing Unit For Point Sets

S. Heinzle, G. Guennebaud,M. Botsch, M. Gross

Graphics Hardware 2008

Page 2: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Motivation

• Point-based graphics established• Powerful algorithms

– Representation– Processing– Manipulation– Rendering

• Decomposition– Get neighborhood– Operate on neighbors

Graphics Hardware 2008 2

Page 3: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Motivation

• GPUs not suited for getting neighborhood– SIMD – Incoherent branching– Dynamic data structures

slow– Recursive calls not

supported

• CPUs– Small number of FPUs– Inflexible memory caches

Graphics Hardware 2008 3

Courtesy of NVIDIA

Courtesy of Intel

Page 4: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Contributions

• Hardware architecture for point sets– Neighbor search module– Novel advanced caching mechanism– Reconfigurable processing module– Programmability using FPGA compiler

• FPGA prototype and measurements• Small & Lean

Integration into multi-core CPU/GPU possible

Graphics Hardware 2008 4

Page 5: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Outline

• Related Work• Spatial Searching and Caching• Architecture and Prototype• Results• Conclusion

Graphics Hardware 2008 5

Page 6: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Related Work

Kd-Tree[Bentley 75]

Graphics Hardware 2008 6

kNN on GPUs[Ma and McCool 02]

Kd-Tree Hardware[Woop et al. 05][Woop et al. 06]

Kd-Tree on GPUs[Popov et al. 07]

Page 7: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Related Work

Adaptive SPH Fluid Simulation[Adams et al. ‘07]

Graphics Hardware 2008 7

Linear Moving Least Squares,[Adamson and Alexa ’04]

Algebraic Moving Least Squares, [Guennebaud and Gross ‘07]

Page 8: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 8

• Implicit surface definition defined by set of points

Page 9: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 9

x

• Implicit surface definition defined by set of points

Page 10: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 10

10

x

pi

ni

Page 11: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 11

x

• Iterative projections onto plane

Page 12: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 12

x

• Iterative projections onto plane

x’

Page 13: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 13

x

• Iterative projections onto plane

x’’

’ ’

Page 14: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 14

x

• Iterative projections onto plane

x’’’

’ ’ ’

Page 15: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Linear Moving Least Squares

Graphics Hardware 2008 15

x

• Surface defined by points projecting onto themselves

Page 16: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Outline

• Related Work• Spatial Searching and Caching• Architecture & Prototype• Results• Conclusion

Graphics Hardware 2008 16

Page 17: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Spatial Search

• Spatial search: kNN and NN– Common in most point operations– Based on kd-tree

• Example NN:

Graphics Hardware 2008 17

Page 18: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Spatial Search

• kNN search similar to NN search:– Start with infinite radius– Sort leaf points into priority queue– Shrink radius with every point sorted

Graphics Hardware 2008 18

Page 19: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Coherent Neighbor Cache(NN)

• Find neighbors in slightly bigger radius• Re-use result for spatially close query

Graphics Hardware 2008 19

Re-use if

Page 20: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Coherent Neighbor Cache

(kNN, exact)• Find (k+1) neighbors• Re-use result for spatially close query

Graphics Hardware 2008 20

Re-use if

Page 21: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Coherent Neighbor Cache

(kNN, approximation)• Approximation error

– Enlarge radius

Graphics Hardware 2008 21

Re-use if

Page 22: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Outline

• Related Work• Spatial Searching and Caching• Architecture & Prototype• Results• Conclusion

Graphics Hardware 2008 22

Page 23: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

The Architecture

Graphics Hardware 2008 23

Host

Page 24: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

• Eight cached neighborhoods• Problem: parallel queries in kd-tree

module Interleave spatially similar queries

Coherent Neighbor Cache

Graphics Hardware 2008 24

1 1 1

0 0 0

n n n

Page 25: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Kd-Tree Traversal

Graphics Hardware 2008 25

Page 26: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Graphics Hardware 2008 26

• Kd-tree structure on chip• 16 threads• Pipelining and multi-threading

NodeRecurs

e

Page 27: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Stacks

• 16 stacks• Parallel read/write• Bounded in depth

• 6 bytes per thread per recursion

Graphics Hardware 2008 27

Page 28: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Leaf

• 16 parallel priority queues (1-cycle ops)• Queues store pointers and distances• Bandwidth bottleneck

Graphics Hardware 2008 28

Page 29: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

• Multithreaded quad-port bank of 16 registers

• 128 threads• Programmability using FPGA-technology

Processing Module

Graphics Hardware 2008 29

Page 30: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Further Data

• Implemented on two FPGAs– 64 bit DDR DRAM– Interconnection: no overhead

• Resource usage regs and LUTs– Virtex 2 Pro 100 (kNN):

26% registers, 38% LUTs– Virtex 2 Pro 70 (MLS):

47% registers, 52% LUTs

• Clock frequency: 75 MHz

Graphics Hardware 2008 30

Page 31: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Outline

• Related Work• Spatial Searching and Caching• Architecture & Prototype• Results• Conclusion

Graphics Hardware 2008 31

Page 32: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Applications

• Tested on various applications

• PCI interface of prototype slow

Graphics Hardware 2008 32

[Weyrich et al. 04]

[Adams et al. 07]

Page 33: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Results kNN

Graphics Hardware 2008 33

CUDA: x4

CPU: x1.5

FPGA: x1

CUDA: x2.4

CPU: x1.4

FPGA: x1

CUDA w/o sort: x4.0

CUDA: x1.6CPU: x1.1

FPGA: x1

CUDA w/o sort: x3.1

75 MHz

1200 MHz2200 MHz

Number of Neighbors

Nu

mb

er

of

qu

eri

es

ASIC estimate, 500 MHzx6.6

Page 34: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Results kNN

Graphics Hardware 2008 34

CUDA: x4

CPU: x1.5

FPGA: x1

CUDA: x2.4

CPU: x1.4

FPGA: x1

CUDA w/o sort: x4.0

CUDA: x1.6CPU: x1.1

FPGA: x1

CUDA w/o sort: x3.1

75 MHz

1200 MHz2200 MHz

Number of Neighbors

Nu

mb

er

of

qu

eri

es

ASIC estimate, 500 MHzx6.6

• Small hardware footprint • FPGA slightly slower• Realistic clock frequency

Prototype faster than CPU/GPU

Page 35: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Results MLS

Graphics Hardware 2008 35

FPGA: x1

MLS CPU: x0.4

MLS CUDA x3.8

75 MHz

1200 MHz2200 MHz

Number of Neighbors

Nu

mb

er

of

qu

eri

es

FPGA faster than CPU

kNN bottleneck – FPGA– GPU

Page 36: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Coherent Neighbor Cache

Graphics Hardware 2008 36

CPU,=0.1

FPGA, exact

FPGA,=0.1

Level of coherence

Nu

mb

er

of

qu

eri

es

Page 37: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Results Approximation Error (MLS projection)

Graphics Hardware 2008 37

approximation

MLS

Err

or

no approx.

Page 38: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Results Approximation Error (MLS projection)

Graphics Hardware 2008 38

Cache hits

Cach

e H

its

approximation

Page 39: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Approximation Error (visual)

Graphics Hardware 2008 39

Page 40: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Approximation Error (visual)

Graphics Hardware 2008 40

Coherent Neighbor Cache:

• Not optimal for exact queries

• Approximate queries – Can be tolerated in most

cases– Greatly increases

performance– Even for small

approximations

Page 41: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Outline

• Related Work• Spatial Searching and Caching• Architecture & Prototype• Results• Conclusion

Graphics Hardware 2008 41

Page 42: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Conclusion

• Novel hardware architecture for – Nearest-neighbor searches– Generic meshless processing operators

• Cache exploiting spatial coherence• Good performance considering resources• Possible GPU integration

Graphics Hardware 2008 42

Page 43: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

Future Work

• Programmable data structure– Support different data structures– Programmability in data structure– Construction on-chip

• ‘Real’ programmability in point processing module

Graphics Hardware 2008 43

Page 44: A Hardware Processing Unit For Point Sets S. Heinzle, G. Guennebaud, M. Botsch, M. Gross Graphics Hardware 2008

A Hardware Processing Unit For Point Sets

S. Heinzle, G. Guennebaud,M. Botsch, M. Gross

Graphics Hardware 2008