34
Ray Tracing in CUDA Andrei Monteiro Marcelo Gattass Assignment 3 June 2010

Ray Tracing in CUDA

Embed Size (px)

DESCRIPTION

Ray Tracing in CUDA. Andrei Monteiro Marcelo Gattass Assignment 3 June 2010. Topics. Motivation Related Work Grid Construction Ray Tracing in CUDA Results Conclusion References. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Ray Tracing in CUDA

Ray Tracing in CUDA

Andrei MonteiroMarcelo GattassAssignment 3June 2010

Page 2: Ray Tracing in CUDA

Topics

MotivationRelated WorkGrid ConstructionRay Tracing in CUDAResultsConclusionReferences

Page 3: Ray Tracing in CUDA

Motivation

Ray Tracing is a technique for generating an image by launching rays for each pixel and calculating its intersections with the scene objects.

Simulates several effects naturally, such as reflection and refraction, producing a very high degree of virtual realism.

Computationally expensive.

Can use different acceleration structures. (Kd-trees, Uniform grid, BVH)

Why CUDA? Designed for General-Purpose Computing. Construction of Grid is faster than other structures (e.g. Kd-trees, BVH). Provides natural compactness, avoiding memory waste in contrast with the stencil

routing algorithm using GLSL. Use of shared memory speed up construction. Fast data transfers. Atomic operations

Page 4: Ray Tracing in CUDA

Motivation

Page 5: Ray Tracing in CUDA

Related Work

Uniform Grid A Parallel Algorithm for Construction of Uniform Grids,

Kalojanov, J. GPU-Accelerated Uniform Grid Construction for Ray

Tracing Dynamic Scenes, Ivson, P., Duarte, L., Celes, W.

Ray Tracing Understanding the Efficiency of Ray Traversal on GPUs,

Aila, T. NVIDIA. Ray Tracing on Programmable Graphics Hardware,

Purcell, T. Ray Tracing Animated Scenes using Coherent Grid

Traversal, Wald et al.

Page 6: Ray Tracing in CUDA

Grid Construction

Uniform Grid Speed Up simulation, avoids going

through every scene object to test intersection.

Supports Dynamic Scenes. Each voxel contains a list of primitives The ray traverses the grid.

Grid Resolution

Page 7: Ray Tracing in CUDA

Grid Construction

Algorithm in CUDA:1. Insert triangles in voxels

2. Calculate grid hash table

3. Sort the pairs

4. Write cell start and end

5. Reorder particles

Page 8: Ray Tracing in CUDA

Grid Construction

Insert triangles in voxels Bounding Box (corners) Check triangle plane Intersection Avoids more than the same reference of the triangle inside the

voxel.

Contained in more than one voxel

Contained in same voxel 4 times

Page 9: Ray Tracing in CUDA

Grid Construction

1. Grid Hash Table

Pair Cell Index – Particle Index. E.g. Cell Dimension = 3,

Grid resolution = 3x3

0 3 6 9

0 1

3

6

2

4 5

7 8

0

3

6

4 4 0 1 2 0 8 4 0 5 7 0 0 3 0 8 8 0 3 1 0 8 1 0 7 1 0

0 1 2 3 4 5 6 7 8

4 0 5 7 3 8 1 2 2

0 1 2 3 4 5 6 7 8

HASH:

Cell Index

Particle Index

PARTICLES:

9

0

1

2

3

4

5

6 7

8

Page 10: Ray Tracing in CUDA

Grid Construction

Sorting the Pairs1. In order to calculate the cells´start and

end, it is necessary to order particles in respect to cell indices which they belong.

2. Actually, the application sorts the previous hash table with respect to their keys, or cell indices.

3. Use of Radix Sort from CUDA SDK.

Page 11: Ray Tracing in CUDA

Grid Construction

1. Sorting the Pairs Sort Hash table by key values (cell indices).

4 0 5 7 3 8 1 2 2

0 1 2 3 4 5 6 7 8

HASH:

Cell Index

Particle Index

0 1 2 2 3 4 5 7 8

1 6 7 8 4 0 2 3 5

Sorted HASH:

Cell Index

Particle Index

Page 12: Ray Tracing in CUDA

Grid Construction

1. Finding Cell Start/End and Reordering Particles.

0 1 1 2

Sorted HASH:

Cell Index10 2 21

0 1 3 4 5 6 7 8 ... Current Thread2

Cell Index [thread_id]Cell Index [thread_id - 1]

0/2

Cell Start/End:

Cell Start / End 6/... 2/6

0 1 3 4 5 6 7 8 ... Cell Index2

Cell 0: end = 2Cell 1: start = 2

≠CellStart [Cell Index[thread_id]] = 2

CellEnd [Cell Index [thread_id -1]] = 2

Page 13: Ray Tracing in CUDA

Ray Tracing in CUDA

Can be easily parallelizedEach thread is responsible for one pixel /

ray intersection.Problems that slow performance:

Internal LoopsCause threads to diverge.

Random memory accessCauses bank conflicts, non-coalesce reading

Page 14: Ray Tracing in CUDA

Ray Tracing in CUDA

Kernels1. Build grid (if scene changes)

2. Setup rays (if camera moved)

3. Lauch Rays

4. Get Hits

5. Get Shadow Hits

6. Get Reflection Hits (repeat)

7. Shade

Page 15: Ray Tracing in CUDA

Ray Tracing in CUDA

Setup Rays Calculate the ray equation for each pixel.

One thread per pixel

dtotp

)(

Page 16: Ray Tracing in CUDA

Ray Tracing in CUDA

Lauch Rays Calculates the ray-grid intersection, if any. For rays that do

not intersect the grid, they are discarded for the next steps. Returns the first cell intersection and the parameters for

traversing the grid.

p(t)

Page 17: Ray Tracing in CUDA

Ray Tracing in CUDA

Get Hits The most expensive steps of the simulation. Typical algorithm:

Problem: Causes too much thread divergency. Solution: Use while-while algorithm

Causes less divergency

While (not hit or ray inside grid) {

Traverse cell;

if (! Cell empty) {

for each triangle in cell {

get hit ();

}

}

}

Page 18: Ray Tracing in CUDA

Ray Tracing in CUDA

while- while algorithm

while-while trace(): while ray not terminated while node does not contain primitives traverse to the next node while node contains untested primitives perform a ray-primitive intersection test

Page 19: Ray Tracing in CUDA

Ray Tracing in CUDA

Get HitsTraversal Algorithm: 3D DDA

if (nextx < nexty) nextx += deltax X += 1;else nexty += deltay Y += 1;process_grid(X, Y);

Page 20: Ray Tracing in CUDA

Ray Tracing in CUDA

Get HitsTriangle Intersection

MöllerEnables face culling.Greatly increased performanceCareful:

Triangles can be in more than one voxel, so it´s necessary to check if the intersection point is in the current voxel.

Page 21: Ray Tracing in CUDA

Ray Tracing in CUDA

Increase efficiency The internal loops make threads diverge and thus

lower performance. To contour this problem, NVIDIA researcher T. Aila

included a method called Persistent Threads in CUDA.

The idea is to keep threads busy while at least one of them is not done.

Increased performance depends on the GPU. 9800 GX2: 2.2x increase GTX 480: 3.0x increase

Page 22: Ray Tracing in CUDA

Ray Tracing in CUDA Persistent threads implementation code

Page 23: Ray Tracing in CUDA

Ray Tracing in CUDA

Shade Linear Interpolation using baricentric coordinates

Normal Texture

Texture Used CUDA 3D Texture to support variable number of scene

textures. Phong Shading

)(

)(

)(

)1(

)(

)(

)(ˆˆˆˆ

tb

tg

tr

rb

rg

rr

luzes

n

r

sb

sg

sr

b

g

r

db

dg

dr

b

g

r

s

db

dg

dr

ab

ag

ar

b

g

r

I

I

I

o

I

I

I

k

k

k

k

l

l

l

k

k

k

l

l

l

f

k

k

k

I

I

I

I

I

I

r

r

r

r

r

r

LrLn

Page 24: Ray Tracing in CUDA

Results

Real-Time Ray Tracing PerformanceDepends on:

Grid resolutionNumber of primitivesCamera in/outside gridShadow PassReflection Passes (1 or more times)

Scenes with reflections and many primitives vary about 20~30 fps

Page 25: Ray Tracing in CUDA

Results

Page 26: Ray Tracing in CUDA

Results

Page 27: Ray Tracing in CUDA

Results

Page 28: Ray Tracing in CUDA

Results

Page 29: Ray Tracing in CUDA

Results

Page 30: Ray Tracing in CUDA

Results

Page 31: Ray Tracing in CUDA

Results

Page 32: Ray Tracing in CUDA

Results

Page 33: Ray Tracing in CUDA

Conclusion

The user was able to replicate physical effects. CUDA is slower compared to other languages

(e.g. GLSL) if not optimizing and use its maximum optimization resources.

There are still several optimizations pending in this work. Math CUDA threads and kernels Too much memory used

Page 34: Ray Tracing in CUDA

References

Kalojanov, J. A Parallel Algorithm for Construction of Uniform Grids. High Performance Graphics, 2009. Retrieved in Apr 21 2010.

Ivson, P., Duarte, L., Celes, W., GPU-Accelerated Uniform Grid Construction for ray Tracing Dynamic Scenes.

Understanding the Efficiency of Ray Traversal on GPUs, Aila, T. NVIDIA Research. Retrieved May 23, 2010.

Ray Tracing on Programmable Graphics Hardware, Purcell, T. Stanford University. Retrieved May 28, 2010.

Ray Tracing Animated Scenes using Coherent Grid Traversal, Wald et al. SCI Institute, University of Utah. Retrieved May 25, 2010

NVIDIA CUDA Programming Guide. V. 2.0, 2008. Retrieved Mar 29, 2010.