60
Rendering on the Rendering on the GPU GPU Tom Fili Tom Fili

Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

  • View
    226

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Rendering on the GPURendering on the GPU

Tom FiliTom Fili

Page 2: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

AgendaAgenda

Global Illumination using RadiosityGlobal Illumination using Radiosity

Ray TracingRay Tracing

Global Illumination using RasterizationGlobal Illumination using Rasterization

Photon MappingPhoton Mapping

Rendering with CUDARendering with CUDA

Page 3: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using RadiosityGlobal Illumination using Radiosity

Global Illumination using Progressive Global Illumination using Progressive Refinement Radiosity by Greg Coombe Refinement Radiosity by Greg Coombe and Mark Harris (GPU GEMS 2: Chapter and Mark Harris (GPU GEMS 2: Chapter 39) 39)

The radiosity energy is stored in texels, The radiosity energy is stored in texels, and fragment programs are used to do and fragment programs are used to do computation. computation.

Page 4: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using RadiosityGlobal Illumination using Radiosity

It breaks the scene into many small It breaks the scene into many small elements and calculates how much energy elements and calculates how much energy is transferred between the elements.is transferred between the elements.

Function of the distance and relative Function of the distance and relative orientation.orientation.

V is 0 if objects are occluded, 1 if they are V is 0 if objects are occluded, 1 if they are fully visible.fully visible.

Page 5: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using RadiosityGlobal Illumination using Radiosity

Only works if objects are very small.Only works if objects are very small.

To increase speed we use larger areas To increase speed we use larger areas and approximate them with oriented discs.and approximate them with oriented discs.

Page 6: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using RadiosityGlobal Illumination using Radiosity

The classic radiosity algorithm solve a large The classic radiosity algorithm solve a large system of linear equations composed of the system of linear equations composed of the pairwise form factors.pairwise form factors.

These equations describe the radiosity of an These equations describe the radiosity of an element as a function of the energy from every element as a function of the energy from every other element, weighted by their form factors other element, weighted by their form factors and the element's reflectance, and the element's reflectance, rr..

The classical linear system requires The classical linear system requires OO((NN 22) ) storage, which is prohibitive for large scenes. storage, which is prohibitive for large scenes.

Page 7: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Progressive RefinementProgressive Refinement

Instead we use Progressive refinement.Instead we use Progressive refinement.

Each element in the scene maintains two Each element in the scene maintains two energy values: an energy values: an accumulatedaccumulated energy energy value and value and residualresidual (or "unshot") energy. (or "unshot") energy.

All energy values are set to 0 except the All energy values are set to 0 except the residual energy of light sources. residual energy of light sources.

Page 8: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Progressive RefinementProgressive Refinement

To implement this on the GPU we use 2 To implement this on the GPU we use 2 textures (accumulated and residual) for textures (accumulated and residual) for each element.each element.We render from the POV of the shooter.We render from the POV of the shooter.Then we iterate over receiving elements Then we iterate over receiving elements and test for visibility.and test for visibility.We then draw each visible element into We then draw each visible element into the frame buffer and use a fragment the frame buffer and use a fragment program to compute the form factor.program to compute the form factor.

Page 9: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Progressive RefinementProgressive Refinement

initialize shooter residual Einitialize shooter residual Ewhile not convergedwhile not converged{{

render scene from POV of shooterrender scene from POV of shooterfor each receiving element for each receiving element {{

if element is visibleif element is visible{{

compute form factor FFcompute form factor FFDE = r * FF * EDE = r * FF * Eadd DE to residual textureadd DE to residual textureadd DE to radiosity textureadd DE to radiosity texture

} } } } shooter's residual E = 0shooter's residual E = 0compute next shootercompute next shooter

} }

Page 10: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

VisibilityVisibility

The visibility term of the form factor equation is The visibility term of the form factor equation is usually computed using a hemicube.usually computed using a hemicube. The scene is rendered onto the five faces of a cube The scene is rendered onto the five faces of a cube

map, which is then used to test visibility.map, which is then used to test visibility.

Instead, we can avoid rendering the scene five Instead, we can avoid rendering the scene five times by using a vertex program to project the times by using a vertex program to project the vertices onto a hemisphere.vertices onto a hemisphere. The The hemispherical projectionhemispherical projection, also known as a , also known as a

stereographic projectionstereographic projection, allows us to compute the , allows us to compute the visibility in only one rendering pass.visibility in only one rendering pass.

The objects must be tesselated at a higher level to The objects must be tesselated at a higher level to conform to the hemisphere.conform to the hemisphere.

Page 11: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

VisibilityVisibilityvoid hemiwarp(float4 Position: POSITION, // World Posvoid hemiwarp(float4 Position: POSITION, // World Pos

uniform half4x4 ModelView, // Modelview Matrixuniform half4x4 ModelView, // Modelview Matrixuniform half2 NearFar, // Near/Far planesuniform half2 NearFar, // Near/Far planesout float4 ProjPos: POSITION) // Projected Posout float4 ProjPos: POSITION) // Projected Pos

{{ // transform the geometry to camera space// transform the geometry to camera spacehalf4 mpos = mul(ModelView, Position);half4 mpos = mul(ModelView, Position);

// project to a point on a unit hemisphere// project to a point on a unit hemispherehalf3 hemi_pt = normalize( mpos.xyz );half3 hemi_pt = normalize( mpos.xyz );

// Compute (f-n), but let the hardware divide z by this// Compute (f-n), but let the hardware divide z by this// in the w component (so premultiply x and y)// in the w component (so premultiply x and y)half f_minus_n = NearFar.y - NearFar.x;half f_minus_n = NearFar.y - NearFar.x;ProjPos.xy = hemi_pt.xy * f_minus_n;ProjPos.xy = hemi_pt.xy * f_minus_n;

// compute depth proj. independently,// compute depth proj. independently,// using OpenGL orthographic// using OpenGL orthographicProjPos.z = (-2.0 * mpos.z - NearFar.y - NearFar.x);ProjPos.z = (-2.0 * mpos.z - NearFar.y - NearFar.x);

ProjPos.w = f_minus_n;ProjPos.w = f_minus_n;}}

bool Visible(half3 ProjPos, bool Visible(half3 ProjPos, // camera-space pos // camera-space posuniform fixed3 RecvID, // ID of receiver uniform fixed3 RecvID, // ID of receiver sampler2D HemiItemBuffer )sampler2D HemiItemBuffer )

{{// Project the texel element onto the hemisphere// Project the texel element onto the hemispherehalf3 proj = normalize(ProjPos);half3 proj = normalize(ProjPos);

// Vector is in [-1,1], scale to [0..1] for texture lookup// Vector is in [-1,1], scale to [0..1] for texture lookupproj.xy = proj.xy * 0.5 + 0.5;proj.xy = proj.xy * 0.5 + 0.5;

// Look up projected point in hemisphere item buffer// Look up projected point in hemisphere item bufferfixed3 xtex = tex2D(HemiItemBuffer, proj.xy);fixed3 xtex = tex2D(HemiItemBuffer, proj.xy);

// Compare the value in item buffer to the// Compare the value in item buffer to the// ID of the fragment// ID of the fragmentreturn all(xtex == RecvID);return all(xtex == RecvID);

}}

Projection Vertex Program Visibility Test Fragment Program

Page 12: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Form Factor ComputationForm Factor Computationhalf3 FormFactorEnergy(half3 FormFactorEnergy(

half3 RecvPos, // world-space position of this elementhalf3 RecvPos, // world-space position of this elementuniform half3 ShootPos, // world-space position of shooteruniform half3 ShootPos, // world-space position of shooterhalf3 RecvNormal, // world-space normal of this elementhalf3 RecvNormal, // world-space normal of this element

uniform half3 ShootNormal, // world-space normal of shooteruniform half3 ShootNormal, // world-space normal of shooteruniform half3 ShootEnergy, // energy from shooter residual textureuniform half3 ShootEnergy, // energy from shooter residual textureuniform half ShootDArea, // the delta area of the shooteruniform half ShootDArea, // the delta area of the shooter

uniform fixed3 RecvColor ) // the reflectivity of this elementuniform fixed3 RecvColor ) // the reflectivity of this element{{

// a normalized vector from shooter to receiver// a normalized vector from shooter to receiverhalf3 r = ShootPos - RecvPos;half3 r = ShootPos - RecvPos;half distance2 = dot(r, r);half distance2 = dot(r, r);r = normalize(r);r = normalize(r);

// the angles of the receiver and the shooter from r// the angles of the receiver and the shooter from rhalf cosi = dot(RecvNormal, r);half cosi = dot(RecvNormal, r);half cosj = -dot(ShootNormal, r);half cosj = -dot(ShootNormal, r);

// compute the disc approximation form factor// compute the disc approximation form factorconst half pi = 3.1415926535;const half pi = 3.1415926535;half Fij = max(cosi * cosj, 0) / (pi * distance2 + ShootDArea);half Fij = max(cosi * cosj, 0) / (pi * distance2 + ShootDArea);Fij *= Visible(); // returns visibility as 0 or 1Fij *= Visible(); // returns visibility as 0 or 1

// Modulate shooter's energy by the receiver's reflectivity// Modulate shooter's energy by the receiver's reflectivity// and the area of the shooter.// and the area of the shooter.half3 delta = ShooterEnergy * RecvColor * ShootDArea * Fij;half3 delta = ShooterEnergy * RecvColor * ShootDArea * Fij;

return delta;return delta;}}

Page 13: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Adaptive SubdivisionAdaptive Subdivision

We create smaller elements along areas that We create smaller elements along areas that need more detail (eg. Shadow edges).need more detail (eg. Shadow edges).

Reuse same algorithms except we compute Reuse same algorithms except we compute visibility on the leaf nodes.visibility on the leaf nodes.

We evaluate a gradient of the radiosity and if its We evaluate a gradient of the radiosity and if its above a certain threshold weabove a certain threshold wediscard it.discard it.

If we discard enough fragments then If we discard enough fragments then we subdivide the current node.we subdivide the current node.

Page 14: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

PerformancePerformance

Can render a 10,000 element version of Cornell Can render a 10,000 element version of Cornell Box at 2 fps.Box at 2 fps.

To get this we need to make some optimizationsTo get this we need to make some optimizations Use occlusion queries in visibility passUse occlusion queries in visibility pass Shoot rays a lower resolution than the texture.Shoot rays a lower resolution than the texture. Batch together multiple shooters.Batch together multiple shooters. Use lower resolution textures to compute indirect Use lower resolution textures to compute indirect

lighting. Compute direct lighting separately and add in lighting. Compute direct lighting separately and add in later.later.

Page 15: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using RadiosityGlobal Illumination using Radiosity

Page 16: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Ray TracingRay Tracing

Ray Tracing on Programmable Graphics Ray Tracing on Programmable Graphics HardwareHardware by Timothy J. Purcell, et al. by Timothy J. Purcell, et al. Siggraph 2002Siggraph 2002

Shows how to design a streaming ray Shows how to design a streaming ray tracer that is designed to be run on parallel tracer that is designed to be run on parallel graphics hardware.graphics hardware.

Page 17: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Streaming Ray TracerStreaming Ray Tracer

Multi-pass algorithmMulti-pass algorithm

Divides the scene into a uniform grid, Divides the scene into a uniform grid, which is represented by a 3D texture.which is represented by a 3D texture.

Split the operation into 4 kernels executed Split the operation into 4 kernels executed as fragment programs.as fragment programs.

Uses the stencil buffer to keep track of Uses the stencil buffer to keep track of which pass a ray is on.which pass a ray is on.

Page 18: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

StorageStorage

Grid TextureGrid Texture 3D Texture3D Texture

Triangle ListTriangle List 1D Texture1D Texture Single ChannelSingle Channel

Triangle-Vertex ListTriangle-Vertex List 1D Texture1D Texture 3 Channel (RGB)3 Channel (RGB)

Page 19: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Eye Ray GeneratorEye Ray Generator

Simplest of the kernels.Simplest of the kernels.

Given the camera parameters it generates Given the camera parameters it generates a ray for each screen pixel.a ray for each screen pixel.

A fragment program is invoked for each A fragment program is invoked for each pixel which generates a ray.pixel which generates a ray.

Also tests rays against the scene’s Also tests rays against the scene’s bounding volume and terminates the ones bounding volume and terminates the ones outside the volume.outside the volume.

Page 20: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

TraverserTraverser

For each ray it steps through the grid.For each ray it steps through the grid.

A pass is required for each step through A pass is required for each step through the grid.the grid.

If a voxel contains triangles, then the ray is If a voxel contains triangles, then the ray is marked to run the intersection kernel on marked to run the intersection kernel on triangles in that voxel.triangles in that voxel.

If not, then it continues stepping through If not, then it continues stepping through the grid.the grid.

Page 21: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

IntersectorIntersector

Tests the ray for intersection with all Tests the ray for intersection with all triangles within a voxel.triangles within a voxel.

A pass is required for each ray-triangle A pass is required for each ray-triangle intersection test.intersection test.

If an intersection occurs then the ray is If an intersection occurs then the ray is marked for execution in the shading stage.marked for execution in the shading stage.

If not the ray continues in the traversal If not the ray continues in the traversal stage.stage.

Page 22: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Intersection Shader (Pseudo)CodeIntersection Shader (Pseudo)Codefloat4 IntersectTriangle( float3 ro, float3 rd, int list pos, float4 h )float4 IntersectTriangle( float3 ro, float3 rd, int list pos, float4 h ){{ float tri id = texture( list pos, trilist );float tri id = texture( list pos, trilist ); float3 v0 = texture( tri id, v0 );float3 v0 = texture( tri id, v0 ); float3 v1 = texture( tri id, v1 );float3 v1 = texture( tri id, v1 ); float3 v2 = texture( tri id, v2 );float3 v2 = texture( tri id, v2 ); float3 edge1 = v1 - v0;float3 edge1 = v1 - v0; float3 edge2 = v2 - v0;float3 edge2 = v2 - v0; float3 pvec = Cross( rd, edge2 );float3 pvec = Cross( rd, edge2 ); float det = Dot( edge1, pvec );float det = Dot( edge1, pvec ); float inv det = 1/det;float inv det = 1/det; float3 tvec = ro - v0;float3 tvec = ro - v0; float u = Dot( tvec, pvec ) * inv det;float u = Dot( tvec, pvec ) * inv det; float3 qvec = Cross( tvec, edge1 );float3 qvec = Cross( tvec, edge1 ); float v = Dot( rd, qvec ) * inv det;float v = Dot( rd, qvec ) * inv det; float t = Dot( edge2, qvec ) * inv det;float t = Dot( edge2, qvec ) * inv det; bool validhit = select( u >= 0.0f, true, false );bool validhit = select( u >= 0.0f, true, false ); validhit = select( v >= 0, validhit, false );validhit = select( v >= 0, validhit, false ); validhit = select( u+v <= 1, validhit, false );validhit = select( u+v <= 1, validhit, false ); validhit = select( t < h[0], validhit, false );validhit = select( t < h[0], validhit, false ); validhit = select( t >= 0, validhit, false );validhit = select( t >= 0, validhit, false ); t = select( validhit, t, h[0] );t = select( validhit, t, h[0] ); u = select( validhit, u, h[1] );u = select( validhit, u, h[1] ); v = select( validhit, v, h[2] );v = select( validhit, v, h[2] ); float id = select( validhit, tri id, h[3] );float id = select( validhit, tri id, h[3] ); return float4( ft, u, v, idg );return float4( ft, u, v, idg );}}

Page 23: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

ShaderShader

This adds the shading for the pixel.This adds the shading for the pixel.

It also generates new rays and marks It also generates new rays and marks them for processing in a future rendering them for processing in a future rendering pass.pass.

Also gives new rays a weight so the color Also gives new rays a weight so the color can be simply added.can be simply added.

Page 24: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using Global Illumination using RasterizationRasterization

High-Quality Global High-Quality Global Illumination Rendering Illumination Rendering Using Rasterization by Using Rasterization by Toshiya Hachisuka (GPU Toshiya Hachisuka (GPU GEMS 2: Chapter 38) GEMS 2: Chapter 38)

Instead of adapting global Instead of adapting global illumination algorithms to illumination algorithms to the GPU, it makes use of the GPU, it makes use of the GPU’s rasterization the GPU’s rasterization hardware.hardware.

Page 25: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Two-pass methodsTwo-pass methods

First pass uses photon mapping or First pass uses photon mapping or radiosity to compute a rough radiosity to compute a rough approximation of illumination.approximation of illumination.

In the second pass, the first pass result is In the second pass, the first pass result is refined and rendered.refined and rendered.

The most common way to use the first The most common way to use the first pass is as a source of indirect illumination.pass is as a source of indirect illumination.

Page 26: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Final GatheringFinal Gathering

The process of final gathering is used to The process of final gathering is used to compute the amount of indirect light by compute the amount of indirect light by shooting a large amount of rays.shooting a large amount of rays. This can be the bottleneck.This can be the bottleneck.

Sampling and interpolation is used to Sampling and interpolation is used to speed it up.speed it up. This can lead to rendering artifacts.This can lead to rendering artifacts.

Page 27: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Final Gathering via RasterizationFinal Gathering via Rasterization

Precomputes directions and traces all of the rays Precomputes directions and traces all of the rays at once using rasterization.at once using rasterization.

This is done with a parallel projection of the This is done with a parallel projection of the scene along the current direction or the scene along the current direction or the global global ray directionray direction..

Page 28: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Depth PeelingDepth Peeling

Each depth layer is a subsection of the Each depth layer is a subsection of the scene.scene.

Shoot a ray in the opposite direction of the Shoot a ray in the opposite direction of the global ray direction.global ray direction.

This can be achievedThis can be achievedby rendering multipleby rendering multipletimes using a greatertimes using a greaterthan depth test.than depth test.

Page 29: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Depth PeelingDepth Peeling

Step through the depth layers, computing Step through the depth layers, computing the indirect illumination until no fragments the indirect illumination until no fragments are rendered.are rendered.

Repeat with anotherRepeat with anotherglobal ray direction global ray direction until the number ofuntil the number ofsamplings is sufficient.samplings is sufficient.

Page 30: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

RenderingRendering

This method only computes indirect illumination.This method only computes indirect illumination.The first rendering pass can be done with any The first rendering pass can be done with any CPU or GPU method that computes the CPU or GPU method that computes the irradiance distribution.irradiance distribution. They suggest Grid Photon Mapping.They suggest Grid Photon Mapping.

We use this in the final gathering pass.We use this in the final gathering pass.Direct illumination must be computed with a real-Direct illumination must be computed with a real-time shadowing technique.time shadowing technique. They suggest shadow mapping and stencil shadows.They suggest shadow mapping and stencil shadows.

Direct and indirect illumination are summed Direct and indirect illumination are summed before the final rendering.before the final rendering.

Page 31: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

PerformancePerformance

Its hard to compare performance because Its hard to compare performance because the algorithms are very different.the algorithms are very different.

Performance is similar to CPU based Performance is similar to CPU based sampling/interpolation methods.sampling/interpolation methods.

Performance is much faster than a CPU Performance is much faster than a CPU method that would sample all pixels.method that would sample all pixels.

Page 32: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Global Illumination using Global Illumination using RasterizationRasterization

Page 33: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Photon MappingPhoton Mapping

Photon Mapping on Programmable Photon Mapping on Programmable Graphics HardwareGraphics Hardware by Timothy J. Purcell, by Timothy J. Purcell, et al. Siggraph 2003et al. Siggraph 2003

Page 34: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Photon TracingPhoton Tracing

Each pass of the photon tracing reads Each pass of the photon tracing reads from the previous frame.from the previous frame.At each surface interaction a photon is At each surface interaction a photon is written to the texture and another is written to the texture and another is emitted.emitted.The initial frame has the photons on the The initial frame has the photons on the light sources and their random directions.light sources and their random directions.The direction of each photon bounce are The direction of each photon bounce are computed from a random number texture.computed from a random number texture.

Page 35: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Photon Map Data StructurePhoton Map Data Structure

The original photon map algorithm uses a The original photon map algorithm uses a balanced balanced kk-d tree for locating the nearest -d tree for locating the nearest photons.photons.This structure makes it possible to quickly locate This structure makes it possible to quickly locate the nearest photons at any point.the nearest photons at any point.It requires random access writes to construct It requires random access writes to construct efficiently.efficiently. This can be slow on the GPU.This can be slow on the GPU.

Instead we use a uniform grid for storing the Instead we use a uniform grid for storing the photons.photons. Bitonic Merge Sort – Fragment programBitonic Merge Sort – Fragment program Stencil Routing – Vertex programStencil Routing – Vertex program

Page 36: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Fragment Program MethodFragment Program Method

We can Index the photons by grid cell and We can Index the photons by grid cell and sort them by cell.sort them by cell.Then find the index of the first photon in Then find the index of the first photon in each cell using a binary search.each cell using a binary search.Bitonic Merge Sort is a parallel sorting Bitonic Merge Sort is a parallel sorting algorithm that takes O(logalgorithm that takes O(log22n) steps.n) steps.It can be implemented as a fragment It can be implemented as a fragment program with each rendering pass being program with each rendering pass being one stage of the sort.one stage of the sort.

Page 37: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

1

2

3

4

5

6

7

8

8x monotonic lists: (3) (7) (4) (8) (6) (2) (1) (5)4x bitonic lists: (3,7) (4,8) (6,2) (1,5)

Page 38: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

1

2

3

4

5

6

7

8

Sort the bitonic lists

Page 39: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

4x monotonic lists: (3,7) (8,4) (2,6) (5,1)2x bitonic lists: (3,7,8,4) (2,6,5,1)

Page 40: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic lists

Page 41: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic lists

Page 42: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic lists

Page 43: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

2x monotonic lists: (3,4,7,8) (6,5,2,1)1x bitonic list: (3,4,7,8, 6,5,2,1)

Page 44: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic list

Page 45: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic list

Page 46: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic list

Page 47: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

2

3

1

4

7

5

8

6

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic list

Page 48: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

2

3

1

4

7

5

8

6

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Sort the bitonic list

Page 49: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Bitonic Merge SortBitonic Merge Sort

1

3

2

4

7

6

8

5

2

3

1

4

7

5

8

6

3

2

4

1

7

5

8

6

3

7

4

8

2

5

1

6

3

8

4

7

2

6

1

5

1

2

3

4

5

6

7

8

3

8

7

4

5

6

1

2

Done!

Page 50: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Fragment Program MethodFragment Program Method

Binary search can be used to locate the Binary search can be used to locate the contiguous block of photons occupying a given contiguous block of photons occupying a given grid cell.grid cell.We compute an array of the indices of the first We compute an array of the indices of the first photon in every cell.photon in every cell. If no photon is found for a cell, the first photon in the If no photon is found for a cell, the first photon in the

next grid cell is located.next grid cell is located.

The simple fragment program implementation of The simple fragment program implementation of binary search requires binary search requires OO(log(lognn) photon lookups. ) photon lookups. All of the photon lookups can be unrolled into a All of the photon lookups can be unrolled into a single rendering pass.single rendering pass.

Page 51: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Fragment Program MethodFragment Program Method

Page 52: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Vertex Program MethodVertex Program Method

Since the Bitonic Merge Sort can add many Since the Bitonic Merge Sort can add many rendering passes, it may not be useful for rendering passes, it may not be useful for interactive rendering.interactive rendering.

You can use a Stencil Routing to route photons You can use a Stencil Routing to route photons to each grid cell in one rendering pass.to each grid cell in one rendering pass.

Each grid cell covers a Each grid cell covers a m m x x mm set of pixels. set of pixels.

Draw a point with a point size of Draw a point with a point size of mm and then use and then use the stencil buffer to send the photon to the the stencil buffer to send the photon to the correct fragment.correct fragment.

Page 53: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Vertex Program MethodVertex Program Method

Page 54: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Vertex Program MethodVertex Program Method

There are two draw backs to this methodThere are two draw backs to this method We must read from a photon texture which We must read from a photon texture which

requires a readback.requires a readback. We allocate a fixed amount of memory so we We allocate a fixed amount of memory so we

must redistribute the power for cells with must redistribute the power for cells with greater than greater than mm22 photons and space is wasted photons and space is wasted if there is less.if there is less.

Page 55: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

Radiance EstimateRadiance Estimate

We accumulate a radiance value based on We accumulate a radiance value based on predefined number of nearest photons.predefined number of nearest photons.

We search all photons in the cell.We search all photons in the cell. If the photon is in the search range then we If the photon is in the search range then we

add it.add it. If not, then we ignore it unless we don’t have If not, then we ignore it unless we don’t have

enough photons. Then we add it and expand enough photons. Then we add it and expand the range.the range.

Page 56: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

RenderingRendering

Use a stochastic ray tracer written using a Use a stochastic ray tracer written using a fragment program to output a texture with all the fragment program to output a texture with all the hit points, normals, and colors for a given ray hit points, normals, and colors for a given ray depth.depth.This texture is used as input to several additional This texture is used as input to several additional fragment programs.fragment programs. One program computes the direct illumination using One program computes the direct illumination using

one or more shadow rays to estimate the visibility of one or more shadow rays to estimate the visibility of the light sources.the light sources.

One that invokes the ray tracer to compute reflections One that invokes the ray tracer to compute reflections and refractions.and refractions.

One to compute the radiance.One to compute the radiance.

Page 57: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

VideoVideo

Page 58: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

CUDA RenderingCUDA Rendering

All of these rendering techniques can be All of these rendering techniques can be done with CUDA.done with CUDA.

They are simpler to implement because They are simpler to implement because you don’t have to store everything in you don’t have to store everything in textures and you can use shared memory.textures and you can use shared memory.

Page 59: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

CUDA Rendering DemoCUDA Rendering Demo

Page 60: Rendering on the GPU Tom Fili. Agenda Global Illumination using Radiosity Ray Tracing Global Illumination using Rasterization Photon Mapping Rendering

ReferencesReferences

GPU Gems 2 – Chapters 38 & 39GPU Gems 2 – Chapters 38 & 39Ray Tracing on Programmable Graphics HardwareRay Tracing on Programmable Graphics Hardware by Timothy J. Purcell, et al., Siggraph 2002by Timothy J. Purcell, et al., Siggraph 2002Photon Mapping on Programmable Graphics Photon Mapping on Programmable Graphics HardwareHardware by Timothy J. Purcell, et al., Siggraph by Timothy J. Purcell, et al., Siggraph 20032003Jon Olick VideoJon Olick Video http://www.youtube.com/watch?v=VpEpAFGplnIhttp://www.youtube.com/watch?v=VpEpAFGplnI

CUDA Voxel DemoCUDA Voxel Demo http://www.geeks3d.com/20090317/cuda-voxel-renderinghttp://www.geeks3d.com/20090317/cuda-voxel-rendering

-engine/-engine/