Paris Master Class 2011 - 04 Shadow Maps

Shadow Maps

Wolfgang EngelConfetti Special Effects Inc., Carlsbad

Paris Master Class

Agenda

• The Shadow Map Basics• “Attaching” a Shadow Map frustum around a view

frustum• Multi-Frustum Shadow Maps• Cascaded Shadow Maps (CSM) : Splitting up the View• CSM Challenges• Cube Shadow Maps• Softening the Penumbra• Soft Shadow Maps• References

Shadows

• In a game: many shadows to consider– Cloud shadows: just projected down– Character self-shadowing : those are optional

shadows with their own frustum that is just around the bounding boxes

– Sun shadows: Cascaded Shadow Maps– Shadows from point, spot and other light types

The Basics

The Basics III

1. Render from light`s point of view -> render depth into shadow map

2. Render from eye’s point of view a. Shadow map projected down from light’s point of

view == standard projective texturingb. The current pixel is projected into light space (D)c. The depth value from the shadow map is compared

to the depth value of the pixel (Rpp))

d. If Rpp == D -> nothing occludede. If Rpp > D -> there must have been an object in front

of this point -> shadowed

The Basics II

Basics IV// Pseudo code

// Get pixel depth from the point of view from the light camera

float4 pos = mul(float4(WorldSpace.xyz, 1.0f), LightMatrix);

// fetch shadow map depth value

float depth = tex2D(ShadowSampler, pos.xy).x;

// compare

return step(pos.z, depth)

How to Create a Light Frustum

• How to create a light frustum that encloses the view frustum


1. Create a Unit Cube in Post-Projective Space// bound frustum points in post-projective spacevec3 points[8] = {

vec3(-1.0f,-1.0f,-1.0f), vec3(1.0f,-1.0f,-1.0f), vec3(-1.0f,1.0f,-1.0f), vec3(1.0f,1.0f,-1.0f),vec3(-1.0f,-1.0f, 1.0f), vec3(1.0f,-1.0f, 1.0f), vec3(-1.0f,1.0f, 1.0f), vec3(1.0f,1.0f, 1.0f),

};

-1.0, 1.0, 0.0

1.0, -1.0, 1.0


2. Transform it into view space with inverse projection matrix-> end up with the eight points of the whole view frustum// world space bound frustum points | transform from post-projective space to camera view space == world spacefor(int i = 0; i < 8; i++) {

vec4 point = iprojection * vec4(points[i]);points[i] = vec3(point) / point.w;

}


3. Now we get the direction of the vectors from the near to the far plane// world space bound frustum directionsvec3 directions[4];for(int i = 0; i < 4; i++)

directions[i] = normalize(points[i + 4] - points[i]);


4. Build a bounding box -> around this a sphereBoundBox bb;for(int j = 0; j < 4; j++) {

// builds the bounding box by moving planes for the front and back side of the bb forward

bb.expand(points[j] + directions[j] * min);

bb.expand(points[j] + directions[j] * max); } BoundSphere bs(bb);


5. Build the ortho projection and the lookAt matrix for the light camera frustum// imodelview – inverse camera view matrix -> transform to world space

vec3 target = imodelview * bs.getCenter();

projections[i] = ortho(bs.getRadius(),-bs.getRadius(),bs.getRadius(),-bs.getRadius(),shadow_range / 1000.0f,shadow_range);

// direction is light direction | shadow_range - depth of light view frustum

modelviews[i] = lookAt(target + direction * shadow_range / 2.0f,target - direction * shadow_range / 2.0f,up);


• View frustum surrounded by– AABB surrounded by• Sphere surrounded by

– Light orthographic view frustum

Multi-Frustum Shadow Maps

• Using several light frustums in a scene– Per Object [Forsyth] -> self-shadowing on

characters– Slicing up the view frustum == Cascaded Shadow

maps [Engel]

Cascaded Shadow Maps

• Having several view frustum slices// split bound frustumfor(int i = 0; i < 4; i++) {

float k0 = (float)(i + 0) / 4;float k1 = (float)(i + 1) / 4;

// znear + (zfar - znear) * k0 – uniform split schema // znear * powf(zfar / znear,k0 – logarithmic split schema see // http://appsrv.cse.cuhk.edu.hk/~fzhang/pssm_project/shadow_vrcia.pdf

float min = lerp(znear * powf(zfar / znear,k0),znear + (zfar - znear) * k0,shadow_distribute);float max = lerp(znear * powf(zfar / znear,k1),znear + (zfar - znear) * k1,shadow_distribute);

BoundBox bb;for(int j = 0; j < 4; j++) {

// builds the bounding box by moving planes for the front and back side of the bb forwardbb.expand(points[j] + directions[j] * min);bb.expand(points[j] + directions[j] * max);

}

BoundSphere bs(bb);

// imodelview – inverse camera view matrixvec3 target = imodelview * bs.getCenter();projections[i] = ortho(bs.getRadius(), -bs.getRadius(), bs.getRadius(), -bs.getRadius(), shadow_range / 1000.0f, shadow_range);

// direction is light direction | shadow_range - depth of light view frustummodelviews[i] = lookAt(target + direction * shadow_range / 2.0f, target - direction * shadow_range / 2.0f, up);

}

CSMs

CSMs

• Advantages– Large shadow distance: shadow resolution required

decreases in distance -> farer away maps can cover large areas -> LOD scheme

– Scalability: based on hardware the # of view frustums can be in/decreased; near and far plane can be adjusted; e.g. they can be moved out for sniper’s view or a motion blurred image

– Robustness: orthographic projection in combination with several shadow maps aligned to view is robust under changing light conditions

CSMs

• How does it work1. Render each object from POV of light into a

texture atlas of e.g. four shadow maps – DX9: objects need to be drawn for each map– DX10: geometry shader can be used to replicate

triangles in each map -> object needs to be drawn only once for all maps

2. Set texture atlas and render into a screen-space render target all shadowed objects in one render pass

CSM Challenges

• How to render all objects in one pass into the screen-space render target?– Compare sphere that surrounds view frustum slice

against pixel position

CSM Challenges// 1. calculate distance between two points// 2. compare to the radius. If smaller it is in the sphere// Original equation// inside = (sqrt((WorldSpace.x - CenterOfSphere.x)^2 + (WorldSpace.y - CenterOfSphere.y)^2 + (WorldSpace.z - CenterOfSphere.z)^2) < RadiusOfSphere// The equation is simplified like this// inside = ((CenterOfSphere.x - WorldSpace.x)^2 + (CenterOfSphere.y - WorldSpace.y)^2 + (CenterOfSphere.z - WorldSpace.z)^2) < RadiusOfSphere^2// For certain hardware platforms those are too many constants. Check out [Valient] for a solution that only utilizes texture coordinate registersfloat4 Dist;Dist.x = dot((WorldSpace.xyz - gShadowSpheres[0].xyz), (WorldSpace.xyz - gShadowSpheres[0].xyz));Dist.y = dot((WorldSpace.xyz - gShadowSpheres[1].xyz), (WorldSpace.xyz - gShadowSpheres[1].xyz));Dist.z = dot((WorldSpace.xyz - gShadowSpheres[2].xyz), (WorldSpace.xyz - gShadowSpheres[2].xyz));Dist.w = dot((WorldSpace.xyz - gShadowSpheres[3].xyz), (WorldSpace.xyz - gShadowSpheres[3].xyz));

// whatever comes first, pick that// this is distance < radius// to make this work with the simplified equation, radius comes in as radius ^2 in the w channelfloat mapToUse = (Dist.x < gShadowSpheres[0].w) ? 0 :

(Dist.y < gShadowSpheres[1].w) ? 1 : (Dist.z < gShadowSpheres[2].w) ? 2 : (Dist.w < gShadowSpheres[3].w) ? 3 : 4;

if(mapToUse == 4)

return 1.0f;

// Get pixel depth from the point of view from the light camera | read [Zang] why using world space interpolated can be wrongfloat4 pos = mul(float4(WorldSpace.xyz, 1.0f), LightMatrixArr[mapToUse]);

CSM Challenges

• Flickering: shadow light frustums are moved with the camera -> data in shadow maps moves with less than the texel size

• Solution: move light frustums texel-by-texel == quantize light frustum movement

Vector2 fWorldUnitsPerTexel = Vector2(radius / ShadowMapSize.x, radius / ShadowMapSize.y) * 2.0f;

// transform sphere center to light space…// fmod - floating-point remainder of numerator/denominator..SphereCenter.x -= fmodf( SphereCenter.x, fWorldUnitsPerTexel.x );SphereCenter.y -= fmodf( SphereCenter.y, fWorldUnitsPerTexel.y );

Cube Shadow Maps

• Six view frustums in all six directions -> write into a cube map

• DirectX 10 and higher can replicate primitives in the geometry shader, so that one draw call is enough to fill up all six faces of a cube map

• Shown in the Cube Shadow Map example in the file shadow.shd

• Can be called Ellipsoidal Light Shadow Maps if the light source is ellipsoidal [Engel11]

Ellipsoidal Light Shadows






Cube Shadow Maps

• Simplified shadervoid GS_CubeMap( triangle VS_OUTPUT_CUBEMAP In[3], inout TriangleStream<GS_OUTPUT_CUBEMAP> CubeMapStream )

{

for( int f = 0; f < 6; ++f )

{

// Compute screen coordinates

GS_OUTPUT_CUBEMAP Out;

Out.RTIndex = f;

for( int v = 0; v < 3; v++ )

{

Out.Pos = mul( In[v].Pos, g_mViewCM[f] );

Out.Pos = mul( Out.Pos, mProj );

Out.Tex = In[v].Tex;

CubeMapStream.Append( Out );

}

CubeMapStream.RestartStrip();

}

}

Cube Shadow Maps

• Culling– most effective optimization for cube shadow maps is skipping to render into

cube map faces-> check if light view frustum of each face is in regular view frustum-> if outside -> skip rendering into it

– Objects casting shadows that are not visible in the viewing frustum do not need to be drawn into the shadow map• Create an extruded bounding box around the shadow caster• Extrusion represents projected shadow• Extrusion is created by extending the bounding box of the object in the direction of the light

vector -> resulting bounding box is similar to a frustum• The resulting frustum is tested against the view frustum -> if not visible -> not drawn

Cube Shadow Maps

• Filtering of Cube Maps– Edges of the cube maps are hard to filter– [Waliszewski] constructs 3D texture coordinates and then lerps the

results

Softening the Penumbra

• Percentage-Closer Filtering


• Percentage-Closer Filtering// perspective projection

projCoords= oTex1.xy / oTex1.w;

// sample nearest 2x2 quad

shadowMapVals.r= tex2D(ShadowSampler, projCoords);

shadowMapVals.g= tex2D(ShadowSampler, projCoords+ texelOffsets[1].xy * g_vFullTexelOffset.xy);

shadowMapVals.b= tex2D(ShadowSampler, projCoords+ texelOffsets[2].xy * g_vFullTexelOffset.xy);

shadowMapVals.a= tex2D(ShadowSampler, projCoords+ texelOffsets[3].xy * g_vFullTexelOffset.xy);

// evaluate shadowmaptest on quad of shadow map texels

inLight= ( dist < shadowMapVals);

// percent in light

percentInLight= dot(inLight, float4(0.25, 0.25, 0.25, 0.25) );


• Percentage-Closer Filtering– Drawbacks• need many samples to fight aliasing• only scales linearly with sample count


• Exponential Shadow Maps [Salvi]float depth = tex2D(ShadowSampler, pos.xy).x;shadow = saturate(2.0 - exp((pos.z - depth) * k));

• Approximate step function (z-d> 0) byexp(k*(z-d)) = exp(k*z) * exp(-k*d)

Good overview on the latest development in [Bavoil]

Soft Shadows Agenda

• Basics of Soft Shadows• Point Light Soft Shadow Rendering

Architecture• Min-Z Map• Collect Shadow Data in Screen-Space• Screen-Space Anisotropic Filtering• See [Engel2010] [Engel2011]

Image courtesy of Randy Fernando [Fernando]

Basics of Soft Shadows

• Terminology


• PCSS searches red region for blockers– Blocker is defined by being closer to light than receiving

point– Averages depth values of blockers– Averaged depth value used for penumbra size estimation


• Blocker search // is it a blocker?

if (shadMapDepth < receiver)

{

blockerSum += shadMapDepth;

blockerCount++;

foundBlocker = 1;

}… // return average depth of the blockers

result = blockerSum / blockerCount;


• How to calculate the scale factor for the penumbra

• dBlocker – the result of the blocker search• dReceiver – the depth of the pixel that is currently rendered• wlight – the light size


Rendering Architecture

For Point Light Screen-Space soft shadows we are going to change mainly two things:

1. Replace blocker search with a minimum Z-map or dilated shadow map [Gumbau] -> Blocker search is expensive

2. Anisotropic Screen-Space Filter Kernel instead of Light-Space filter kernel-> Screen-Space is less expensive

The algorithm in steps:1.Calculate the cube shadow map2.Generate min Z map [Gumbau]3.Blend the “unfiltered” exponential shadow map data of all cube maps into a screen-space shadow map4.Based on each min Z map, calculate

a. The x and y offset values for the filter kernel based on• Adjustment based on distance from camera• Penumbra size [Fernando]• Anisotropic kernel adjustment [Geusebroek]

b. Early out value (optimization) [Gumbau]-> store the end result in a screen-space texture

5.Apply a screen-space anisotropic Gaussian filter kernel based on 4.

Rendering Architecture

Min-Z Map

• Blocker search • is used to determine the distance of the shadow

blockers -> dBlocker

• this is used to determine the penumbra width• Minimum Z map represents the minimum Z values ==

closest to the lights of the whole scene~ kind of like blocker data

Min-Z Map

• Generated from the shadow map• Into a lower res render target == coarse shadow map

== one pixel represents an area of the orig. shadow map-> runs only CoarseMapsizeX * CoarseMapsizeY times -> fast

• 2 pass filter kernel that returns the minimum Z values of its area in light space

Min-Z Map

• Issue: maximum size of penumbra restricted by size of filter kernel -> no way to figure out max size of the penumbra-> sensible user defined constant value that scales up filter kernel• too high -> artefacts• too low -> loose of softeness

• In other words: filter kernel is determined by a• value representing the light size• + value that is the magic user defined constant

Min-Z Map

• Advantage of Min-Z map approach• much faster• … therefore allows soft cube shadow maps

• Disadvantage: min-Z value not only from blockers but for the whole lit scene -> min-Z aliasing

• All the Exponential Shadow Map data is just blended via BLEND_ADD into a screen-space texture

• This texture can be called shadow collector or shadow mask

Collect Shadow Data in Screen-Space

Screen-Space Anisotropic Filter Kernel

• Why a screen-space filter kernel?• In light space we filter per shadow map • In screen-space we filter only once for all

shadow maps-> many light sources -> advantage


• Why Gauss filter kernel?• Gauss filter kernel is separable compared to

the PCSS filter kernel


• What we need to do:• Determine the values that scale the filter

kernel• Write those values for all shadow maps into

a screen-space render target• … the Gauss filter will read those values

later from there


• What are the scale values required for the Gauss filter?

1. The x and y offset values for the filter kernel are based ona. Adjustment based on distance from

camerab. Penumbra size & Light size[Fernando]c. Anisotropic kernel adjustment

[Geusebroek] 2. Early out value (optimization) [Gumbau]

• … store those values in a 16:16 fp render target or calculate them on the fly while filtering

Distance to the Camera

• Screen-space filter kernel is getting bigger with increasing distance because of the projection-> decrease kernel size with with increasing distance

• Simple way to do this is 1.0 / (distance2 * bias)~ light attenuation• This requires a linear depth value in camera space [Gillham]:float depthLin= (-NearClip * Q) / (Depth - Q);

Q = FarClip / (FarClip – NearClip)Depth = value from depth buffer

• Source code:// scale based on distance to the viewer sampleStep.xy = TexelSize.zw * sqrt(1.0f / ((depthLin.xx * depthLin.xx) * bias));

Penumbra Size

• To calculate the Penumbra, [Fernando] suggested the following equation

• dBlocker – the result of the blocker search

• dReceiver – the depth of the pixel that is currently rendered

• wlight – the light size

Penumbra Size

Anisotropic Filter Kernel Adjustment• Anisotropic filter kernel: round filter kernel

projected into ellipse following the orientation of the geometry [Geusebroek]-> need to determine the shape and orientation of this ellipse

float Aniso = saturate(sqrt(dot( viewVec, normal )));

Screen-Space Anisotropic Filter Kernel• Screen-space challenges

• Filter kernel can smear values into the penumbra around corners of geometry

• Compare Z value of pixel with Z value of shadow map tap bool isValidSample = bool( abs(sampleDepth - d) < errDepth );if (isValidSample && isShadow){// the sample is considered validsumWeightsOK += weights[i+1]; // accumulate valid weightsShadow += sampleL0.x * weights[i+1]; // accumulate weighted shadow value }

Screen-Space Anisotropic Filter Kernel• Screen-space challenges• “Light in a box” or occlusion of shadow data in

general -> should not affect Gauss filter -> need to deal with occlusion or ignore it (game specific)

• Overlapping shadows in screen-space-> starts with the philosophical question: what kind of entity are shadows?

Tips & Tricks• Seriously! Who needs 64 Point light shadows

perceptually correct on screen • switch off when the lights are moving fast• Far away• .. . or in all other cases you can think off

• How to render shadow data into a cube map-> fill up a texture array; then type cast to cube maps

• Try Dual-Paraboloid Shadow Maps … might be faster with DX10 / 11 … I didn’t try so far

Massive Soft Point Light Shadows

16 Point Light Soft Shadows filtered in screen-space





References• [Bavoil] Louis Bavoil, “Advanced Soft Shadow Mapping Techniques”

http://developer.download.nvidia.com/presentations/2008/GDC/GDC08_SoftShadowMapping.pdf• [Fernando] Randy Fernando, “Percentage-Closer Soft Shadows”, SIGGRAPH 2005• [Forsyth] Tom Forsyth, “Making Shadow Buffers Robust Using Multiple Dynamic Frustums”, ShaderX4, pp. 331 – 345• [Engel11] Wolfgang Engel, “Shadows - Thoughts on Ellipsoid Light Shadow Rendering”,

http://altdevblogaday.org/2011/02/28/shadows-thoughts-on-ellipsoid-light-shadow-rendering/• [Geusebroek] Jan-Mark Geusebroek, Arnold W. M. Smeulders, J. van de Weijer, “Fast anisotropic Gauss filtering”, IEEE

Transactions on Image Processing, Volume 12 (8), page 938-943, 2003• [Gilham] David Gilham, "Real-Time Depth-of-Field Implemented with a Post-Processing only Technique", ShaderX 5:

Advanced Rendering, Charles River Media / Thomson, pp 163 - 175, ISBN 1-58450-499-4• [Gumbau] Jesus Gumbau, Miguel Chover, and Mateu Sbert, “Screen-Space Soft Shadows”, GPU Pro, pp. 477 – 490• [Engel] Wolfgang Engel, “Cascaded Shadow Maps”, ShaderX5, pp. 197 – 206• [Engel2010] Wolfgang Engel, “Massive Point Light Soft Shadows”,http://www.confettispecialfx.com/massive-point-light-soft-shadows• [Engel2011] Wolfgang Engel, “Screen-Space: Rules for Designing Graphics Sub-systems (Part I) ”,• http://altdevblogaday.org/2011/06/13/screen-space-rules-for-designing-graphics-sub-systems-part-i/• [Salvi] Marco Salvi, “Rendering Filtered Shadows with Exponential Shadow Maps”, ShaderX6

Marco Salvi’s website: http://pixelstoomany.wordpress.com/?s=Exponential• [Valient] Michal Valient “Stable Rendering of Cascaded Shadow Maps”, ShaderX6, pp. 231 – 238• [Waliszewski] Arkadiusz Waliszewski, “Floating-point Cube Maps”, ShaderX2 – Shader Programming Tips and Tricks with

DirectX9, Wordware Inc., pp. 319 – 323. http://www.realtimerendering.com/blog/shaderx2-books-available-for-free-download/

• [Zhang] Fan Zhang, Alexander Zaprjagaev, Allan Bentham, “Practical Cascaded Shadow Maps”, ShaderX7, pp. 305 - 329

Engineering

Paris Master Class 2011 - 04 Shadow Maps