Using wavelets on the XBOX360 For current and future games San Francisco, GDC 2008 Mike Boulton Senior Software Engineer Rare/MGS [email protected]

Using wavelets on the XBOX360For current and future games

San Francisco, GDC 2008

Mike BoultonSenior Software Engineer

Rare/MGS

[email protected]

Introduction

• Fixed compression methods such as DXT are becoming antiquated

• Distribution of data “density” becoming less uniform as complexity increases

• Why can’t compression efforts be focused where it matters most?

• Wavelets can help!

What are wavelets?

• Wavelets are mathematical functions formed from scaled and translated copies of a few basis functions

• The advantage is their ability to localize functions in both frequency and space– Fourier just frequency– “Point basis” just space

• There are lots of common wavelet bases

Why 2D Haar?

• This talk will focus on 2D Haar wavelets• Haar is the simplest basis set– In general, this means more basis terms and/or

blocky artefacts– But also easiest to implement – “Blocky” nature of bases is less of a problem for

operations such as integration• What do the wavelets look like?

2D Haar continued...

• Three basis wavelet functions, plus ‘solid’ scaling function– White represents +1, black -1

The wavelet advantage• Local coverage allows windowed changes– Compared with spherical harmonics, which have

global cover– This means that local changes only involve local

bases• Mip-map chains not required for image

decompression• Truly variable compression– Can focus efforts where it counts

Usage examples• Real-time shader image decompression• Lighting• Static shadow maps• Displacement maps over large areas• Easy dynamic texture packing• Geometry representations• ...many more!

Talk focus• Real-time shader image decompression– Real-time on XBOX360 GPU– 500Hz+ for full-screen 720p monochrome

decompression• Double-product integration for relighting– Real-time (ish!) on XBOX360 GPU

• Other applications– E.G. geometry representation– If time!

Real-time image decompression• Image broken up into 16x16 texel blocks• Each block compressed into wavelet sub-tree• Texture at 1/16th resolution stores scaling

coefficient and offset to start of sub-tree• Why?– Decompression performance should be

independent of main image resolution– Each sub-tree fits inside a texture cache tile, so no

traversal cache thrashing– Can unroll traversal loop in shader for perf. gain

Real-time image decompression continued...

• Given a (u,v) coordinate, pixel shader traverses appropriate sub-tree for final value

• Mip-map chain not required– Obtained as by-product of traversal

• Intermediate memory not required– All decompression performed directly in the pixel

shader• Dynamic predication works well– Good vector coherency if minification avoided


• How is the wavelet tree represented?– Stored breadth-first in a line texture– Multiple ways to pack the data– Example: ARGB8• {r, g, b} stores windowed wavelet basis coefficients• {a} stores linear offset to child of node, if one exists

– If no child exists, we jump to an “empty” node• Wait for the rest of the pixel vector to finish

– Can use wider data formats• E.g. U16, F32


Uncompressed with mipmaps ~2MB @ 1578fps


Wavelet compressed, cut-off 0.05, 249KB @ 579fps


Wavelet compressed, cut-off 0.08, 157KB @ 622fps


• Notice how areas of high contrast have their detail preserved, and areas of lower contrast are “smoothed” out

• Can use a different notion of importance!


• Has advantages over fixed DXT-like compression schemes:– Can be lossless in areas you need it to be

• Particularly important for “data” textures, such as SH coefficients

– Will do a much better job for images with areas of low contrast• Again, higher-order SH terms are a good example• But most surface-parameterised data is likely to be like this

– Can use “focus” ability to increase the area of parameterisation


• (Video)

Double-product integration• Another good application is relighting– Work by [Ng, Ramamoorthi and Hanrahan]– Diffuse BRDF

• We represent both the transfer function (with cosine) and the environment as wavelet trees in a texture

Double-product integration continued...

• Here we need to traverse only the intersection of both trees– So if one is simple, should get good performance

• Parallel GPU traversal needs to know how to “jump over” bits of either tree not contained in the intersection– If a node has a child, store linear offset to sibling

or ancestor (could be root)– Then we can jump over whole child branch if the

other tree has no children at that point


• Performance is a function of the size of the intersection between both trees– Plus other factors such as cache behaviour

• Loads of other interesting ideas– Transfer function wavelet trees have a tendency to

be quite similar in localised patches– Clustering scheme would help further– Could store “representative” tree for each patch,

and then a (smaller) “residual” tree per texel of patch


• (Video)

Other applications• Wavelets can be used to define surface

deformations– As a “dynamic” displacement map

• Can keep memory overhead constant by removing oldest high-frequency terms for new deformation terms

• Windowed nature of wavelets makes applying localised deformations simple

• Can easily modify e.g. moment of inertia

Documents

Using wavelets on the XBOX360 For current and future games San Francisco, GDC 2008 Mike Boulton Senior Software Engineer Rare/MGS [email protected]