Upload
rosamond-wiggins
View
228
Download
0
Tags:
Embed Size (px)
Citation preview
Using wavelets on the XBOX360For current and future games
San Francisco, GDC 2008
Mike BoultonSenior Software Engineer
Rare/MGS
Introduction
• Fixed compression methods such as DXT are becoming antiquated
• Distribution of data “density” becoming less uniform as complexity increases
• Why can’t compression efforts be focused where it matters most?
• Wavelets can help!
What are wavelets?
• Wavelets are mathematical functions formed from scaled and translated copies of a few basis functions
• The advantage is their ability to localize functions in both frequency and space– Fourier just frequency– “Point basis” just space
• There are lots of common wavelet bases
Why 2D Haar?
• This talk will focus on 2D Haar wavelets• Haar is the simplest basis set– In general, this means more basis terms and/or
blocky artefacts– But also easiest to implement – “Blocky” nature of bases is less of a problem for
operations such as integration• What do the wavelets look like?
2D Haar continued...
• Three basis wavelet functions, plus ‘solid’ scaling function– White represents +1, black -1
The wavelet advantage• Local coverage allows windowed changes– Compared with spherical harmonics, which have
global cover– This means that local changes only involve local
bases• Mip-map chains not required for image
decompression• Truly variable compression– Can focus efforts where it counts
Usage examples• Real-time shader image decompression• Lighting• Static shadow maps• Displacement maps over large areas• Easy dynamic texture packing• Geometry representations• ...many more!
Talk focus• Real-time shader image decompression– Real-time on XBOX360 GPU– 500Hz+ for full-screen 720p monochrome
decompression• Double-product integration for relighting– Real-time (ish!) on XBOX360 GPU
• Other applications– E.G. geometry representation– If time!
Real-time image decompression• Image broken up into 16x16 texel blocks• Each block compressed into wavelet sub-tree• Texture at 1/16th resolution stores scaling
coefficient and offset to start of sub-tree• Why?– Decompression performance should be
independent of main image resolution– Each sub-tree fits inside a texture cache tile, so no
traversal cache thrashing– Can unroll traversal loop in shader for perf. gain
Real-time image decompression continued...
• Given a (u,v) coordinate, pixel shader traverses appropriate sub-tree for final value
• Mip-map chain not required– Obtained as by-product of traversal
• Intermediate memory not required– All decompression performed directly in the pixel
shader• Dynamic predication works well– Good vector coherency if minification avoided
Real-time image decompression continued...
• How is the wavelet tree represented?– Stored breadth-first in a line texture– Multiple ways to pack the data– Example: ARGB8• {r, g, b} stores windowed wavelet basis coefficients• {a} stores linear offset to child of node, if one exists
– If no child exists, we jump to an “empty” node• Wait for the rest of the pixel vector to finish
– Can use wider data formats• E.g. U16, F32
Real-time image decompression continued...
Uncompressed with mipmaps ~2MB @ 1578fps
Real-time image decompression continued...
Wavelet compressed, cut-off 0.05, 249KB @ 579fps
Real-time image decompression continued...
Wavelet compressed, cut-off 0.08, 157KB @ 622fps
Real-time image decompression continued...
• Notice how areas of high contrast have their detail preserved, and areas of lower contrast are “smoothed” out
• Can use a different notion of importance!
Real-time image decompression continued...
• Has advantages over fixed DXT-like compression schemes:– Can be lossless in areas you need it to be
• Particularly important for “data” textures, such as SH coefficients
– Will do a much better job for images with areas of low contrast• Again, higher-order SH terms are a good example• But most surface-parameterised data is likely to be like this
– Can use “focus” ability to increase the area of parameterisation
Real-time image decompression continued...
• (Video)
Double-product integration• Another good application is relighting– Work by [Ng, Ramamoorthi and Hanrahan]– Diffuse BRDF
• We represent both the transfer function (with cosine) and the environment as wavelet trees in a texture
Double-product integration continued...
• Here we need to traverse only the intersection of both trees– So if one is simple, should get good performance
• Parallel GPU traversal needs to know how to “jump over” bits of either tree not contained in the intersection– If a node has a child, store linear offset to sibling
or ancestor (could be root)– Then we can jump over whole child branch if the
other tree has no children at that point
Double-product integration continued...
• Performance is a function of the size of the intersection between both trees– Plus other factors such as cache behaviour
• Loads of other interesting ideas– Transfer function wavelet trees have a tendency to
be quite similar in localised patches– Clustering scheme would help further– Could store “representative” tree for each patch,
and then a (smaller) “residual” tree per texel of patch
Double-product integration continued...
• (Video)
Other applications• Wavelets can be used to define surface
deformations– As a “dynamic” displacement map
• Can keep memory overhead constant by removing oldest high-frequency terms for new deformation terms
• Windowed nature of wavelets makes applying localised deformations simple
• Can easily modify e.g. moment of inertia