View
222
Download
0
Tags:
Embed Size (px)
Citation preview
RasterizationRasterization
Setup triangles (calculate slope Setup triangles (calculate slope values).values).
Fill triangle: Interpolate Fill triangle: Interpolate parameters.parameters.
Parameters: R, G, B, z, r, s, t, q.Parameters: R, G, B, z, r, s, t, q.
Pixel PlanesPixel Planes
Calculate 3 edge functions: if all the Calculate 3 edge functions: if all the edge functions are positive in a edge functions are positive in a point (x, y) the point is inside the point (x, y) the point is inside the triangle.triangle.
E(x, y) = (x – X)dY – (y – Y)dXE(x, y) = (x – X)dY – (y – Y)dX
E(x, y) > 0 if (x, y) is to the “right” side.E(x, y) > 0 if (x, y) is to the “right” side.E(x, y) = 0 if (x, y) is exactly on the line.E(x, y) = 0 if (x, y) is exactly on the line.E(x, y) < 0 if (x, y) is to the “left” side.E(x, y) < 0 if (x, y) is to the “left” side.
Classification (1)Classification (1)A polygon defined by N vertex:A polygon defined by N vertex:
(xi, yi) (xi, yi) 0 < i <= N0 < i <= N(x0, y0) = (xN, yN)(x0, y0) = (xN, yN)
The incremental classification of the points around a polygon can The incremental classification of the points around a polygon can be calculated as:be calculated as:
Initial values:Initial values:
dXi = Xi – X(i-1)dXi = Xi – X(i-1)dYi = Yi – Y(i-1)dYi = Yi – Y(i-1)Ei(Xs, Ys) = (Xs – Xi) dY – (Ys – Yi) dXiEi(Xs, Ys) = (Xs – Xi) dY – (Ys – Yi) dXi
for 0 < i <= Nfor 0 < i <= N
Classification(2)Classification(2)
Incremental computation for a unit step in X and Y Incremental computation for a unit step in X and Y axis:axis:
E(x + 1, y) = Ei(x, y) + dYiE(x + 1, y) = Ei(x, y) + dYiE(x - 1, y) = Ei(x, y) - dYiE(x - 1, y) = Ei(x, y) - dYiE(x, y + 1) = Ei(x, y) - dYiE(x, y + 1) = Ei(x, y) - dYiE(x, y - 1) = Ei(x, y) + dXiE(x, y - 1) = Ei(x, y) + dXi
Fragment inside of the triangle if:Fragment inside of the triangle if:
Ei >= 0 for all i : 0 < i <= NEi >= 0 for all i : 0 < i <= N
Parallel RasterizationParallel Rasterization
E(x + L, y) = E(x) + LdyE(x + L, y) = E(x) + Ldy
Allows a group of Allows a group of
interpolators, each interpolators, each
responsible for a pixelresponsible for a pixel
within a block of within a block of
contiguous pixels, tocontiguous pixels, to
simultaneously computesimultaneously compute
the edge function of anthe edge function of an
adjacent block in aadjacent block in a
single cyclesingle cycle
Olano and GreerOlano and Greer
Triangle Scan Conversion using 2D Triangle Scan Conversion using 2D Homogeneous CoordinatesHomogeneous Coordinates
Based in Pixel Planes and Pineda Based in Pixel Planes and Pineda approach (edge functions) but using approach (edge functions) but using homogeneous coordinates.homogeneous coordinates.
Avoids the need of clipping.Avoids the need of clipping. Adds a hither edge function for user Adds a hither edge function for user
clipping.clipping. Perspective correct interpolation.Perspective correct interpolation.
Interpolation functionInterpolation functionA parameter varies linearly accross a triangle in 3D:A parameter varies linearly accross a triangle in 3D:
u = aX + bY + cZu = aX + bY + cZ
The 3D position (X, Y, Z) projects to 2D, using 2DH coords (x = X, The 3D position (X, Y, Z) projects to 2D, using 2DH coords (x = X, y = Y , w = Z). The equation in 2DH space:y = Y , w = Z). The equation in 2DH space:
u = ax + by + cwu = ax + by + cw
2D perspective correct function (division by w):2D perspective correct function (division by w):
u/w = a x/w + b y/w + c = a X + b Y + c u/w = a x/w + b y/w + c = a X + b Y + c
u/w is a linear function in screen space (X, Y)u/w is a linear function in screen space (X, Y)
Interpolation functionInterpolation function
If each vertex has a a value for u If each vertex has a a value for u we can resolve [a b c] using this we can resolve [a b c] using this equation:equation:
Scan conversionScan conversion
Edge function parameters: [1 0 0], [0 Edge function parameters: [1 0 0], [0 1 0], [0 0 1].1 0], [0 0 1].
1/w interpolation parameter: [1 1 1].1/w interpolation parameter: [1 1 1]. Zero-area and back facing triangles: Zero-area and back facing triangles:
3x3 matrix inverse of M only exists if 3x3 matrix inverse of M only exists if the determinant of M isn’t 0. The the determinant of M isn’t 0. The determinant calculates a function of determinant calculates a function of the area of the triangle.the area of the triangle.
Arbitrary clip planesArbitrary clip planes
To add arbitrary clip planes (user To add arbitrary clip planes (user clip planes) we need to add new clip planes) we need to add new clip edge functions:clip edge functions:
AlgorithmAlgorithmTo summarize the algorithm:To summarize the algorithm:
setup:setup:three edge functions = M-1three edge functions = M-1= inverse of 2D homogeneous vertex matrix for each clip edge= inverse of 2D homogeneous vertex matrix for each clip edge
clip edge function = dot product test * M-1clip edge function = dot product test * M-1 interpolation function for 1/w = sum of rows of M-1interpolation function for 1/w = sum of rows of M-1 for each parameterfor each parameter interpolation function = parameter vector * M-1interpolation function = parameter vector * M-1
pixel processing:pixel processing: interpolate linear edge and parameter functionsinterpolate linear edge and parameter functions where all edge functions are positivewhere all edge functions are positive
w = 1/(1/w)w = 1/(1/w)for each parameterfor each parameter
perspective-correct parameter = parameter * wperspective-correct parameter = parameter * w
CostCost Setup:Setup:
Calculate the interpolation coefficients and Calculate the interpolation coefficients and slopes.slopes.
1 matrix inversion (1 division, multiple 1 matrix inversion (1 division, multiple multiplication/additions).multiplication/additions).
1 matrix vector multiplication for each parameter. 1 matrix vector multiplication for each parameter. This includes the edge and clip edge functions, This includes the edge and clip edge functions, the 1/w value and the other parameters (r, g, b, z, the 1/w value and the other parameters (r, g, b, z, s, t, r) (3x3 matrix/vector multiplication: 9 Mul + s, t, r) (3x3 matrix/vector multiplication: 9 Mul + 6 Add).6 Add).
Calculate the X and Y slopes (derivatives) for Calculate the X and Y slopes (derivatives) for each parameter and the initial value at the first each parameter and the initial value at the first pixels (2 Mul + 2 Add per parameter).pixels (2 Mul + 2 Add per parameter).
Cost (2)Cost (2)
Per pixel:Per pixel: Interpolate parameters: 1 Addition per Interpolate parameters: 1 Addition per
parameter.parameter. Determine if the 3 edge functions are positive (3 Determine if the 3 edge functions are positive (3
test sign).test sign). Determine if the clip edge functions are positive Determine if the clip edge functions are positive
(n test sign) (n test sign) Per pixel inside the triangle:Per pixel inside the triangle:
w = 1/(1/w) (1 division????)w = 1/(1/w) (1 division????) For each parameter, perspective correct parameter For each parameter, perspective correct parameter
value: u = uw * w (1 multiplication for each parameter).value: u = uw * w (1 multiplication for each parameter).
Rasterization/FragmentsRasterization/Fragments
Calculate the final color value of Calculate the final color value of the fragment:the fragment: Texture Read.Texture Read. Color sum.Color sum. Fog.Fog.
Per fragment (tests)Per fragment (tests)
Determine the vissibility of the fragment:Determine the vissibility of the fragment: Ownership test.Ownership test. Scissor test.Scissor test. Alpha test.Alpha test. Stencil test.Stencil test. Depth Buffer test.Depth Buffer test.
Final pixel color:Final pixel color: Blending.Blending. Dithering.Dithering. Logic Operation.Logic Operation.
Z-BufferZ-Buffer Vissibility test.Vissibility test. 1 read from the Z-buffer (24bits).1 read from the Z-buffer (24bits). If test fails the fragment is discarded.If test fails the fragment is discarded. If not 1 write to the Z-buffer (24 bits).If not 1 write to the Z-buffer (24 bits). Early Z test (avoid useless work).Early Z test (avoid useless work). Hierarchical Z-Buffer: reduces bandwidthHierarchical Z-Buffer: reduces bandwidth Z-Buffer compression: reduces bandwidth and Z-Buffer compression: reduces bandwidth and
memory usage.memory usage. Fast Z clear.Fast Z clear. Pixel shaders that change pixel depth (Z) Pixel shaders that change pixel depth (Z)
disable early Z test.disable early Z test.
TexturesTextures Original: additional color (material) information per pixel. Original: additional color (material) information per pixel.
It is used to compensate lack of geometry information.It is used to compensate lack of geometry information. Current: color, normals or any kind of information. Current: color, normals or any kind of information.
Different formats (access) supporter by hardware (1D, Different formats (access) supporter by hardware (1D, 2D, 3D, cubemap).2D, 3D, cubemap).
Supported dependant reads (use information from a Supported dependant reads (use information from a texture as address to access another texture).texture as address to access another texture).
Minimification, magnification.Minimification, magnification. MIP mapping (Multus in Parvum): multiple levels of detail MIP mapping (Multus in Parvum): multiple levels of detail
for a single texture.for a single texture. Filtering: bilinear (4 access same mipmap), trilinear (8 Filtering: bilinear (4 access same mipmap), trilinear (8
access to two mipmaps), anisotropic (up to 128 access access to two mipmaps), anisotropic (up to 128 access
(16x trilinear) access).(16x trilinear) access).
Register combinersRegister combiners
Multitexture: multiple textures can be Multitexture: multiple textures can be read per cycle (multiple texture units per read per cycle (multiple texture units per pipe, up to 4 in Matrox Parhelia). Also pipe, up to 4 in Matrox Parhelia). Also multiple textures per pass (loop mode, multiple textures per pass (loop mode, up to 16 in DX9 hardware).up to 16 in DX9 hardware).
The output of those textures is combined The output of those textures is combined (*, +, ...) with the pixel interpolated color.(*, +, ...) with the pixel interpolated color.
First implementation of pixel shaders First implementation of pixel shaders (not really instructions for a processor, (not really instructions for a processor, but a configuration for the hardware).but a configuration for the hardware).
GeForce256 Register GeForce256 Register CombinersCombiners
Spare 0
Fragment Color
TextureFetching
GeneralCombiner
0
4 RGB Inputs
Texture 0
Texture 1
Fog Color/FactorR
egis
ter
Set
Reg
iste
r S
et
6 RGB Inputs
Specular Color
4 Alpha Inputs
3 RGB Outputs
3 Alpha Outputs
GeneralCombiner
1
4 RGB Inputs
4 Alpha Inputs
3 RGB Outputs
3 Alpha Outputs
FinalCombiner
1 Alpha Input
Specular Color
Texture EffectsTexture Effects
There is a large a new graphics There is a large a new graphics effects that can be achieved with effects that can be achieved with those extended texture functions:those extended texture functions: Cubemap (lightning, shadows).Cubemap (lightning, shadows). Bump Mapping (per pixel Bump Mapping (per pixel
lightning/shading).lightning/shading). Others?Others?
Pixel ShadersPixel Shaders DX9 pixel shaders are true processors. Based in Vertex DX9 pixel shaders are true processors. Based in Vertex
Shaders but without branching. Replaces (or complements) Shaders but without branching. Replaces (or complements) the register combiner stage.the register combiner stage.
Most instructions of the vertex shader are present in the Most instructions of the vertex shader are present in the pixel shader (but branches). Conditional codes, swizzle, pixel shader (but branches). Conditional codes, swizzle, negate, absolute value, mask, conditional mask (NV30).negate, absolute value, mask, conditional mask (NV30).
Additional instructions (NV30):Additional instructions (NV30): Texture read: TEX, TEXP, TXD.Texture read: TEX, TEXP, TXD. Partial derivarives: DDX, DDY.Partial derivarives: DDX, DDY. Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H, Pack/Unpack: PK2H, PK2US, PK4B, PK4UB, PK4UBG, UP2H,
UP2US, UP4B, UP4UB, UP4UBG.UP2US, UP4B, UP4UB, UP4UBG. Fragment conditional kill: KIL.Fragment conditional kill: KIL. Extra math: LRP (linear interpolation), X2D (2D coordinate Extra math: LRP (linear interpolation), X2D (2D coordinate
transform), RFL (reflection), POW (exponentation).transform), RFL (reflection), POW (exponentation).
Pixel ShaderPixel Shader Inputs: 1 position (x, y, z, 1/w), 2 colors (4 Inputs: 1 position (x, y, z, 1/w), 2 colors (4
compenent vector RGBA), 8 texture coordinates, 1 compenent vector RGBA), 8 texture coordinates, 1 fog coordinate.fog coordinate.
Outputs: fragment color (RGBA), optionally new Outputs: fragment color (RGBA), optionally new fragment depth. In NV30/R300 also to 4 RGBA fragment depth. In NV30/R300 also to 4 RGBA textures.textures.
Temporaries (NV30): 32 32-bit registers (64 16-bit Temporaries (NV30): 32 32-bit registers (64 16-bit registers).registers).
Constants (NV30): unlimited? (maybe memory?). Constants (NV30): unlimited? (maybe memory?). Accessed by ‘name’ (label). Also literal constants Accessed by ‘name’ (label). Also literal constants (embedded).(embedded).
R300: 12 temporary registers, 32 constants.R300: 12 temporary registers, 32 constants. 16 samplers and 8 texture coordinates (DX9).16 samplers and 8 texture coordinates (DX9).
Pixel ShaderPixel Shader
R300: 64 ALU instructions, 32 texture R300: 64 ALU instructions, 32 texture instructions, 4 levels of dependent read. instructions, 4 levels of dependent read. Up to 96 instructions (?).Up to 96 instructions (?).
R300: R300: ALU instructions: ADD, MOV, MUL, MAD, ALU instructions: ADD, MOV, MUL, MAD,
DP3, DP4, FRAC, RCP, RSP, EXD, LOG, CMP.DP3, DP4, FRAC, RCP, RSP, EXD, LOG, CMP. Texture: TEXLD, TEXLDP, TEXLDBIAS, Texture: TEXLD, TEXLDP, TEXLDBIAS,
TEXKILL.TEXKILL. NV30: up to 1024 instructions.NV30: up to 1024 instructions.