55
These are confidential sessions—please refrain from streaming, blogging, or taking pictures

With Multiple Render Targets

  • Upload
    lethuan

  • View
    228

  • Download
    0

Embed Size (px)

Citation preview

Page 1: With Multiple Render Targets

These are confidential sessions—please refrain from streaming, blogging, or taking pictures

Page 2: With Multiple Render Targets

These are confidential sessions—please refrain from streaming, blogging, or taking pictures

Advances in OpenGL ES 3.0

Filip Iliescu Graphics and Media Evangelist [email protected]

Page 3: With Multiple Render Targets
Page 4: With Multiple Render Targets

Apple A7 Processor

Page 5: With Multiple Render Targets

Apple A7 Processor OpenGL ES 3.0

Page 6: With Multiple Render Targets

Xcode 5Apple A7 Processor OpenGL ES 3.0

Page 7: With Multiple Render Targets

Agenda

• Highlights of the Apple A7 GPU • Moving from ES2 to ES3 • Deeper dive into ES3 • Tuning using Xcode 5 OpenGL ES Debugger

■ New Shader Profiler for A7 GPU

Page 8: With Multiple Render Targets

Apple A7 GPU

Page 9: With Multiple Render Targets

Overview

• Tile Based Deferred Renderer (TBDR) • Up to 2x graphics performance

■ Compared to A6 (iphone 5)

• Fully native OpenGL ES 3.0, 2.0 ■ Shader-based pipeline ■ ES1.1 backwards compatibility

Page 10: With Multiple Render Targets

GLSL ES 3.00

Multiple Render Targets

Framebuffer_Fetch

Transform feedback

Instanced Rendering

Uniform Buffer Objects

ETC2/EAC

Vertex Array Objects Immutable Texture Storage

Half-Float Color Buffers

Program Performance View

Extended Indices

MSAA Render to Texture

Separate Shader Objects

Vertex Buffer Objects

Pixel Buffer Objects

Map Buffer Range

PVRTC

Render to Mipmap level

2D Texture arraysPrimitive Restart

Vertex Texture Fetch

3D Textures

Deferred Lighting / Shading

Non Power of Two Textures

MSAA Render to Texture

GLSL ES 1.00

1 & 2 component texturesPer-Line Shader Performance Metrics

More Texture Units

Seamless cubemap filtering

Render to Texture

Sampler Objects

sRGB Textures

Fast texture copy

Occlusion Query

Page 11: With Multiple Render Targets

OpenGL ES Attributes Apple A7 SGX 554 Apple A7 with ES2 context

Max Texture Image Units 16 8 8

Max Combined Texture Image Units 32 8 8

Max Vertex Texture Image Units 16 8 8

Max Vertex Uniform Vectors 512 128 128

Max Fragment Uniform Vectors 224 64 64

Max Varying Vectors 15 8 8

Max Color Attachments 4 NA NA

Max Texture & Renderbuffer Size 4096 x 4096 4096 x 4096 4096 x 4096

OpenGL ES Limits

Page 12: With Multiple Render Targets

Key Differences From A6

• Performance ■ No penalty for dependent texture reads ■ Higher penalty for frame buffer loads and stores

• Precision ■ lowp shader values promoted to 16-bit (mediump) ■ All FP shader calculations performed with scalar processor

• Limits ■ Apps with ES2 context get ES2 limits ■ Apps with ES3 context get ES3 limits

Page 13: With Multiple Render Targets

Moving from ES2 to ES3

Page 14: With Multiple Render Targets

The Big Picture

• Core ■ ES2 Compatible with ES3 API ■ ES2 is Subset of ES3

• Extensions (3 cases) ■ Some have moved into the ES3 core as-is ■ Some move into ES3 core with semantic changes ■ Some extensions in ES2 are still extensions in ES3

Page 15: With Multiple Render Targets

ES2 extensions promoted directly to ES3 core

• These work identically in ES2 and ES3 • Just remove EXT, APPLE, OES API suffixes

Case #1

■ OES_depth24 ■ OES_element_index_uint ■ OES_fbo_render_mipmap ■ OES_rgb8_rgba8 ■ OES_texture_half_float_linear ■ OES_vertex_array_object ■ EXT_blend_minmax

■ EXT_draw_instanced ■ EXT_instanced_arrays ■ EXT_map_buffer_range ■ EXT_occlusion_query_boolean ■ EXT_texture_storage ■ APPLE_sync ■ APPLE_texture_max_level

Page 16: With Multiple Render Targets

• EXT_texture_storage ES2: glTexStorage2DEXT(GL_TEXTURE_2D, 1, GL_RGBA8_OES, width, height); ES3: glTexStorage2D (GL_TEXTURE_2D, 1, GL_RGBA8, width, height); !

• EXT_map_buffer_range ES2: glMapBufferRangeEXT(GL_ARRAY_BUFFER, offset, length, GL_MAP_WRITE_BIT_EXT | GL_MAP_FLUSH_EXPLICIT_BIT_EXT | GL_MAP_UNSYNCHRONIZED_BIT_EXT ); ES3: glMapBufferRange (GL_ARRAY_BUFFER, offset, length, GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT );

Case #1 Examples

Page 17: With Multiple Render Targets

ES2 extensions promoted with API changesCase #2

■ OES_mapbuffer ■ EXT_discard_framebuffer ■ APPLE_framebuffer_multisample ■ OES_depth_texture ■ OES_packed_depth_stencil ■ OES_texture_float ■ OES_texture_half_float ■ EXT_texture_rg ■ EXT_sRGB

Page 18: With Multiple Render Targets

Case #2 Examples

• OES_mapbuffer ES2: map = glMapBufferOES (GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES); ES3: map = glMapBufferRange(GL_ARRAY_BUFFER, 0, size, GL_MAP_WRITE_BIT);

!

• EXT_discard_framebuffer ES2: glDiscardFramebufferEXT(GL_FRAMEBUFFER, count, attachments); ES3: glInvalidateFramebuffer(GL_FRAMEBUFFER, count, attachments);

!

• APPLE_framebuffer_multisample ES2: glResolveMultisampleFramebufferAPPLE(); ES3: glBlitFramebuffer(0,0,w,h, 0,0,w,h, GL_COLOR_BUFFER_BIT, GL_NEAREST);

Page 19: With Multiple Render Targets

Some extensions in ES2 are still extensions in ES3

• Check GL_EXTENSIONS ■ APPLE_copy_texture_levels ■ APPLE_rgb_422 ■ APPLE_texture_format_BGRA_8888 ■ EXT_color_buffer_half_float ■ EXT_debug_label ■ EXT_debug_marker

!

!

!

!

■ EXT_read_format_bgra ■ EXT_separate_shader_objects ■ EXT_shader_framebuffer_fetch ■ EXT_texture_filter_anisotropic ■ IMG_read_format ■ IMG_texture_compression_pvrtc

Case #3

Page 20: With Multiple Render Targets

#version 100 esGLSL ES

• Fully supported ■ ES2-style shaders are compatible with both ES2 and ES3 ■ Version 100 assumed if no #version specified

• #version 300 es ■ Many language additions and changes ■ Similar to desktop GLSL 330

■ Video: “Migrating to OpenGL Core Profile”

Page 21: With Multiple Render Targets

• Now ■ Test your ES2 based games on iPhone 5s ■ Especially, correct any logical buffer loads/stores

• Next ■ Support both ES2 and ES3 ■ Try for an ES3 context, fall back to ES2 if not available ■ Handle extension APIs conditionally at runtime

• Some games: Go ES3 only ■ Games requiring ES3 features, deferred shading, etc.

Adoption Strategy

Page 22: With Multiple Render Targets

Deeper into OpenGL ES 3.0

Page 23: With Multiple Render Targets

GLSL ES 3.00

Multiple Render Targets

Framebuffer fetch

Transform feedback

Instanced Rendering

Uniform Buffer Objects

ETC2/EAC Compressed Textures

Vertex Array Objects Immutable Texture Storage

Half-Float Color Buffers

Program Performance View

Extended Indices

More Texture Units

Separate Shader Objects

Vertex Buffer Objects

Pixel Buffer Objects

Map Buffer Range

PVRTC

Render to Mipmap level

2D Texture arraysPrimitive Restart

New Buffer Formats

NPOT textures

Deferred Lighting / Shading

Non Power of Two Textures

MSAA Render to Texture

GLSL ES 1.00

1 & 2 component texturesPer-Line Shader Performance Metrics

3D Textures

Seamless cubemap filtering

Render to Texture

Sampler Objects

sRGB Textures

Fast texture copy

Occlusion Query

Page 24: With Multiple Render Targets

GLSL ES 3.00

Multiple Render Targets

Framebuffer fetch

Transform feedback

Instanced Rendering

Uniform Buffer Objects

ETC2/EAC Compressed Textures

Vertex Array Objects Immutable Texture Storage

Half-Float Color Buffers

Program Performance View

Extended Indices

More Texture Units

Separate Shader Objects

Vertex Buffer Objects

Pixel Buffer Objects

Map Buffer Range

PVRTC

Render to Mipmap level

2D Texture arraysPrimitive Restart

New Buffer Formats

NPOT textures

Deferred Lighting / Shading

Non Power of Two Textures

MSAA Render to Texture

GLSL ES 1.00

1 & 2 component texturesPer-Line Shader Performance Metrics

3D Textures

Seamless cubemap filtering

Render to Texture

Sampler Objects

sRGB Textures

Fast texture copy

Occlusion Query

Page 25: With Multiple Render Targets

Instanced Rendering

Page 26: With Multiple Render Targets

Drawing Many (Similar) Objects

Page 27: With Multiple Render Targets

Without instancingDrawing Many Objects

// Draw stars and planet [self drawStarsAndPlanet]; !

// Draw asteroids for (x=0; x < NUM_ASTEROIDS; x++) { // Set asteroid position, rotation, etc. glUniformMatrix4fv(asteroidParameters[x]); !

// Draw one asteroid glDrawArrays(GL_TRIANGLES, 0, asteroidVertexCount); }

Page 28: With Multiple Render Targets

Faster way to draw many similar objectsInstanced Rendering

• Draws the same object many times ■ All in a single draw call

• Each can have different parameters ■ Positions ■ Rotations ■ Texture coordinates ■ etc.

Page 29: With Multiple Render Targets

Two formsInstanced Rendering

• Instanced arrays ■ All instance parameters stored in an attribute array

• Shader instance ID ■ Instance parameters derived from gl_InstanceID in vertex shader

!

• Both available on all iOS 7 devices ■ ES3: In the ES3 core ■ ES2: GL_APPLE_instanced_arrays, GL_APPLE_draw_instanced

Page 30: With Multiple Render Targets

Vertex shaderInstance ID

• gl_InstanceID incremented for each instance ■ 0, 1, 2, 3, ... n

• You take it from there ■ Use ID as input to calculation in shader ■ Use ID for lookup in Uniform Buffer Object (UBO) ■ Use ID for lookup with Vertex Texture Sampling

Page 31: With Multiple Render Targets

// Vertex attributes for one asteroid glVertexAttribPointer(0, ..., vertices); glVertexAttribPointer(1, ..., normals); glVertexAttribPointer(2, ..., colors); !

// Uniforms for all glUniformMatrix4fv(modelViewProjectionMatrix); !

// All in one draw call glDrawArraysInstanced(GL_TRIANGLES, 0, NUM_VERTICES, NUM_ASTEROIDS);

Basic exampleInstance ID

Page 32: With Multiple Render Targets

Vertex shader - basic exampleInstance ID

#version 300 es in vec4 inPos; in vec3 inNorm, inColor; uniform float spacing; uniform mat4 cameraMVP; ... void main() {   vec4 pos = inPos; !

  ivec2 instancePosition = ivec2(gl_InstanceID % 100, gl_InstanceID / 100);   pos.xy += vec2(instancePosition) * spacing; !

  gl_Position = cameraMVP * pos; }

Page 33: With Multiple Render Targets

DemoInstanced Asteroids

Page 34: With Multiple Render Targets

Demo is doing much moreInstanced Rendering

• Instance ID ■ Used to lookup matrix in UBO ■ Used as seed for spin rate per-asteroid

• Uniform Buffer Object ■ Holds transformation, color data for each instance ■ Limited size

• Transform feedback & rasterize discard ■ Vertex stage only ■ Used to populate the UBO at startup with model view matrix, etc... ■ reduces per vertex calculations to per instance

Page 35: With Multiple Render Targets

Multiple Render Targets

Page 36: With Multiple Render Targets

ConceptMultiple Render Targets

• Render to multiple textures or renderbuffers from a single draw call ■ 4 outputs from fragment

shader ■ Quadruple the channels

• Enables deferred lighting/shading, other effects

• Each attachment’s format can differ from each other

• 128 bits per pixel

Fragment Shader

Page 37: With Multiple Render Targets

Geometry Stage

Fragment Shader

Lighting Stage

+ Lighting

Deferred ShadingUsing Multiple Render Targets

Page 38: With Multiple Render Targets

Lighting StageGeometry Stage

+ Lighting

Using Multiple Render Targets

Fragment Shader

Deferred Shading

Page 39: With Multiple Render Targets

SetupMultiple Render Targets

// Define 4 color attachments for currently bound framebuffer GLenum renderbuffers[] = { GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1, GL_COLOR_ATTACHMENT2, GL_COLOR_ATTACHMENT3 }; !

// Attach textures as output buffers glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, colorTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, normalTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, depthTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, albedoTex, 0); !

// Tell GL to enable buffers to draw into glDrawBuffers(4, renderbuffers); !

// Draw glDrawElements(...);

Page 40: With Multiple Render Targets

Fragment shaderMultiple Render Targets

#version 300 es !

layout(location = 0) out lowp vec4 fs_color_output; layout(location = 1) out lowp vec4 fs_normal_output; layout(location = 2) out highp uint fs_depth_output; layout(location = 3) out lowp vec4 fs_albedo_output; !

void main(void) { fs_color_output = ... fs_normal_output = ... fs_depth_output = ... fs_albedo_output = ...}

Page 41: With Multiple Render Targets

For Deferred ShadingMultiple Render Targets

Page 42: With Multiple Render Targets

Framebuffer Fetch

Page 43: With Multiple Render Targets

EXT_shader_framebuffer_fetchFramebuffer Fetch

• Provides current destination color in fragment shader • Syntax

■ Built-in variable in #version 100 shaders gl_LastFragData[0]

■ User-declared in #version 300 es layout(location = 0) inout lowp vec4 my_destination_name;

• Useful for ■ Programmable blending ■ Local post-processing effects ■ Fetching non-color framebuffer data

Page 44: With Multiple Render Targets

With Multiple Render Targets

• Read and write multiple attachments ■ Read with framebuffer fetch ■ Write with MRT

• Read from one, write to another • All from the same shader

Framebuffer Fetch

Fragment Shader

Page 45: With Multiple Render Targets

Compute Lighting

Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch

Clean up and Present

Output G-buffers

Page 46: With Multiple Render Targets

Output G-buffers

Compute Lighting

Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch

Clean up and Present

Page 47: With Multiple Render Targets

Output G-buffers

Compute (and Output)

Lighting

Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch

Clean up and Present

Page 48: With Multiple Render Targets

Output G-buffers

Compute (and Output)

Lighting

Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch

Clean up and Present

Page 49: With Multiple Render Targets

Using Multiple Render Targets and Framebuffer FetchDeferred Shading in One Pass

Page 50: With Multiple Render Targets

Three stages

• Multiple Render Targets ■ Render G-buffer attachments in one pass ■ formats can vary between attachments

• Framebuffer fetch ■ Render deferred lights in the same pass ■ Read from all attachments, write to one

■ Gbuffer becomes per-pixel scratch space

• Framebuffer invalidate ■ To avoid logical buffer stores

Deferred Shading on A7

Page 51: With Multiple Render Targets

OpenGL ES Tools

Page 52: With Multiple Render Targets

DemoXcode 5 OpenGL ES Debugger

Page 53: With Multiple Render Targets

Apple A7 Processor OpenGL ES 3.0 OpenGL ES Debugger

Page 54: With Multiple Render Targets

More Information

Filip Iliescu Graphics and Media Evangelist [email protected] !

Apple Developer Forums http://devforums.apple.com/ !

Developer Documentation http://developer.apple.com/opengles/ !

Migrating to OpenGL Core Profile video http://developer.apple.com/opengl/

Page 55: With Multiple Render Targets