Upload
lethuan
View
228
Download
0
Embed Size (px)
Citation preview
These are confidential sessions—please refrain from streaming, blogging, or taking pictures
These are confidential sessions—please refrain from streaming, blogging, or taking pictures
Advances in OpenGL ES 3.0
Filip Iliescu Graphics and Media Evangelist [email protected]
Apple A7 Processor
Apple A7 Processor OpenGL ES 3.0
Xcode 5Apple A7 Processor OpenGL ES 3.0
Agenda
• Highlights of the Apple A7 GPU • Moving from ES2 to ES3 • Deeper dive into ES3 • Tuning using Xcode 5 OpenGL ES Debugger
■ New Shader Profiler for A7 GPU
Apple A7 GPU
Overview
• Tile Based Deferred Renderer (TBDR) • Up to 2x graphics performance
■ Compared to A6 (iphone 5)
• Fully native OpenGL ES 3.0, 2.0 ■ Shader-based pipeline ■ ES1.1 backwards compatibility
GLSL ES 3.00
Multiple Render Targets
Framebuffer_Fetch
Transform feedback
Instanced Rendering
Uniform Buffer Objects
ETC2/EAC
Vertex Array Objects Immutable Texture Storage
Half-Float Color Buffers
Program Performance View
Extended Indices
MSAA Render to Texture
Separate Shader Objects
Vertex Buffer Objects
Pixel Buffer Objects
Map Buffer Range
PVRTC
Render to Mipmap level
2D Texture arraysPrimitive Restart
Vertex Texture Fetch
3D Textures
Deferred Lighting / Shading
Non Power of Two Textures
MSAA Render to Texture
GLSL ES 1.00
1 & 2 component texturesPer-Line Shader Performance Metrics
More Texture Units
Seamless cubemap filtering
Render to Texture
Sampler Objects
sRGB Textures
Fast texture copy
Occlusion Query
OpenGL ES Attributes Apple A7 SGX 554 Apple A7 with ES2 context
Max Texture Image Units 16 8 8
Max Combined Texture Image Units 32 8 8
Max Vertex Texture Image Units 16 8 8
Max Vertex Uniform Vectors 512 128 128
Max Fragment Uniform Vectors 224 64 64
Max Varying Vectors 15 8 8
Max Color Attachments 4 NA NA
Max Texture & Renderbuffer Size 4096 x 4096 4096 x 4096 4096 x 4096
OpenGL ES Limits
Key Differences From A6
• Performance ■ No penalty for dependent texture reads ■ Higher penalty for frame buffer loads and stores
• Precision ■ lowp shader values promoted to 16-bit (mediump) ■ All FP shader calculations performed with scalar processor
• Limits ■ Apps with ES2 context get ES2 limits ■ Apps with ES3 context get ES3 limits
Moving from ES2 to ES3
The Big Picture
• Core ■ ES2 Compatible with ES3 API ■ ES2 is Subset of ES3
• Extensions (3 cases) ■ Some have moved into the ES3 core as-is ■ Some move into ES3 core with semantic changes ■ Some extensions in ES2 are still extensions in ES3
ES2 extensions promoted directly to ES3 core
• These work identically in ES2 and ES3 • Just remove EXT, APPLE, OES API suffixes
Case #1
■ OES_depth24 ■ OES_element_index_uint ■ OES_fbo_render_mipmap ■ OES_rgb8_rgba8 ■ OES_texture_half_float_linear ■ OES_vertex_array_object ■ EXT_blend_minmax
■ EXT_draw_instanced ■ EXT_instanced_arrays ■ EXT_map_buffer_range ■ EXT_occlusion_query_boolean ■ EXT_texture_storage ■ APPLE_sync ■ APPLE_texture_max_level
• EXT_texture_storage ES2: glTexStorage2DEXT(GL_TEXTURE_2D, 1, GL_RGBA8_OES, width, height); ES3: glTexStorage2D (GL_TEXTURE_2D, 1, GL_RGBA8, width, height); !
• EXT_map_buffer_range ES2: glMapBufferRangeEXT(GL_ARRAY_BUFFER, offset, length, GL_MAP_WRITE_BIT_EXT | GL_MAP_FLUSH_EXPLICIT_BIT_EXT | GL_MAP_UNSYNCHRONIZED_BIT_EXT ); ES3: glMapBufferRange (GL_ARRAY_BUFFER, offset, length, GL_MAP_WRITE_BIT | GL_MAP_FLUSH_EXPLICIT_BIT | GL_MAP_UNSYNCHRONIZED_BIT );
Case #1 Examples
ES2 extensions promoted with API changesCase #2
■ OES_mapbuffer ■ EXT_discard_framebuffer ■ APPLE_framebuffer_multisample ■ OES_depth_texture ■ OES_packed_depth_stencil ■ OES_texture_float ■ OES_texture_half_float ■ EXT_texture_rg ■ EXT_sRGB
Case #2 Examples
• OES_mapbuffer ES2: map = glMapBufferOES (GL_ARRAY_BUFFER, GL_WRITE_ONLY_OES); ES3: map = glMapBufferRange(GL_ARRAY_BUFFER, 0, size, GL_MAP_WRITE_BIT);
!
• EXT_discard_framebuffer ES2: glDiscardFramebufferEXT(GL_FRAMEBUFFER, count, attachments); ES3: glInvalidateFramebuffer(GL_FRAMEBUFFER, count, attachments);
!
• APPLE_framebuffer_multisample ES2: glResolveMultisampleFramebufferAPPLE(); ES3: glBlitFramebuffer(0,0,w,h, 0,0,w,h, GL_COLOR_BUFFER_BIT, GL_NEAREST);
Some extensions in ES2 are still extensions in ES3
• Check GL_EXTENSIONS ■ APPLE_copy_texture_levels ■ APPLE_rgb_422 ■ APPLE_texture_format_BGRA_8888 ■ EXT_color_buffer_half_float ■ EXT_debug_label ■ EXT_debug_marker
!
!
!
!
■ EXT_read_format_bgra ■ EXT_separate_shader_objects ■ EXT_shader_framebuffer_fetch ■ EXT_texture_filter_anisotropic ■ IMG_read_format ■ IMG_texture_compression_pvrtc
Case #3
#version 100 esGLSL ES
• Fully supported ■ ES2-style shaders are compatible with both ES2 and ES3 ■ Version 100 assumed if no #version specified
• #version 300 es ■ Many language additions and changes ■ Similar to desktop GLSL 330
■ Video: “Migrating to OpenGL Core Profile”
• Now ■ Test your ES2 based games on iPhone 5s ■ Especially, correct any logical buffer loads/stores
• Next ■ Support both ES2 and ES3 ■ Try for an ES3 context, fall back to ES2 if not available ■ Handle extension APIs conditionally at runtime
• Some games: Go ES3 only ■ Games requiring ES3 features, deferred shading, etc.
Adoption Strategy
Deeper into OpenGL ES 3.0
GLSL ES 3.00
Multiple Render Targets
Framebuffer fetch
Transform feedback
Instanced Rendering
Uniform Buffer Objects
ETC2/EAC Compressed Textures
Vertex Array Objects Immutable Texture Storage
Half-Float Color Buffers
Program Performance View
Extended Indices
More Texture Units
Separate Shader Objects
Vertex Buffer Objects
Pixel Buffer Objects
Map Buffer Range
PVRTC
Render to Mipmap level
2D Texture arraysPrimitive Restart
New Buffer Formats
NPOT textures
Deferred Lighting / Shading
Non Power of Two Textures
MSAA Render to Texture
GLSL ES 1.00
1 & 2 component texturesPer-Line Shader Performance Metrics
3D Textures
Seamless cubemap filtering
Render to Texture
Sampler Objects
sRGB Textures
Fast texture copy
Occlusion Query
GLSL ES 3.00
Multiple Render Targets
Framebuffer fetch
Transform feedback
Instanced Rendering
Uniform Buffer Objects
ETC2/EAC Compressed Textures
Vertex Array Objects Immutable Texture Storage
Half-Float Color Buffers
Program Performance View
Extended Indices
More Texture Units
Separate Shader Objects
Vertex Buffer Objects
Pixel Buffer Objects
Map Buffer Range
PVRTC
Render to Mipmap level
2D Texture arraysPrimitive Restart
New Buffer Formats
NPOT textures
Deferred Lighting / Shading
Non Power of Two Textures
MSAA Render to Texture
GLSL ES 1.00
1 & 2 component texturesPer-Line Shader Performance Metrics
3D Textures
Seamless cubemap filtering
Render to Texture
Sampler Objects
sRGB Textures
Fast texture copy
Occlusion Query
Instanced Rendering
Drawing Many (Similar) Objects
Without instancingDrawing Many Objects
// Draw stars and planet [self drawStarsAndPlanet]; !
// Draw asteroids for (x=0; x < NUM_ASTEROIDS; x++) { // Set asteroid position, rotation, etc. glUniformMatrix4fv(asteroidParameters[x]); !
// Draw one asteroid glDrawArrays(GL_TRIANGLES, 0, asteroidVertexCount); }
Faster way to draw many similar objectsInstanced Rendering
• Draws the same object many times ■ All in a single draw call
• Each can have different parameters ■ Positions ■ Rotations ■ Texture coordinates ■ etc.
Two formsInstanced Rendering
• Instanced arrays ■ All instance parameters stored in an attribute array
• Shader instance ID ■ Instance parameters derived from gl_InstanceID in vertex shader
!
• Both available on all iOS 7 devices ■ ES3: In the ES3 core ■ ES2: GL_APPLE_instanced_arrays, GL_APPLE_draw_instanced
Vertex shaderInstance ID
• gl_InstanceID incremented for each instance ■ 0, 1, 2, 3, ... n
• You take it from there ■ Use ID as input to calculation in shader ■ Use ID for lookup in Uniform Buffer Object (UBO) ■ Use ID for lookup with Vertex Texture Sampling
// Vertex attributes for one asteroid glVertexAttribPointer(0, ..., vertices); glVertexAttribPointer(1, ..., normals); glVertexAttribPointer(2, ..., colors); !
// Uniforms for all glUniformMatrix4fv(modelViewProjectionMatrix); !
// All in one draw call glDrawArraysInstanced(GL_TRIANGLES, 0, NUM_VERTICES, NUM_ASTEROIDS);
Basic exampleInstance ID
Vertex shader - basic exampleInstance ID
#version 300 es in vec4 inPos; in vec3 inNorm, inColor; uniform float spacing; uniform mat4 cameraMVP; ... void main() { vec4 pos = inPos; !
ivec2 instancePosition = ivec2(gl_InstanceID % 100, gl_InstanceID / 100); pos.xy += vec2(instancePosition) * spacing; !
gl_Position = cameraMVP * pos; }
DemoInstanced Asteroids
Demo is doing much moreInstanced Rendering
• Instance ID ■ Used to lookup matrix in UBO ■ Used as seed for spin rate per-asteroid
• Uniform Buffer Object ■ Holds transformation, color data for each instance ■ Limited size
• Transform feedback & rasterize discard ■ Vertex stage only ■ Used to populate the UBO at startup with model view matrix, etc... ■ reduces per vertex calculations to per instance
Multiple Render Targets
ConceptMultiple Render Targets
• Render to multiple textures or renderbuffers from a single draw call ■ 4 outputs from fragment
shader ■ Quadruple the channels
• Enables deferred lighting/shading, other effects
• Each attachment’s format can differ from each other
• 128 bits per pixel
Fragment Shader
Geometry Stage
Fragment Shader
Lighting Stage
+ Lighting
Deferred ShadingUsing Multiple Render Targets
Lighting StageGeometry Stage
+ Lighting
Using Multiple Render Targets
Fragment Shader
Deferred Shading
SetupMultiple Render Targets
// Define 4 color attachments for currently bound framebuffer GLenum renderbuffers[] = { GL_COLOR_ATTACHMENT0, GL_COLOR_ATTACHMENT1, GL_COLOR_ATTACHMENT2, GL_COLOR_ATTACHMENT3 }; !
// Attach textures as output buffers glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, colorTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT1, normalTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT2, depthTex, 0); glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT3, albedoTex, 0); !
// Tell GL to enable buffers to draw into glDrawBuffers(4, renderbuffers); !
// Draw glDrawElements(...);
Fragment shaderMultiple Render Targets
#version 300 es !
layout(location = 0) out lowp vec4 fs_color_output; layout(location = 1) out lowp vec4 fs_normal_output; layout(location = 2) out highp uint fs_depth_output; layout(location = 3) out lowp vec4 fs_albedo_output; !
void main(void) { fs_color_output = ... fs_normal_output = ... fs_depth_output = ... fs_albedo_output = ...}
For Deferred ShadingMultiple Render Targets
Framebuffer Fetch
EXT_shader_framebuffer_fetchFramebuffer Fetch
• Provides current destination color in fragment shader • Syntax
■ Built-in variable in #version 100 shaders gl_LastFragData[0]
■ User-declared in #version 300 es layout(location = 0) inout lowp vec4 my_destination_name;
• Useful for ■ Programmable blending ■ Local post-processing effects ■ Fetching non-color framebuffer data
With Multiple Render Targets
• Read and write multiple attachments ■ Read with framebuffer fetch ■ Write with MRT
• Read from one, write to another • All from the same shader
Framebuffer Fetch
Fragment Shader
Compute Lighting
Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch
Clean up and Present
Output G-buffers
Output G-buffers
Compute Lighting
Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch
Clean up and Present
Output G-buffers
Compute (and Output)
Lighting
Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch
Clean up and Present
Output G-buffers
Compute (and Output)
Lighting
Deferred Shading in One PassUsing Multiple Render Targets and Framebuffer Fetch
Clean up and Present
Using Multiple Render Targets and Framebuffer FetchDeferred Shading in One Pass
Three stages
• Multiple Render Targets ■ Render G-buffer attachments in one pass ■ formats can vary between attachments
• Framebuffer fetch ■ Render deferred lights in the same pass ■ Read from all attachments, write to one
■ Gbuffer becomes per-pixel scratch space
• Framebuffer invalidate ■ To avoid logical buffer stores
Deferred Shading on A7
OpenGL ES Tools
DemoXcode 5 OpenGL ES Debugger
Apple A7 Processor OpenGL ES 3.0 OpenGL ES Debugger
More Information
Filip Iliescu Graphics and Media Evangelist [email protected] !
Apple Developer Forums http://devforums.apple.com/ !
Developer Documentation http://developer.apple.com/opengles/ !
Migrating to OpenGL Core Profile video http://developer.apple.com/opengl/