16
1 Bandwidth-efficient graphics with ARM Mali GPUs June 27th (Friday), 2014 Hessed Choi @ ARM

Bandwidth-efficient graphics with ARM Mali GPUs

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bandwidth-efficient graphics with ARM Mali GPUs

1

Bandwidth-efficient graphics with

ARM Mali GPUs

June 27th (Friday), 2014

Hessed Choi @ ARM

Page 2: Bandwidth-efficient graphics with ARM Mali GPUs

2

Memory Bandwidth

Page 3: Bandwidth-efficient graphics with ARM Mali GPUs

3

Vertex load

Varyings

Textures

Framebuffer output

Bandwidth Where does it go?

Page 4: Bandwidth-efficient graphics with ARM Mali GPUs

4

Vertex load

Varyings

Textures

Framebuffer output

Bandwidth Where does it go?

Page 5: Bandwidth-efficient graphics with ARM Mali GPUs

5

Less power = less memory bandwidth Desktop: 170 Watts to >300 Watts… That’s just the GPU!

Console: 80 - 100 Watts (CPU/GPU/WiFi/Network)

Mobile: 3 - 7 Watts (CPU/GPU/Modem/WiFi)

Need smarter solutions

Bandwidth Where are we?

Page 6: Bandwidth-efficient graphics with ARM Mali GPUs

6

Mali Architecture

Page 7: Bandwidth-efficient graphics with ARM Mali GPUs

7

Mali is a tile-based deferred rendering architecture Framebuffer is divided into tiles

Renders tile by tile

16x16 tile size

▪ Color ▪ Depth ▪ Stencil

ARM® Mali™ GPU Rendering

Page 8: Bandwidth-efficient graphics with ARM Mali GPUs

8

Deferred Shading and Extensions Support

Page 9: Bandwidth-efficient graphics with ARM Mali GPUs

9

Popular technique on PC and console games

Deferred Shading

Very memory bandwidth intensive Traditionally not a good fit for mobile

Limitation

Page 10: Bandwidth-efficient graphics with ARM Mali GPUs

10

Fragment shader extensions for OpenGL® ES 2.0 and above

Allows reading of existing framebuffer color, depth and stencil values

Enables: Programmable blending Programmable depth/stencil testing Soft particles Reconstruction of 3D position etc

Extensions (1) Shader Framebuffer Fetch

http://www.khronos.org/registry/gles/extensions/ARM/ARM_shader_framebuffer_fetch.txt http://www.khronos.org/registry/gles/extensions/ARM/ARM_shader_framebuffer_fetch_depth_stencil.txt

Page 11: Bandwidth-efficient graphics with ARM Mali GPUs

11

Fragment shader extension for OpenGL® ES 3.0 and above

On the ARM® Mali™ -T600 series this amounts to 128-bits per pixel Mali-T760 can support even more data per pixel

Enables reading and writing the current pixel’s data that is persistent throughout the

lifetime of the framebuffer

Independent of framebuffer format

Extensions (2) Shader Pixel Local Storage (PLS)

http://www.khronos.org/registry/gles/extensions/EXT/EXT_shader_pixel_local_storage.txt

Page 12: Bandwidth-efficient graphics with ARM Mali GPUs

12

Compute final pixel color based on the Pixel Local Storage data

Output to current framebuffer format

Deferred Shading Resolve

Page 13: Bandwidth-efficient graphics with ARM Mali GPUs

13

Bandwidth Comparison

Deferred Shading

0. 750. 1500. 2250.

Write MB/s

Read MB/s

Total MB/s

Using extensionsMultiple render targets

deferred shading example rendering to 4xRGBA8 1080p@30fps

Page 14: Bandwidth-efficient graphics with ARM Mali GPUs

14

Roadmap

Page 15: Bandwidth-efficient graphics with ARM Mali GPUs

15

Various deferred shading/lighting

Order independent transparency

Deferred virtual texturing

Volume rendering

etc, etc, etc

Future

http://geomerics.com/downloads/SIGGRAPH-2013-SamMartinEtAl-Challenges.pdf

Page 16: Bandwidth-efficient graphics with ARM Mali GPUs

16

Questions? Thank you.