21
ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

Embed Size (px)

Citation preview

Page 1: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

ARM Mali GPU Architecture

Sam Martin

ARM Game Developer Day - London

Graphics Architect, ARM

03/12/2015

Page 2: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 2

Agenda

Mali architecture and tiling introduction

Behind the scenes – power limits

Vulkan

Page 3: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 3

Mali GPU Taxonomy In a Nutshell

Mali 4xx series OpenGL ES 2.0

1-8 shaders cores, separate fragment and vertex processors

Mali 6xx – 8xx OpenGL ES 3.x

Unified “tri-pipe” shader core

Larger core configurations, max 16 cores from Mali 760 +

AFBC, ASTC, Transaction Elimination, ...

All tile-based GPUs

Page 4: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 4

Input assembly

Vertex shader

Rasterizer

Pixel shader

Output merger

Geometry phase

Pixel phase

Command phase

Command stream

from CPU

Page 5: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 5

Fragments >> Geometry

Phased structure

1. Buffer all operations into “render passes”

2. Transform + bin all geometry into screen space tiles

3. Fully shade each tile into local memory, then write back

Tile-based GPUs Input assembly

Vertex shader

Rasterizer

Pixel shader

Outer merger

Command stream

from CPU

Page 6: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 6

Mali Architecture Hardware tiling

Forward Pixel Kill

Reduce overdraw

Framebuffer memory on-chip

4x MSAA for “free”

Advanced on-chip shading

Bandwidth efficiencies

ARM Framebuffer Compression

Transaction elimination

ASTC

Page 7: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 7

Mobile Power Limits

Lifetime constrained by battery

High-end performance constrained by heat

Thermal Design Power/Point (TDP) Capacity constrained by ability to dissipate heat

Memory bandwidth particularly expensive Rule of thumb: 100mW / GB/s, assume 1 W total

Low-mid end GPUs are constrained by die area Savings prolong battery life but may not increase performance

Phones 1-3 Watts

Tablets 3-5 Watts

Small laptop-like 10-25 Watts

Regular laptop 25-50 Watts

Integrated desktop 40-100 Watts

Page 8: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 8

3 mm² 5 mm² 10 mm² 30 mm²

561 mm²

Similarly capable mobile GPUs

Die areas shown to scale NVIDIA GeForce

GTX Titan

Page 9: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 9

3 mm² 5 mm² 10 mm² 30 mm²

Low-end

561 mm²

Page 10: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 10

3 mm² 5 mm² 10 mm² 30 mm²

Mid-range

561 mm²

Page 11: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 11

3 mm² 5 mm² 10 mm² 30 mm²

High-end

561 mm²

Page 12: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 12

3 mm² 5 mm² 10 mm² 30 mm²

561 mm²

1-10x range, just within mobile phones

Servicing such a wide range demands scalable GPU designs

GPU feature set cannot indicate performance capability

Page 13: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 13

Thermal Throttling

Max OPP big

Max OPP LITTLE

Max OPP GPU

CPU - big

CPU - LITTLE

GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs]

Fre

quency

Time (s)

Median filtered chart for clarity

Page 14: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 14

Thermal Throttling

Max OPP big

Max OPP LITTLE

Max OPP GPU

CPU - big

CPU - LITTLE

GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs]

Fre

quency

Time (s)

Median filtered chart for clarity

Page 15: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 15

Thermal Throttling

Max OPP big

Max OPP LITTLE

Max OPP GPU

CPU - big

CPU - LITTLE

GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs]

Fre

quency

Time (s)

Median filtered chart for clarity

Page 16: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 16

Thermal Throttling

Max OPP big

Max OPP LITTLE

Max OPP GPU

CPU - big

CPU - LITTLE

GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs]

Fre

quency

Time (s)

Median filtered chart for clarity

Page 17: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 17

Thermal Throttling

Max OPP big

Max OPP LITTLE

Max OPP GPU

CPU - big

CPU - LITTLE

GPU GL Benchmark 2.7 (T-Rex HD) [3 Runs]

Fre

quency

Time (s)

Median filtered chart for clarity

Page 18: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 18

Vulkan

Good match for mobile and tiling architectures

Explicit multi-pass render passes

No hidden costs (copies, allocs, shader recompiles, etc)

Multi-threaded

Low overhead

Gloves-off API

Needs care – look out for future info post-release

Page 19: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 19

[email protected] @palgorithm

Coming up:

Increase texturing efficiency and quality

Daniele Di Donato, “Get the most out of ASTC” – up next!

Advanced use of tiled framebuffers

Marius Bjørge, “Fast Approximate Indirect Lighting on Mobile”, 11am

Compute shaders & tessellation

Hans-Kristian Arntzen, “Real-time GPU-driven Ocean Rendering on Mobile”, 11.30am

Thanks! Questions?

Page 20: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

© ARM 2015 20

For more information visit the Mali

Developer Centre:

http://malideveloper.arm.com

• Revisit this talk in PDF and audio

format post event

• Download tools and resources

Page 21: ARM Mali Architecture - Microsoft · ARM Mali GPU Architecture Sam Martin ARM Game Developer Day - London Graphics Architect, ARM 03/12/2015

The trademarks featured in this presentation are registered and/or unregistered trademarks of ARM Limited (or its

subsidiaries) in the EU and/or elsewhere. All rights reserved. All other marks featured may be trademarks of their

respective owners.

Copyright © 2015 ARM Limited