22
Venugopala Madumbu, NVIDIA GTC 2017 – 210D S7105 – ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION

S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

Venugopala Madumbu, NVIDIA

GTC 2017 – 210D

S7105 – ADAS/AD CHALLENGES:GPU SCHEDULING & SYNCHRONIZATION

Page 2: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

2

ADVANCED DRIVING ASSIST SYSTEMS (ADAS) & AUTONOMOUS DRIVING (AD)High Compute Workloads Mapped to GPU

Page 3: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

3

ADAS/ADRequirements & Challenges

Real-Time Behavior• Determinism

• Freedom from Interference

• Priority of Functionalities

Performance • Maximum Throughput

• Minimal Latency

Multi-Core

CPUGPU/DSP/HWA

Page 4: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

4

ADAS/AD WORKLOADS

Challenges Illustrated

Scenario#1 – Standalone Exec

GL

Workload

X msecCUDA Workload

Scenario#3 – Concurrent Exec

GL Workload

> (X+Y) msec

Time Shared GPU Execution

If so, How to

• Achieve determinism

• Achieve Freedom from interference

• Prioritize one Workload over other

While also having• maximum throughput• minimum latency

CUDA

Workload

GL

Workload

X msecY msec

Scenario#2 – Standalone Exec

Y msec

CUDA

Workload

Page 5: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

5

GPU

Host Engines

DRAM

Memory Controller

CPU

Other

Clients(ISP, Display,

etc.)

GPU Memory Interface

GPU IN TEGRAHigh Level Tegra SoC Block Diagram

CPU submits job/work to GPU

GPU runs asynchronously to CPU

GPU has its own hardware

scheduler (Host)

It switches between workloads

without CPU involvement

Page 6: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

6

GPU SCHEDULING

Channel – independent stream of work on the GPU

Command Push Buffer – Command buffer written by Software and read by Hardware

Channel Switching – Save/restore GPU state on a channel switch

Semaphores/SyncPoints – Synchronization mechanism for events within the GPU

Time Slice – How long a GPU executes commands of a channel before a channel switch

Run-list – An ordered list of channels that SW wants the GPU to execute

Concepts

Page 7: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

7

GPU SCHEDULING

Channel switching occurs when any ONE of the following happens:

• Time slice expires

• Engine runs out of work (no more commands)

• Blocked on a semaphore

Channel Switch time = Drain Time + Save/Restore time

Preemption can reduce Channel Switch times drastically

Timesharing by Channel Switching

TimeGPU Occupancy

GPU

Timesliced Round-Robin

App1 App4App3App2

. . . . . .

Page 8: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

8

GPU SCHEDULINGPreemption

Page 9: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

9

Channel 1

Time slice

Channel 1

Channel SwitchTimeout

2. Channel preemption Stop all commands in pipelineWait for engines to idleHigher Context Switch time

Channel 1

Time slice

Channel 1

Channel Reset

3. Channel Reset Engine could not idle and context could not save before channel switch timeoutCallback to notify kernel of channel reset eventChannel Switch

Timeout

GPU SCHEDULINGChannel Switching with Time Slice Scenarios

Channel 1

Time slice

1. Channel finishes before time slice expiresContext switch to next channel

Page 10: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

10

CHALLENGE REVISTEDHow can we achieve both?

Real-Time behavior:• Determinism• Freedom from Interference• Priority of Functionalities

Performance:• Maximum Throughput• Minimal Latency

Page 11: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

11

GPU SYNCHRONIZATION & SCHEDULING

Software Control

1. User Driver Level (GPU Synchronization Approach)

• Syncpoints/Semaphores for Synchronization

• Through EglStreams, EGLSync etc

2. Kernel Driver Level (GPU Priority Scheduling Approach)

• Run-List Engineering

• How long channel runs

• Order of Channel execution

Page 12: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

12

GPU SYNCHRONIZATION APPROACHNo Synchronization Case

CPU

GPU

CPU Task CPU Task CPU Task

Priority GPU Task

GPU

Task

Latency due to

Concurrent

ExecutionGPU Task

Kernel launch

GPU Semaphore

0 5 10 15 20 25 30 35 msec

Page 13: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

13

GPU SYNCHRONIZATION APPROACHSynchronization on CPU: Not good for GPU

CPU

GPU

CPU Task CPU Task CPU Task

Priority GPU Task

GPU

Task

GPU Task

Kernel launch

GPU Semaphore

0 5 10 15 20 25 30 35 msec

Page 14: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

14

GPU SYNCHRONIZATION APPROACHSynchronization on GPU: No Context Switches

CPU

GPU

CPU Task CPU Task

Priority GPU Task

GPU

Task

GPU Task

Kernel launch

GPU Semaphore

CPU Task

Delayed

Start

0 5 10 15 20 25 30 35 msec

Determinism

Freedom from Interference

Priority of Functionalities

Page 15: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

15

GPU PRIORITY SCHEDULING APPROACHHypothetical Example

TASK PRIORITY FPSWORST CASE

EXECUTION TIME (WCET)

H1 High 60 9ms

M1 Medium 30 4ms

M2 Medium 30 4ms

L1 Low/Best Effort 30 10ms

H1

M1

M2

L1

Page 16: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

16

GPU PRIORITY SCHEDULING APPROACHEngineered Run-list and Time Slice Ensuring FPS and Latency

H1

M1

M2

H1

Run-List

H1 (Max Exec Time = 9 ms)

Time slice = 9 ms

M1 (Max Exec Time = 4 ms)

Time slice = 3 ms

M2 (Max Exec Time = 4 ms)

Time slice = 3 ms

L1 (Max Exec Time = 10 ms)

Time slice = 1 ms

M1

L1

M2

TimeWork on GPU

. . . . . .

Ensured not >16ms for 60fpsoperation

Page 17: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

17

GPU PRIORITY SCHEDULING APPROACH

Ensure timeslice is long enough to complete work

Ensure work is continually submitted and also well ahead in time

• To Avoid

• GPU idle time

• Unnecessary context switches

Reduce Latency for GPU Work Completion

Page 18: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

18

GPU SCHEDULING

Submit work in advance

• So the GPU has some work to execute at any point of time

Try to reduce/eliminate work dependencies

Have contingency plan for work overload

• If feedback shows over budget, submit work few frames ahead and spread

Plan for worst case scenario

• Deal with GPU reset case esp for the Low priority cases

• GL Robustness Extensions

Best Practices to Keep GPU Busy

Page 19: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

19

CONCLUSIONGPU Synchronization & Scheduling Approaches

Real-Time behavior:• Determinism• Freedom from Interference• Priority of Functionalities

Performance:• Maximum Throughput• Minimal Latency

Page 20: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

20

ACKNOWLEDGEMENTS

• Scott Whitman, NVIDIA

• Vladislav Buzov, NVIDIA

• Amit Rao, NVIDIA

• Yogesh Kini, NVIDIA

GTC Instructor led Lab::

L7105 – EGLSTREAMS : INTEROPERABILITY OF

CAMERA, CUDA AND OPENGL

11TH MAY 2017 9:30-11:30AM LL21D

Page 21: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

21

Q&A

Page 22: S7105 ADAS/AD CHALLENGES: GPU SCHEDULING ......Venugopala Madumbu, NVIDIA GTC 2017 –210D S7105 –ADAS/AD CHALLENGES: GPU SCHEDULING & SYNCHRONIZATION 2 ADVANCED DRIVING ASSIST SYSTEMS

THANK YOU