54
Abhijit Patait, 5/8/2017 NVIDIA VIDEO TECHNOLOGIES

NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

Embed Size (px)

Citation preview

Page 1: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

Abhijit Patait, 5/8/2017

NVIDIA VIDEO TECHNOLOGIES

Page 2: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

2

AGENDA

NVIDIA Video Technologies

New SDK Release

Major Focus Areas

Video SDK Features

Software Flow

FFmpeg Performance and Benchmarking Tips

Benchmarks

Page 3: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

3

NVIDIA VIDEO TECHNOLOGIES

Page 4: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

4

VIDEO CODEC SDKA comprehensive set of APIs for GPU-accelerated Video Encode and Decode

NVIDIA Video Codec SDK technology is used to stream video with

NVIDIA ShadowPlay running on NVIDIA GPUs

The SDK consists of two hardware acceleration interfaces:

NVENCODE API for video encode acceleration

NVDECODE API for video decode acceleration (formerly called NVCUVID API)

Independent of CUDA/3D cores on GPU

Page 5: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

5

NVENC Independent Hardware Decoder Function

NVDECIndependent Hardware Encoder Function

NVIDIA VIDEO TECHNOLOGIES

Easy access to NVIDIA GPU hardware acceleration

FFMPEG & LIBAV

SO

FTW

ARE

HARD

WARE

A comprehensive set of APIs for GPU-accelerated Video Encode and Decode for Windows and LinuxCUDA, DirectX, OpenGL interoperability

VIDEO CODEC SDK

NVIDIA DRIVER

Page 6: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

6

CPU

NVDEC NVENC

CUDA Cores

Buffer

Decode HW* Encode HW*

Formats:

• H.264

• H.265

• Lossless

Bit depth:

• 8 bit

• 10 bit

Color**

• YUV 4:4:4

• YUV 4:2:0

Resolution

• Up to 8K***

Formats:

• MPEG-2

• VC1

• VP8

• VP9

• H.264

• H.265

• Lossless

Bit depth:

• 8 bit

• 10 bit

Color**

• YUV 4:2:0

Resolution

• Up to 8K***

NVIDIA VIDEO TECHNOLOGIES

* See support diagram for previous NVIDIA HW generations

** 4:2:2 is not natively supported on HW

*** Support is codec dependent

Page 7: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

7

VIDEO SDK EVOLUTIONVideo SDK 8.0

2014

SDK 4.0Maxwell 1

H.2644:4:4, lossless

2015

SDK 5.0Maxwell 2

HEVCPerf++

2015

SDK 6.0ARGB

Quality+Dec+EncME-only

2016

SDK 7.xPascal

10-bit encodeFFmpeg

ME-only for VRQuality++

2017

SDK 8.010-bit transcode10/12-bit decode

OpenGLDec. optimizations

WP, AQ, Enc. Quality

Page 8: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

8

MAJOR FOCUS AREAS

Page 9: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

9

VIDEO TRANSCODING

➢ Content variety

➢ Codecs, resolutions, quality, bitrate

➢ Live, VOD, ultra-low-latency, broadcast, archives

➢ Pre-encoded or encoded-on-demand

➢ Performance/Watt

Performance/Watt

Page 10: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

10

Stream

➢ Interactive, single frame latency

➢ Capture: NvFBC, Encode: NvENC, Decode: NvDEC

➢ 4K, HDR

Record, Broadcast

➢ Quality

Ultra-low-latency

GAME/APP STREAMING

Page 11: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

11

GPU VIRTUALIZATION

➢ Capture + encode

➢ Low-latency

➢ H.264, HEVC

➢ 4:2:0, 4:4:4, lossless

➢ Multiple-displays

Quality & reliability

Page 12: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

12

➢ Video frame interpolation

➢ Camera stitching (mono to stereo)

➢ Camera stabilization

➢ Computer vision

Accuracy

MOTION-ESTIMATION ONLY MODE

Frame #N

N+1

N+2

N+1.5

Frame #(N+1.5) is interpolated based on motion vectors between frame #N and frame #(N+1)

Page 13: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

13

VIDEO SDK FEATURES

Page 14: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

14

ENCODE FEATURES (1/2)

H.264 HEVC Use-case

Base, Main, High Main, Main10 Baseline standards

8-bit 8-bit, 10-bit 10-bit for HDR

B-frames No B-frames Higher compression & quality

Up to 4096 × 4096 Up to 8192 × 8192 High-res

YUV 4:2:0, 4:4:4 Subsampled or full-res chroma (e.g. wireframes)

Lossless High-quality archiving

Error resiliency: Intra refresh, LTR, ref-pic

invalidation

Handle streaming bit errors

Page 15: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

15

ENCODE FEATURES (2/2)

H.264 HEVC Use-case

Rate control modes:1-pass, 2-pass Quality vs performance

Look-ahead Efficient bit distribution across GOP; higher quality

Adaptive quantization, ∆QP Finer quality control

Weighted prediction (SDK 8.0) Fade-in/fade-out, explosion

RGB inputs Direct NVFBC interoperability

ME-only mode, MV-hints (SDK 8.0) Motion stabilization, Optical flow for VR stereo stitching,

Frame interpolation

1-3 NVENCs per chip High throughput

CUDA, DX, OGL (Linux) (SDK 8.0) Easy integration

Page 16: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

16

DECODE FEATURES

Feature Use-case

MPEG2, VC-1, MPEG-4, H.264, HEVC, VP8, VP9 Baseline standards

8-bit (all codecs), 10/12 bit (HEVC, VP9) (SDK 8.0) HDR decoding

Up to 8192 × 8192 for HEVC, 4096 × 4096 for H.264 High-res

Error resiliency and concealment Internet streaming

Page 17: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

17

VIDEO SDK – CONTENTS (1/2)

➢ Header, documentation, sample applications

➢ Binaries (.dll, .so) in NVIDIA display driver

➢ Unified API for Windows & Linux

➢ NVIDIA developer zone

➢ Encode limitations

➢ Unconstrained: Tesla, GRID, Quadro ≥ X2000 (X = K, M, P)

➢ 2 sessions/system: GeForce, Quadro < X2000

➢ No decode limitations

Page 18: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

18

VIDEO SDK – CONTENTS (2/2)

➢ Decode: DX9, DX11, CUDA, OpenGL

➢ Encode: Basic functionality, features (NvEncoder)

➢ Encode: Performance (NvEnodePerf)

➢ Encode: CUDA interop, D3D interop, OGL interop,

➢ Encode: Low-latency (NVEncoderLowLatency)

➢ Transcode (NvTranscoder)

➢ Coming soon: Reusable classes

Sample Applications

Page 19: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

19

FFMPEG/LIBAV

➢ Major SW focus area for past 6 months

➢ Feature parity with Video SDK 7.1, SDK 8.0 post GTC

➢ End-to-end FFmpeg transcoding @ best possible quality & perf

Page 20: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

20

SOFTWARE FLOW

Page 21: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

21

ENCODE APP FLOW

Client application

NVENC API

NVENC

DriverCUDADirectX

NVENC firmware + hardware

Initialize, Configure, Encode

Configure HW

HW Encode

Encoded

bitstream

OpenGL

OpenGL-CUDA interop NVENC-CUDA interop

Page 22: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

22

ENCODE APP FLOW

API Functions

Structures

Defined in nvEncodeAPI.h

APIs

Open encode

Session

CUDA

DirectX

OpenGL

Device

Type

Query

capabilitiesCodec, presets,

features

NvEncGetEncodeCaps

NvEncGetInputFormats

NvEncGetEncodePresetGUIDs

NvEncOpenEncodeSessionEx

Initialize

encoder

NvEncInitializeEncoder

NV_ENC_INITIALIZE_PARAMS

NV_ENC_CONFIG_H264/HEVC

NV_ENC_RC_PARAMS

W/H, framerate,

preset, RC, codec-

specific params

Allocate

buffers

NvEncRegisterResource

NV_ENC_REGISTER_RESOURCE

Internal/external

DIRECTX,

CUDADEVICEPTR,

OPENGL_TEX

Encode

picture

NvEncEncodePicture

NV_ENC_PIC_PARAMS

Picture-level config

parameters

Synchronous (Win/Lnux)

Async (Win)

Clean-up

NvEncLockBitstream

NvEncUnlockBitstream

Buffers,

session, device

Retrieve

bitstream

NvEncUnregisterResource

Page 23: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

23

DECODE APP FLOW

ParserSource

Client application

Bitstream

Callbacks

NVDEC

Driver

NVDEC

Video

frames

• YUV

• RGB

• DX

• CUDA

NV DECODE API

Demux

Data flow Decode API calls

Page 24: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

24

DECODE APP FLOW

API functions

Structures

Defined in dynlink_nvcuvid.h,

dynlink_cuviddec.h

APIs

Query

capabilitiesCodecs, resolutions

supported

cuvidGetDecoderCaps()

CUVIDDECODECAPS

Create

decoder

W/H, scaling,

bit-depthcuvidCreateDecoder()

CUVIDDECODECREATEINFO

Decode

picture

Picture parameters

from bitstream parsercuvidDecodePicture()

CUVIDPICPARAMS

Post-

processing

scaling, CSC

Etc.CUDA kernels

Clean-upcuvidDestroyDecoder()

Page 25: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

25

FFMPEG APP FLOW

➢ Chain of filters

➢ -hwaccel cuvid: Use end-to-end NVIDIA hardware acceleration

➢ h264_cuvid: Use NVCUVID/NVDECODE

➢ h264_nvenc: Use NVENCODE

➢ scale_npp: high-perf CUDA scaling

ffmpeg -y -vsync 0 –hwaccel cuvid -c:v h264_cuvid -i input.mp4 -c:a copy –vf scale_npp=1280:720 -c:v

h264_nvenc -b:v 5M output.mp4

InputPost-

processingDecode

h264_cuvid

Scale

scale_npp=x:y

Encode Output

h264_nvenc

Page 26: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

26

HARDWARE ACCELERATED TRANSCODE USING FFMPEG

Page 27: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

27

PERFORMANCE CONSIDERATIONS - FFMPEG

➢ Minimize memory (PCIe) transfers

➢ Saturate on-chip encoder/decoder

➢ Efficient M:N command line

➢ Minimize I/O

➢ Encode settings

➢ GPU Clocks

Page 28: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

28

SW TRANSCODEffmpeg -c:v h264 -i input.mp4 -c:a copy -c:v h264 -b:v 5M output.mp4

SW

DecodeSW

EncodeYUVBitstream YUV Bitstream

System Memory

32 fps**1:2 transcode, fps per session4 GHz Intel i7-6700K

Page 29: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

29

SW TRANSCODE + SCALEffmpeg -c:v h264 -i input.mp4 -vf scale=1280:720 -c:a copy -c:v h264 -b:v 5M output.mp4

SW

DecodePreprocess(e.g. scaling)

SW

EncodeYUVBitstream YUV YUV YUV Bitstream

System Memory

29 fps**1:2 transcode, fps per session4 GHz Intel i7-6700K

Page 30: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

30

NVENC

Encode

GPU UNOPTIMIZED TRANSCODEffmpeg -y -vsync 0 -c:v h264_cuvid -i input.mp4 -c:a copy -c:v h264_nvenc -b:v 5M output.mp4

NVDEC

Decode

YU

V

Bit

stre

am

Bitstre

am

System Memory

PCIe

transf

er

GPU Memory

YU

VPCIe

transfe

r

*1:2 transcode, fps per sessionGP104 GPU

288 fps*

Page 31: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

31

NVENC

Encode

GPU UNOPTIMIZED TRANSCODE + CPU SCALEffmpeg -y -vsync 0 -c:v h264_cuvid -i input.mp4 -c:a copy –vf scale=1280:720 -c:v h264_nvenc -b:v 5M output.mp4

NVDEC

Decode

Preprocess(e.g. scaling)

YU

V

Bit

stre

am

Bitstre

am

System Memory

PCIe

transf

er

GPU Memory

YU

VPCIe

transfe

r

*1:2 transcode, fps per sessionGP104 GPU

76 fps*

Page 32: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

32

NVENC

Encode

HIGH-PERF GPU OPTIMIZED TRANSCODEffmpeg -y -vsync 0 –hwaccel cuvid -c:v h264_cuvid -i input.mp4 -c:a copy –vf scale_npp=1280:720 -c:v h264_nvenc -b:v

5M output.mp4

NVDEC

Decode

YU

V

Bit

stre

am

Bitstre

am

System Memory

GPU Memory

YU

V

YU

V

Preprocess(scaling in CUDA)

YU

V

*1:2 transcode, fps per sessionGP104 GPU

472 fps*

Page 33: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

33

PERFORMANCE CONSIDERATIONS

➢ Pipelining

➢ Input/output buffers

➢ Tools: nvidia-smi, Microsoft GPUView

Saturating encoder/decoder

Page 34: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

34

ANALYZING PERFORMANCE BOTTLENECKSMicrosoft GPUView (Windows only)

Page 35: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

35

ANALYZING PERFORMANCE BOTTLENECKSnvidia-smi (Windows & Linux)

Page 36: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

36

PARALLEL TRANSCODES (1:N)

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4

-vf scale_npp=1920:1080 -c:a copy -c:v h264_nvenc -b:v 8M output_1080p.mp4

-vf scale_npp=1280:720 -c:a copy -c:v h264_nvenc -b:v 5M output_720p.mp4

-vf scale_npp=640:480 -c:a copy -c:v h264_nvenc -b:v 3M output_480p.mp4

-vf scale_npp=320:240 -c:a copy -c:v h264_nvenc -b:v 2M output_240p.mp4

-vf scale_npp=160:128 -c:a copy -c:v h264_nvenc -b:v 1M output_128p.mp4

Single command line

Page 37: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

37

PARALLEL TRANSCODES (1:N)

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -vfscale_npp=1920:1080 -c:a copy -c:v h264_nvenc -b:v 5M output1.mp4

Multiple command lines

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -vfscale_npp=1280:720 -c:a copy -c:v h264_nvenc -b:v 5M output2.mp4

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -vfscale_npp=640:480 -c:a copy -c:v h264_nvenc -b:v 5M output3.mp4

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -vfscale_npp=320:240 -c:a copy -c:v h264_nvenc -b:v 5M output4.mp4

ffmpeg -y -vsync 0 -hwaccel cuvid -c:v h264_cuvid -i input.mp4 -vfscale_npp=160:128 -c:a copy -c:v h264_nvenc -b:v 5M output5.mp4

Page 38: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

38

PARALLEL TRANSCODES (1:N)Single command line

CONSPROS

Low init time per transcode (amortized)

Minimize memory transfers

Leverage high encoder perf

Low memory overhead

Complex command line

1:N use-case only

Unsuitable for 1:1 VOD

Typically encoder-limited

Page 39: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

39

PARALLEL TRANSCODES (1:N)Multiple command lines

CONSPROS

Simple command line

Easy scripting

Use-case: 1:1 VOD

High init time per transcode

High memory overhead

Process-level scheduling optimizations

Typically decoder-limited

Multiple disk I/O for input

Page 40: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

40

PARALLEL TRANSCODES (M:N)Hybrid approach

Most flexible approach

Balance memory utilization/complexity/perf

Maximum utilization of encode/decode capacity

Page 41: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

41

ENCODE SETTINGS

Highest Quality Minimum Delay Highest Performance

Use-caseTranscoding, Archiving, Broadcast

streaming (w/ latency), surveillance

Game & app streaming,

surveillance

All, w/ high performance

requirement

NVENC API preset to use High quality (HQ) presets Low-latency (Low delay) presets High performance (HP) presets

Latency (set by the application via

VBV buffer size)

Depends on what application sets;

Typically > 8-10 frames

Depends on what application sets;

Typically 1 frameDepends on what application sets

PSNR delta (0 = High quality)* 0 dB (reference) Approx. -0.5 dB Approx. -0.5-2 dB

Advanced features typically used

Look-ahead, B-frames (H.264 only),

adaptive B-frames (H.264 only),

AQ (Adaptive Quantization)

Strict frame-size compliance

low VBV (Video Buffering Verifier),

AQ (Adaptive Quantization)

Motion search and mode High, 2-pass Medium, 1-pass/2-pass Low, 1-pass

Modes All high quality modes Most modes Most modes

Entropy coding CABAC (H.264) CABAC (H.264) CAVLC (H.264)

Page 42: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

42

BENCHMARKS

Page 43: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

43

ENCODE PERFORMANCE

Performance represents an approximation of max performance and may vary based on GPU clock speed, OS, software versions, and motherboard configuration

H.264 1080p (1920x1080) 4:2:0 8bit 30fps (SINGLE NVENC)

21

13

12

7

9

6

4

2

PASCAL

MAXWELL 2ND GEN

MAXWELL 1ST GEN

KEPLER

Number of Streams / NVENC

Highest Quality Highest Performance

#NVENC GPUs

Kepler Quadro K2000/K2000D/K4000/K4200/

K5000/K5200/K6000

Kepler Tesla K20X/K40

Maxwell Quadro K2200 (1st Gen)/M2000 (2nd Gen)

Maxwell (2nd Gen) Tesla M4

Pascal Quadro P2000/P4000

Kepler Tesla K10/K80

Kepler GRID K2/K520

Maxwell (2nd Gen) Quadro M4000/M5000/M6000

Maxwell (2nd Gen) Tesla M6/M40

Pascal Quadro P5000/P6000

Pascal Tesla P4/P40

Pascal Quadro GP100

Pascal Tesla P100

Kepler GRID K1/K340

Maxwell (2nd Gen) Tesla M60

Note: All GPUs not featured above are limited to 2

simultaneous sessions

Page 44: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

44

ENCODE PERFORMANCE

Performance represents an approximation of max performance and may vary based on GPU clock speed, OS, software versions, and motherboard configuration

HEVC 4K (3840x2160) 4:2:0 8bit 30fps (SINGLE NVENC)

13

7

5

3

PASCAL

MAXWELL (2ND GEN)

Number of Streams / NVENC

Highest Quality Highest Performance

#NVENC GPUs

Maxwell Quadro M2000

Maxwell Tesla M4

Pascal Quadro P2000/P4000

Maxwell Quadro M4000/M5000/M6000

Maxwell Tesla M6/M40

Pascal Quadro P5000/P6000

Pascal Tesla P4/P40

Pascal Quadro GP100

Pascal Tesla P100

Maxwell Tesla M60

Note: All GPUs not featured above are limited to 2

simultaneous sessions

Page 45: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

45

ENCODE PERFORMANCE

Performance represents an approximation of max performance and may vary based on GPU clock speed, OS, software versions, and motherboard configuration

H.264 1080p (1920x1080) 4:4:4 8bit 30fps (SINGLE NVENC)

13

9

7

7

5

4

PASCAL

MAXWELL 2ND GEN

MAXWELL 1ST GEN

Number of Streams / NVENC

Highest Quality Highest Performance

#NVENC GPUs

Maxwell Quadro K2200 (1st Gen)/M2000 (2nd Gen)

Maxwell (2nd Gen) Tesla M4

Pascal Quadro P2000/P4000

Maxwell (2nd Gen) Quadro M4000/M5000/M6000

Maxwell (2nd Gen) Tesla M6/M40

Pascal Quadro P5000/P6000

Pascal Tesla P4/P40

Pascal Quadro GP100

Pascal Tesla P100

Maxwell (2nd Gen) Tesla M60

Note: All GPUs not featured above are limited to 2

simultaneous sessions

Page 46: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

46

ENCODE PERFORMANCE

Performance represents an approximation of max performance and may vary based on GPU clock speed, OS, software versions, and motherboard configuration

Pascal HEVC 10bit 30fps (SINGLE NVENC)

12

9

3

2

5

4

1

1

1080P 4:2:0

1080P 4:4:4

4K 4:2:0

4K 4:4:4

Number of Streams / NVENC

Highest Quality Highest Performance

#NVENC GPUs

Pascal Quadro P2000/P4000

Pascal Quadro P5000/P6000

Pascal Tesla P4/P40

Pascal Quadro GP100

Pascal Tesla P100

Note: All GPUs not featured above are limited to 2

simultaneous sessions

Page 47: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

47

DECODE PERFORMANCENVDEC H.264 YUV 4:2:0

11

8

5

4

0 2 4 6 8 10 12

TESLA P40

TESLA M60

Number of 30fps Streams / NVDEC

4096 x 4096 3840 x 2160 2560 x 1440

Performance represents an approximation of max performance and may vary based on GPU clock speed, OS, software versions, and motherboard configuration

Page 48: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

48

ENCODE PERF/QUALITYEncode quality latest results (slow/med: ±0.4 dB within x264)

medium/720p

medium/1080p

medium/2160p

slow/720pslow/1080p

slow/2160p

-2.0

-1.0

0.0

4x 6x 8x 10x 12x

BD

-PSN

R

FPS RATIO

BD-PSNR vs FPS RATIO

NVENC vs libx264 HQMEDIUM

NVENC vs libx264 HQSLOW

Page 49: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

49

MOTION VECTOR QUALITY

➢ KITTI Vision Benchmark Suite for Optical Flow

➢ Measures distortion of motion vectors compared to “true” motion

➢ Average distortion ≈ 7%, improves 1-2% by motion hints

Page 50: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

50

ME-ONLY MODEFrame 0

Source: http://www.cvlibs.net/datasets/kitti/, under Creative Commons License

Page 51: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

51

ME-ONLY MODEFrame 1

Source: http://www.cvlibs.net/datasets/kitti/, under Creative Commons License

Page 52: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

52

ME-ONLY MODEMotion Vector Distortion

“True” motion NVENC estimated motion Distortion score = 2%

Page 53: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features

53

RESOURCES

Video Codec SDK: https://developer.nvidia.com/nvidia-video-codec-sdk

FFmpeg GIT: https://git.ffmpeg.org/ffmpeg.git

Libav GIT: https://git.Libav.org/libav.git

FFmpeg builds with hardware acceleration: http://ffmpeg.zeranoe.com/builds/

Video SDK support: [email protected]

Video SDK forums: https://devtalk.nvidia.com/default/board/175/video-technologies/

Page 54: NVIDIA VIDEO TECHNOLOGIES - GPU Technology …on-demand.gputechconf.com/gtc/2017/presentation/s7111...NVIDIA Video Technologies New SDK Release Major Focus Areas Video SDK Features