持续助力数据中心虚拟化：KVM里的 GPU6 VM NVIDIA driver Apps NVIDIA vGPU KVM...

持续助力数据中心虚拟化：KVM里的虚拟GPU

AGENDA

NVIDIA vGPU on KVM

NVIDIA vGPU product overview

Summary

NVIDIA VGPU ON KVM

KVM HYPERVISORFast Growing and Widely Adopted

NVIDIA vGPU

• Fully enables NVIDIA GPU on virtualized platforms

• Wide availability - supported by all major hypervisors

• Great app compatibility – NVIDIA driver inside VM

• Great performance – VM direct access to GPU hardware

• Improved density

• Multiple VMs can share one GPU

• Highly manageable

• NVIDIA host driver, management tools retain full control of the GPU

• vGPU suspend, resume, live migration enables workloads to be transparently moved between GPUs

Performance, Density, Manageability – for GPUNVIDIA vGPU

Guest OS

NVIDIA

Driver

Tesla GPU

Hypervisor

vGPU Manager

Guest OS

NVIDIA

Driver

NVIDIA driver

NVIDIA vGPU KVM Architecture 101

Linux kernel

GRID vGPU Manager

VFIO Mediated Framework

Tesla GPU

VFIO PCI driver

kvm.ko

NVIDIA driver

VFIO PCI driver

Based on upstreamVFIO-mediated architecture

No VFIO UAPI change

Mediated device managed by generic sysfs interface or libvirt

Mediated Device Framework – VFIO MDEV

Present in KVM Forum 2016, upstream since Linux 4.10, kernel maintainer - Kirti Wankhede @ NVIDIA

Mediated core module (new)

Mediated bus driver, create mediated device

Physical device interface for vendor driver callbacks

Generic mediate device management user interface (sysfs)

Mediated device module (new)

Manage created mediated device, fully compatible with VFIO user API

VFIO IOMMU driver (enhancement)

VFIO IOMMU API TYPE1 compatible, easy to extend to non-TYPE1

A common framework for mediated I/O devices

Mediated Device Framework - NVIDIA

MDEVDevice Register

Interface

Mediated CBsMediated

Bus Driver

Mdev Driver

Register Interface

IOMMUMMU

Mdev sysfs

Mediated Core

VFIO UAPITYPE1 IOMMU

VFIOTYPE1

NVIDIA driver

Tesla GPUTesla GPUTesla GPU

Mediated Device Framework

After NVIDIA driver device registration, under physical device sysfs:

create : create a virtual device (aka mdev device)

mdev_supported_types : supported mdev and configuration of this device

name: GRID M60-8Q, for example.

description: num_heads, frl_config=60, framebuffer size, max_resolution, max_instance

device_api: vfio-pci

Mdev node: /sys/bus/mdev/devices/$mdev_UUID/

https://www.kernel.org/doc/Documentation/ABI/testing/sysfs-bus-vfio-mdev

Mediated Device sysfs

Creation of vGPU

Generate vGPU mdev UUID via uuid-gen, for example "98d19132-f8f0-4d19-8743-9efa23e6c493"

Create vGPU device:

dmesg output:

Start vGPU VM

Directly via QEMU command line:

-sysfsdev /sys/bus/mdev/devices/98d19132-f8f0-4d19-8743-9efa23e6c493

libvirt:

Mediated Device FrameworkQEMU – Device Initialization

Registers VFIO MDEV as driver

NVIDIA driver registers devices

NVIDIA driver registers Mediated CBs

User writes mdev sysfs to create mdev device

QEMU calls VFIO API to add VFIO dev to IOMMU container, group, get fd back

QEMU access device fd and present it into VM

Device Register

Interface

Bus Driver

Mdev Driver

Register Interface

Mdev sysfs

Mediated Core

IOMMUMMURAM

VFIOTYPE1

NVIDIA driver

pin/unpin

Guest OS

NVIDIA

driver

RAM NVIDIA

Mdev fd

/sys/bus/mdev/devices/$mdev_UUID

Mediated Device Access

Virtual device memory region are presented inside guest for consistent view of vendor driver

Access to emulated regions are redirected to mediated vendor driver for virtualization support

Access to passthrough region are directly sent to device corresponding region for max performance

1st access redirected to mediated vendor driver for CPU page table setup

Emulated vs Passthrough

Mediated Device AccessMMAP

QEMU gets region info via VFIO UAPI from vendor driver thru VFIO MDEV and Mediated CBs

NVIDIA driver accesses MDEV MMIO trapped region backed by mdev fdtriggers EPT violation

KVM services EPT violation and forwards to QEMU VFIO PCI driver

QEMU convert request from KVM to R/W access to MDEV fd

RW handled by NVIDIA driver via Mediated CBs and VFIO MDEV

Device Register

Interface

Bus Driver

Mdev Driver

Register Interface

Mdev sysfs

Mediated Core

IOMMUMMURAM

VFIO TYPE1

NVIDIA driver

pin/unpinKVM

Guest OS

NVIDIA

driver

RAM NVIDIA

Mdev fd

/sys/bus/mdev/devices/$mdev_UUID

NVIDIA driver

IOMMUMMURAM

Tesla GPU

Mediated DMA translationWith QEMU – Runtime Memory pinning

QEMU Starts

Memory regions gets added by QEMU

QEMU calls VFIO_DMA_MAP via Memory listener

TYPE1 IOMMU tracks <VA, GFN>

NVIDIA driver pin/translate GFN by TYPE1 IOMMU to get PFN

NVIDIA driver call pci_map_sg to map PFNs to BFN, program DMA

VA GFN

Device Register

Interface

Bus Driver

Mdev Driver

Register Interface

Mdev sysfs

Mediated Core

DMABFN

VFIOTYPE1

IOMMUpin/unpin

Guest OS

NVIDIA

driver

NVIDIA

Mediated Device FrameworkQEMU – Interrupt injection in runtime

QEMU query MDEV supported interrupt type, provided by NVIDIA driver

QEMU setups up KVM IRQFD

QEMU notifies the vendor driver IRQFD via VFIO PCI UAPI

NVIDIA driver inject interrupt by signaling on eventfd, trigger guest ISR

Device Register

Interface

Bus Driver

Mdev Driver

Register Interface

Mdev sysfs

Mediated Core

IOMMUMMURAM

VFIO TYPE1

NVIDIA driver

pin/unpinKVM

Guest OS

NVIDIA

driver

NVIDIA

GFNISR

NVIDIA vGPU Migration

Proposing a much more complete solution for VFIO MDEV -based migration

Pre-copy support

Dirty page tracking

Iterative memory copy

Initial NVIDIA KVM vGPU patch posted in Oct 2018

[RFC,v1,0/4] Add migration support for VFIO device

For KVM

How to Integrate to your Hypervisor?

Integrate major VFIO MDEV, KVM and QEMU patches

Contact NVIDIA vGPU Product Management team for evaluation driver and license

Let us know your integration experience

NVIDIA GRID GPU VIRTUALIZATION PLATFORMIndustry standard virtualization platform

NVIDIA Tesla GPUs

NVIDIA Virtualization Software

Hypervisor

GRID vPC Quadro vDWS

Multiple-GPUGPU Sharing

M60, M6, M10 (graphics/sharing only) P40, P6, P100, P4, V100, T4 (soon)

Data Center and/or Cloud Accessible

vGPU Monitoring, Insight and Management

CUDA Compute Support(Workstation Apps, Rendering, HPC, DL, AI)

NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

VGPU EVERYWHERE FOR EVERYONE, EVERY WORKLOAD

GPU ACCELERATED DATA CENTER & CLOUD

NVIDIA VGPUPRODUCT OVERVIEW

NON-STOP INNOVATION

2013 2016 2017 2018 2019

GPU in data center.

GRID 4

Virtual Desktop Infrastructure.Density and performance.

vGPU 5

Provision and management.

vGPU 6/7

More hypervisors and CSPs.Advanced data center features.

vGPU 8

AI/DL.

Source: Source information is 8 pt, italic

23NVIDIA CONFIDENTIAL. DO NOT DISTRIBUTE.

MULTI-vGPUDelivering a More Powerful Virtual Workstation

Virtual WorkstationVirtual Workstation Virtual Workstation Virtual Workstation Virtual Workstation

Multiple VM’s Sharing a GPU with NVIDIA Quadro vDWS

A Single VM Accessing Multiple Tesla GPUs with NVIDIA Quadro vDWS (7.0)

QUADRO vDWS MULTI-vGPUEnabling New Workflows in Strategic Markets

Oil & GasSeismic interpretation, simulation

ManufacturingSimulation, modeling & design

Federal GovernmentSimulation & training

Media & EntertainmentRendering

94% FASTER RENDERING USING MULTI-VGPU

Up to 94% faster render time using two Tesla V100-32Q GPUs versus a single Tesla V100-32Q

SOLIDWORKS Visualize (Iray) Render Time

TAKEWAYS

NVIDIA GRID VGPURuns on all Tesla GPUs

Maxwell

Pascal

Turing (Soon)

COMMITTED TO INNOVATION

2013 2016 2017 2018 / 2019

Business Users

Simulation/Photo Realism/3D

Rendering

Designers/Architects/Engineers

Source: Source information is 8 pt, italic

持续助力数据中心虚拟化：KVM里的 GPU6 VM NVIDIA driver Apps NVIDIA vGPU KVM...

Documents

Nvidia grid and vGPU

VFIO, OVMF, GPU, and You - Kernel-based Virtual … OVMF, GPU, and You The state of GPU assignment in QEMU/KVM Alex Williamson / alex.williamson@redhat.com The current state of VGA

VGA Assignment Using VFIO Alex Williamson · 2013-10-10 · VGA Assignment Using VFIO Alex Williamson alex.williamson@redhat.com October 21st, 2013. 2 Alex Williamson ... No VGA

LongView LongView Companion - KVM pro - IP KVM project

Kernel Virtual Machine (KVM): KVM security - IBM - United States

Mastering kvm virtualization- A complete guide of KVM virtualization

Kernel Virtual Machine (KVM): Best practices for KVM · “Best practice: Virtualize memory ... IBM ® DS3400 controllers ... 4 Kernel Virtual Machine (KVM): Best practices for KVM

GRID VGPU FOR CITRIX XENSERVER - Nvidiaus.download.nvidia.com/Windows/Quadro_Certified/GRID/312...NVIDIA CONFIDENTIAL GRID vGPU for Citrix XenServer DU-06920-001 | iv APPENDIX A. XenServer

GRID VGPU FOR VMWARE VSPHEREus.download.nvidia.com/Windows/Quadro_Certified/GRID/348.07/346.68...GRID vGPU for VMware vSphere DU-07354-001 | 3 GETTING STARTED This document provides

17 8-Port Combo VGA LCD IP KVM Switch...KVM-KC1-1.8 1.8M USB KVM Cable with built-in PS2 to USB Converter KVM-KC1-3 3M USB KVM Cable with built-in PS2 to USB Converter KVM-KC1-5 5M

VFIO, OVMF, GPU, and You · The state of GPU assignment in QEMU/KVM Alex Williamson / alex.williamson@redhat.com. The current state of VGA assignment. VGA assignment defined: Graphics

Engine Log Book #2.pdfkr85 kx170b sharc7 vfio-210 vfio-210 removed sub total installed items c1121 c1121 ... adf reciever 066-1023-00 vhf nav/comm 069-102000 aircraft wire n/a elt

VGPU ON KVM...Neo Jia & Kirti Wankhede, 08/25/2016 VGPU ON KVM VFIO BASED MEDIATED DEVICE FRAMEWORK 2 AGENDA Background / Motivation Mediated Device Framework –Overview Mediated

GRID vGPU for VMware vSphere Version 367.64/369us.download.nvidia.com/Windows/Quadro_Certified/... · 5.5. VM configured with more than one vGPU fails to initialize vGPU when booted

GRID VGPU FOR CITRIX XENSERVER - us.download.nvidia.comus.download.nvidia.com/Windows/Quadro_Certified/GRID/332.83/331.59... · GRID vGPU for Citrix XenServer DU-06920-001 | 1 Chapter

VFIO - Linux Plumbers Conf VFIO High performance user-space driver IOMMU based security PCI & Non-PCI support Eventfd based interrupts Original PCI implementation: Tom Lyon @Cisco

Release Notes - Nvidiaus.download.nvidia.com/.../GRID/...nvidia-grid-vgpu-release-notes.pdf · GRID vGPU for VMware vSphere Version 352.54 / 354.13 RN-07347-001 | 1 RELEASE NOTES

GRID VGPU FOR CITRIX XENSERVER - Nvidia's Download site!!

VFIO, OVMF, GPU, and You - linux-kvm.org · VFIO, OVMF, GPU, and You The state of GPU assignment in QEMU/KVM Alex Williamson / alex.williamson@redhat.com

Virtual GPU Software R450 for Red Hat Enterprise Linux ......Red Hat Enterprise Linux with KVM 7.8, 7.7, 7.6 All NVIDIA GPUs that NVIDIA vGPU software supports are supported with vGPU