Upload
opnfv
View
545
Download
0
Embed Size (px)
Citation preview
KVM Enhancements for OPNFV
Jun Nakajima
Intel Corporation
Contributors: https://wiki.opnfv.org/nfv-kvm
Project: NFV Hypervisors-KVM
1. Minimal Interrupt latency variation for data plane VNFs
2. Inter-VM Communication
3. Fast Live Migration
11/12/2015 KVM Enhancements for NFV
https://wiki.opnfv.org/nfv-kvm
2
Enhancements for NFV Hypervisor
1
1. Exclusive allocation of whole CPU cores to VMs
2
3
3. Inter-VM Communication (direct-memory mapped)
4
2. Direct I/O (e.g. SR-IOV)
4. vSwitch implementa1on as a high performance VM
General public and enterprise cloud Hypervisor Architecture
NFV Hypervisor Architecture
From ETSI “Network Functions Virtualization (NFV); Infrastructure; Hypervisor Domain”
3
KVM Hypervisor
Hardware
System Configuration
Testing and Perf Tools
NIC
VM
core core
Virtual Switch (vSwitch)
VM
VM
VM
KVM kernel module
2. Inter-VM Communication
1. Minimal Latency Variation
Configuration Information (BIOS, boot,
kernel)Latency Variation
Perf. Tools
Inter-VM Config. Information
Inter-VM Comm. Perf. Tools
Live Migration Perf. Tools
Live Migration Config.
Information3. Fast Live Migration
Scope of the Project
11/12/2015 KVM Enhancements for NFV 4
Minimal Interrupt Latency Variation
11/12/2015 KVM Enhancements for NFV 5
Goals of Minimal Interrupt Latency Variation
• Timer latency => AVG 5usec, MAX 15 us
• Interrupt latency => MAX 20 us (Future target: 5 us)
• Maximum guest vCPU pre-emption period => 10 us (Future target: 2 us)
• Cumulative pre-emption within 1 ms window => X < 20 us (Future target: X < 4 us)
https://etherpad.opnfv.org/p/nfv-kvm 11/12/2015 KVM Enhancements for NFV 6
User-Level
Linux Kernel
HardwareNIC
core
KVM Modules
User-Level
core
NIC
core core
Real-‐&me Guest Summary Of Solutions
PREEMPT-RT Kernel
Intr_Handler Excusive/Static Allocation
Soft “Partitioning”, CPU Binding, Huge
Pages
Software Real-Time Linux, Code inspection,
testing/measurements
Hardware Technologies Cache Allocation
Technology, Advanced VT features
11/12/2015 KVM Enhancements for NFV 7
Update on Enhancements: Cyclictest (Initial)
Latency (μs) 0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58
Baseline
w/ RT Extensions
Histogram (Total: 1M Samples)
RT Linux Guest on KVM
Latency (μs) 11/12/2015 KVM Enhancements for NFV 8
Update on Enhancements: Cyclictest (After)
• Cyclic Test in Guest: Latency (in μs )
– Min: 7 – Avg: 9 – Max: 16
RT Linux Guest on KVM
Latency (μs)
Host: Linux with RT patches
000007 000003000008 11757562000009 67812652000010 159222000011 069100000012 011004000013 000379000014 000207000015 000049000016 000005
Latency
99.69% (Total #: 79,810,183)
Occurrences
0
10000000
20000000
30000000
40000000
50000000
60000000
70000000
80000000
0 2 4 6 8 10 12 14 16 18 20 22
Histogram
11/12/2015 KVM Enhancements for NFV 9
Linux Kernel
Hardware
KVM Modules
Intr_Hander
PCIe Device
Interrupt (MSI)
Device Driver
Periodic (1ms)
Update on Enhancements:Latency from Periodic External Interrupts
• Latency from periodic external interrupt:
– Time delta from interrupt occurrence to invocation of interrupt handler in guest (unit: in μs)
Min: 3.98 Avg: 4.42 Max: 9.10
11/12/2015 KVM Enhancements for NFV 10
Inter-VM Communication
11/12/2015 KVM Enhancements for NFV 11
Goals of Inter-VM Communication
• Add fast-paths in VMs as optimized inter-VM communication
• Maintain consistent flow table entries in VMs
• Enable protected access to the destination VM or shared memory
– Open the Window when needed – Close it immediately when done
11/12/2015 KVM Enhancements for NFV 12
VM1
virtio-net
In-VM Switch
VM2
TX
RX
virtio-net
Simple Flow Table
NIC
In-VM Switch
VM3
TX
RX
virtio-net
In-VM Switch
Inter-VM data transfer
API Libraries
ProcessProcess
Summary Of Solutions
Flow Table
Entries
Shared Memory
If (In-VM Switch is present)
vSwitch (e.g. OVS-‐dpdk, Snabb Switch)
11/12/2015 KVM Enhancements for NFV 13
Update on Enhancements:PoC
• Transfer 64B packets from virtio-net to another VM (fast path)
– Memory Copy – VMFUNC – VMFUNC with Trampoline
Code • 65Mpps with 32-packet
batching*:
11/12/2015 KVM Enhancements for NFV *Intel internal estimation 14
Update on Enhancements:Upstream Project
• Proposal from Michael S. Tsirkin
– Inter-VM Communication for two VMs – Receiver allows one to access its own memory via existing IOMMU
protection mechanism • Our plan
– Extend Michael’s proposal to support multiple VMs – Working with him for implementation
11/12/2015 KVM Enhancements for NFV
https://wiki.opnfv.org/vm2vm_mst
15
Fast Live Migration
11/12/2015 KVM Enhancements for NFV 16
17
Goals of Fast Live Migration
• Migration Time – < 2 sec.
• Downtime – Hypervisor downtime for IO intensive
workload: 10 ms (future 3ms) – Hypervisor downtime for memory
intensive workload: < 500 ms – Network downtime: 25 ms
• Configuration – Typical VM size: 8G to 16G – Network throughput: 5Gbps for
64bytes packets
• SR-IOV Migration – Support NIC with kernel Land driver – Support NIC with DPDK driver
11/12/2015 KVM Enhancements for NFV
18
Summary Of Solutions
• Shorten VM Downtime - Clean up operation after completion of
data transmission
- Minimize cpu_synchronize_all_states() calls
• Shorten Total Migration Time - Optimize handling of unused or zero
pages
- Use CPU instructions to optimize zero pages checking
- Use hardware accelerator to compress data before transmitting
11/12/2015 KVM Enhancements for NFV
Update on Enhancements
11/12/2015 KVM Enhancements for NFV
Guest
Host
Host
Guest
10Gbps for migra&on
10G used by pktgen
DPDK L2 FW APP
Eth0 Eth1
pktgen
DPDK L2 FW APP
Eth0 Eth1
Eth1 Eth0 Eth0 Eth1
vBridge vBridge
Network connec&on
Host CPU: Intel(R) Xeon(R) CPU E5-‐2699 v3 @ 2.30GHz RAM: 64G OS: RHEL 7.1, Kernel: 4.2-‐rc6 QEMU:2.4.0 Ethernet controller: Intel Corpora&on 82599ES 10-‐Gigabit SFI/SFP+ -‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐-‐ Parameters: /root/qemu.git/x86_64-‐sojmmu/qemu-‐system-‐x86_64 -‐enable-‐kvm -‐cpu host -‐smp 4 –device vir&o-‐net-‐pci,netdev=net1,mac=52:54:00:12:34:56 –netdev type=tap,id=net1,script=/etc/kvm/qemu-‐ifup,downscript=no,vhost=on –device vir&o-‐net-‐pci,netdev=net2,mac=54:54:00:12:34:56 –netdev type=tap,id=net2,script=/etc/kvm/qemu-‐ifup,downscript=no,vhost=on -‐m 8192 -‐monitor stdio /mnt/liang/ia32e_rhel6u5.qcow
Test environment:
0
500
1000
1500
2000
Idle guest DPDK FW workload
Total migra1on 1me (ms)
Original
Ajer op&miza&on
0
10
20
30
40
50
Idle guest DPDK FW workload
VM down1me (ms)
Original
Ajer op&miza&on
Performance data:
19
Q & A
11/12/2015 KVM Enhancements for NFV 20