31
Windows PV Network Performance Paul Durrant Senior Principal Software Engineer, Citrix Systems Windows PV Driver Community Lead Xen Project Developer Summit 2016

XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Embed Size (px)

Citation preview

Page 1: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Windows PV Network Performance

Paul DurrantSenior Principal Software Engineer,Citrix Systems

Windows PV Driver Community Lead

Xen Project Developer Summit 2016

Page 2: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Agenda

• Background• The netif protocol• Windows RSS

• Protocol Extensions

• Performance Measurements

• Q & A

Page 3: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Background

Page 4: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

The netif protocol• Canonical header: xen/include/public/io/netif.h

• Usual split driver model:

• But…

Backend Frontend

Requests

Responses

Page 5: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

The netif protocol• Duplicated for RX and TX:

Backend Frontend

Requests

Responses

Backend Frontend

Requests

Responses

RX TX

• RX requests still come from frontend so ring needs to be ‘pre-filled’

Page 6: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

The netif protocol• TX packet fragments (requests):

Frag 1

Extra 1

Extra n

Frag 2

Frag n

… …• Data specified by grant_ref, offset and size

• size of ‘Frag 1’ is total size of packet, not just the fragment• id field echoed in corresponding response

• ‘Extra’ fragments have no room for id. How are responses matched?

• They’re not but…• ‘Extra’ response has magic RSP_NULL status

Page 7: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

The netif protocol• RX packet fragments (responses):

Frag 1

Extra 1

Extra n

Frag 2

Frag n

… …• Data specified by offset

• No size field. A positive value of status is a fragment size• grant_ref is in the request, so id needed to find the right

data, but…

• ‘Extra’ fragments have no room for id. How are responses matched?

• Responses must be in same ring slot as corresponding request, so id isn’t actually needed!

Page 8: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

The netif protocol• Performance Issues:

• Single event channel for RX and TX completion• Fixed by feature-split-event-channels

• Single ring (therefore single vCPU) for RX and TX processing• Fixed by multi-queue…

• Single page ring• Still an open question…

Page 9: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Windows RSS• Relies on NIC functionality (which most implement):

PACKET

HASHKEY

TABLE

MSI-X

CPU0

CPU1

CPUn

…Toeplitz

Set by Windows network stack

Page 10: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Windows RSS

“So how do we do this with PV drivers?”

Page 11: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Windows RSS• This bit needs to be in the frontend:

queue-0/event_channel_rx

queue-1/event-channel-rx

queue-n/event-channel-rx

CPU0

CPU1

CPUn

…HVMOP_set_evtchn_upcall_vectorEVTCHNOP_bind_vcpu

ALREADY DONE

Page 12: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Windows RSS• This bit needs to be in the backend:

PACKET

HASHKEY

TABLE

QUEUE

Toeplitz

queue-0

queue-1

queue-n

…Set by Windows network stack

HOW?

Page 13: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions

Page 14: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions• Need some way to…

• Specify hash algorithm

• Specify hash key and flags

• Specify indirection table

…in the backend

Page 15: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions• Introduce netif control ring:

Backend Frontend

Requests

Responses

CTRL

Requests:

XEN_NETIF_CTRL_TYPE_GET_HASH_FLAGSXEN_NETIF_CTRL_TYPE_SET_HASH_FLAGSXEN_NETIF_CTRL_TYPE_SET_HASH_KEYXEN_NETIF_CTRL_TYPE_GET_HASH_MAPPING_SIZEXEN_NETIF_CTRL_TYPE_SET_HASH_MAPPING_SIZEXEN_NETIF_CTRL_TYPE_SET_HASH_MAPPINGXEN_NETIF_CTRL_TYPE_SET_HASH_ALGORITHM

Responses:

XEN_NETIF_CTRL_STATUS_SUCCESSXEN_NETIF_CTRL_STATUS_NOT_SUPPORTEDXEN_NETIF_CTRL_STATUS_INVALID_PARAMETERXEN_NETIF_CTRL_STATUS_BUFFER_OVERFLOW

Page 16: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions

unsigned int size = vif->hash.mapping_size;

xenvif_set_skb_hash(vif, skb);

return vif->hash.mapping[skb_get_hash_raw(skb) % size];

• xen-netback implementation:

• New ndo_select_queue op (overrides default):

Toeplitz implementation (actually in netif.h)

Page 17: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions• xen-netback implementation:

• New debugfs node:

root@brixham:~# ls /sys/kernel/debug/xen-netback/vif1.1ctrl io_ring_q0 io_ring_q1 io_ring_q2 io_ring_q3

root@brixham:~# cat /sys/kernel/debug/xen-netback/vif1.1/ctrlHash Algorithm: TOEPLITZ

Hash Flags:- IPv4- IPv4 + TCP- IPv6- IPv6 + TCP

Page 18: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions

“What about the hash values?”

Page 19: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Protocol Extensions• New ‘Extra’ frag type:

XEN_NETIF_EXTRA_TYPE_HASH

struct { uint8_t type; uint8_t algorithm; uint8_t value[4];} hash;

• Windows passes RX flow hash on TX side, so correct queue can be chosen.

Page 20: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements

Page 21: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements

• Gigabyte Brix i7-4770R

• 32GB RAM

• 200GB SATA SSD

• Hardware:

Page 22: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements

• 2 x Windows 10 32-bit domU• 4 vCPUs• 4G RAM• 8.2.0 (master) PV Drivers

• Xen 4.7.0• Upstream QEMU

• Linux 4.7.0• debugfs patch

• IXIA Chariot• TCP Throughput

• Software:

Page 23: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Single Pair:

Page 24: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Two Pairs:

Page 25: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Four Pairs (one per CPU):

Page 26: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements

“Does RSS make a difference over basic multi-queue?”

Page 27: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Four Pairs (multi-queue, no RSS):

Unbalanced throughput because competing for CPU

Page 28: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements

“What if all flows compete for the same CPU?”

Page 29: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Four Pairs (RSS forced to single queue):

Worst case is bad… Down ~6Gbps on best case.

Page 30: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Performance Measurements• Conclusions

• Multi-queue works best when queues are targeted at different CPUs

• RSS allows the guest to control TCP flow to queue mapping and hence get the best from multi-queue

Page 31: XPDS16: Windows PV Network Performance - Paul Durrant, Citrix Systems Inc

Q & A