22
© 2014 IBM Corporation On the cost of tunnel endpoint processing in overlay virtual networks J. Weerasinghe & F. Abel IBM Research Zurich Laboratory J. Weerasinghe; NVSDN2014, London; 8 th December 2014

On the cost of tunnel endpoint processing in overlay

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

On the cost of tunnel endpoint processing in overlay virtual networks

J. Weerasinghe & F. AbelIBM Research – Zurich Laboratory

J. Weerasinghe; NVSDN2014, London; 8th December 2014

Page 2: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Motivation

Overlay Virtual Networks–background

– tunnel endpoint

Cost of Tunnel Endpoint Processing

Proposal & Implementation of Acceleration

Measurements

Conclusions

Outline

2

Page 3: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Rack Server Blade Server

Rack- & Blade-Servers

Discrete NICs (PCIe- or CPU-attached)

Deployment of OVNs in Integrated NICs (iNICs)–area and power restricted

Motivation

3

Integrated NIC (inside the CPU)

Hyper-scale Servers

Micro Server

iNIC

Page 4: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Overlay Virtual Networks (1/2)

HEADER PAYLOAD

HEADER PAYLOADHEADER

HEADER PAYLOADHEADER

HEADER PAYLOADHEADER

4

…Millions of Virtual

L2 Networks/

Overlay Networks

Single

Physical Network/

Underlay Network

(Both L2 & L3)

Single Physical Network

Millions of Virtual Networks

Packet Encapsulation

Page 5: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Many Flavors –different encapsulation protocols

Overlay Virtual Networks (2/2)

STT

IP/TCP-encap

NVGRE

IP/GRE-encap

VXLAN

GENEVE

DOVE

IP/UDP-encap

5

Page 6: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Place where packets are encapsulated and de-capsulated

Usually an IP interface (L4 depends on encap protocol)

Tunnel Endpoint (TEP)

HEADER PAYLOAD

HEADER PAYLOADHEADER

HEADER PAYLOADVirtual

Network

HEADER PAYLOADHEADER

TEP TEP

Physical

Network

6

Can be implemented in:

(a) HW Switch(b) NIC(c) SW

Page 7: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Performance is good

But not scalable• isolation of traffic between App TEP

• MAC table explosion

a) TEP in HW Switch: Pros & Cons

Physical

Network

Virtual

Network

HEADER PAYLOADHEADER

HEADER PAYLOAD

Server(SW)

App

NIC(HW)

Server(SW)

App

NIC(HW)

HEADER PAYLOADHEADER

HEADER PAYLOAD

TEP TEP

7

Page 8: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Performance is good

But not scalable• isolation of traffic between App TEP

• MAC table explosion

b) TEP in NIC: Pros & Cons

Virtual

Network

HEADER PAYLOADHEADER

HEADER PAYLOAD

Server(SW)

App

NIC(HW)

Server(SW)

App

NIC(HW)

HEADER PAYLOADHEADER

HEADER PAYLOAD

TEP TEP

Physical

Network

HEADER PAYLOADHEADER HEADER PAYLOADHEADER

8

Page 9: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Scales well

But longer code-path in SW degrades performance

c) TEP in SW: Pros & Cons

Physical

Network

Virtual

Network

HEADER PAYLOADHEADER

HEADER PAYLOAD

Server(SW)

App

NIC(HW)

Server(SW)

App

NIC(HW)

HEADER PAYLOADHEADER

HEADER PAYLOAD

TEP TEP

9

Page 10: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

OVN –VXLAN

Application Protocol–TCP/IP

• a widely used application protocol

Rx Path–critical part of packet processing

• due to lack of prior knowledge

SW TEP–analyze Linux implementation

–assess the Cost

Identify Functions to be Accelerated

Accelerate the Identified Functions

Approach On Improving SW TEP Performance

MAC IP UDP VXLAN MAC IP TCP PAYLOAD

HEADER PAYLOADHEADER

HEADER PAYLOAD

VXLAN-encapsulated TCP/IP packet

10

Page 11: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

VXLAN-encapsulated TCP/IP Packet

each packet has to

travel twice in the

stack

• Path in the Linux

Network Stack

VXLANMAC IP UDP MAC IP TCP PAYLOAD

Outer Stack Inner Stack

11

Page 12: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Experimental Setup–Linux on bare metal

–sender and receiver• netperf-based

• connected back-to-back

Measurements–clock cycles (for TEP processing)

–BW

– latency

Assessing the Cost of TEP Processing (1/4)

Experimental Setup

Specification of the Experimental Setup

netperf

Sender(Tx)

NIC

TEP

VXLAN

eth0

netperf

Receiver(Rx)

NIC

TEP

VXLAN

eth0

12

Page 13: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Clock Cycle Measurement– fine grained instrumentation of the code using the time stamp counter

Procedure

Assessing the Cost of TEP Processing (2/4)

M1 = Measure()M2 = Measure()

M3 = Measure()

M4 = Measure()

Measurement overhead = M2–M1

Clock cycles spent

on executing the code

w/ measurement

overhead = M4-M3

Clock cycles spent

on executing the code = (M4-M3)-(M2-M1)

Linux Kernel Network Source Code

Code Segment of

Interest

13

Page 14: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Assessing the Cost of TEP Processing (4/4)

CPU Clock Cycles for VXLAN Packet Processing

Number of Clock Cycles Spent on Outer- and Inner-stack Processing

14

VXLANMAC IP UDP MAC IP TCP PAYLOAD

Outer Stack Inner Stack

StackLayer/

Function

Sub

Function

Clock

CyclesTotal

% of MTU-

Size Packet

Outer

Net Core 120

1224 21%L3 (IP) 668

L4 (UDP) 180

VXLAN 256

Inner

Net Core 92

1604 27%L3 (IP)

Checksum 24

Other 228

L4 (TCP)Checksum 608

Other 652

Format of VXLAN-encapsulated TCP/IP packet

21% overhead

10% overhead

Page 15: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Bandwidth–Netperf TCP_STREAM

–BW performance = data rate (Gbps)/ CPU utilization

Latency–Netperf TCP_RR

–RTT/2

Assessing the Cost of TEP Processing (3/4)

15

31.8% BWdrop

32.5%latency

increment

Page 16: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

SW TEP Hybrid (SW & HW) TEP–part of Rx path SW TEP processing moved to NIC HW & Driver

–Tx path TEP processing not changed

Proposed Acceleration (1/3)

App

Server

NIC

Switch

App

VXLAN

TEP

VXLAN

eth0

TEP

Driver

TEP

16

App

Server

NIC

Switch

App

VXLAN

TEP

VXLAN

eth0

Driver

Page 17: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Stack Acceleration–outer stack acceleration (OSA)

• (a) packet de-capsulation in NIC

– inner stack acceleration (ISA)• (b) checksum in NIC

• (c) direct access to L4

Proposed Acceleration (2/3)

VXLANMAC IP UDP MAC IP TCP PAYLOAD

Outer Stack Inner Stack

VXLAN

App

17

AppServer

NIC

Switch

VXLAN

eth0

TEP

Driver

TEP

HEADER PAYLOADHEADER

PAYLOADTCP

(c) driver places the packet

directly in L4 (TCP) layer

(a) NIC decapsulates,

(b) verifies inner packet checksum

(c) extract inner stack information

HEADER PAYLOAD

inner packet Rx descriptor

VNI STACK INFOCS

VNI:VXLAN Network ID, CS: Checksum

PAYLOADTCP

HEADER PAYLOADHEADER

HEADER PAYLOAD

(a) NIC decapsulates,

(b) verifies inner packet checksum

(c) extract inner stack information

inner packet Rx descriptor

VNI STACK INFOCS

(c) driver places the packet

directly in L4 (TCP) layer

Page 18: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Proposed Acceleration (3/3)

(a) each packet

travels only

once in the

stack

(b) L2 and L3 are

bypassed

• Accelerated Path in the

Linux Network Stack

MAC IP UDP VXLAN MAC IP TCP PAYLOAD

Outer Stack Inner Stack

18

Page 19: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

@Tx driver–meta-data is generated (in real implementation this happens in Rx NIC)

–and prepended to the packet

@Rx driver–meta-data is removed

–and packet is decapsulated (in real implementation this happens in Rx NIC )

Implementation

19

PAYLOADHEADER

HEADER PAYLOADHEADERVNI STACK INFOCS

HEADER PAYLOADHEADERVNI STACK INFOCS

PAYLOADTCP

HEADER PAYLOADHEADER

HEADER PAYLOADHEADERVNI STACK INFOCS

HEADER PAYLOADHEADERVNI STACK INFOCS

Netperf

Sender (Tx)

NIC

VXLAN

Driver

eth0

Receiver (Rx)

NIC

VXLAN

Driver

Netperf

eth0

TEP

TEP

VNI:VXLAN Network ID, CS: Checksum

Page 20: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

Bandwidth

Latency

Results

20

OSA: Outer Stack Acceleration

ISA: Inner Stack Acceleration

97.1% of the BWperformance

achieved

94.4% of the latency

performanceachieved

Page 21: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

SW Tunnel Endpoint–supports all OVN requirements

–but performance is degraded

–cost• VXLAN adds 21% of CPU cycles to the processing of a MTU-size packet

• BW performance is dropped by 31.8%

• latency is increased by 32.5%

Accelerated Tunnel Endpoint– light-weight stack acceleration

–achieved performance

• 97.1% of BW

• 94.4% of latency

Future Work–add OVN support to

integrated NICs

Conclusion

21

0

10

20

30

40

50

60

70

80

90

100

BW Latency

Performance of Accelerated TEP(higher the better)

Non-VXLAN Non-Accelerated VXLAN

Pe

rce

nta

ge

Page 22: On the cost of tunnel endpoint processing in overlay

© 2014 IBM Corporation

THANKS

22