26
Enabling Fast, Dynamic Network Processing with ClickOS Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*, Felipe Huici* * NEC Labs Europe, Heidelberg, Germany § University Politehnica of Bucharest [email protected], [email protected]

XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Embed Size (px)

DESCRIPTION

While virtualization technologies like Xen have been around for a long time, it is only in recent years that they have started to be targeted as viable systems for implementing middlebox processing (e.g., firewalls, NATs). But can they provide this functionality while yielding the high performance expected from hardware-based middlebox offerings? In this talk Joao Martins will introduce ClickOS, a tiny, MiniOS-based virtual machine tailored for network processing. In addition to the vm itself, Joao Martins will describe performance improvements done to the entire Xen I/O pipe. Finally, Joao Martins will discuss an evaluation showing that ClickOS can be instantiated in 30 msecs, can process traffic at 10Gb/s for almost all packet sizes, introduces delay of 40 microseconds and can run middleboxes at rates of 5 Mp/s.

Citation preview

Page 1: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Enabling Fast, Dynamic Network Processing with ClickOS

Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*, Felipe Huici*

* NEC Labs Europe, Heidelberg, Germany

§ University Politehnica of Bucharest

[email protected], [email protected]

Page 2: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

The Idealized Network

Physical

Datalink

Network

Transport

Application

Physical

Datalink

Network

Transport

Application

Physical

Datalink

Network

Physical

Datalink

Page 2

Page 3: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

A Middlebox World

Page 3

carrier-grade NAT

load balancer

DPIQoE monitor

ad insertion

BRAS

session border controller

transcoder

WAN accelerator

DDoS protection

firewall

IDS

Page 4: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Hardware Middleboxes - Drawbacks

▐ Middleboxes are useful, but…ExpensiveDifficult to add new features, lock-inDifficult to manageCannot be scaled with demandCannot share a device among different tenantsHard for new players to enter market

▐ Clearly shifting middlebox processing to a software-based, multi-tenant platform would address these issuesBut can it be built using commodity hardware while still

achieving high performance?

▐ ClickOS: tiny Xen-based virtual machine that runs Click

Page 4

Page 5: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Click Runtime

▐ Modular architecture for network processing

▐ Based around the concept of “elements”▐ Elements are connected in a configuration

file▐ A configuration is installed via a command

line executable (e.g., click-install router.click)

▐ An element Can be configured with parameters

(e.g., Queue::length) Can expose read and write variables available

via sockets or the /proc system under Linux

(e.g., Counter::reset, Counter::count) Compiled 262/300 elements Programmers can write new ones to extend

Click runtime

Page 5

Page 6: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

A simple (click-based) firewall example

Page 6

in :: FromNetFront(DEVMAC 00:11:22:33:44:55, BURST 1024);

out :: ToNetFront(DEVMAC 00:11:22:33:44:55, BURST 1);

filter :: IPFilter(

allow src host 10.0.0.1 && dst host 10.1.0.1 && udp,

drop all);

in -> CheckIPHeader(14) -> filter

filter[0] -> Print(“allow”) -> out;

filter[1] -> Print(“drop”) -> Discard();

Page 7: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

What's ClickOS ?

domU

paravirt

apps

guestOS

ClickOS

paravirt

Click

miniOS

Page 7

▐ Work consisted of:Build system to create ClickOS images (5 MB in size)Emulating a Click control plane over MiniOS/XenReducing boot times (roughly 30 miliseconds)Optimizations to the data plane (10 Gb/s for almost all pkt sizes)

Page 8: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Performance analysis

Page 8

netback

Driver Domain (or Dom 0) ClickOS Domain

Xen bus/store

Event channel

netfront

Xen ring API(data)

NW driver Linux/OVS bridge

vif

Click

FromNetfront

ToNetfront

300* Kp/s 350 Kp/s 225 Kp/s* - maximum-sized packets

pkt size (bytes) 10Gb rate

64 14.8 Mp/s

128 8.4 Mp/s

256 4.5 Mp/s

512 2.3 Mp/s

1024 1.2 Mp/s

1500 810 Kp/s

Page 9: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Main issues

© NEC Corporation 2009Page 9

▐ Backend switch ( bridge / openvswitch ) are slow

▐ Copying pages between domains (grant copy) greatly affects packet I/O– These are done in batches, but still expensive

▐ Packet metadata (skb or mbufs) allocations

▐ MiniOS netfront not as good as Linux – 225 Kpps VS 430 Kpps Tx– only 8 Kpps Rx

Page 10: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Optimizing Network I/O – Backend Switch

Page 10

VALE

netback

Driver Domain (or Dom 0) ClickOS Domain

netfrontXen bus/store

Event channel

Xen ring API(data)

NW driver(netmap mode)

port

Click

FromNetfront

ToNetfront

▐ Introduce VALE as the backend switch

– NIC switches to netmap-mode

▐ Slight modifications to the netback driver only

▐ Batch more I/O requests through multi-page rings

▐ Removed packet metadata manipulation

▐ 625 Kpps (1500 size, 2.7x improvement) and 1.2 Mpps (64 size, 4.2x improvement)

Page 11: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Background - Netmap

Page 11

▐ Fast packet I/O framework

– 14.88 Mpps on 1 core at 900 Mhz

▐ Available in FreeBSD 9+

– Also runs on Linux

▐ Minimal device driver modifications

– Critical resources (NIC registers, physical buffer addresses, and descriptors) not exposed to the user

– NIC works in special mode, bypassing the host stack

▐ Amortize syscalls cost by using large batches

▐ Preallocated packet buffers, and memory mapped to userspace

Netmap – a novel framework for fast packet I/Ohttp://info.iet.unipi.it/~luigi/netmap/Luigi RizzoUniversita di Pisa

Page 12: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Background - VALE Software Switch

Page 12

▐ High performance switch based on netmap API (18 Mpps between virtual ports, one CPU core)

▐ Packet processing is “modular”

– Default as learning bridge

– Modules are independent kernel modules▐ Applications use the netmap API

VALE, a Virtual Local Ethernethttp://info.iet.unipi.it/~luigi/vale/Luigi Rizzo, Giuseppe LettieriUniversita di Pisa

Page 13: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

VALE

Optimizing Network I/O

Page 13

Driver Domain (or Dom 0) ClickOS Domain

netfront

NW driverClick

FromNetfront

ToNetfront

netback

Xen bus/store

TX/RX Event channels

Netmap API(data)

▐ No longer need the extra copy between domains

▐ Netmap rings (in the VALE switch) are mapped all the way to the guest

▐ An I/O request doesn't require a response to be consumed by the guest

▐ Event channels are used to proxy netmap operations from/to guest and VALE

▐ Breaks other (non-MiniOS) guests :(

– But we have implemented a netmap-based Linux netfront driver

Page 14: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Vale

Netback (Xen)

netback

netfront app.netmap API

Driver Domain

Mini-OS

3. ring/bufs pages granted

Initialization

buf slot [0]buf slot [1]buf slot [2]

slots KB (per ring)

# grants(per ring)

64 135 33

128 266 65

256 528 130

512 1056 259

1024 2117 516

2048 4231 1033

Optimizing Network I/O – Initialization and Memory usage

4. ring grant refs read from the xenstore buffer refs read from the mapped ring slot

VALE

1. opens netmap device2. registers a VALE port

▐ Netmap buffers are contiguous pages in guest memory

▐ Buffers are 2k in size, each page fits 2 buffers

▐ Ring fits 1 page for 64 and 128 slots; (2+ for 256+ slots)

netmap buffers pool

Page 15: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Vale

Netback (Xen)

VALE

netback netfront app

Domain-0

Guest (Mini-OS)

Backend finished

Packets to transmit

TX event channel

buf slot 0buf slot 1buf slot 2

Optimizing Network I/O – Synchronization

buf slot 0buf slot 1buf slot 2

(mapped)

▐ In netmap application, operation is done in sender context

▐ Backend/Frontend private copy not included in the shared ring page(s)

▐ Event channels used for synchronization

Page 16: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

EVALUATION

Page 17: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS Base Performance

RX TX

Intel Xeon E1220 4-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. One CPU core assigned to VM, the rest to dom0

Page 18: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Scaling out – Multiple NICs/VMs

Intel Xeon E1650 6-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. 3 cores assigned to VMs, 3 cores for dom0

Page 19: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Linux Guest Performance

Page 20: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS (virtualized) Middlebox Performance

Page 21: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS Delay vs. Other Systems

Page 22: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Conclusions

Presented ClickOS:Tiny (5MB) Xen VM tailored at network processingCan be booted (on demand) in 30 millisecondsCan achieve 10Gb/s throughput using only a single core.Can run a varied range of middleboxes with high throughput

Page 22

Future work:Improving performance on NUMA systemsHigh consolidation of ClickOS VMs (thousands)Service chaining

Page 23: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC
Page 24: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

MiniOS (pkt-gen) Performance

RX TX

Page 25: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Scaling Out – Multiple VMs TX

Page 26: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS VM and middlebox Boot time

30 milliseconds

220 milliseconds