NFV SDN Summit March 2014 D1 07 kireeti_kompella Native MPLS Fabric

Copyright 2014 Juniper Networks

A NATIVE MPLS FABRIC

MPLS World Congress 2014

CTO, R&D

Kireeti Kompella


PROBLEM STATEMENT (DC)

Overlays are all the rage in the data center … §  except that we’ve been doing overlays/underlays with MPLS pretty

much since 1997

The DC overlays start at the host (server) … § which requires true “plug-and-play” operation

To have an MPLS underlay network, the host must be part of the underlay … §  this talk is about making that easy and plug-and-play


PROBLEM STATEMENT (ACCESS)

Many have asked that MPLS start at the access node (DSLAM, OLT, cell-site gateway) §  “Seamless MPLS” has suggested the use of LDP “Downstream on

Demand” (DoD) for this § However, there haven’t been many implementations of LDP DoD

from access node vendors

Maybe the time has come for a different approach/protocol for this functionality, one that is easier to implement


OVERLAYS AND UNDERLAYS: NEW?

~64K end

points core: no endpoint

state

VCs Single VP

ATM overlay (two level)

O(10^4) VPNs,

O(10^7) addresses

PE P

core: no VPN or VPN address

state

LSP MPLS overlay (multi-level)

edge: lots of state

edge: lots of state


OVERLAY/UNDERLAY DATA PLANE

Commodity chips implemented the MPLS data plane about a decade ago Now, some have implemented just one of a largish crop of new overlay encapsulations; another is close to shipping §  Each has its own way of identifying tenants, doing load balancing,

and of course, its own encap/decap implementations §  And, as we speak, there is yet another proposal for an encap


MPLS has a very sophisticated, robust, scalable and interoperable control plane §  Various types of hierarchy are supported §  {BGP, T-LDP} [overlay] over {LDP, RSVP-TE} [underlay] §  The overlay control plane is for service state §  The underlay control plane is for traffic management

The new overlay encapsulations don’t have well-specified, interoperable control planes for either the overlay or the underlay §  There is a recent proposal to extend BGP for an overlay (EVPN/

IPVPN) over VXLAN and perhaps NVGRE

OVERLAY/UNDERLAY CONTROL PLANES


WHY IP UNDERLAYS?

The theory is, IP is simple, plug-and-play, well understood §  and all forwarding chips can do IP §  also, network interface cards on servers can do TCP offload

Furthermore, IP is ubiquitous § Outside the DC, it is all IP § Upstream from the access node, it is again all IP


WHY IP UNDERLAYS?

The theory is, IP is simple, plug-and-play, well understood §  and all forwarding chips can do IP §  also, network interface cards on servers can do TCP offload

Furthermore, IP is ubiquitous § Outside the DC, it is all IP § Upstream from the access node, it is again all IP

The reality is, IP encapsulations are pretty heavyweight §  VXLAN and NVGRE forwarding is relatively new §  forwarding hasn’t got all the features yet §  these encapsulations add significant complexity to forwarding •  the above are true whether forwarding is in software or in hardware

Furthermore, while IP is ubiquitous, MPLS is very common too, especially in metro, core and inter-DC networks §  To support WAN multi-tenancy, it is very common to use MPLS §  Alternately, one can use IPsec, but that is a very different use case,

and requires very different forwarding support

^ not

…


The theory is, MPLS is complex, hard to provision, hard to understand, hard to troubleshoot §  and few people seem to remember if forwarding chips can do MPLS §  there is a concern about TCP offload if MPLS is used

Furthermore, MPLS features are not needed for DCs (?) §  After all, who would need sophisticated load balancing in a DC? § Who would need traffic engineering? §  Fast reroute?

WHY NOT MPLS UNDERLAYS?


The theory is, MPLS is complex, hard to provision, hard to understand, hard to troubleshoot §  and few people seem to remember if forwarding chips can do MPLS §  there is a concern about performance if MPLS is used

Furthermore, MPLS features are not needed for DCs (?) §  After all, who would need sophisticated load balancing in a DC? § Who would need traffic engineering? §  Fast reroute?

Furthermore, MPLS features are indeed needed in DCs §  Among them, sophisticated load balancing with entropy labels §  Traffic engineering to take care of “mice” and “elephant” flows §  and fast reroute – DC applications are real-time and critical

WHY NOT MPLS UNDERLAYS?

The reality is, all networking is complex today. We need to make network provisioning and troubleshooting easier § while focusing on the features we want or even need §  Excellent performance is possible with MPLS!


WHAT TO DO ABOUT OVERLAYS?

Having a proliferation of overlay technologies means significant work for forwarding planes (both software and hardware), which in turn means increased cost § most of all, a dilution of effort that serves no one §  a repetition of features and a slow “catch-up” game §  a host of interoperability issues – yet more features to implement

Focusing solely on encapsulations loses the plot – we have to look at the whole picture §  provisioning, control plane, data plane, troubleshooting

Do it once, do it right!


CAN THE MPLS CONTROL PLANE (IN SOME CASES) BE TOO SOPHISTICATED?

Can’t have a flat IGP with so many hosts LDP DoD with static routing is a possibility, but not ideal Absolutely has to be plug-and-play – new hosts are added at a high rate

1000s of nodes

VMs

VMs

hosts

ToRs

100000s

VMs

10^7s


PROXY ARP RECAP

IP1 IP2

1) Hey, give me a hardware address (of type Ethernet) that I can use to reach IP2

3) You can use MAC1

MAC1 H2 T1 H1 T2 …

2) T1 can reach IP2

FIB To reach IP2, use MAC1


LABELED ARP (1)

IP1 IP2 … H2

T1 has a label (L2) to reach IP2

T1 H1 T2 Label L2 to reach

IP2

LDP Label L3 to reach

IP2

LDP

T2 leaks its hosts’ routes (here IP2) into LDP (with label L3)

LFIB L3: pop & send to H2


LABELED ARP (2)

IP1 IP2

1) Hey, give me a hardware address (of type MPLSoEthernet)

that I can use to reach IP2

3) You can use MAC1:L1

MAC1

LFIB L1 à L2

H2 T1 H1 T2 L2 L3 L1

Functionality is very much like LDP DoD.

However, ARP code is plug-and-play and

ubiquitous

…

2) T1 can reach IP2 via MPLS, so it allocates L1 for H1 to reach IP2

Note: new h/w type means that this code

can coexist with “normal” ARP code

LFIB L3: pop & send to H2


POINTS TO NOTE

ARP is ubiquitous – present on 10s of billions of devices L-ARP can coexist peacefully with existing Ethernet ARP §  A host can choose to do E-ARP or L-ARP by setting the hardware

type in the ARP request

There is an IETF draft describing L-ARP in detail §  The goal is to make this an open standard for anyone to implement

We have prototype L-ARP code for Linux servers §  The goal is to make this open source for anyone to use

No, sorry, we don’t have an iOS/Android app (yet!) §  But just ask :)


USE CASE 1: EGRESS PEERING TRAFFIC ENGINEERING

Content Server

ISP1

Data Center

Peering link to ISP1 Intra DC

Network

ISP2

Internet

Direct traffic to prefix1 to ISP1

Direct traffic to prefix2 to ISP2

IP1

IP2

RIB: Prefix1: nh IP1 Prefix2: nh IP2

Peering link to ISP2

Use MPLS underlay for traffic steering

CS L-ARPs for IP1/IP2 to get required labels

Bonus: DC switches carry “a few” MPLS LSPs rather than full

Internet routing


USE CASE 2: MPLS UNDERLAY FOR DCs (WITH VRFs/E-VPNs FOR OVERLAY)

IP1 IP2 … H2 T1 H1 T2

RR

VM2

BGP: to reach VM2, use VL2 with

IP2 as the nexthop

VM1

1) VM1 wants to talk

to VM2 (same VPN).

LDP LDP L-ARP L-ARP

2) BGP says to reach VM2, go to IP2 3) H1 resolves IP2 using L-ARP

4) Then, packets from VM1 to VM2 are encapsulated with outer label from L-ARP and inner label from BGP


CONCLUSION

A proliferation of encapsulations hurts the industry §  Encapsulations need a number of features in the data plane –

encap/decap, load balancing, offload, hierarchy, FRR § Overlays need a control plane for services and tunnels

MPLS is a powerful data plane and has a great control plane – it is robust, scalable, and very widely used §  It just needs to be zero-config, plug-and-play, rack-and-stack, which

is the proposal here See the demo in the booth! § We have prototype L-ARP client code for Linux servers and L-ARP

server code for Junos

Technology

NFV SDN Summit March 2014 D1 07 kireeti_kompella Native MPLS Fabric