32
MidoNet Deep Dive Pino de Candia

Technical Deep Dive into MidoNet

  • Upload
    midonet

  • View
    448

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Technical Deep Dive into MidoNet

MidoNet Deep DivePino de Candia

Page 2: Technical Deep Dive into MidoNet

Agenda1. Virtual Network Topology2. Physical-Virtual Boundary3. Cluster nodes (aka Network State DB)4. Compute nodes5. Flow Switch concept6. Gateway nodes7. How tunneling works8. Change propagation and flow invalidation9. L4 Flow State

Page 3: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

MidoNet transforms this...

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Page 4: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

into this...

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

LB

LB

then moves packets...

Page 5: Technical Deep Dive into MidoNet

“Port-Interface Bindings”

● Vport1 => Compute1, tap12345● Vport2 => Compute2, tap67890● Uplink1 => Gateway1, eth1

Virtual-Physical Boundary

Bindings (and the virtual network topology) are stored in MidoNet’s cluster and propagated to the MidoNet Agents.

Page 6: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

Cluster stores and propagates topology

midonet cluster 2

midonet cluster 3

midonet cluster 1

IP FabricIP Fabric

Page 7: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

Port-Interface Bindings

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

3

LB

LBVport1 => Compute1, tap12345Uplink1 => Gateway1, eth1

VM1

VM

2Vport2 => Compute2, tap67890

Page 8: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

Back to the physical view...

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Compute 1 Compute 2midonet cluster 2

midonet cluster 3

midonet cluster 1

IP Fabric

Page 9: Technical Deep Dive into MidoNet

Port-Interface Bindings in the Physical View

Compute 1

Flow Switch (in-kernel OVS)

Compute 2

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Flow Switch (in-kernel OVS)

IP1 IP2

VXLANTunnel Port

VXLANTunnel Port

eth0 eth0

port5, tap12345 port6, tap678902Vport2 => Compute2, tap67890

1Vport1 => Compute1, tap12345

The compute hosts in a little more detail

Page 10: Technical Deep Dive into MidoNet

Compute 1

Flow Switch (in-kernel OVS)

What is a flow switch?

VM

VM

VM

VM VM

VM VM

VM

IP1

VXLANTunnel Port

eth0

10.0.0.4->10.0.0.510.0.0.3->200.0.0.5

port6 port8

port1

MidoNet Agent (Java Daemon)

10.0.0.3->10.10.0.2

Miss packets go to user-space via Netlink channel

Rule1: Match: in=6, srcIP=10.0.0.4➔ Actions: []

Rule2: Match: in=8, srcIP=10.0.0.3,dstIP=200.0.0.5, proto=TCP, srcPort=23109,dstPort=22➔ Actions: [srcIP=111.0.0.4, tunnel=[src=192.

168.0.3, dst=192.168.0.4, key=100], out=1]

MidoNet can:1. ignore it2. send it back with actions3. install a new flow rule4. do both #3 and #4

port2 Rule3: Match: in=8, srcIP=10.0.0.3, dstIP=10.10.0.2, proto=ICMP➔ Actions: [srcMAC=M1, dstMAC=M2, out=2]

Page 11: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

Port-Interface Bindings

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

3

LB

LBUplink1 => Gateway1, eth1VM

VM

Page 12: Technical Deep Dive into MidoNet

Gateway 1

Detail of the Gateway Node

Compute 1

VM

VM

VM

VM VM

VM VM

VM

Quagga, bgpd

IP Fabric

Flow Switch (in-kernel OVS) Flow Switch (in-kernel OVS)

IP1 IP3

VXLANTunnel Port

eth0 eth0 eth1VXLAN

Tunnel Port

3Uplink1 => Gateway1, eth1

Internet/WAN/DC

port5, tap123451Vport1 => Compute1, tap12345

Page 13: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

Back to the physical view...

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

midonet cluster 2

midonet cluster 3

midonet cluster 1

midonet gateway

2

midonet gateway

3

midonet gateway

1

IP FabricIP FabricInternet/WAN/DC

Page 14: Technical Deep Dive into MidoNet

Gateway 1

Detail of the Gateway Node - pre-installed flows

Quagga, bgpd

Flow Switch (in-kernel OVS)

IP3

eth0 eth1VXLAN

Tunnel Port

Internet/WAN/DC

Compute 1

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Flow Switch (in-kernel OVS)

IP1

VXLANTunnel Port

eth0 3Uplink1 => Gateway1, eth1

port5, tap123451Vport1 => Compute1, tap12345

port1 port2

port3, veth0

veth1

Rule1: Match: in=2, srcIP=<Uplink1 Peer’s IP>, dstIP=<Uplink1’s IP>, proto=TCP, dstPort=BGP➔ Actions: [out=3]

Rule2: Match: in=2, srcIP=<Uplink1 Peer’s IP>, dstIP=<Uplink1’s IP>, proto=TCP, srcPort=BGP➔ Actions: [out=3]

Rule3: Match: in=3➔ Actions: [out=2]

Rule4: Match: in=2, ethertype=ARP, op=BOTH, srcIP=<Uplink1 Peer’s IP>➔ Actions: [out=3, to-user-space]

Internet/WAN

Uplink1 => Gateway1, eth1

MidoNet Agent (Java Daemon)

Page 15: Technical Deep Dive into MidoNet

● Flow rules are computed at the ingress host● by simulating a packet’s path through the virtual topology● without fetching any information off-box (~99% of the time)● if the egress port is on a different host, then the packet is

tunneled● and the tunnel key encodes the egress port● so that no computation is needed at the egress

MidoNet uses VNIs to encode Vports - NOT network segments.

Flow rule computation and tunneling

Page 16: Technical Deep Dive into MidoNet

Compute 1

Flow Switch (in-kernel OVS)

VM

VM

VM

VM VM

VM VM

VM

IP1

VXLANTunnel Port

eth0

Compute 2

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Flow Switch (in-kernel OVS)

IP2

VXLANTunnel Port

eth0

Pre-installed flows on the compute hosts

Rule1: Match: in=1, tunKey=<VNI of VM1>➔ Actions: [out=2]

Rule2: Match: in=1, tunKey=<VNI of VM2>➔ Actions: [out=3]

Rule3: Match: in=1, tunKey=<VNI of VM3>➔ Actions: [out=4]

… and so on...

port1

ExtIP->VM1

IP3 -> IP1VNI of VM1

ExtIP->VM1

Page 17: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

A flow between two VMs...

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

LB

LBVM1->FIP1

VIP1->VM2FIP2->FIP1

FIP2->VIP1

Page 18: Technical Deep Dive into MidoNet

is tunneled C1 to C2 (no middle compute nodes)

Compute 2Compute 1

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Flow Switch (in-kernel OVS) Flow Switch (in-kernel OVS)

IP1 IP2

VXLANTunnel Port

VXLANTunnel Port

VM1->FIP1

VIP1->VM2

IP1 -> IP2VNI of VM2

VIP1->VM2

Host network stackperforms encapsulation Host network stack

performs decapsulation

New Rule: Match: in=5, srcIP=VM1, dstIP=FIP1, proto=TCP➔ Actions: [srcIP=VIP1, dstIP=VM2, tunnel=

[src=IP1, dst=IP2, key=<VNI of VM2], out=1]

port5, tap12345

Page 19: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

A flow that exits an uplink...

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

LB

LBVM1->ExtIP1

FIP1->ExtIP1

Page 20: Technical Deep Dive into MidoNet

Gateway 1

...is tunneled C1 to L3GW node

Compute 1

VM

VM

VM

VM VM

VM VM

VM

Quagga, bgpd

IP Fabric

Flow Switch (in-kernel OVS) Flow Switch (in-kernel OVS)

IP1 IP3

VXLANTunnel Port

eth0 eth0 eth1VXLAN

Tunnel Port

VM1->ExtIP1

FIP1->ExtIP1

IP1 -> IP2Uplink1 VNI

FIP1->ExtIP1

Internet/WAN/DC

port5, tap12345

New Rule: Match: in=5, srcIP=VM1, dstIP=ExtIP1, proto=TCP➔ Actions: [srcIP=FIP1, dstIP=ExtIP1, tunnel=

[src=IP1, dst=IP3, key=<VNI of Uplink1], out=1]

Page 21: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

If an uplink fails...

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

LB

LB

Page 22: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

notify whomever needs to know

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

midonet cluster 2

midonet cluster 3

midonet cluster 1

midonet gateway

2

midonet gateway

3

midonet gateway

1

IP FabricIP FabricInternet/WAN/DC

Page 23: Technical Deep Dive into MidoNet

Compute 1

Flow Switch (in-kernel OVS)

The receiving Agent invalidates related rules

VM

VM

VM

VM VM

VM VM

VM

IP1

VXLANTunnel Port

eth0

port1

MidoNet Agent (Java Daemon)

New Rule: Match: in=5, srcIP=VM1, dstIP=ExtIP1, proto=TCP➔ Actions: [srcIP=FIP1, dstIP=ExtIP1, tunnel=

[src=IP1, dst=IP3, key=<VNI of Uplink1], out=1]

port5, tap12345

VM1->ExtIP1

If the flow is still active, a miss packet will be sent to the MN Agent via Netlink and a new flow rule can be recomputed that doesn’t use the failed uplink.

Uplink1 is Down

Page 24: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

If a flow had L4 state (SNAT)...

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM VM

VM

VM

VMVM

VM

VMVM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/WAN

FW

LB

LBVM1->ExtIP1

FIP1->ExtIP1

Page 25: Technical Deep Dive into MidoNet

Bare MetalServer

Bare MetalServer

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

midonet cluster 2

midonet cluster 3

midonet cluster 1

midonet gateway

2

midonet gateway

3

midonet gateway

1

IP FabricIP FabricInternet/WAN/DC

The state is shared with return flow ingress(es)

Page 26: Technical Deep Dive into MidoNet

...is tunneled C1 to L3GW node

Compute 1

VM

VM

VM

VM VM

VM VM

VM

IP Fabric

Flow Switch (in-kernel OVS)

IP1 IP3

VXLANTunnel Port

eth0

FIP1->ExtIP1

IP1 -> IP2Uplink1 VNI

Internet/WAN/DC

port5, tap12345

VM1->ExtIP1

Gateway 1

Quagga, bgpd

Flow Switch

(in-kernel OVS)

eth0Tunnel Port

eth1

Gateway 2

Quagga, bgpd

Flow Switch

(in-kernel OVS)

eth0Tunnel Port

eth1

Gateway 3

Quagga, bgpd

Flow Switch

(in-kernel OVS)

eth0Tunnel Port

eth1

IP5 IP6

Flow State

IP1 -> IP2Special VNI

Page 27: Technical Deep Dive into MidoNet

Port’s packet pipeline in MN 5.0

PortMirroring

from wireService

RedirectionChain

FilteringChain

into device

Filtering Chain

from device Service

RedirectionChain

Port Mirroring

onto wire to next port or end simulation

Page 28: Technical Deep Dive into MidoNet

Bridge packet pipeline in MN 5.0

Pre-forwarding

Chain

from port Forwarding Table

Post-forwarding

Chain

to one or more ports

Page 29: Technical Deep Dive into MidoNet

Router packet pipeline in MN 5.0

Pre-forwarding

Chainfrom port

Routing Table

Post-forwarding

Chain

to one or more ports

L4 LBaaS

Page 30: Technical Deep Dive into MidoNet

Security Groups are translated to Chains and Rules

Page 31: Technical Deep Dive into MidoNet

New in MN 5.0: L2 SFC API ObjectsL2Insertion:● inspected vm port UUID● inspected vm MAC● service port UUID● vlan tag● fail-open (true/false)● position (relative to other insertions for the same inspected vm port)

L2Service● service port UUID

Page 32: Technical Deep Dive into MidoNet

1 protected VM, 1 SF

VM1(protected) VM2 SF1

1 protected VM, SF down, fail-close1 protected VM, SF down, fail-open