42
SWAN: Software-driven wide area network Ratul Mahajan

SWAN: Software-driven wide area network Ratul Mahajan

Embed Size (px)

Citation preview

SWAN: Software-driven wide area network

Ratul Mahajan

Partners in crime

Ratul Mahajan

Chi-Yao Hong

Vijay Gill Srikanth Kandula Ming Zhang Roger Wattenhofer

Harry LiuXin JinRohan Gandhi

Mohan Nanduri

Inter-DC WAN: A critical, expensive resource

Hong Kong

Seoul

Seattle

Los Angeles

New York

Miami

Dublin

Barcelona

But it is highly inefficient

One cause of inefficiency: Lack of coordination

Another cause of inefficiency: Local, greedy resource allocation

Local, greedy allocation

A

B C D

E

FGH

B C D

FGH

A E

Globally optimal allocation[Latency inflation with MPLS-based traffic engineering, IMC 2011]

SWAN: Software-driven WAN

Highly efficient WANFlexible sharing policies

Coordinate across servicesCentralize resource allocation

Goals Key design elements

[Achieving high utilization with software-driven WAN, SIGCOMM 2013]

Software-defined networking: Primer

Networks today• Beefy routers• Control plane: distributed, on-board• Data plane: indirect configuration

SDNsStreamlined switchesControl plane: centralized, off-boardData plane: direct configuration

SWAN controller

SWAN overview

WAN

Service hosts

Network agentService broker

Traffic demand

BW allocation

Networkconfig.

Topology, traffic

Rate limiting

Key design challenges

Scalably computing BW allocations

Avoiding congestion during network updates

Working with limited switch memory

Scalably computing allocation

Goal: Prefer higher-priority traffic and max-min fair within a class

Challenge: Network-wide fairness requires many MCFs

Approach: Bounded max-min fairness (fixed number of MCFs)

Bounded max-min fairness

demand demand

demand

demand

demand

α𝑈

Geometrically partition the demand space with parameters

α 2𝑈

α 3𝑈

Bounded max-min fairness

demand demand

demand

demand

demand

α𝑈

Maximize throughput while limiting all allocations below

α 2𝑈

α 3𝑈

Bounded max-min fairness

α𝑈demand

demand

demand

demand

demand

Maximize throughput while limiting all allocations below

α 2𝑈

α 3𝑈

Bounded max-min fairness

α𝑈demand

demand

demand

demand

demand

Fix the allocation for smaller flows

α 2𝑈

α 3𝑈

Bounded max-min fairness

demand demand

demand

demand

demand

Continue until all flows fixed

α𝑈

α 2𝑈

α 3𝑈

Bounded max-min fairness

demand demand

demand

demand

demand

Fairness bound: Each flow is within of its fair rateNumber of MCFs:

α𝑈

α 2𝑈

α 3𝑈

SWAN computes fair allocations

In practice, only 4% of the flows deviate more than 5%

Rela

tive

devi

ation

SWAN ()

Flows sorted by demand

MPLS TE

Rela

tive

devi

ation

Key design challenges

Scalably computing BW allocations

Avoiding congestion during network updates

Working with limited switch memory

Congestion during network updates

Congestion-free network updates

Computing congestion-free update plan

Leave scratch capacity on each link• Ensures a plan with at most steps

Find a plan with minimal number of steps using an LP• Search for a feasible plan with 1, 2, …. max steps

Use scratch capacity for background traffic

SWAN provides congestion-free updatesCo

mpl

emen

tary

CD

F

Oversubscription ratio Extra traffic (MB)

Key design challenges

Scalably computing BW allocations

Avoiding congestion during network updates

Working with limited switch memory

Working with limited switch memory

Working with limited switch memory

Install only the “working set” of pathsUse scratch capacity to enable disruption-free updates to the set

SWAN comes close to optimal

SWAN

Thro

ughp

ut(r

elati

ve to

opti

mal

)

SWANw/o rate control

MPLS TE

Deploying SWAN in production

WAN

Data center

WAN

Data center

Lab prototype Partial deployment Full deployment

Key lesson from using SDNs

Loops in SDNs

Dependencies are worse than you might think

Ongoing work: Robust, fast network updates

Understand fundamental limits

Develop practical solutions

What are the minimal dependencies for a desired consistency property?

[On consistent updates in software-defined networks, HotNets 2013]

Network update pipeline

Robust rule generation

Dependency graph generation

Updateexecution

Routing policy

Consistency property

Network characteristics

Robust rule generation: Example

S1 S3

(a) Initial TES2

S45

5

55

S1 S3

S2

10

10

S4

10S1 S3

S2

S410

55

10

(b) Unreliable new TE (TE-1) (c) Congestion when S2 fails to update to TE-1

Demands:S4->S3:10S2->S3:10S1->S3:0

Demands:S4->S3:10S2->S3:10S1->S3:10

Demands:S4->S3:10S2->S3:10S1->S3:10

Capacity of Links: 10

S1 S3

(a) Initial TES2

S45

5

55

S1 S3

S2

10

10

S4

10S1 S3

S2

S410

55

10

(b) Unreliable new TE (TE-1) (c) Congestion when S2 fails to update to TE-1

Demands:S4->S3:10S2->S3:10S1->S3:0

Demands:S4->S3:10S2->S3:10S1->S3:10

Demands:S4->S3:10S2->S3:10S1->S3:10

Capacity of Links: 10

S1 S3

(a) Initial TES2

S45

5

55

S1 S3

S2

10

10

S4

10S1 S3

S2

S410

55

10

(b) Unreliable new TE (TE-1) (c) Congestion when S2 fails to update to TE-1

Demands:S4->S3:10S2->S3:10S1->S3:0

Demands:S4->S3:10S2->S3:10S1->S3:10

Demands:S4->S3:10S2->S3:10S1->S3:10

Capacity of Links: 10

S1 S3

S2

10

10

S4

5S1 S3

S2

S410

55

5

(d) Reliable new TE (TE-2) (c) No congestion when S2 fails to update to TE-2

Demands:S4->S3:10S2->S3:10S1->S3:10

Demands:S4->S3:10S2->S3:10S1->S3:10

Capacity of Links: 10

Robust rule generation

Goal: No congestion if any k or fewer switches fail to update

Challenge: Too many failure combinations

Approach: Use a sorting network to identify worst k failures – O(kn) constraints

Robust rule generation: Preliminary results

max

k/n=10%

k/n=20%

k/n=30%

worst-ca

se0

0.2

0.4

0.6

0.8

1

Nor

mal

ized

th

roug

hput

Network update pipeline

Robust rule generation

Dependency graph generation

Updateexecution

Routing policy

Consistency property

Network characteristics

Dependency graph generation

A Del @ A

Add @ Del @Del @

Update execution

Updates without parents can be applied right awayBreak any cycles and shorten long chains

A D A

A DD

Add @

Update execution: Preliminary results

Summary

SWAN yields a highly efficient and flexible WAN– Coordinated service transmissions and centralized resource allocation– Bounded fairness for scalable computation– Scratch capacities for “safe” transitions

Change is hard for SDNs– Need to understand fundamental limits, develop practical solutions