136
Integrating Network Management for Cloud Computing Services Final Public Oral Exam Peng Sun

Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Integrating Network

Management for Cloud

Computing Services

Final Public Oral Exam

Peng Sun

Page 2: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

What is Cloud Computing

• Deliver applications as services

over Internet

• Run applications in datacenters

• Lower capital and operational

expenses

1

Page 3: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Cloud Services are Growing

• Public cloud for consumer service• Amazon Web Services

• Microsoft Azure

• …

2

Page 4: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Cloud Services are Growing

• Public cloud for consumer service• Amazon Web Services

• Microsoft Azure

• …

• Enterprises out-source IT service• Storage: Box, …

• Analytics: Salesforce, …

• Productivity: Office365, …

2

Page 5: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Quality of Cloud Service

Depends on Network Quality

3

OS

App

Servers

ISP

Hardware

Routing

Network

Devices

ISP

ISP

Enterprise

Network

Datacenter

Page 6: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Improve Network Quality

• The old way cannot keep up with

the growth of cloud services

• Deploying more devices with higher

bandwidth

4

Page 7: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Improve Network Quality

• The old way cannot keep up with

the growth of cloud services

• Deploying more devices with higher

bandwidth

• The key is proper management

of network resources

4

Page 8: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Page 9: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic Engineering

Page 10: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic EngineeringLoad

Balancing

Page 11: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic EngineeringLoad

BalancingLink Corruption Mitigation

Page 12: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic EngineeringLoad

BalancingLink Corruption Mitigation

Network Device

Configuration

Page 13: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic EngineeringLoad

BalancingLink Corruption Mitigation

Network Device

ConfigurationDevice Power Control

Page 14: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Examples of Network

Management Solutions

5

Traffic EngineeringLoad

BalancingLink Corruption Mitigation

Network Device

Configuration

……

Device Power Control

Page 15: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problems of Current

Network Management

• Disjoint management of network

components

• Low-level interfaces with network

devices

6

Page 16: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Management Systems

• Division of labor in pre-cloud era

7

OS

App

Servers

ISP

Hardware

Routing

Network Devices

Users

Page 17: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Management Systems

• Division of labor in pre-cloud era

7

OS

App

Servers

ISP

Hardware

Routing

Network Devices

Users

Server

Provisioning

Page 18: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Management Systems

• Division of labor in pre-cloud era

7

OS

App

Servers

ISP

Hardware

Routing

Network Devices

Users

Server

Provisioning

Traffic Engineering

Page 19: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Management Systems

• Division of labor in pre-cloud era

7

OS

App

Servers

ISP

Hardware

Routing

Network Devices

Users

Server

Provisioning

Infrastructure

Traffic Engineering

Page 20: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Datacenter Breaks Balance

• Coordination of server & network:

• More applications are built as multi-

tier distributed systems

• Intra-datacenter traffic is new majority

8

Page 21: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Datacenter Breaks Balance

• Coordination of server & network:

• More applications are built as multi-

tier distributed systems

• Intra-datacenter traffic is new majority

• Infrastructure evolution speeds up:

• Network architecture changes (e.g.,

FatTree)

8

Page 22: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Mgmt. is Bottleneck

• Because:

• Cloud service providers have stakes

in all three management areas:

• Server, infrastructure, traffic

9

Page 23: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Disjoint Mgmt. is Bottleneck

• Because:

• Cloud service providers have stakes

in all three management areas:

• Server, infrastructure, traffic

• Great opportunity exists for

consolidation

• Google and Microsoft on SoftWAN

9

Page 24: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Yet Another Problem:

Low-level Device Interaction

• Hardware vendors differentiate

with specialized devices

• Network operation:

• intensively uses vendor-specific APIs

• heavily depends on experiences

10

Page 25: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Cloud Service is Different

• Much more devices in datacenters than traditional networks• Automation is a must

11

Page 26: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Cloud Service is Different

• Much more devices in datacenters than traditional networks• Automation is a must

• Commodity devices instead of specialized hardware• Homogeneity is preferred

11

Page 27: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Cloud Service is Different

• Much more devices in datacenters than traditional networks• Automation is a must

• Commodity devices instead of specialized hardware• Homogeneity is preferred

• Lower vendor dependence pays off (MS & Amazon on SoftLB)

11

Page 28: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Promising Yet Limited SDN

• Great way of automating traffic

management with high-level

programming paradigms

• Yet literature focus on just ‘traffic’

12

Page 29: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Summary of Problems

• This dissertation solves:

• Disjoint management of server,

infrastructure, and traffic

• Low-level device interaction for

broader scope of network

management

13

Page 30: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Page 31: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Identify opportunity of

integration

Page 32: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Simple and expressive

programming abstraction

Identify opportunity of

integration

Page 33: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Simple and expressive

programming abstraction

Efficient and scalable

system for execution

Identify opportunity of

integration

Page 34: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Simple and expressive

programming abstraction

Efficient and scalable

system for execution

Deployment in real world

Identify opportunity of

integration

Page 35: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Practical Approach

14

Simple and expressive

programming abstraction

Efficient and scalable

system for execution

Deployment in real world

Identify opportunity of

integration

Page 36: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Project Publication Deployment

Statesman SIGCOMM’14 Microsoft Azure

HONEJNSM, Vol. 23,

2015

Overture/

Verizon

Sprite SOSR’15In process with

OIT

Contributions

15

Page 37: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman: Safe Datacenter

Traffic/Infrastructure Management

16

OS

App

Servers

Hardware

Routing

Network

Devices

Datacenter

Enterprises

ISP ISP

Other ISPs

Page 38: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone: End-host/Network

Cooperative Traffic Management

17

OS

App

Servers

Hardware

Routing

Network

Devices

Datacenter

Enterprises

ISP ISP

Other ISPs

Page 39: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Sprite: Direct Control of

Entrant ISP for Enterprise Traffic

18

OS

App

Servers

Hardware

Routing

Network

Devices

Datacenter

Enterprises

ISP ISP

Other ISPs

Page 40: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

What Follows

• Brief explanation of each project

• Open issues

• Related work

• Q&A

19

Page 41: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman:

Integrating Network

Infrastructure Management

20

Page 42: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Cloud Providers

• Multiple mgmt. solutions coexist

• for traffic and infrastructure mgmt.

Page 43: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Cloud Providers

• Multiple mgmt. solutions coexist

• for traffic and infrastructure mgmt.

• Complexity is the main problem

Page 44: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Cloud Providers

• Multiple mgmt. solutions coexist

• for traffic and infrastructure mgmt.

• Complexity is the main problem

• Development

• Scale & heterogeneity of devices

Page 45: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Cloud Providers

• Multiple mgmt. solutions coexist

• for traffic and infrastructure mgmt.

• Complexity is the main problem

• Development

• Scale & heterogeneity of devices

• Coordination

• Conflicts and safety violations

Page 46: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Conflict

22

Page 47: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Conflict

22

Link-corruption-

mitigation adjusts

traffic away from

Core1

Page 48: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Conflict

22

Link-corruption-

mitigation adjusts

traffic away from

Core1

TE tunes traffic

among links to

Core1, 2

Page 49: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Safety Violation

23

Page 50: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Safety Violation

23

Link-corruption-

mitigation shuts

down faulty Agg A

Page 51: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Agg

A

ToRs

Agg

B

Core1 2

Safety Violation

23

Link-corruption-

mitigation shuts

down faulty Agg A

Firmware-upgrade

schedules Agg B

to upgrade

Page 52: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Statesman System

• Network operating system• Common layer to consolidate traffic

and infrastructure management

• Resolve conflicts & safety violations

24

Page 53: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Statesman System

• Network operating system• Common layer to consolidate traffic

and infrastructure management

• Resolve conflicts & safety violations

• Core techniques:• Network state & three-view workflow

• State dependency model

• Scalable & robust system

24

Page 54: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three-view Workflow with

Network State

25

Observed

State

Target

StateProposed

State

Page 55: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three-view Workflow with

Network State

25

Observed

State

Target

StateProposed

State

ApplicationApplicationSolutions

Page 56: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three-view Workflow with

Network State

25

Observed

State

Target

StateProposed

State

What we

see from

the network

ApplicationApplicationSolutions

Page 57: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three-view Workflow with

Network State

25

Observed

State

Target

StateProposed

State

What we

see from

the network

What we want

the network

to be

ApplicationApplicationSolutions

Page 58: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three-view Workflow with

Network State

25

Observed

State

Target

StateProposed

State

What we

see from

the network

What we want

the network

to be

What can be

actually done

on the network

StatesmanApplicationApplication

Solutions

Page 59: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

Device

Link

Page 60: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerStateDevice

Link

Page 61: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerState

FirmwareVersion

Device

Link

Page 62: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerState

FirmwareVersion

ConfigurationState

Device

Link

Page 63: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerState

FirmwareVersion

ConfigurationState AdminState

ConfigurationState

Device

Link

Page 64: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerState

FirmwareVersion

ConfigurationState

RoutingState

AdminState

ConfigurationState

Device

Link

Page 65: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

State Dependency for

Conflict Detection

26

PowerState

FirmwareVersion

ConfigurationState

RoutingState

AdminState

ConfigurationState

PathState

Device

Link

Page 66: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman System

27

Target

State

Proposed

State

Observed

State

Page 67: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman System

27

Target

State

Proposed

State

Observed

State

Storage Service

Page 68: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman System

27

Target

State

Monitor

Proposed

State

Observed

State

Storage Service

Page 69: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman System

27

Target

State

Monitor

Checker

Proposed

State

Observed

State

Storage Service

Page 70: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman System

27

Target

State

Monitor Updater

Checker

Proposed

State

Observed

State

Storage Service

Page 71: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Deployment Overview

• Operational in Microsoft Azure

since October 2013

• Cover 10 DCs of 20K devices

• 3 production management

solutions built and running

28

Page 72: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Case: Resolve ConflictInter-DC TE &

Firmware-upgrade

29

BR 1

BR 2

DC 1

BR 8

BR 7

DC 4

BR 3BR 4

DC 2

BR 5

DC 3

BR 6

DC = Data Center

BR = Border Router

Page 73: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

……

……

Page 74: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

……

……

Page 75: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

Firmware-upgrade

acquires lock of BR1

……

……

Page 76: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

TE fails to acquire lock,

and moves traffic away

……

……

Page 77: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

TE fails to acquire lock,

and moves traffic away

……

……

Page 78: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

BR1 firmware

upgrade starts

……

……

Page 79: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

BR1 firmware

upgrade starts

BR1 firmware upgrade

ends. Lock released.

……

……

Page 80: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

BR1 firmware

upgrade starts

TE re-acquires lock, and

moves traffic back

……

……

Page 81: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

30

BR1 firmware

upgrade starts

TE re-acquires lock, and

moves traffic back

……

……

Page 82: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman Summary

• Programming abstraction

• Three-view network state model

• Efficient and scalable system

• Automatic and safe infrastructure management system

• Deployment

• Operational in Microsoft Azure worldwide

• SIGCOMM 2014

31

Page 83: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone:

Combining End-host and

Network for Traffic Management

32

Page 84: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem

• Traffic management is limited by

the scope of network devices

• Coarse granularity & limited view

33

Server

App

OS Network

Traffic Management Solutions

Page 85: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Hone System

• Bring end hosts into traffic mgmt.

34

Page 86: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Hone System

• Bring end hosts into traffic mgmt.

• Core techniques:

• Access of socket & transport layers

34

Page 87: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Hone System

• Bring end hosts into traffic mgmt.

• Core techniques:

• Access of socket & transport layers

• Expressive three-stage programming

framework

34

Page 88: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

The Hone System

• Bring end hosts into traffic mgmt.

• Core techniques:

• Access of socket & transport layers

• Expressive three-stage programming

framework

• Efficient system with parallel execution

of partitioned program

34

Page 89: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Programming Model

• Framework for each stage

• Programmable body of each stage

• Focus on measurement and analysis

35

Measure

StageAnalyze Stage Control Stage

Page 90: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone System

36

Network

Host

Server OS

App

Page 91: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone System

36

Network

Host

Server OS

Hone

AgentApp

Page 92: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Runtime System

Hone System

36

Network

Host

Server OS

Hone

AgentApp

Controller

Mgmt.

Program

Page 93: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Runtime System

Hone System

36

Network

Host

Server OS

Hone

AgentApp

Controller

Mgmt.

Program

Page 94: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Runtime System

Hone System

36

Network

Host

Server OS

Hone

AgentApp

Controller

Global Exe. PlanNet Local

Exe. Plan

Host Local

Exe. Plan

Page 95: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Runtime System

Hone System

36

Network

Host

Server OS

Hone

AgentApp

Controller

Global Exe. Plan

Net Local

Exe. Plan

Host Local

Exe. Plan

Page 96: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Evaluation

• Built multiple traffic management

solutions on Hone

• Show ‘distributed rate limiting’ here

• Limit total bandwidth across all

instances of a tenant in a public cloud

37

Page 97: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

DRL on HONE

• Host-side execution:

• Measure

• Calculate throughput

• Aggregate among connections

• Controller-side execution:

• Aggregate among hosts

• Generate new rate-limit policies

38

Page 98: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Distributed Rate Limiting

39

Page 99: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Distributed Rate Limiting

39

Page 100: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Distributed Rate Limiting

39

Page 101: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Distributed Rate Limiting

39

Page 102: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Distributed Rate Limiting

39

Page 103: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Summary

• Programming abstraction• Access to fine-grained data in servers• Three-stage framework

• Efficient and scalable runtime

• Deployment• Integrated into product of Overture for

Verizon Business Cloud

• Springer JNSM Volume 23, 2015

40

Page 104: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Sprite:

Bridging Enterprise and ISP

for Inbound Traffic Control

41

Page 105: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Enterprises:

Inbound Traffic Engineering

42

ISP 1

ISP 2

ISP 3

Enterprise

NetworkInternet

Service 2

Service 1

Page 106: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Enterprises:

Inbound Traffic Engineering

42

ISP 1

ISP 2

ISP 3

Enterprise

NetworkInternet

Service 2

Page 107: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Problem for Enterprises:

Inbound Traffic Engineering

42

ISP 1

ISP 2

ISP 3

Enterprise

NetworkInternet

Service 2

Page 108: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Mechanism of Sprite

43

ISP 1

ISP 2

Other

ISPs

YouTube

VoIP

User

Enterprise Network

Page 109: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Mechanism of Sprite

43

ISP 1

ISP 2

Other

ISPs

YouTube

VoIP

User

1.1.1.0/24

1.1.2.0/24

Enterprise Network

Page 110: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Mechanism of Sprite

43

ISP 1

ISP 2

Other

ISPs

YouTube

VoIP

User

1.1.1.0/24

1.1.2.0/24

10.1.1.2

Enterprise Network

Page 111: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Mechanism of Sprite

43

ISP 1

ISP 2

Other

ISPs

YouTube

VoIP

User

1.1.1.0/24

1.1.2.0/24

10.1.1.2

Enterprise Network

Src:1.1.1.2

Page 112: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Mechanism of Sprite

43

ISP 1

ISP 2

Other

ISPs

YouTube

VoIP

User

1.1.1.0/24

1.1.2.0/24

10.1.1.2

Enterprise Network

Src:1.1.1.2

Src:1.1.2.2

Page 113: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Challenges

• Naïve solution• All done by the border router

44

Page 114: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Challenges

• Naïve solution• All done by the border router

• Need a distributed solution

44

Page 115: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Challenges

• Naïve solution• All done by the border router

• Need a distributed solution• Control-plane scaling

44

Page 116: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Challenges

• Naïve solution• All done by the border router

• Need a distributed solution• Control-plane scaling

• Data-plane scaling

44

Page 117: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Challenges

• Naïve solution• All done by the border router

• Need a distributed solution• Control-plane scaling

• Data-plane scaling

• Need a simple management interface

44

Page 118: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

Abstraction Example

Page 119: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Abstraction Example

Page 120: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Abstraction

<BioDept, Analytic> BestLatency

Example

Page 121: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Network-

level Rule

Abstraction

<BioDept, Analytic> BestLatency

Example

Page 122: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Network-

level Rule

Abstraction

<BioDept, Analytic> BestLatency

<10.1.0.0/16, 184.168.1.0/24> ISP1

Example

Page 123: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Network-

level Rule

Abstraction

<BioDept, Analytic> BestLatency

<10.1.0.0/16, 184.168.1.0/24> ISP1

Example

Page 124: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Network-

level Rule

Per-flow

Rule

Abstraction

<BioDept, Analytic> BestLatency

<10.1.0.0/16, 184.168.1.0/24> ISP1

Example

Page 125: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Three Tiers of Abstractions

45

High-level

Policy

Network-

level Rule

Per-flow

Rule

Abstraction

<BioDept, Analytic> BestLatency

<10.1.0.0/16, 184.168.1.0/24> ISP1

<10.1.1.42,184.168.1.17>

SNAT to 1.1.1.2

Example

Page 126: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Case: Dynamic

Perf-driven Balancing

• Move users’ connections among

ISPs for best per-user performance

• Live Internet experiment on AWS

VPC and PEERING

• 10 users watch movies on YouTube

46

Page 127: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

User YouTube Throughput

47

Page 128: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Sprite Summary

• Direct and fine-grained inbound traffic control

• Scalable solution• Scaling control and data planes

• Evaluation• In collaboration with OIT

• SOSR’15

48

Page 129: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Open Issue #1

Statesman + Hone

• Merge Hone into Statesman

• Joint server, traffic, infrastructure mgmt.

• Possible exploration

• State abstraction for server data

• Dependency of server and network

states

• Safety invariants involving servers

49

Page 130: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Open Issue #2

Transactional Statesman

• Transactional semantics for conflict

resolution

• Possible exploration

• Grouping semantics

• Condition semantics

50

Page 131: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Open Issue #3

Hone for Multi-tenant Cloud

• Loosen the assumption of having

access to the hosts’ OS

• Possible exploration

• Infer stats in guest OS from data in

hypervisor

51

Page 132: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Related work

52

Page 133: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Statesman Related Work

53

Work What they do Statesman

SDN worksCentralized control of

flows

Wider spectrum of

management

applications

OnixSingle repository of

network stats

Three-view network

state model

PyreticTarget at flow

management

Wider spectrum of

applications

CorybanticTight cross-application

proposal evaluationLoose coupling

FlowVisor Virtual topology slicing Network state model

Page 134: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Hone Related Work

54

Work What they do Hone

Network

Exception

Handler

Use hosts only as

software switches

Go deeper into

socket layer

Gigascope Extend SQLUse functional

language to construct

ChimeraSpecific application

(intrusion detection)

Support various

management

solutions

MapReduceNaturally

parallelizable data

Data inherently

associated with

collection point

Page 135: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

Sprite Related Work

55

Work What they do Sprite

BGP AS-path prepending

Tune BGP

configurations to influence neighboring

ASes

Direct control from inside enterprise

networks

Works on

Internet re-

architecture

Clean-slate design in

routing or hostsIncrementally

deployable

Tunnel-basedworks

Two ends cooperate to control traffic

Only need the client side to act

Page 136: Integrating Network Management for Cloud Computing Servicesjrex/thesis/peng-sun-talk.pdf · 2015. 5. 13. · Integrating Network Management for Cloud Computing Services Final Public

56

Thanks!

Q&A