25
1/25 Generic and Automatic Address Generic and Automatic Address Configuration for Data Center Configuration for Data Center Networks Networks 1 Kai Chen Kai Chen , , 2 Chuanxiong Guo, Chuanxiong Guo, 2 Haitao Wu, Haitao Wu, 3 Jing Jing Yuan, Yuan, 4 Zhenqian Feng, Zhenqian Feng, 1 Yan Chen, Yan Chen, 5 Songwu Lu, Songwu Lu, 6 Wenfei Wu Wenfei Wu 1 Northwestern University, Northwestern University, 2 Micrsoft Research Micrsoft Research Asia, Asia, 3 Tsinghua, Tsinghua, 4 NUDT, 5 UCLA, 6 BUAA SIGCOMM 2010, New Delhi, India SIGCOMM 2010, New Delhi, India

1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

Embed Size (px)

Citation preview

Page 1: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

1/25

Generic and Automatic Address Generic and Automatic Address Configuration for Data Center Configuration for Data Center

NetworksNetworks

11Kai ChenKai Chen, , 22Chuanxiong Guo, Chuanxiong Guo, 22Haitao Wu, Haitao Wu, 33Jing Yuan, Jing Yuan, 44Zhenqian Feng, Zhenqian Feng, 11Yan Chen, Yan Chen, 55Songwu Lu, Songwu Lu, 66Wenfei WuWenfei Wu

11Northwestern University, Northwestern University, 22Micrsoft Research Asia, Micrsoft Research Asia, 33Tsinghua, Tsinghua, 44NUDT, 5UCLA, 6BUAA

SIGCOMM 2010, New Delhi, IndiaSIGCOMM 2010, New Delhi, India

Page 2: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

2/25

Motivation Address autoconfiguration is desirable in networked

systemsManual configuration is error-prone

50%-80% network outages are due to manual configurationDHCP for layer-2 Ethernet autoconfiguration

Address autoconfiguration in data centers (DC) has become a problemApplications need locality information for computationNew DC designs encode topology information for routingDHCP is not enough - no such locality/topology information

Page 3: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

3/25

Research Problem

Given a new/generic DC, how to autoconfigure the addresses for all the devices in the

network?

DAC: data center address autoconfiguration

Page 4: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

4/25

OutlineMotivationResearch ProblemDACImplementation and ExperimentsSimulationsConclusion

Page 5: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

5/25

DAC Input Blueprint Graph (Gb)

A DC graph with logical IDsLogical ID can be any formatAvailable earlier and can be

automatically generated

Physical Topology Graph (Gp)A DC graph with device IDsDevice ID can be MAC addressNot available until the DC is

built and topology is collected

(a). Blueprint:Each node has a logical ID

(b). Physical network topology:Each device has a device ID

10.0.0.3

00:19:B9:FA:88:E2

Page 6: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

6/25

DAC System Framework

Physical Topology Collection

Device-to-logical ID Mapping

Logical ID Dissemination

Malfunction Detection

Page 7: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

7/25

Two Main Challenges Challenge 1: Device-to-logical ID Mapping

Assign a logical ID to a device, preserving the topological relationship between devices

Challenge 2: Malfunction DetectionDetect the malfunctioning devices if the physical topology is

not the same as blueprint (NP-complete and even APX-hard)

Page 8: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

8/25

Roadmap

Physical Topology Collection

Device-to-logical ID Mapping

Logical ID Dissemination

Malfunction Detection

Page 9: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

9/25

Device-to-logical ID Mapping How to preserve the topological relationship?

Abstract DAC mapping into the Graph Isomorphism (GI) problem

The GI problem is hard: complexity (P or NPC) is unknownIntroduce O2: a one-to-one mapping for DAC

O2 Base Algorithm and O2 Optimization Algorithm Adopt and improve techniques from graph theory

Page 10: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

10/25

O2 Base Algorithm Gb: {l1 l2 l3 l4 l5 l6 l7

l8}

Gp: {d1 d2 d3 d4 d5 d6 d7 d8}

Gb: {l1} {l2 l3 l4 l5 l6 l7 l8}

Gp: {d1} {d2 d3 d4 d5 d6 d7 d8}

Gb: {l1} {l5} {l2 l3 l4 l6 l7 l8}

Gp: {d1} {d2 d3 d5 d7} {d4 d6 d8}

Decomposition

Refinement

Page 11: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

11/25

O2 Base Algorithm Gb: {l1 l2 l3 l4 l5 l6 l7 l8}

Gp: {d1 d2 d3 d4 d5 d6 d7 d8}

Gb: {l5} {l1 l2 l3 l4 l6 l7 l8}

Gp: {d1} {d2 d3 d4 d5 d6 d7 d8}

Gb: {l5} {l1 l2 l7 l8} {l3 l4 l6 }

Gp: {d1} {d2 d3 d5 d7} {d4 d6 d8}

Gb: {l5} {l1 l2 l7 l8} {l6} {l3 l4}

Gp: {d1} {d2 d3 d5 d7} {d6} {d4 d8}

Decomposition

Refinement

Refinement

Page 12: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

12/25

O2 Base Algorithm

Gb: {l5} {l6} {l1 l2} {l7 l8} {l3 l4}

Gp: {d1} {d6} {d2 d7} {d3 d5} {d4 d8}

Gb: {l5} {l6} {l1} {l2} {l7 l8} {l3 l4}

Gp: {d1} {d6} {d2} {d7} {d3 d5} {d4 d8}

Gb: {l5} {l6} {l1} {l2} {l7} {l8} {l3} {l4}

Gp: {d1} {d6} {d2} {d7} {d3} {d5} {d4} {d8}

Refinement

Decomposition

Decomposition &

Refinement

Page 13: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

13/25

O2 Base AlgorithmO2 base algorithm is very slow

for 3 problems:P1: Iterative splitting in

Refinement: it tries to use each cell to split every other cell iteratively Gp: π1 π2 π3 …… πn-1 πn

P2: Iterative mapping in Decomposition: when the current mapping is failed, it iteratively selects the next node as a candidate for mapping

P3: Random selection of mapping candidate: no explicit hint for how to select a candidate for mapping

Page 14: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

14/25

O2 Optimization AlgorithmHeuristics based on DC topology features

Sparse => Selective Splitting (for Problem 1)Symmetric => Candidate Filtering via Orbit (for

Problem 2)Asymmetric => Candidate Selection via SPLD

(Shortest Path Length Distribution) (for Problem3)

We propose the last one and adopt the first two from graph theory

R1: A cell cannot split another cell that is disjoint

with itself.

R2: If u in Gb cannot be mapped to v in Gp, then all nodes in the same orbit with u cannot be

mapped to v either. R3: Two nodes u, v in Gb, Gp cannot be mapped to each other if have different

SPLDs.

Page 15: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

15/25

Speed of O2 Mapping

12.4 hours

8.9 seconds

8.9 seconds

Page 16: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

16/25

Roadmap

Physical Topology Collection

Device-to-logical ID Mapping

Logical ID Dissemination

Malfunction Detection

Page 17: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

17/25

Malfunction Detection Types of Malfunctions

Node failure, Link failure, Miswiring

Effects of MalfunctionsO2 cannot find device-to-logical

ID mapping

Our GoalDetect malfunctioning devices

Problem ComplexityAn ideal solution

1.Find Maximum Common Subgraph (MCS) between Gb and Gp say Gmcs

2.Remove Gmcs from Gp => the rest are malfunctionsMCS is NP-complete and even APX-hard

Page 18: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

18/25

Practical Solution Observations

Most node/link failures, miswirings cause node degree changeSpecial, rare miswirings happen without degree change

Our IdeaDegree change case: exploit the degree regularity in DC

Devices in DC have regular degrees (common sense)No degree change case: probe sub-graphs derived from anchor

points, and correlate the miswired devices using majority voting Select anchor point pairs from 2 graphs

probe sub-graphs iteratively, stop when k-hop subgraphs are isomorphic but (k+1)-hop are not, increase the counters for k- and (k+1)- hop nodes

Output node counter list: high counter => high possible to be miswired

Isomorphic

1

2

k+1

k

1

2

k

k+1

Isomorphic

… …Isomorphic

Non-Isomorphic

Page 19: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

19/25

Simulations on Miswiring Detection

Over data centers with tens of thousands of deviceswith 1.5% nodes as anchor points to identify all hardest-to-

detect miswirings

1.5%

Page 20: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

20/25

Roadmap

Physical Topology Collection

Device-to-logical ID Mapping

Logical ID Dissemination

Malfunction Detection

Page 21: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

21/25

Basic DAC Protocols CBP: Communication Channel Building

ProtocolTop-Down, from root to leaves

PCP: Physical Topology Collection ProtocolBottom-Up, from leaves to root

LDP: Logical ID Dissemination ProtocolTop-Down, from root to leaves Logical

DAC Manager

DAC manager:1. handle all the

intelligences 2. can be any server in the

network

Page 22: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

22/25

Implementation and ExperimentsOver a BCube(8,1) network with 64 servers

1. Communication Channel Building (CCB)

2. Transition time3. Physical Topology Collection (TC)4. Device-to-logical ID Mapping5. Logical IDs Dissemination (LD)The total time used: 275

milliseconds

Page 23: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

23/25

SimulationsOver large-scale data centers (in milliseconds)

46 seconds for the DCell(6, 3) with 3.8+

million devices

Page 24: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

24/25

SummaryDAC: address autoconfiguration for generic data

center networks, especially when the address is topology-awareGraph isomorphism for address configuration

275ms for a 64-sever BCube, and 46s for a DCell with 3.8+ million devices

Anchor point probing for malfunction detectionwith 1.5% nodes as anchor points to identify all hardest-to-

detect miswirings

DAC is a small step towards the more ambitious goal of automanagement of the whole data centers

Page 25: 1/25 Generic and Automatic Address Configuration for Data Center Networks 1 Kai Chen, 2 Chuanxiong Guo, 2 Haitao Wu, 3 Jing Yuan, 4 Zhenqian Feng, 1 Yan

25/25

Q & A? Thanks!