WPDCArchitectureswM LAG 1750

8/18/2019 WPDCArchitectureswM LAG 1750

1/5

Extreme Networks White Paper

© 2011 Extreme Networks, Inc. All rights reserved. Do not reproduce.

Make Your Network Mobile

Abstract

The broad adoption of virtualization has led to a urry

of server consolidation projects. IT administrators arelooking to push the envelope when it comes to how

many virtual servers or Virtual Machines (VMs) can be

packed on a single physical server. This is a disruptive

change and impacts traditional network architectures

and best practices in many ways. This white paper exam-

ines the challenges and the different architectural

approaches to meet bandwidth, redundancy and

resiliency requirements from the server edge to the

core of the network in a virtualized environment.

Exploring New DataCenter NetworkArchitectures withMulti-Switch LinkAggregation (M-LAG)


2/5

2

Extreme Networks White Paper: Exploring New Data Center Network Architectures with Multi-Switch Link Aggregation (M-LAG)

© 2011 Extreme Networks, Inc. All rights reserved.

Introduction

The broad adoption of virtualization has led to a urry ofserver consolidation projects. IT administrators are lookingto push the envelope when it comes to how many virtual

servers or Virtual Machines (VMs – as they are commonlyreferred to) can be packed on a single physical server. Thisis a disruptive change and impacts traditional networkarchitectures and best practices in many ways. One directconsequence of higher server virtualization ratios is thatas more VMs are packed on a single server, the bandwidthdemands from the server edge, all the way to the core ofthe network, are growing at a rapid pace. Additionally withmore virtual machines on a single server, the redundancyand resiliency requirements from the server edge to thecore of the network are increasing.

Traditionally, the approach to increasing bandwidth fromthe server to the network edge has been to add more Net-work Interface Cards (NICs) and use Link Aggregation (LAG)or “NIC teaming” as it is commonly called to bond links toachieve higher bandwidth. If any of the links in the groupof aggregated links fails, the traffic load is redistributedamong the remaining links. Link aggregation provides asimpler and easier way to both increase bandwidth and addresiliency. Link aggregation is also commonly used betweentwo switches to increase bandwidth and resiliency. How-ever, in both cases, link aggregation works only between

two individual devices, for example switch to switch, orserver to switch. If any one of the devices on either end ofthe link aggregated group (or trunk as it is also called) fails,then there is complete loss of connectivity.

In order to add device level redundancy various othermechanisms have been deployed. Where Layer 3 routingand segmentation is deployed in the network, variousrouter redundancy protocols such as VRRP, in conjunctionwith interior gateway protocols such as OSPF, provide

adequate resiliency, failover and redundancy in the net-work. However, with virtualization driving the need for “at-ter” Layer 2 topologies (since virtual machine movementtoday is typically restricted to within a subnet boundary),the drive towards a broader atter Layer 2 data center net-work is gaining momentum. In this environment, protocolssuch as the spanning tree protocol have typically providedredundancy around both link and device failures. Spanningtree protocol works by blocking ports on redundant pathsso that all nodes in the network are reachable through asingle path. If a device or a link failure occurs, based onthe spanning tree algorithm, a selective redundant pathor paths are opened up to allow traffic to ow, while stillreducing the topology to a tree structure which preventsloops. Spanning tree protocol can be used in combinationwith link aggregation where links between two nodes –such as switch to switch connections – can be aggregated

using link aggregation to increase bandwidth and resiliencybetween nodes or devices. Spanning tree would typicallytreat the aggregated link as a single logical port in its calcu-lations to come up with a loop free topology.

While spanning tree has served for many years as the defacto network redundancy protocol, the changing require-ments of data center networks today are forcing are-examination of the choice of redundancy mechanisms.For example, one of the drawbacks of spanning tree pro-tocol is that in blocking redundant ports and paths, span-ning tree effectively reduces the available bandwidth

signicantly, i.e. the bandwidth available on the redun-dant paths goes unused until a failure occurs. Addition-ally, in many situations the choice of which ports to blockcan also lead to a suboptimal path of communicationbetween end nodes by forcing traffic to go up and downthe spanning tree. See Figure 1 below. Finally, the timetaken to recompute the spanning tree and propagate thechanges in the event of a failure can vary as well.

Traffic Path

X

LAGLAGLAG

LAGLAG

LAG

STP BlockX

STP Block

Figure 1


3/5

3



Multi-Switch Link Aggregation(M-LAG)

A number of new protocols and approaches have beensuggested to address some of the shortcomings ofspanning tree protocol. One approach to addressingboth the performance as well as the resiliency require-ments of these highly virtualized data centers is toextend the link-level redundancy capabilities of linkaggregation and add support for device-level redundan-cy. This can be accomplished by allowing one end of thelink aggregated port group to be dual-homed into twodifferent devices to provide device-level redundancy. Theother end of the group is still single homed into a singledevice. See Figure 2 below.

In Figure 2, Device 1 treats the link aggregated ports asa normal link aggregated trunk group, i.e. it does notsee anything different. Traffic from Device 1 is distrib-uted across the ports in the group using traditional linkaggregation algorithms which would typically hash thetraffic across the ports in the group using a variety ofhashing algorithms. If one of the links in the group wereto go down, traffic would get redistributed across theremaining ports in the group. However, the other endof link aggregated group is where things now functiondifferently. Device 2 and Device 3 now work togetherto create the perception of a common link aggregated

group so that Device 1 doesn’t see anything differentfrom a link aggregation perspective, even though the

link aggregated ports are now distributed across Device2 and Device 3, thereby leading to the term Multi-SwitchLink Aggregation (M-LAG). Device 2 and Device 3 com-municate information to each other over the Inter SwitchLink (ISL) so that forwarding, learning and bridging workconsistently without causing any loops. The ISL itself canbe a regular LAG. If either the link to Device 1 from Device2 or Device 3 were to go down, or if Device 2 or Device 3itself went down, traffic would now get forwarded acrossthe remaining link/device thus providing both link-leveland device-level redundancy. The intelligence that allowsthe ports on Device 2 and Device 3 to present itself as asingle link aggregated trunk group to Device 1 today isimplemented using mostly proprietary mechanisms, i.e.M-LAG technology is still largely proprietary. However,the proprietary nature of the technology is conned tothe layer which presents itself as a distributed link

aggregated group, specically Device 2 and Device 3in the gure below, both of which should come fromthe same vendor. Device 1 does not participate in thisproprietary protocol, and in fact Device 1 may come froma different vendor and can in fact be a different type ofdevice. For example, Device 1 can be a server which has dualNICs teamed together, while Device 2 and Device 3 can beEthernet switches from a single vendor. M-LAG can work atdifferent layers from the access to the core of the network.M-LAG can be used in conjunction with traditional linkaggregation to increase bandwidth as well as add link-levelredundancy between devices. See Figure 3 below.

Device 1

Device 2 Device 3

LAG (Device 1)

ISL

M-LAG(Device 2 & Device 3)

Figure 2

Device 1

Device 2 Device 3

LAG (Device 1)

LAG(Device 2) M-LAG(Device 2 & Device 3)

Figure 3


4/5

4



Combining M-LAG with theDirect Attach Architecture(M-LAG Direct Attach)

M-LAG serves as a powerful mechanism and tool toaddress newer architectural requirements centered onbandwidth and resiliency. Since the proprietarynature of M-LAG is limited to only the switches providingthe distributed link aggregation capabilities, it can becombined with other technologies, devices and vendorequipment to build better network architectures. Forexample, M-LAG can be used in conjunction with theDirect Attach™ architecture from Extreme Networks® toeliminate tiers and simplify switching in the data center. The Direct Attach™ architecture allows virtual machinesto be directly switched in the aggregation or the core

of the physical network, thereby eliminating multipleswitching tiers such as the virtual switch, as well as theblade switch and the access switch. M-LAG allows dualhoming links from the server into the network and usingboth links in an active-active manner. By combining thetwo, virtual machines on a single server can be dual-homeddirectly into the aggregation or core of the network, while

using both links in an active-active manner. Creating anM-LAG Direct Attach architecture not only helps to elimi-nate multiple tiers of switches thereby eliminating multiplepoints of oversubscription, latency and power, but it alsoadds link, device and network level resiliency to the datacenter fabric. And it does this without blocking ports or linksthereby allowing full utilization of the capacity built into thedata center fabric. In effect, an M-LAG Direct Attach architec-ture provides a very scalable, high-performance, low latencynetwork fabric for highly virtualized data centers. See Figure4 below.

For more information on Extreme Networks Direct Attacharchitecture and how it works to reduce tiers, read thewhite paper at:www.extremenetworks.com/go/DirectAttach .

A key benet of an M-LAG and M-LAG/Direct Attach

approach is that it can be deployed on existing data cen-ter switches using a simple software upgrade, i.e. it doesnot require an infrastructure refresh. While M-LAG itself isproprietary, it works in conjunction with standard provenlink aggregation technology commonly available acrossserver and switch vendors, as well as in conjunction withdifferent hypervisor technologies in Direct Attach mode.

Blade Chassis

M-LAG

LAG

96-por t10/100/1000BASE-T,MPU21 96-port10 /100/1000BASE-T,MPU21

BladeServer

PassThroughModule

PassThroughModule

VMVM

Figure 4


5/5

Make Your Network Mobile


© 2011 Extreme Networks, Inc. All rights reserved. Extreme Networks, the Extreme Networks logo and Direct Attach are either registered trademarks or trade-marks of Extreme Networks, Inc. in the United States and/or other countries. Specications are subject to change without notice. 1750_02 07/11

Corporateand North AmericaExtreme Networks, Inc.3585 Monroe StreetSanta Clara, CA 95051 USAPhone +1 408 579 2800

Europe, Middle East, Africaand South AmericaPhone +31 30 800 5100

Asia Paci cPhone +65 6836 5437

JapanPhone +81 3 5842 4011

extremenetworks.com

TRILL and SPB

M-LAG is one of several approaches to building outmodern data center network fabrics. TRILL and ShortestPath Bridging (SPB) are two other new approaches that

are being positioned as an alternative and replacement tothe spanning tree protocol. TRILL and SPB are competingproposals being pursued in the IETF and IEEE respectively.Both TRILL and SPB use link state routing protocols tocompute optimal paths between nodes in the network.However, unlike traditional link state routing protocolsthat operate at the IP or Layer 3 level, both TRILL and SPBoperate at the Layer 2 level. While SPB leverages the IS-ISlink state protocol, TRILL uses a variant of IS-IS. Addition-ally, both TRILL and SPB use encapsulation mechanisms totransport packets across the network. TRILL uses a form of

MAC-in-MAC encapsulation while SPB has variants for bothMAC-in-MAC as well as Q-in-Q encapsulation. Both TRILLand SPB provide for multiple active redundant paths toeffectively fully utilize available bandwidth.

The challenge with both TRILL and SPB is that they arenew protocols that require understanding and expertiseof two new technologies in the data center, IS-IS andMAC-in-MAC encapsulation. Additionally depending onthe protocol, multicast forwarding can require comput-ing additional multicast trees which can add furthercomplexity from a troubleshooting and debugging

perspective. Lastly, both TRILL and SPB require new infra-structure due to the encapsulation mechanisms they usefor data forwarding i.e. most existing data center networkinfrastructures will not support either TRILL or IS-IS.

Different vendors have expressed support for either TRILL or SPB leading to some confusion as to the industrydirection for both TRILL and SPB. In the face of this uncer-tainty around TRILL and SPB, and a lack of broad supportacross product lines for either TRILL or SPB, the M-LAGDirect Attach approach provides a viable alternativetoday to deploying a scalable and resilient data center

fabric.

Documents

WPDCArchitectureswM LAG 1750