6
IEEE Network • March/April 2008 16 0890-8044/08/$25.00 © 2008 IEEE ost currently deployed routing protocols select only a single forwarding path for the traffic between each source-destination pair. In this article, we explore techniques that allow flexi- ble division of traffic over multiple paths. That is, we argue that an end host or edge network should have access to multi- ple paths through the Internet, and direct control lover which traffic traverses each path. Application requirements dictate the granularity of division, for example, by IP address blocks (i.e., IP prefixes), destination host, a TCP flow, or a single packet. Motivation for Flexible Multipath Routing In the Internet today, there exists a market for value-added services, many of which can be realized by flexible multipath routing: Customizing to application performance requirements: Differ- ent applications have different needs. If multiple paths exist, VoIP and online gaming traffic can use a low-delay path, while file sharing traffic uses a high-throughput path. In addition, an application can access more bandwidth by using multiple paths simultaneously. Improving end-to-end reliability: If multiple paths exist, traffic can switch quickly to an alternate path when a link or router fails. Similarly, if an adversary drops packets along a path, the traffic could be moved to an alternate path to cir- cumvent the adversary [1]. This is particularly useful if dis- joint paths are available. Avoiding congested paths: When multiple paths are available, traffic can move to an alternate path to circumvent conges- tion. Despite problems with routing oscillation in the early ARPANET, recent work has shown how to dynamically split traffic over multiple paths in a stable fashion [2]. In fact, by just having two paths and flexible splitting between them, protocols can be easily tuned to efficiently utilize net- work resources. Past work indicates that the Internet’s network layer topology has significant underlying path diversity. 1 Each network is a collection ofrouters and links under the con- trol of one entity, such as an Internet service provider (ISP) that offers connectivity to other networks or a stub network that just provides connectivity to its own users and services. In this article we explore how to give stub net- works greater end-to-end path diversity. Extra end-to-end paths may arise because a stub network is connected to multiple IPSs, individual ISPs have intradomain path diver- sity, or ISPs connect to each other in multiple locations. In fact, a measurement study of a large ISP found that almost 90 percent of point-of-presence (PoP) pairs have at least four link-disjoint paths between them [4]. Another study showed that although Internet traffic traverses a single path, 30–80 percent of the time an alternate path with lower loss or smaller delay exists [3]. Challenges: Scalability and Incentives Unfortunately much of the existing path diversity in the Inter- net topology is never exploited. The scalability challengers of multipath routing are one of the reasons. Multipath routing introduces extra overhead in both the control plane and the data plane of the routers. In the control plane, routers run routing protocols to compute the forwarding tables that the data plane then uses to direct incoming packets to outgoing links. Multipath routing would increase the overhead in both the control and data planes: Control plane overhead: First, exchanging the extra topology or path information required for multipath routing con- sumes extra bandwidth and processing resources. Second, storage overhead at each router grows with the number of paths. Third, computing multiple paths requires more com- putational power. Data plane overhead: Forwarding traffic on different paths requires the data packets to carry an extra header or label. In addition, forwarding tables need extra entities for each destination, thus consuming more memory; in addition, this data plane memory is expensive, due to the need to forward packets at high speed. M M Jiayue He and Jennifer Rexford, Princeton University Abstract The Internet would be more efficient and robust if routers could flexibly divide traf- fic over multiple paths. Often, having one or two extra paths is sufficient for cus- tomizing paths for different applications, improving security, reacting to failures, and balancing load. However, support for Internet-wide multipath routing faces two significant barriers. First, multipath routing could impose significant computational and storage overhead in a network the size of the Internet. Second, the indepen- dent networks that comprise the Internet will not relinquish control over the flow of traffic without appropriate incentives. In this article, we survey flexible multipath routing techniques that are both scalable and incentive compatible. Techniques covered include: multihoming, tagging, tunneling, and extensions to existing Inter- net routing protocols. Toward Internet-Wide Multipath Routing 1 In this article we focus on the path diversity at the network layer. There is a large body of research on path diversity at the physical layer that is important for reliability and security, but the shared risks at the physical layer are not visible to the IP routing system.

Toward internet-wide multipath routing

  • Upload
    j

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

IEEE Network • March/April 200816 0890-8044/08/$25.00 © 2008 IEEE

ost currently deployed routing protocols selectonly a single forwarding path for the trafficbetween each source-destination pair. In thisarticle, we explore techniques that allow flexi-

ble division of traffic over multiple paths. That is, we arguethat an end host or edge network should have access to multi-ple paths through the Internet, and direct control lover whichtraffic traverses each path. Application requirements dictatethe granularity of division, for example, by IP address blocks(i.e., IP prefixes), destination host, a TCP flow, or a singlepacket.

Motivation for Flexible Multipath RoutingIn the Internet today, there exists a market for value-addedservices, many of which can be realized by flexible multipathrouting:• Customizing to application performance requirements: Differ-

ent applications have different needs. If multiple pathsexist, VoIP and online gaming traffic can use a low-delaypath, while file sharing traffic uses a high-throughput path.In addition, an application can access more bandwidth byusing multiple paths simultaneously.

• Improving end-to-end reliability: If multiple paths exist, trafficcan switch quickly to an alternate path when a link orrouter fails. Similarly, if an adversary drops packets along apath, the traffic could be moved to an alternate path to cir-cumvent the adversary [1]. This is particularly useful if dis-joint paths are available.

• Avoiding congested paths: When multiple paths are available,traffic can move to an alternate path to circumvent conges-tion. Despite problems with routing oscillation in the earlyARPANET, recent work has shown how to dynamicallysplit traffic over multiple paths in a stable fashion [2]. Infact, by just having two paths and flexible splitting betweenthem, protocols can be easily tuned to efficiently utilize net-work resources.Past work indicates that the Internet’s network layer

topology has significant underlying path diversity.1 Eachnetwork is a collection ofrouters and links under the con-trol of one entity, such as an Internet service provider(ISP) that offers connectivity to other networks or a stub

network that just provides connectivity to its own users andservices. In this article we explore how to give stub net-works greater end-to-end path diversity. Extra end-to-endpaths may arise because a stub network is connected tomultiple IPSs, individual ISPs have intradomain path diver-sity, or ISPs connect to each other in multiple locations. Infact, a measurement study of a large ISP found that almost90 percent of point-of-presence (PoP) pairs have at leastfour link-disjoint paths between them [4]. Another studyshowed that although Internet traffic traverses a singlepath, 30–80 percent of the time an alternate path withlower loss or smaller delay exists [3].

Challenges: Scalability and IncentivesUnfortunately much of the existing path diversity in the Inter-net topology is never exploited. The scalability challengers ofmultipath routing are one of the reasons. Multipath routingintroduces extra overhead in both the control plane and thedata plane of the routers. In the control plane, routers runrouting protocols to compute the forwarding tables that thedata plane then uses to direct incoming packets to outgoinglinks. Multipath routing would increase the overhead in boththe control and data planes:• Control plane overhead: First, exchanging the extra topology

or path information required for multipath routing con-sumes extra bandwidth and processing resources. Second,storage overhead at each router grows with the number ofpaths. Third, computing multiple paths requires more com-putational power.

• Data plane overhead: Forwarding traffic on different pathsrequires the data packets to carry an extra header or label.In addition, forwarding tables need extra entities for eachdestination, thus consuming more memory; in addition, thisdata plane memory is expensive, due to the need to forwardpackets at high speed.

MM

Jiayue He and Jennifer Rexford, Princeton University

AbstractThe Internet would be more efficient and robust if routers could flexibly divide traf-fic over multiple paths. Often, having one or two extra paths is sufficient for cus-tomizing paths for different applications, improving security, reacting to failures,and balancing load. However, support for Internet-wide multipath routing faces twosignificant barriers. First, multipath routing could impose significant computationaland storage overhead in a network the size of the Internet. Second, the indepen-dent networks that comprise the Internet will not relinquish control over the flow oftraffic without appropriate incentives. In this article, we survey flexible multipathrouting techniques that are both scalable and incentive compatible. Techniquescovered include: multihoming, tagging, tunneling, and extensions to existing Inter-net routing protocols.

Toward Internet-Wide Multipath Routing

1 In this article we focus on the path diversity at the network layer. There isa large body of research on path diversity at the physical layer that isimportant for reliability and security, but the shared risks at the physicallayer are not visible to the IP routing system.

HE LAYOUT 3/6/08 1:45 PM Page 16

IEEE Network • March/April 2008 17

The ultimate flexibility would befor sources to see the entire Inter-net-wide topology and utilize allpaths to each destination. Thiswould create a large scaling prob-lem, however, since the Internethas more than 25,000 networks andmany more paths. Even if the scala-bility challenges were surmount-able, accessing all paths wouldrequire many (or even all) networksto cooperate, which may be unreal-istic. Instead, it is more likely forISPs to allow other networks toselect from a small set of paths,under a specific business agreement. Since business models inthe Internet today are bilateral, multipath solutions based oncooperation between pairs of networks are much more likelyto succeed than solutions that require widespread cooperationbetween many (sometimes competing) networks. Fortunately,multipath routing solutions that limit the number of additio-nial paths and the coordination between different networksare aligned with both goals: scalability and business incentives.As such, in this article we focus on solutions where stub net-works select from among a small set of paths provided by alimited number of bilateral agreements, rather than tech-niques that require a stub network to compute andsignal acomplete end-to-end path.

This survey focuses on multipath routing schemes with lowoverhead and minimal cooperation between networks. Thesections progress from deployed techniques to proposed solu-tions that are easily deployable, to techniques that rely on newbusiness models. We start by reviewing how Internet routingworks today, with an eye toward the limitations of the existingrouting system. Multipath routing relies on two key capabili-ties: discovering extra paths and directing packets over them.We cover a range of solutions for flexible forwarding, such astunneling and tagging, for directing packets to different paths.Then, we describe control-plane extensions that enable ASs tolearn additional paths. We discuss techniques for a single ASto achieve multipath routing without requiring cooperationfrom other ASs. However, the impact of a single AS on end-to-end path performance is limited, and more end-to-endpaths will be available if ASs cooperate. We discuss tech-niques that require only cooperation between a pair of ASs.Finally, we summarize and conclude the article.

Internet Routing TodayIn this section, we introduce the key Internet routing proto-cols used today. Routers use the Border Gateway Protocol(BGP) to exchange reachability information with neighboringnetworks. BGP is a path-vector protocol, where routing deci-sions are made based on local policies. Inside a network,routers communicate using an Interior Gateway Protocol(IGP). Most ISPs run link-state protocols that perform short-est-path routing based on configurable link weights. The linkweights in IGP and the policies in BGP are configured byhuman operators to satisfy business objectives.

Interdomain: Path Vector Protocol and MultihomingFigure 1a represents a network-level topology, where eachcloud is a network and each link represents a physical connec-tion, as well as the existence of a business relationshipbetween two ASs. In a path-vector protocol, the entire routingpath is exchanged between neighbors. Edge routers in eachAS learn multiple paths to reach a particular destination and

store all of them in a routing table. From the list of paths, arouter then applies a set of policies to select a single activeroute. A router optionally advertises the active route to eachneighboring network, depending on the business relationship.Using a path-vector protocol allows BGP to support flexiblelocal policies that give each network control over its incomingand outgoing traffic. For example, a stub network like net-work D in Fig. 1 would not advertise routes learned from B toC (and vice versa) because D does not wish to carry transittraffic between the two neighbors.

Today’s BGP has two limitations as a single-path protocol.First, since only the active path is advertised, customer net-works are prevented from seeing alternate paths, includingones they might prefer. Second, by using only the active path,a network does not have fine-grained control, and can onlybalance traffic over multiple paths at the IP address block(i.e., prefix) level. Extending BGP to a multipath protocol,however, requires alignment of economic incentives betweennetworks. The economic incentives are likely to grow strongerin the future as the demand for performance and robustnessincreases and customers are willing to pay for value-addedInternet services. Today, two ASs usually have a customer-provider relationship or a peering relationship. In a peeringrelationship, two ASes can mutually provide additional pathsto each other without any economic exchange, similar to howthey carry traffic on peering links for free today. In a cus-tomer-provider relationship, the provider can offer additionalpaths to its customers as a value-added service.

One such example is multihoming, where a stub AS pays toconnect to more than one ISP. The use of multihoming hasseen a dramatic increase in recent years for two main reasons.First, as more and more enterprises rely heavily on the Inter-net for their business transactions, availability is of primaryimportance. Second, multihoming can be used to drive downthe cost of Internet access. For example, the multihomed net-work can use a cheaper ISP for general traffic and a moreexpensive but better ISP for performance-sensitive traffic. InFig. 1a, AS D is multihomed to AS B and AS C. Despite hav-ing two upstream routes, AS D can load balance between thetwo only at the prefix level and only forward on a single path.So, while multihoming provides additional paths to stub net-works, fine-grained control remains elusive.

Intradomain: Link State ProtocolUnlike the interdomain case, here each AS has full control ofits internal network. In addition, a network typically has justtens or hundreds of routers, much fewer than 25,000 networksin the Internet. Inside a single AS, each router is configuredwith a static integer weight on each of its outgoing links, asshown in Fig. 1b. The routers flood the link weights through-out the network and compute shortest paths as the sum of theweights using the Dijkstra algorithm. Each router uses this

n Figure 1. Sample inter-AS topology, with a close-up on one AS: a) topology between fourASs: B and C are ISPs; b) topology inside AS C: A and D are enterprise networks.

21

3

5j

33

3

2 1

1i

(a) (b)

B C

A

D

HE LAYOUT 3/6/08 1:45 PM Page 17

IEEE Network • March/April 200818

information to construct a table that drives the forwarding ofeach IP packet to the next hop in its path to the destination.Link-state protocols offer several advantages. First, routing isbased only on a single link metric, namely, link weights. Sec-ond, to reduce message-passing overhead, routers disseminateinformation only when the topology changes. Finally, byflooding the link-weight information, each router has a com-plete view of the topology and associated link weights.

On the other hand, even though each router can see thewhole topology, the existing path diversity is under exploited[4]. Even when alternative paths have been computed, packetstoward a destination are forwarded on a single path. Equal-cost multipath is a commonly deployed technique where therouters keep track of all shortest paths and then split evenlyamong them. In Fig. 1b, we see that router i has two shortestpaths to reach router j. In today’s IGPs, the traffic would bedivided evenly between the two paths. Even this limited ver-sion of multipath routing is useful for fast reaction to failures.In fact, some operators tune the link weights to create equal-cost multipaths [5].

Multiple shortest paths enable the operator to balance loadand react quickly to failures, but does not enable the operatorto customize paths for different applications. An existingoption for operators to customize paths inside their own net-work is the Constrained Shortest Path First (CSPF) protocol,an extension of the shortest-path protocol. The path comput-ed using CSPF is a shortest path fulfilling a set of constraints.A constraint could be minimum bandwidth required per link,end-to-end delay, or maximum number of links traversed.CSPF can be useful for a range of applications, for example,picking a low-delay path for a VoIP call, but cannot pickpaths based on dynamic constraints such as packet loss.

Toward Flexible ForwardingThe most prevalent forwarding mechanism in the Internettoday is destination-based hop-by-hop forwarding. Each routerforwards a packet to an outgoing link based on the destina-tion address from the IP packet header and its correspondinglongest-prefix match entry in the forwarding table. For exam-ple, in Fig. 1b, a router will forward a packet destined for j,independent of where the packet came from. Destination-based hop-by-hop forwarding leads to small forwarding tables,but cannot implement all forwarding policies. For example, inFig. 1a, if AS A wanted to reach D via (B, C), but B wanted toreach D directly, then A is forced to use path (A, B, D). Evenwhen the forwarding table contains multiple next hops for thesame destination, common practice would divide the trafficevenly among the multiple paths.

In this section, we describe alternative schemes that for-ward traffic over multiple paths. This is useful for customizingpaths for different applications. To decide which path(s)should carry a packet, an edge router or end host needs tofirst classify a packet, and then map the packet:• Packet Classification: Packets are classified based on the

requirements of the application. For example, the applica-tion may want low delay, high throughput, or a secure path.The application could be defined by a prefix, a destination,or a Transmission Control Protocol (TCP) flow (source anddestination addresses and port numbers). Packets withinthe same flow are normally classified in the same way. Oneoption is to mark packets using the type of service bits inthe IP header and later, to forward using the same bits.

• Mapping Packets to Path(s): The edge routers can measure(or infer) path properties, then identify the path that isbest-suited to each class of traffic. By examining the packetheader, a packet can be mapped to an appropriate path.

Designing a measurement infrastructure to monitor pathperformance is challenging. One reason is that measure-ments of path performance can be inherently inaccurate,for example, round-trip time estimation is a classic chal-lenge. In addition, the inaccuracies can be even greater in acompetitive environment where other ASs may treat prob-ing packets differently than data packets to make pathslook more attractive than they are.Both packet classification and mapping packets to path(s)

incur extra data-plane overhead. Although the overhead ofmarking packets and processing the marked packets is mini-mal, the measurement overhead associated with monitoringpath performance can be significant if the measurements arefine-grained (e.g., at the destination prefix level).

If multiple paths are associated with a particular class oftraffic, the router can send a fraction of the packets on eachpath to balance load and circumvent congestion. Later we dis-cuss the pros and cons of splitting traffic at different granular-ities. We focus on existing techniques (round-robin, hashing,and flowcache), but also describe flowlet-cache, a promisingtechnique that is yet to be deployed.

Forwarding on Alternate PathsTunneling is a widely available alternative to destination-based hop-by-hop forwarding that offers much more flexibil-ity. At a high level, tunneling establishes a logical linkbetween two routers (or hosts). Forwarding packets over atunnel usually involves “pushing’’ a header (or label) at thetunnel ingress and “popping’’ the header (or label) at thetunnel egress, in a process called encapsulation. For exam-ple, in Fig. 2, a packet going from B to F could be encapsu-lated to ensure it travels through E. At B, an extra headerwould be pushed on the packet to indicate E is the destina-tion. After the packet reaches router E, the extra headerwould be popped from the packet; then E would forwardthe packet to F hop-by-hop. Encapsulation can be imple-mented through IP-in-IP tunnels or multiprotocol labelswitching (MPLS). MPLS is a label-based forwarding mech-anism that encapsulates packets using labels. In either case,encapsulation requires packets to carry an extra label or anextra IP header. In the case of MPLS, each router alsostores the label-based forwarding table, although a label-based look up is simpler than matching the longest prefix ofthe destination address.

The path between the tunnel ingress to tunnel egress candepend on the underlying routing protocol, or the entirepath can be specified explicitly. Encapsulation alone often issufficient for most application requirements, such as direct-ing a packet to a particular egress point or through a partic-ular AS. When the path between tunnel end points onlydepends on the underlying protocol, it has the added advan-tage of adapting to topology changes automatically. Forexample, in Fig. 2, B could forward the packet toward E onehop at a time. This implies that if the link from C to D fails,the encapsulated packets would switch transparently toanother path. Still, by specifying only the end points of thetunnel, it is difficult to satisfy certain applications require-ments, for example, the end-to-end bandwidth requirement.So for those specialized applications, explicit routing is auseful alternative.

Explicit routing specifies every router (or AS) along thepath. The routers (or networks) along the path can be speci-fied directly in the packet header or indirectly through a labelin the packet header. One possibility is to implement explicitrouting by specifying the whole router-level path with IPoptions. In Fig. 2, if the path sequence (A, B, C, D, E, F) is anexplicit path for certain packets traveling from A to F, A

HE LAYOUT 3/6/08 1:45 PM Page 18

IEEE Network • March/April 2008 19

would know to forward to router B based on the IP options inthe packet header. Explicit routing can be implemented byMPLS as a combination of CSPF and Resource ReservationProtocol (RSVP). CSPF selects the path using a variety ofmetrics, whereas RSVP is the signaling protocol used to setup the path. RSVP establishes a hop-by-hop chain of labels torepresent the path, and it reserves bandwidth along the pathby signaling in advance. At the source end of the path, a labelwould be pushed onto the packet, based on information fromthe packet header, such as source address, destination address,and port numbers. Each intermediate router would perform alabel look up to find the outgoing label and outgoing link.Compared to tunneling, explicit routing does impose moredata plane overhead (to swap the labels at each hop), althoughthe overhead is manageable when the number of explicitlyrouted paths is limited.

Flexible Splitting among Multiple PathsThe network management system may wish to balance trafficbetween multiple paths to achieve certain traffic engineeringobjectives. For example, an application can send 40 percent oftraffic on one path and 60 percent of traffic on another. Toachieve a given splitting percentage, traffic can be switchedonto different paths using four major techniques: round-robin,hashing, flow cache, and flowlet cache [6]. Each techniquestrikes a different trade-off between overhead, splitting per-centage accuracy, and out-of-order packets.

A weighted round-robin will switch traffic at the granularityof packets. Since packets are small in size, round-robinscheduling can achieve very accurate splitting percentages andadds very little extra overhead to today’s forwarding functions.The downside is that because different paths between thesame source-destination pair often have different delays, somepackets that belong to the same TCP flow could arrive out oforder. This is problematic because TCP considers out-of-orderpacket delivery a sign of network congestion, and consequent-ly, the TCP sender would slow down the transfer. If the pathshave very similar delay, then weighted round-robin is a goodchoice due to its low overhead and accurate splitting percent-ages.

Hashing involves first dividing the hash space into weightedpartitions corresponding to the outbound paths. Then, packetsare hashed based on their header information and forwardedon the corresponding path. Hashing ensures in-order deliveryof most packets because a flow is likely to be mapped to aspecific path for its entire duration. A flow is defined by thefollowing attributes in the packet header: source IP address,destination IP address, transport protocol, source port, anddestination port. On the other hand, because flows vary dras-tically in their sizes and rates, it is difficult to realize accurate

splitting percentages. Finally, if splitting percent-ages change or a path fails, a flow is likely to behashed onto a different path than before, possiblystill causing a few out-of-order packets during thetransition.

The best way to avoid out-of-order packets isto implement a flow cache. A flow cache is a for-warding table that keeps track of which patheach active flow traverses. A flow cache ensurespackets belonging to the same flow always tra-verse the same path. Another advantage of flowcaching over hashing is that when new flowsarrive, they can be placed on any path, whichleads to better control of dynamic splitting per-centages although the splitt ing percentagesachieved are less accurate than with a round-robin. The big downside is that a high-speed link

could easily carry tens of thousands of concurrent flows [6],leading the flow cache to consume a significant amount ofadditional memory in the router.

It is possible to reduce data-plane overhead and improvesplitting ratios by dividing traffic at the granularity of packet-bursts, using a flowlet cache [6]. If the time between two suc-cessive packets is larger than the maximum delay differencebetween multiple paths, then the second packet can be for-warded on any available path without reordering. A flowletcache is typically much smaller than a flow cache, becausethere are significantly fewer active packet bursts than activeflows [6]. In addition, flowlet switching always achieves withina few percent of the desired splitting percentage withoutreordering any packets. Overall, flowlet cache would be thebest choice for most applications, although it is not imple-mented yet in routers today.

Multipath Routing by a Single ASIn this section, we present incrementally deployable tech-niques that can be adopted by a single network. Each ISP canexploit its internal path diversity, and a multihomed stub net-work can split traffic over multiple end-to-end paths.

Intradomain: Non-Shortest Paths within an ISPEach AS can select its own IGP, enabling it to change theprotocol without requiring cooperation from others. In link-state protocols, because link weights and topology informa-tion already are flooded to all routers, multipath routingdoes not incur extra overhead. One way to extend the link-state protocol for multipath is to compute K-shortest pathsrather than just the shortest path. This is cumbersome forseveral reasons. To start with, computing the K-shortestpaths is more computationally intensive (i.e., O(Nlog N +KN) for a network with N routers) than computing a singleshortest path (i.e., O(Nlog N)). The forwarding table sizewould also grow with the increase in number of paths perdestination. Perhaps the biggest overhead increase is in thedata plane, where K tunnels need to be established betweeneach source pair. If each router does destination-based hop-by-hop forwarding, there is no guarantee packets wouldtravel on the K-shortest paths from source to destination.This is significantly more cumbersome than the current hop-by-hop forwarding.

Another approach is to run multiple instances of the linkstate routing protocol [11]. Instead of having a single weightassociated with each link, each link has a vector of weights.Each instance of the link state protocol can just compute theshortest path and create a forwarding table for the corre-sponding topology. The vector of weights does not lead to the

n Figure 2. Illustration of how a tunnel works.

Src: ADest: F

Data

Logical view:A

Src: ADest: F

Data

Src: BDest: E

Src: BDest: E

Src: ADest: F

Data

Src: ADest: F

Data

BTunnel

E F

Physical view:A B C D E F

HE LAYOUT 3/6/08 1:45 PM Page 19

IEEE Network • March/April 200820

K shortest paths, but rather a shortest path for each of K setsof link weights. Each set of link weights can be tuned inde-pendently to customize the paths to different applications; forexample, one set of weights could be tuned for high through-put and another for low delay. The link weights could even bespecialized to handle different failure scenarios. In the controlplane, if K routing instances run simultaneously, the controlplane overhead would be exactly K times as much as shortest-path routing. In the data plane there are two ways to forwardpackets on the multiple topologies. The simpler (and morerestrictive) way is for each packet to belong to a single topolo-gy [11]. An alternate approach to multipath routing is to for-ward traffic on all paths that make forward progress towardthe destination [8, 9], based on a single set of link weights.Each router can make local forwarding decisions based on thecost of the shortest path through each of its neighbors [10].Forwarding packets only to routers that have a shorter path tothe destination guarantees that the path is loop-free [8]. Toencourage the use of shorter paths, diminishing proportions ofthe traffic would be directed on the longer paths. For exam-ple, in Fig. 1b i has two outgoing links along shorter paths toj. Since these paths have costs 8 and 9, less trafic would beplaced on the path with cost 9 [10]. Under this scheme, thepath computation costs are still O(Nlog N), since each routerwill just run Dijkstra’s shortest path algorithm. Compared toshortest path routing, forwarding along the “downward” pathsrequires more entries in the forwarding tables. In addition,there will be slightly more data plane overhead in order toimplement the splitting percentages, as explained earlier. Still,each router can make local forwarding decisions without theuse of tunnels.

Interdomain: Flexible Splitting by a Multihomed Stub

So far, we have described how to exploit path diversity insidea single AS. Next, we examine how to exploit interdomainpath diversity. Many routers learn multiple interdomain pathsand could conceivably split traffic over them by installing mul-tiple next-hops in the forwarding table. This is not done inpractice due to the extra control-plane overhead. For ISP net-works, edge routers would be required to announce multiplepaths to neighboring ASs, and the neighboring ASs wouldnow be required to store multiple paths. In addition, tunnel-ing would be required to direct packets on any non-defaultpaths, as explained earlier.

In a stub AS, however, edge routers are notrequired to propagate any of the learned paths asit has no customers. In addition, packet classifi-cation is simpler for a stub network since thedate rates end to be lower and all packets origi-nate from a single domain. Therefore, stub net-works are natural places to deploy flexiblesplitting. Because applications are run at theedge, the stub network also has direct knowledgeof the application requirements. Flexible splittingenables a network to place different classes oftraffic onto different paths and balance the loadacross multiple paths. Balancing loads betweenmultiple classes allows for efficient use of net-work resources and can avoid potential routingoscillations. For example, if all traffic is forward-ed on the least-delay path, route oscillations canoccur. Fortunately, flexible path selection addsvery little extra overhead on the data plane for astub AS, because choosing an outgoing linkdetermines the entire path a packet will follow,and no tunneling is required.

Cross-Network Cooperation for MultiplePathsWhen multiple ASs cooperate, more paths are available thanwhen a network acts alone. In addition to scalability chal-lenges, new business models must be put in place to enableinter-AS cooperation, for example, charging for providingadditional paths. In this section, we focus on schemes thataccess additional paths with only limited cooperation betweenASs. Sources can encapsulate packets to direct the trafficthrough a deflection point — an end host or an edge routerthat lies on an alternate path. This only requires a bilateralagreement between two parties. Sources can also deflect pack-ets indirectly via tagging, where a few opaque bits in the packetheader are used to indicate dissatisfaction with the currentpath. Tagging requires more inter-AS cooperation thanencapsulation because routers in participating ASs must bemodified to forward packets based on the tags.

Encapsulation: Forwarding through a Deflection PointEncapsulation can be used to explicitly force traffic onto analternate path with better performance properties. A packetwould be encapsulated to first arrive at the deflection point,then follow the default path of that deflection point to thedestination [7]. Deflections can occur at the application layeror the network layer. The key difference is that the choice ofa deflection point at the network layer can be based on data-plane and control-plane information.

The easiest way to access another path is by deflectingthrough another end host, which does not require cooperationfrom or coordination between ISPs. First, an overlay or logicaltopology can be established between end hosts using tunnels[11]. Then each end host can measure the end-to-end perfor-mance properties of paths to a destination via other end hosts.If a path with better performance is found, packets can bedeflected through another end host as seen in Fig. 3. In addi-tion to ease of deployment, application-layer deflections areattractive because they avoid the advertisement of additionalpaths. On the other hand, as the overlay grows in size, prob-ing all paths through other end hosts imposes a significantamount of measurement overhead and does not scale beyondtens of end hosts due to the measurement overhead [11]. Inaddition, sending traffic through other end hosts consumes

n Figure 3. The default path, shown in solid line is through AS B. Deflectionthrough AS C is possible either with an overlay (dot-dash line) or through anISP (dashed line).

C

Src

Host

Dest

A

B

D

HE LAYOUT 3/6/08 1:45 PM Page 20

IEEE Network • March/April 2008 21

edge link bandwidth and potentially incurs extra costs for theedge network.

A more scalable and efficient approach is for ISPs to pro-vide alternative paths [7]. As seen in Fig. 3, the deflectionpoint can be an edge router inside a network, rather than anend host. Although this approach requires more cooperationfrom (and between) ISPs, it is still incrementally deployable.To ensure scalability, a network would request only an alter-native path (perhaps with certain properties) from another ASif it is dissatisfied with its default path. For example, in Fig. 3,the source could request an alternative path from its providerAS A for reaching the destination D; AS A can then choose toforward traffic on the alternative path (A, C, D) — possiblyfor a price. Encapsulation would be used to deflect the pack-ets through AS C. The amount of control-plane overhead isdirectly proportional to the portion of ASs that are dissatis-fied with their default path.

Tagging: Requesting an Alternate PathAn alternative to encapsulation is for end hosts to tag theirpackets to request an alternate path [8, 10], without knowingthe details of the path. A router forwards an incoming packeton the default path or an alternate path, based on the associ-ated tag. Alternative paths inside an ISP can be constructedby one of the methods described earlier.

Tagging without path visibility is effective when an end-to-end path is undesirable due to one particular segment of thepath. For example, the path could contain a low capacity link,a high delay link, or a point of congestion. In these cases,routing around the problem link or router does not requiredirect knowledge of the route. By trying out a few tag num-bers, the source AS is likely to find a better path.

Tagging is quite scalable in the control plane because inter-mediate ASs are not required to disseminate extra informa-tion or store AS-level paths. An intermediate AS can merelyexploit the path diversity inside its own domain. There is littleextra data-plane overhead, because the tag can use somerarely used bits in the existing IP header [8]. The extra data-plane overhead only comes from an ISP processing thereceived tag and directing the packet onto an alternate pathbased on its tag number. Although tagging imposes less over-head than forwarding through a deflection point, it mayrequire business relationships between the stub network andmultiple ISPs. The incentives for honoring the tags are themost obvious in the context of hosts served by a single ISP.

ConclusionsThe ability to forward traffic on multiple paths would be use-ful for customizing paths for different applications, improvingreliability and balancing load. Yet Internet-wide multipathrouting remains elusive, due to scalability and economic chal-lenges In this article, we survey a variety of incrementallydeployable techniques that achieve flexible multipath routing.Routers already have data plane support for forwarding onalternate paths through tunneling: encapsulation and explicit

routing, although such techniques should be used in modera-tion for scalability reasons. We examine existing techniquesfor fine-grained traffic division, then propose flowlet-cache asa more accurate and scalable alternative. To access more end-to-end paths, stub networks can continue the trend of multi-homing and extend it to perform fine-grained load balancing.

Inside an ISP, multitopology routing and forwarding on“downward” paths are both lightweight and easily deployablemethods to leverage internal path diversity. Finally, we arguethat deflecting packets at the network layer is a promising wayto access more end-to-end paths with limited cooperationbetween networks, although new business models are needed toenable internetwork cooperation. We believe that moreresearch could be done to better quantify the trade-off betweenoverhead and performance for the more heavyweight solutions,including end-to-end signaling techniques not surveyed in thisarticle. As technology advances, routers may become morecapable of handling the overhead, making a wider range ofsolutions viable in practice. In addition, the economic incentivesfor providing value-added services will likely grow in the futureand hopefully motivate the creation of new internetwork busi-ness models that enable Internet-wide multipath routing.

References[1] D. Wendlandt et al., "Don't Secure Routing Protocols, Secure Data Delivery,"

Proc. SIGCOMM Wksp. Hot Topics in Networking, Nov. 2006.[2] S. Kandula et al., "Walking the Tightrope: Responsive Yet Stable Traffic Engi-

neering," Proc. ACM SIGCOMM, Aug. 2005.[3] S. Savage et al., "The End-to-End Effects of Internet Path Selection," Proc.

ACM SIGCOMM, Aug. 1999.[4] R. Teixeira et al., "Characterizing and Measuring Path Diversity of Internet

Topologies," Proc. ACM SIGMETRICS, June 2003.[5] G. Iannaccone et al., "Feasibility of IP Restoration in a Tier-1 Backbone,"

IEEE Network, Mar. 2004.[6] S. Kandula et al., "Dynamic Load Balancing Without Packet Reordering,"

ACM SIGCOMM Comp. Commun. Rev., June 2007.[7] W. Xu and J. Rexford, “MIRO: Multipath Interdomain Routing,” Proc. ACM

SIGCOMM, Aug. 2006.[8] X. Yang and D. Wetherall, "Source Selectable Path Diversity Via Routing

Deflections," Proc. ACM SIGCOMM, Aug. 2006.[9] D. Xu, M. Chiang, and J. Rexford, "DEFT: Distributed Exponentially Weight-

ed Flow Splitting," Proc. IEEE INFOCOM, May 2007.[10] M. Motiwala, N. Feamster, and S. Vempala, "Path Splicing: Reliable Con-

nectivity With Rapid Recovery," Proc. SIGCOMM Wksp. Hot Topics in Net-working, Nov. 2007.

[11] D. G. Andersen et al., "Resilient Overlay Networks," Proc. ACM SOSP,Oct. 2001.

BiographiesJIAYUE HE ([email protected]) received her B.A.Sc. (Hon.) in engineering sciencefrom the University of Toronto in 2004. She is currently working toward herPh.D. degree at Princeton University. Her Ph.D. is partially funded by the Gor-don Wu Fellowship at Princeton University and the graduate fellowship fromNational Science and Engineering Research Council of Canada.

JENNIFER REXFORD ([email protected]) received her B.S.E. degree in electricalengineering from Princeton University in 1991, and her M.S.E. and Ph.D.degrees in computer science and electrical engineering from the University ofMichigan in 1993 and 1996, respectively. She is a professor in the ComputerScience Department at Princeton University. From 1996 to 2004 she was amember of the Network Management and Performance Department at AT&TLabs-Research. She serves on the CRA Board of Directors, the ACM Council, andthe GENI Science Council.

HE LAYOUT 3/6/08 1:45 PM Page 21