6
IP Fast Reroute with Failure Inferencing Junling Wang and Srihari Nelakuditi Department of Computer Science and Engineering University of South Carolina {wang257,srihari}@cse.sc.edu ABSTRACT Five nines availability is being expected from IP networks due to the growing popularity of IP telephony and the in- creasing usage of the Internet for mission-critical applica- tions. This necessitates enhancing the resiliency of IP net- works against transient failures that are observed to happen relatively frequently even in well-managed networks. To- wards that end, we proposed failure inferencing based fast rerouting (FIFR) approach that exploits the existence of a forwarding table per line-card, for lookup efficiency in cur- rent routers, to provide fast rerouting similar to MPLS, while adhering to the destination-based forwarding paradigm. Earlier, we have shown that FIFR can deal with either single link or single node failures in a network consisting of point- to-point links with symmetric link weights. In this paper, we generalize FIFR to handle both link and node failures in net- works with asymmetric link weights and multi-access links too. Furthermore, we apply FIFR for protecting against inter-AS failures also. With these extensions, we argue that FIFR elevates the resiliency of any IP network with minimal changes to the forwarding and routing planes. Categories and Subject Descriptors: C.2.2 [Network Protocols]: Routing Protocols General Terms: Algorithms Keywords: Fast Rerouting, Failure Protection 1. INTRODUCTION With the increasing dependence on the Internet for de- livering various critical services, it is expected to be always available. In particular, applications such as Voice over IP demand five-nines availability (99.999% uptime) from IP networks as is the case with traditional telephone networks. But it has been observed that transient failures happen rel- atively frequently even in well-managed IP backbone net- works [10]. The existing IGPs such as OSPF and IS-IS typi- cally take several seconds to converge and resume forwarding after a failure, whereas failure recovery within 50 ms is con- Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. INM’07, August 27–31, 2007, Kyoto, Japan. Copyright 2007 ACM 978-1-59593-788-9/07/0008 ...$5.00. sidered desirable for mission-critical applications [3]. Recent studies have shown that the IGP convergence can be acceler- ated to sub-second by tuning some parameters of these rout- ing protocols [7], but bringing it down to sub-50ms level can potentially cause instability in the network, particularly due to hot-potato routing [13]. MPLS provides fast local rerout- ing effectively with label stacking [12], but it is not scal- able, as it requires a careful configuration of many backup label-switched paths for protection. As an alternative, IP fast reroute mechanisms have been proposed that make use of loop-free alternates [2], u-turn alternates [1], not-via ad- dresses [4], or multiple routing configurations [9], for local rerouting upon a failure. While each of these mechanisms have their relative strengths, they either do not reroute to all destinations or rely on encapsulation or marking of data- grams for rerouting. We are interested in an IP fast reroute scheme that requires little or no changes to datagram for- mat, forwarding process, or routing protocol while ensuring forwarding continuity to all reachable destinations. We proposed a failure inferencing based fast rerouting (FIFR) approach earlier for handling transient failure of a link [11] or a node [15] in an IP network, leveraging the existence of a forwarding table per line-card of a router. Under FIFR, routers prepare for failures by computing interface-specific forwarding tables, i.e., a packet’s next hop depends on both its destination address and incoming interface. A router, without being explicitly notified of a failure, can infer it from the incoming interface and destination address, and precompute interface-specific forwarding entries excluding the inferred failures. When a link/node fails, adjacent router suppresses global advertisement and instead initiates local rerouting of packets that were to be forwarded through the failed link/node. All other routers simply forward packets according to their precomputed interface-specific forward- ing tables without relying on network-wide link-state adver- tisements. The attraction of FIFR is that the only change needed to the existing routers, with a line-card per inter- face, for providing protection against single link/node fail- ures, is to replace the Dijkstra’s algorithm for computing interface-independent forwarding entries with an algorithm to compute an additional interface-specific forwarding entry per destination. However, our earlier approach to handle node failures treats a link failure also as a failure of its tail node. This presumption could result in no route to the des- tination, even when there exists a path to it without the failed link, if the tail node partitions the network. More- over, our previous work assumes that the network consists of point-to-point links with symmetric link weights only. 268

[ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

  • Upload
    srihari

  • View
    214

  • Download
    0

Embed Size (px)

Citation preview

Page 1: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

IP Fast Reroute with Failure Inferencing

Junling Wang and Srihari NelakuditiDepartment of Computer Science and Engineering

University of South Carolina{wang257,srihari}@cse.sc.edu

ABSTRACTFive nines availability is being expected from IP networksdue to the growing popularity of IP telephony and the in-creasing usage of the Internet for mission-critical applica-tions. This necessitates enhancing the resiliency of IP net-works against transient failures that are observed to happenrelatively frequently even in well-managed networks. To-wards that end, we proposed failure inferencing based fastrerouting (FIFR) approach that exploits the existence of aforwarding table per line-card, for lookup efficiency in cur-rent routers, to provide fast rerouting similar to MPLS,while adhering to the destination-based forwarding paradigm.Earlier, we have shown that FIFR can deal with either singlelink or single node failures in a network consisting of point-to-point links with symmetric link weights. In this paper, wegeneralize FIFR to handle both link and node failures in net-works with asymmetric link weights and multi-access linkstoo. Furthermore, we apply FIFR for protecting againstinter-AS failures also. With these extensions, we argue thatFIFR elevates the resiliency of any IP network with minimalchanges to the forwarding and routing planes.

Categories and Subject Descriptors: C.2.2 [NetworkProtocols]: Routing ProtocolsGeneral Terms: AlgorithmsKeywords: Fast Rerouting, Failure Protection

1. INTRODUCTIONWith the increasing dependence on the Internet for de-

livering various critical services, it is expected to be alwaysavailable. In particular, applications such as Voice over IPdemand five-nines availability (99.999% uptime) from IPnetworks as is the case with traditional telephone networks.But it has been observed that transient failures happen rel-atively frequently even in well-managed IP backbone net-works [10]. The existing IGPs such as OSPF and IS-IS typi-cally take several seconds to converge and resume forwardingafter a failure, whereas failure recovery within 50 ms is con-

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee.INM’07, August 27–31, 2007, Kyoto, Japan.Copyright 2007 ACM 978-1-59593-788-9/07/0008 ...$5.00.

sidered desirable for mission-critical applications [3]. Recentstudies have shown that the IGP convergence can be acceler-ated to sub-second by tuning some parameters of these rout-ing protocols [7], but bringing it down to sub-50ms level canpotentially cause instability in the network, particularly dueto hot-potato routing [13]. MPLS provides fast local rerout-ing effectively with label stacking [12], but it is not scal-able, as it requires a careful configuration of many backuplabel-switched paths for protection. As an alternative, IPfast reroute mechanisms have been proposed that make useof loop-free alternates [2], u-turn alternates [1], not-via ad-dresses [4], or multiple routing configurations [9], for localrerouting upon a failure. While each of these mechanismshave their relative strengths, they either do not reroute toall destinations or rely on encapsulation or marking of data-grams for rerouting. We are interested in an IP fast reroutescheme that requires little or no changes to datagram for-mat, forwarding process, or routing protocol while ensuringforwarding continuity to all reachable destinations.

We proposed a failure inferencing based fast rerouting (FIFR)approach earlier for handling transient failure of a link [11]or a node [15] in an IP network, leveraging the existence ofa forwarding table per line-card of a router. Under FIFR,routers prepare for failures by computing interface-specificforwarding tables, i.e., a packet’s next hop depends on bothits destination address and incoming interface. A router,without being explicitly notified of a failure, can infer itfrom the incoming interface and destination address, andprecompute interface-specific forwarding entries excludingthe inferred failures. When a link/node fails, adjacent routersuppresses global advertisement and instead initiates localrerouting of packets that were to be forwarded through thefailed link/node. All other routers simply forward packetsaccording to their precomputed interface-specific forward-ing tables without relying on network-wide link-state adver-tisements. The attraction of FIFR is that the only changeneeded to the existing routers, with a line-card per inter-face, for providing protection against single link/node fail-ures, is to replace the Dijkstra’s algorithm for computinginterface-independent forwarding entries with an algorithmto compute an additional interface-specific forwarding entryper destination. However, our earlier approach to handlenode failures treats a link failure also as a failure of its tailnode. This presumption could result in no route to the des-tination, even when there exists a path to it without thefailed link, if the tail node partitions the network. More-over, our previous work assumes that the network consistsof point-to-point links with symmetric link weights only.

268

Page 2: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

We make several contributions in this paper to strengthenFIFR and make it applicable to a broad range of IP net-works. First, we revise FIFR to guarantee loop-free for-warding to a destination if it is reachable, regardless of thefailure of a link or a node in the network. Second, we gen-eralize FIFR to handle failures in networks of bidirectionallinks with different link weights in each direction. This re-quires revamping of the procedures for inferring potentiallink/node failures and computing interface-specific forward-ing entries. A noteworthy feature of the new version of FIFRis that it works fine in a network with a mix of symmetricand asymmetric link weights, and behaves exactly like theearlier version when all links have symmetric weights. Third,we show that FIFR is suitable even for networks with multi-access links by modelling each such link as a virtual nodewith an adjacency between it and each of the nodes adjacentto that link. Finally, we apply FIFR for protection againstthe failure of an AS peering link or an AS border node in caseof an AS with multiple egresses to a destination. With theseextensions, we believe FIFR would become a compelling al-ternative to the existing IP fast reroute schemes.

2. LINK AND NODE FAILURESIn this section, we introduce the FIFR approach for han-

dling transient failures in a network consisting of point-to-point links with symmetric link weights. First, we describethe general framework of the FIFR approach. We thenpresent the details on how to handle link failures, followedby node failures, and finally both link and node failures.

2.1 FIFR FrameworkUnder FIFR, a router usually forwards a packet to the

next-hop along the shortest path to its destination. But incase of a failure, the router adjacent to it locally reroutespackets that were affected by the failure to alternate next-hops without notifying all other routers about the failure.The central idea behind the FIFR approach is the abilityof a router to infer potential non-adjacent failures from theincoming interface and the destination of a packet withoutbeing explicitly informed of the failure. When a packet ar-rives at a router through an unusual interface along the re-verse shortest path1, due to local rerouting by the routeradjacent to the failure, through which it would never arrivehad there been no failure, that router can identify the cor-responding set of potential individual failures. Since theseinferences can be made in advance, the forwarding entry as-sociated with that interface and destination can be precom-puted excluding the corresponding potential failures fromthe network topology. Due to such failure inferencing andinterface-specific forwarding, in case of a single failure, FIFRcan achieve both loop-freedom and destination-reachability,the two fundamental objectives of any fast rerouting scheme.

The packet forwarding under FIFR can be summarized asfollows. Each router i under FIFR maintains a routing tableentry Rd

i per each destination d. In addition, it keeps a for-warding table entry Fd

j→i and a backwarding table entry Bdi→j

per each neighbor j and destination d. A packet originatingat i to destination d is forwarded to Rd

i . A packet destined

1Since FIFR operates within AS, we do not expect it tointerfere with unicast reverse path filtering (uRPF) [6] de-ployed at network ingress nodes for defeating denial of ser-vice attacks which employ IP source address spoofing.

Table 1: NotationV set of all verticesE set of all edgesE(n) set of all edges adjacent to node nRd

i set of next hops from i to dFd

j→i set of next hops from j→i to d.Bd

i→j set of back hops from i→j to d.KLd

j→i key link corresponding to j→i and d.KN d

j→i key node corresponding to j→i and d.SPT(i,V ′, E ′) shortest path tree at i w.r.t. (V ′, E ′)rSPT(i,V ′, E ′) reverse SPT rooted at i w.r.t (V ′, E ′)N(d, T ) next hops to d from root of SPT TP (d, T ) previous hops from d to root of rSPT T

for d arriving at i through neighbor j is forwarded to Fdj→i.

When an adjacent link i−j fails or adjacent node j fails,router i locally reroutes a packet destined for d to Bd

i→j , if itwere to be forwarded to j had there been no failure.

The computation ofRdi and Bd

i→j is straightforward whereas

the computation of Fdj→i is intricate but captures the essence

of FIFR approach. Based on the notation in Table 1,

Rdi = N(d, SPT(i,V, E))

The computation of Bdi→j depends on whether the adjacent

failure is treated as a link or a node failure. Assuming onlylink failures, for ease of illustration, we have

Bdi→j = N(d, SPT(i,V, E \ {i−j}))

Again, assuming only links fail, before computing Fdj→i, we

need to infer KLdj→i, a key link failure among all the potential

failures that would cause a packet to destination d arrive atnode i through neighbor j. We can then compute Fd

j→i by

removing KLdj→i from the network topology as follows.

Fdj→i = N(d, SPT(i,V, E \ {KLd

j→i}))

2.2 Identifying Key LinksWe now describe how a key link is identified for each in-

terface and destination. A link u→v is considered to be acandidate key link w.r.t. interface j→i and destination d, ifit satisfies both of the following conditions:

1. with u→v, j is a next hop from i to d.

2. without u→v, shortest path from u to d contains j→i.

In other words, u→v is a candidate if it is along the shortestpath from i via j to d, and it’s failure causes a packet ford to arrive at i through j along the reverse shortest path.Since these candidate links are common to all the shortestpaths from i to d [11], we can find the one closest to thedestination, referred to as the key link KLd

j→i. Once the keylink is identified, the forwarding entry can be computed asshown earlier by excluding it from the network topology. Wehave shown that if there exists a path from i to d withoutthe failed link, there exists a path without the key link [11].

Consider the topology shown in Fig. 1 where each link islabelled with its weight. When a packet destined to F arrivesat A via B, router A can infer that either B−E or E−F musthave been down. Therefore, the key link corresponding tointerface B→A and destination F, KLF

B→A is E−F since it

269

Page 3: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

is closer to F than B−E. Consequently, FFB→A is H, i.e., A

forwards the packet to H ensuring that it reaches destinationF without actually knowing whether B−E or E−F failed.

2.3 Identifying Key NodesAnalogous to key link inferencing, we can also identify key

nodes assuming only node failures. A node v is considered tobe a candidate key node w.r.t. interface j→i and destinationd, if it satisfies both of the following conditions:

1. with v, j is a next hop from i to d.

2. without v, the shortest path from a parent node of v(w.r.t. SPT(i,V, E)) to d contains j→i.

Among all the candidates, the one closest to d is referred toas the key node, KN d

j→i. Hence, for handling node failures,the forwarding and backwarding table entries would be

Bdi→j = N(d, SPT(i,V \ {j}, E \ E(j)))

Fdj→i = N(d, SPT(i,V \ {KN d

j→i}, E \ E(KN dj→i)))

Consider the previous illustration based on Fig. 1 where apacket destined for F arrives at A via B. For the sake of con-venience, let us refer to the link version of FIFR as FIFRL

and the node version as FIFRN . Under FIFRN , A wouldidentify E as the key node, KNF

B→A, and forward the packetto H as before. Now imagine a scenario where a packet isbeing forwarded from A to D when node H failed. UnderFIFRL, A treats it as the failure of A−H and reroutes thepacket to G which in turn reroutes it back to A, resulting ina forwarding loop. In contrast, under FIFRN , A reroutes thepacket to D along the path A−B−E−F−D. But in anotherscenario, when link H−D is down, a packet from A to D issuccessfully forwarded by FIFRL while it is discarded un-der FIFRN by H incorrectly assuming that node D is down.This is referred to as the last-hop problem with FIFRN [9].

2.4 Merging Key Links and NodesDue to the discrepancy in the preparation and the occur-

rence of the type of failures, there could be a forwarding loopin case of a node failure, if we prepare only for link failuresas in FIFRL. On the other hand, some destinations maynot be reachable in case of a link failure if it is treated asa node failure like in FIFRN . Note that the latter happensonly when the failure of a single node partitions the networkor a link adjacent to the destination, i.e., last-hop, is down.Therefore, we propose to remedy this by having a router un-der FIFR treat an adjacent failure as a node failure, unlessdoing so causes the destination to be unreachable, in whichcase it will be treated as a link failure. Thus, we have

Bdi→j = N(d, SPT(i,V \ {j}, E \ E(j)))

If Bdi→j is ∅, then

Bdi→j = N(d, SPT(i,V, E \ {i−j}))

The other nodes non-adjacent to the failure identify boththe key node KN d

j→i and the key link KLdj→i. If KN d

j→i is ∅,

Fdj→i = N(d, SPT(i,V, E \ {KLd

j→i}))

Otherwise,

Fdj→i = N(d, SPT(i,V \ {KN d

j→i}, E \ E(KN dj→i)))

1

1

1

2 2

1

C

A E F

1

D

3

G

2

H

1

B

Figure 1: Topology with symmetric link weights

1

A

C

D

GFE

B

5

1

5

51

1

1

1 2

25 2

1

1

1

6

1

Figure 2: Topology with asymmetric link weights

One final wrinkle is that whenever a failure adjacent tonode i is treated as the failure of link i−j, to avoid forward-ing loops when node j indeed failed, node i needs to encap-sulate the packet with destination d, using generic routingencapsulation [5], in another packet with destination j be-fore rerouting to Bj

i→j . When the packet arrives at j, it isdecapsulated and forwarded to destination d. FIFR dictatesthat no more than one-level of encapsulation is done perpacket. Consequently, when node j is indeed down, the en-capsulated packet would arrive at another node adjacent toj, which would discard it since it has already been reroutedas indicated by the encapsulation. We understand that en-capsulation/decapsulation add additional burden on routers.Note that a packet is encapsulated under FIFR only whenthe failure of a node renders its destination to be unreach-able, which happens rarely in practical topologies. Thus,the revised FIFR forwards successfully to all reachable des-tinations while being agnostic to the type of failure.

3. ASYMMETRIC LINK WEIGHTSAll the versions of FIFR presented above assume that the

network consists of links with symmetric weights only. Theyall share the property that a non-adjacent router infers keyfailure and performs unusual forwarding only when a packetarrives through an unusual interface of the router along thereverse shortest path w.r.t. the destination. Otherwise, ifthe packet arrives through any other interface, the routerjust forwards it to the usual next-hop to its destination. Sim-ilarly, a router adjacent to the failure determines backward-ing table entries by simply excluding the failed link/node, asit would compute the usual routing table entries if the adja-cent link/node is permanently down. While these relativelysimple operations suffice to handle failures in networks withsymmetric link weights, forwarding loops can occur due tolocal rerouting, when link weights are asymmetric. In thefollowing, we illustrate the problem and present the fix byredesigning backwarding and forwarding table computation.

Suppose the network is as given in Fig. 2 where eachdirected link is labelled with its weight. For ease of il-

270

Page 4: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

lustration, consider the forwarding of a packet from C toD, under FIFRL, when link C−D is down. Upon detect-ing the failure, node C recomputes the alternate path asC→B→A→E→F→D and forwards the packet to B. NodeB infers the failure of link C−D and forwards the packet tonode A according to its precomputed interface-specific for-warding table entry. Node A in turn forwards the packet toits usual next-hop E. When node E receives the packet fromA, however, it will not be able to infer any failure, since in-terface A→E is a usual interface from A via E to reach D.Therefore, E forwards the packet to its usual next-hop B,which in turn forwards it to its usual next-hop C, resultingin a forwarding loop C→B→A→E→B→C→B· · · .

The shortest path from node s to node d in an asymmetricnetwork is not the exact reverse of that from d to s. Forexample, in Fig. 2, node E reaches B directly while B reachesE through A. This leads to forwarding loops in asymmetricnetworks since link weights used during usual forwardingand local rerouting upon failures are inconsistent. To enforceconsistency, upon a failure adjacent to s, we require that apacket from s to d is forwarded along rrSP(s, d), i.e., thereverse path of the shortest path from d to s. Note thatrrSP(s, d) equals the shortest path from s to d when alllinks in the network are symmetric. In order to computerrSP, a router can build the reverse shortest path tree rSPT,where each path from the root represents a rrSP. Similarly,non-adjacent routers infer failures and compute forwardingentries for unusual interfaces using rSPT as described below.Note that rrSP is only used for local rerouting upon a failureto affected destinations, while the conventional shortest pathis still used for failure-free forwarding.

Let us consider the previous example again where a packetis being forwarded from C to D. The shortest path fromD to C without C−D is D→G→F→E→B→C. Therefore,rrSP(C,D) excluding C−D is C→B→E→F→G→D. NodeB, which is not adjacent to the failed link C−D, can inferthe failure associated with the corresponding reverse usualinterface by computing the rrSP to D, and forward it to Einstead of A as before. Node E would also infer similarly asB and forward the packet to F. Since E is not the usual nexthop of F to D, F would forward the packet to its usual next-hop which is node D itself. Thus, a packet from C arrivesat D along a loop-free path C→B→E→F→D.

The computation of backwarding and forwarding table en-tries have to be done differently to accommodate asymmetriclink weights. First, backwarding entries would be,

Bdi→j = P (d, rSPT(i,V \ {j}, E \ E(j)))

If Bdi→j is ∅, then

Bdi→j = P (d, rSPT(i,V, E \ {i−j}))

The computation of forwarding entries involves redefiningthe conditions for identifying candidate key links and nodes.A link u→v is a candidate key link w.r.t. j→i and d, if

1. with u→v, j is a next hop from i to d.

2. without u→v, rrSP(u,d) contains j→i.

A node v is a candidate key node w.r.t. j→i and d, if

1. with v, j is a next hop from i to d.

2. without v, the rrSP from a parent node of v (w.r.t.SPT(i,V, E)) to d contains j→i.

The key node KN dj→i and key link KLd

j→i will still be thoseclosest to d among the corresponding candidates. Now,if KN d

j→i 6= ∅,

Fdj→i = P (d, rSPT(i,V \ {KN d

j→i}, E \ E(KN dj→i)))

else if KLdj→i 6= ∅,

Fdj→i = P (d, rSPT(i,V, E \ {KLd

j→i}))

otherwise,

Fdj→i = N(d, SPT(i,V, E))

When all links are symmetric, the above formulation yieldsthe same backwarding and forwarding entries as before.

4. MULTI-ACCESS LINKSThe description of FIFR so far assumed that the network

consists of only point-to-point links such that each routerknows the neighbor attached to the incoming interface of apacket, enabling it to infer failures. It is not obvious whetherFIFR can be generalized to work with broadcast LANs andnon-broadcast multi-access (NBMA) links where multipleneighbors are attached to a router through the same inter-face. It appears as if FIFR can not be employed in suchnetworks, since a router seems not to know which neighboris forwarding it the packet. However, in the following, weargue that it is still possible to infer non-adjacent failureseven in the presence of multi-access links.

4.1 Failure DetectionCompared to point-to-point links where a failure is either

a link or a node failure, broadcast links have more possiblescenarios of failures. We consider, w.r.t. a node i and a next-hop node j, the failure of one of the following: 1) interfaceof i to LAN; 2) LAN itself; 3) interface of j to LAN; 4) nodej. We treat the first two as LAN failure and the last twoas node failure. We assume that the linecard hardware of arouter attached to a broadcast LAN can detect the failureof its connection to LAN, as in SDH/SONET and GigabitEthernet, which could signal either the LAN failure or theinterface failure. In order to handle the failure promptly,we consider it as the LAN failure without further diagnosis.However, in general there is no easy way for a router tolearn from the hardware about the failure of another routeron the broadcast LAN. Bidirectional Forwarding Detection(BFD [8]), an implementation of faster hello, can be used inthis case to detect the individual router failure.

4.2 Modeling Broadcast LinksInstead of modeling the connectivity between each pair of

routers in a broadcast LAN as a point-to-point link, OSPFrepresents a LAN as a virtual node in the link state databaseand elects a designated router (DR) for the LAN. Only theDR forms the adjacency between the virtual node and theother routers in the LAN, and is responsible for generatingLSAs for the LAN itself. Fig. 3 illustrates a simple exampleof a routing domain that contains a LAN connecting threerouters, C, E and F. Under OSPF, the LAN is modeled asa virtual node N in the link state database with the cost ofedges from N to routers as 0 as shown in Fig. 4. By repre-senting the LAN as a virtual node in the topology graph, theLAN interfaces are modeled as point-to-point links between

271

Page 5: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

G1 2

C

F

1

1

1

1

E

A

D6

7

2

2

B

Figure 3: A network with routers on a LAN

G

1

A C

F

1

1

2

1

1

0

0

0

N

D

E

6

7

B

2

2

Figure 4: A representation of the LAN

the routers and the virtual LAN node, and the topology be-comes a graph with asymmetric links. Thus, FIFR describedin the previous section can be applied without any hitches.The following details how non-adjacent nodes infer failuresin a routing domain with broadcast LANs.

4.2.1 Inference of LAN FailureAs explained earlier, an adjacent router treats the failure

associated with a LAN as if the LAN itself failed. The othernon-adjacent routers can infer the LAN failure using theusual key node definition described in the previous section.Since the LAN is modeled as a virtual node, FIFR does notinfer the link failure for all links between the routers andthe LAN node. For example, in Fig. 4, links C−N, F−N andE−N are not considered to fail separately and therefore theycan not be the key links for any interface and destination.

4.2.2 Inference of LAN Router FailureWhen a router does not receive a BFD response from an-

other LAN router, it treats the failure as a node failure ifits failure does not partition the routing domain. For exam-ple, without any failure, C would forward a packet destinedfor D to F over the LAN. When C does not receive BFDresponse from F, it simply considers it as the failure of F.Node C then computes the alternate rrSP, which would beC→B→A→D and forwards the packet to B. Again, the otherrouters can infer the node failure through the usual defini-tion of the key node whose failure will cause its parent nodeto reroute the packet through the reverse usual interface.We want to stress that when FIFR is applied to LANs, theparent node of a router on LAN can not be the virtual LANnode as suggested by the topology, since the LAN node itselfcan not make rerouting decisions. In the previous example,the shortest path from B to D is B→C→N→F→D. How-ever, when B infers the key node, C should be consideredthe parent of node F instead of N. Otherwise, if N is treatedas the parent, the failure of F causes N to reroute the packetthrough E to D, not through C→B. Thus, F would not bethe key node for interface C→B and destination D. So, whenB receives a packet from C, it infers only the LAN failureand forwards the packet to G, causing the packet to oscillatebetween G and C. Except for taking care of this, FIFR can

AS1

A B

AS2

P

C DE

(a) Multiple peering links

AS1

A B

P

AS3 AS4

C DE

(b) Peering with multiple ASes

Figure 5: AS with multiple egress nodes to a prefix

be applied as is to networks with multi-access links.When the failure of a router on a LAN partitions the rout-

ing domain or specifically the router itself is the destination,it is not appropriate to treat it as the node failure. In thiscase, the router adjacent to the failure attempts to deliverthe packet to its destination by encapsulating the packet inanother packet with the next-hop before the failure as itsdestination and forwards it along an alternate path afterexcluding the virtual LAN node. This is similar to the pro-cedure we discussed in the previous section on merging keylinks and nodes. For example, if C has a packet destined forF, when it does not get BFD response from F, C finds analternate path through B to F after excluding the LAN. Incase node F indeed failed, the other adjacent node G will dis-card the encapsulated packet as it indicates that the packethas already encountered a failure before. In summary, thegeneralized version of FIFR protects against both link andnode failures in an IP network with point-to-point/multi-access links with symmetric/asymmetric weights.

5. INTER-AS FAILURESThe discussion of FIFR thus far has been about dealing

with failures inside an autonomous system (AS). We nowexplain how FIFR approach is applicable for dealing withnot only intra-AS failures but also inter-AS failures.

First, let us examine how the forwarding entry is com-puted currently at a node inside an AS to a destinationprefix outside that AS. Consider the scenario illustrated inFig. 5 where AS1 can reach destination prefix P via its egressnodes C and D through the same AS (5(a)) or different ASes(5(b)). The corresponding BGP NEXT-HOPs are A and Brespectively. One of them may be chosen as the best NEXT-HOP according to BGP policies based on attributes such asLOCAL PREF or MED. If there is a tie, the closest nodein terms of IGP distance is chosen as the best NEXT-HOP,as per hot-potato routing principle. Once the best BGPNEXT-HOP is chosen for each destination prefix, then theIGP next-hop to that BGP NEXT-HOP is computed basedon IGP metrics. In other words, a forwarding entry for adestination prefix P at a node E is determined by first map-ping from P to a BGP NEXT-HOP (A or B) and then fromBGP NEXT-HOP to an IGP next-hop (C or D).

Now, we describe how FIFR computes forwarding entriesto the destination prefix P at node E to protect against thefailure of any one of the AS border nodes A, B, C, D, and theinter-AS links A−C, B−D. We require that, for each prefix,apart from the best (primary) NEXT-HOP, FIFR is made

272

Page 6: [ACM Press the 2007 SIGCOMM workshop - Kyoto, Japan (2007.08.27-2007.08.31)] Proceedings of the 2007 SIGCOMM workshop on Internet network management - INM '07 - IP fast reroute with

aware by BGP of at least one more (secondary) NEXT-HOP.Suppose A is the primary and B the secondary NEXT-HOPto P. The core idea is then to assign an appropriate IGP coststo the “virtual links” A−P and B−P, as in Fig. 5, only forthe purpose of computing forwarding table entries at a node.If A is chosen as primary based on LOCAL PREF or MED,then a small cost of 1 is assigned to A−P, while assigninga sufficiently large cost to B−P. Otherwise, when primaryis chosen based on IGP distance, an equal cost is assignedto both the virtual links, i.e., cost(A−P) = cost(B−P) = 1.Note that the view of the topology including these virtuallinks is specific to the prefix, but is consistent across all thenodes within an AS. It may appear that the number of suchviews will be too many, but all the prefixes that have thesame set of primary and secondary NEXT-HOPs share thesame view. For example, in an AS such as AS1 in Fig. 5with two egress nodes, at most three such topology viewsare needed regardless of the number of prefixes.

Given the topology view corresponding to a destinationprefix P, FIFR can then be applied as is to determine interface-specific forwarding entries to P at a node such as E. Basedon this view, the failure of primary NEXT-HOP node A orprimary egress node C or primary peering link A−C, is in-ferred by E as the failure of node A, when the packet arrivesthrough an interface along the reverse shortest path fromE to A, i.e, A is the key node for that interface for desti-nation P. Therefore, E will forward the packet towards thesecondary NEXT-HOP B via egress node D. Since the topol-ogy view is consistent at all routers within the AS, the packetunder FIFR gets rerouted consistently from a node adjacentto the failure to the destination via secondary egress D.

Occasionally, FIFR may reroute packets to the secondaryegress on a few intra-AS failures even when the primaryinter-AS link is still up. This is because failure inferencingin FIFR is inclusive and intra-AS failures can cause somenodes to infer the inter-AS failure. It may be undesirablethat FIFR switches to the secondary even when there existsa path via the primary to the destination. But it only lastsfor a short period of time after the failure while it is beingsuppressed. In the other case where there is only one BGPpath to the destination, the key node then can not be theprimary NEXT-HOP. For example, assume that AS4 doesnot exist in Fig. 5(b). In this case, node A can not be the keynode to P for any interface, since its failure would not causethe packet to be rerouted but to be discarded. Therefore, onall intra-AS failures within a stub AS, FIFR would attemptto find an alternate route to its only egress node.

This problem of recovering from BGP peering failures isaddressed in [3] by relying on the provisioning of protec-tion tunnels between the primary egress and the secondaryegress or the NEXT-HOP. In contrast, FIFR does loop-freelocal rerouting upon any single intra-AS or inter-AS failurewithout setting up of additional protection tunnels.

6. CONCLUSIONSIn this paper, we revised a previously proposed FIFR ap-

proach for local rerouting around failed links and nodes with-out explicit link state updates. We generalized FIFR toguarantee protection against any single failure of intra-ASor inter-AS, link or node, in a network with point-to-pointor multi-access links with symmetric or asymmetric weights.Due to space limitation, no proofs on the correctness or com-plexity of FIFR are included here but can be found in [14].

7. ACKNOWLEDGEMENTSWe thank Amund Kvalbein and Audun Fosselie Hansen

from Simula Research Laboratory for their valuable sugges-tions on extending the FIFR approach to handle inter-ASfailures. We also acknowledge the support of the NationalScience Foundation under CAREER Award CNS-0448272and CRI grant CNS-0551650. The opinions and findings inthis paper are those of the authors and do not necessarilyrepresent the views of the National Science Foundation.

8. REFERENCES[1] Atlas, A. U-turn Alternates for IP/LDP

Fast-Reroute. IETF Internet Draft, Feb. 2005.draft-atlas-ip-local-protect-uturn-02.txt.

[2] Atlas, A., et al. Loop-Free Alternates for IP/LDPLocal Protection”. Internet Draft(work in progress),May 2005. draft-atlas-ip-local-protect-loopfree-00.txt.

[3] Bonaventure, O., Filsfils, C., and Francois, P.Achieving Sub-50 Milliseconds Recovery Upon BGPPeering Link Failures.

[4] Bryant, S., Shand, M., and Previdi, S. IP FastReroute using Not-via Addresses. Internet Draft(workin progress), Mar. 2006.draft-bryantshand-IPFRR-notvia-addresses-02.txt.

[5] Farinacci, D., Li, T., Hanks, S., Meyer, D., andTraina, P. Generic Routing Encapsulation (GRE).RFC 2784, Mar. 2000.

[6] Ferguson, P., and Senie, D. Network IngressFiltering: Defeating Denial of Service Attacks whichemploy IP Source Address Spoofing. RFC 2827, May2000.

[7] Francois, P., Filsfils, C., Evans, J., andBonaventure, O. Achieving Sub-Second IGPConvergence in Large IP Networks. ACM SIGCOMMComputer Communications Review 35, 2 (July 2005),35–44.

[8] Katz, D., and Ward, D. Bidirectional ForwardingDetection. draft-ietf-bfd-base-06.txt, Mar. 2007.

[9] Kvalbein, A., Hansen, A., Cicic, T., Gjessing, S.,and Lysne, O. Fast IP Network Recovery usingMultiple Routing Configurations. In Proc. IEEEInfocom (Apr. 2006).

[10] Markopulu, A., Iannaccone, G., Bhattacharya,S., Chuah, C.-N., and Diot, C. Characterization offailures in an IP backbone. In Proc. IEEE Infocom(Mar. 2004).

[11] Nelakuditi, S., Lee, S., Yu, Y., Zhang, Z.-L.,and Chuah, C.-N. Fast Local Rerouting for HandlingTransient Link Failures. IEEE/ACM Trans.Networking 15, 2 (Apr. 2007), 359–372.

[12] Sharma, V., and Hellstrand, F. Framework forMPLS-based Recovery. RFC 3469, Feb. 2003.

[13] Teixeira, R., Shaikh, A., Griffin, T., andRexford, J. Dynamics of Hot-Potato Routing in IPNetworks. In Proc. ACM Sigmetrics (June 2004).

[14] Wang, J., and Nelakuditi, S. IP Fast Reroute withFailure Inferencing. Tech. Rep. TR-2007-006,University of South Carolina, June 2007.

[15] Zhong, Z., Nelakuditi, S., Yu, Y., Lee, S., Wang,J., and Chuah, C. Failure Inferencing based FastRerouting for Handling Transient Link and NodeFailures. In GI (Mar. 2005).

273