19
Route flap damping: harmful? NANOG / Eugene 2002.10.28 Randy Bush (IIJ) Tim Griffin (AT&T Research) Z. Morley Mao (UC Berkeley) <http://psg.com/~randy/021028.zmao-nanog>

Route flap damping : harmful? - NANOG - Meet us in Phoenix

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Route flap damping:harmful?

NANOG / Eugene2002.10.28

Randy Bush (IIJ)Tim Griffin (AT&T Research)Z. Morley Mao (UC Berkeley)

<http://psg.com/~randy/021028.zmao-nanog>

2002.10.27 Flap 2

Verified using BGP Beacons

§ BGP Beacon:§ A prefix that is announced and withdrawn at

well-known times

Internet cloud

1st BGP Beacon198.133.206.0

Observation point3Berkeley

Observation point1Oregon Route Views

Observation point2RIPE

2002.10.27 Flap 3

Transient instability

§ Router reboot§ Due to circuit or software upgrade, etc.

§ A single link flap§ Due to network congestion, link connectivity

problems etc.

Rout

e av

aila

bilit

y

availableRoute is suppressed!

unavailable

withdrawal

one hour

available

Re-announcement

< 1 hour one hour

2002.10.27 Flap 4

What is route flap damping?

§ RFC2439§ Supported by all major router vendors§ Believed to be widely deployed§ Responsible for Internet stability?

§ Goals:§ Reduce router processing load due to instability§ Do not sacrifice convergence times for well-

behaved routes (!?)

2002.10.27 Flap 5

How does route flap dampingwork?

Exponentially decayed

§ For each peer, perdestination, keep a penaltyvalue

§ Penalty increases for eachflap

§ A flap is a route change§ Penalty decays exponentially

§ Parameters:§ Fixed:

Penalty increment§ Configurable:

half-life, suppress-,reuse-threshold, maxsuppressed time

)'()()'( ttetPtP −−= λ

0

1000

2 32

Reuse threshold

750

4

Suppress threshold

2000

Time (min)

Penalty

3000

(using default Cisco parameters)

2002.10.27 Flap 6

Router vendor default values

§ Cisco§ Three flaps can

suppress route

§ Juniper§ Minimum four flaps to

suppress route

§ Example:§ Three flaps with 2 min

interval§ Cisco: suppress on the

third flap for morethan 28 minutes

6060Max suppress time (min)

750750Reuse threshold

1515Half-life (min)

30002000Suppress threshold

500500Attributes change penalty

10000Re-advertisement penalty

10001000Withdrawal penalty

JuniperCiscoParameter

2002.10.27 Flap 7

Cascaded withdrawals! (2)§ Peer: 212.47.190.1, AS=9177 from RIPE§ In response to WD-beacon at 18:00, Aug 10th.§ Using Cisco setting + RIPE229 recommendation

27529177 6730 5400 2914 3130 3927A18:04:03

3694W18:04:31

2812W18:03:35

18539177 3320 1239 2914 3130 3927A18:03:06

14459177 3320 2914 3130 3927A18:01:41

9909177 6730 5400 2914 3130 3927A18:00:41

5009177 3320 1 2914 3130 3927A18:00:15

PenaltyASPathA/WTime 8/10

Above suppress threshold

2002.10.27 Flap 8

Cascaded withdrawals! (1)§ Peer: 213.200.87.254, AS=3257 from RouteViews§ In response to WD-beacon at 01:00, Aug 20th.§ Using Cisco setting + RIPE229 recommendation§ (Note: first 2 announcements differ in community

attributes)

3354W01:02:05

24513257 1299 701 2914 3130 3927A01:01:13

19853257 1299 4200 2914 3130 3927A01:00:50

1985W01:00:50

9883257 1299 2914 3130 3927A01:00:47

5003257 1299 2914 3130 3927A01:00:16

PenaltyASPathA/WTime 8/20

Above suppress threshold

2002.10.27 Flap 9

Why does this happen?

§ BGP is a path vector protocol§ Explores alternate routes before withdrawal§ Topology dependent

§ Delay in messages due to variations in§ MinRouteAdver timer values§ Propagation delays§ Router processing overhead

§ Route flap damping parameter setting§ Cisco/Juniper punishes virtually all route changes§ Default setting and RIPE-229 recommendation are

too aggressive

2002.10.27 Flap 10

The Naïve View

2002.10.27 Flap 11

… reality is a wonderful thing

2002.10.27 Flap 12

No wonder I’m going crazy tryingto interpret those BGP updates ….It is easy to construct a 5 node BGP system where a simple Announce/Withdraw signal (a_0 b_0) at one node can produce any of these 52 output signals at another…

a_0 b_0

2002.10.27 Flap 13

Max no. updates for ANN-signal

2002.10.27 Flap 14

Max no. updates for WD-signal

2002.10.27 Flap 15

Multihoming Won’t Save You

§ Damping is not per path§ Damping is per prefix!§ Sure, multi-homing gives you more

paths§ But damping just loves more paths

2002.10.27 Flap 16

Preliminary ideas to improveRoute flap damping

§ Change the constants§ Descrease suppression threshold§ Decrease penalty increment, especially for

attribute changes

§ Change the Algorithms§ Do we really need damping at all?§ Note that damping§ is opaque§ happens on the other side of the internet§ and is hard to debug

2002.10.27 Flap 17

Existing BGP Beacons

§ Announced and withdrawn with a fixed period(2 hours) between updates

§ 1st daily ANN: 3:00AM GMT§ 1st daily WD: 1:00AM GMT

Geoff HustonAS12219/25/021221203.10.63.0/24

10/24/02

9/4/02

8/10/02

Startdate

Andrew PartanAS2914, AS80013944198.32.7.0/24

David MeyerAS3701, AS29145637192.135.183.0/24

Randy BushAS2914, AS13927198.133.206.0/24

Beacon HostUpstreamProviders

SourceAS

Prefix

2002.10.27 Flap 18

Reference

§ RIPE-229: “RIPE Routing-WG Recommendations forCoordinated Route-flap Damping Parameters”, 2001

§ RFC 2439: “BGP Route Flap Damping”, 1998§ Randy Bush, Tim Griffin, Z. Mao, “Route flap

Damping: harmful?” RIPE 43, 2002§ Tim Griffin, “What is the sound of one route flapping”

Dartmouth talk slides, June 2002§ C. Labovitz, G. R. Malan, F. Jahanian, “Internet

Routing Instability”, TON 1998§ C. Labovitz, R. Malan, F. Jahanian, “Origins of

Internet Routing Instability”, Infocom 1999

2002.10.27 Flap 19

More references

§ C. Labovitz, A. Ahuja, F. Jahanian, “ExperimentalStudy of Internet Stability and Wide-Area NetworkFailures”, FTCS 1999

§ C. Labovitz, A. Ahuja, A. Bose, F. Jahanian, “DelayedInternet Routing Convergence” Sigcomm 2000

§ C. Labovtiz, A. Ahuja, R. Wattenhofer, S.Venkatachary, “The Impact of Internet Policy andTopology on Delayed Routing Convergence”, Infocom2001

§ Z. Mao, R. Govindan, G. Varghese, R. Katz “RouteFlap Damping Exacerbates Internet RoutingConvergence” Sigcomm 2002