Upload
others
View
14
Download
0
Embed Size (px)
Citation preview
.
.
BGPBorder Gateway Protocol (an introduction)
Karst Koymans
Informatics InstituteUniversity of Amsterdam
(version 3.10, 2014/03/11 10:50:06)
Monday, March 10, 2014
.
General ideas behind BGPBackgroundProviders, Customers and PeersExternal and Internal BGPBGP information bases
The BGP protocolBGP attributesBGP messages
Traffic EngineeringOutbound Traffic EngineeringInbound Traffic Engineering
IBGP scaling
.
BGP version 4
▶ Border Gateway Protocol version 4 (BGP4)▶ Specified in RFC 4271▶ The inter-AS routing protocol▶ “Monopolises” the Internet▶ Based on path vector routing
▶ which is inbetween distance vector and link state▶ Uses (often non-coordinated) routing policies
▶ which can be problematic for convergence
.
Autonomous system (AS)
Definition (AS — Autonomous System)▶ A connected group of networks and routers▶ Representing some assigned set of IP prefixes▶ Having a single, consistent routing policy▶ Both internally and externally
.
.
Autonomous system illustration
Autonomous Systems
AS2503 AS192
AS29077
3
Slide courtesy Iljitsch van Beijnum.
Providers and Customers
Internet Internet
Provider&&
IPff
OO
IP��
xx
IP88
Customer$$
OO
.
Peers
Provider 1 oo $$ // Provider 2 oo $$ // Provider 3
Customer 1$$
OO
��
IP��
OO
No packets
OOCustomer 2
OO
$$��
IP��
Customer 3$$OO
.
Providers, Customers and Peers
G1 oo$$
// G2
R1
$$66
P1$$
OO
oo $$ // P2
$$
OO
C1
$$
OO
��
IP
��
C2
$$==
��
IP!!C3
$$
aa
$$
==
C4
$$aa
.
.
The AS abstraction
AS Graph != Internet Topology
The AS graphmay look like this. Reality may be closer to this…
BGP was designed to throw away information!
Slide courtesy Timothy Griffin.
Providers, Customers and Peers routing preferences
▶ The order of preference for a route is▶ Customers have highest preference▶ Peers have the next highest preference▶ Providers have the lowest preference
▶ Transit relationships are enforced by export filtering▶ Do not advertise provider or peer routes
to other providers or peers▶ Do advertise all routes to customers▶ Do advertise customer routes to providers and peers
.
Providers, Customers and Peers: Import
Import Routes
Frompeer
Frompeer
Fromprovider
Fromprovider
From customer
From customer
provider route customer routepeer route ISP route
Slide courtesy Timothy Griffin.
Providers, Customers and Peers: Export
Export Routes
Topeer
Topeer
Tocustomer
Tocustomer
Toprovider
From provider
provider route customer routepeer route ISP route
filtersblock
Slide courtesy Timothy Griffin
.
.
External and Internal BGP (1)
▶ EBGP (External BGP)▶ Used for BGP neighbors between different ASs
▶ Exchanging prefixes▶ Implementing policies
▶ IBGP (Internal BGP)▶ Used for BGP neighbors within one and the same AS
▶ Distributing Internet prefixes across the backbonein order to create a consistent viewamong all entry/exit points
▶ Inserting locally originated prefixesfor instance for customers that do not speak BGP
.
External and Internal BGP (2)
▶ Routes imported from one IBGP peerare not distributed to another IBGP peer
▶ This prevents possible routing loops▶ Loop detection is based on duplicates in AS paths
▶ EBGP detects this between different ASs▶ IBGP cannot detect this inside one and the same AS
▶ Requires IBGP peers to be configured as a full mesh
.
Routing Information Bases (RIBs)
▶ Adj-RIB-In (one per peer)▶ Routes after input filtering▶ Every AS needs an input policy
▶ Loc-RIB (only one globally)▶ Routes after best path selection▶ Path selection is a fixed and specified algorithm
▶ Adj-RIB-Out (one per peer)▶ Routes after output filtering▶ Every AS needs an output policy
.
BGP route processing
52
BGP Route Processing
Best Route Selection
Apply Import Policies
Best Route Table
Apply Export Policies
Install forwardingEntries for bestRoutes.
ReceiveBGPUpdates
BestRoutes
TransmitBGP Updates
Apply Policy =filter routes & tweak attributes
Based onAttributeValues
IP Forwarding Table
Apply Policy =filter routes & tweak attributes
Open ended programming.Constrained only by vendor configuration language
Slide courtesy Timothy Griffin
.
.
BGP protocol
▶ Uses TCP over port 179▶ Usually with a directly connected neighbor on layer 2
▶ Exchanges Network Layer Reachability Information (NLRI)▶ Prefixes that can or can no longer be reached through the
router▶ Accompanied by BGP attributes used by the
best route selection algorithm
.
Some important BGP attributes
▶ In order of path selection importance▶ LOCAL PREF (Local Preference)▶ AS PATH▶ ORIGIN (Historical)▶ MULTI EXIT DISC (MED; Multi-exit discriminator)
▶ And unrelated to path selection▶ NEXT HOP
▶ Must be reachable (directly or via IGP)except in the case of multi-hop BGP
.
Next Hop in EBGP and IBGP
53
BGP Next Hop Attribute
Every time a route announcement crosses an AS boundary, the Next Hop attribute is changed to the IP address of the border router that announced the route.
AS 6431AT&T Research
135.207.0.0/16Next Hop = 12.125.133.90
AS 7018AT&T
AS 12654RIPE NCCRIS project
12.125.133.90
135.207.0.0/16Next Hop = 12.127.0.121
12.127.0.121
Slide courtesy Timothy Griffin.
Interaction between BGP and IGP
Forwarding Table
Forwarding Table
Join EGP with IGP For Connectivity
AS 1 AS 2192.0.2.1
135.207.0.0/16
10.10.10.10
EGP
192.0.2.1135.207.0.0/16
destination next hop
10.10.10.10192.0.2.0/30
destination next hop
135.207.0.0/16Next Hop = 192.0.2.1
192.0.2.0/30
135.207.0.0/16
destination next hop
10.10.10.10
+
192.0.2.0/30 10.10.10.10
Slide courtesy Timothy Griffin
.
.
BGP attribute types
▶ Well-known mandatory▶ ORIGIN, AS PATH, NEXT HOP
▶ Well-known discretionary▶ LOCAL PREF, ATOMIC AGGREGATE
▶ Optional transitive▶ COMMUNITIES, AGGREGATOR
▶ Optional non-transitive▶ MULTI EXIT DISC
.
LOCAL PREF (Local Preference)
▶ Advertised within a single AS (via IBGP)▶ Used to implement local policies▶ Can depend on any locally available information
▶ This might be learned outside of BGP▶ Default value is 100▶ Highest value wins
.
AS PATH
▶ Sequence of ASs▶ An AS can also be generalized to a set of ASs
▶ Used for loop detection▶ The sequence length defines the metric (distance)▶ Shortest path wins▶ Prepend your own AS in EBGP updates
▶ Possibly multiple times, enabling traffic engineering▶ Leave unchanged in IBGP updates
.
AS PATH example
64
ASPATH Attribute
AS7018135.207.0.0/16AS Path = 6341
AS 1239Sprint
AS 1755Ebone
AT&T
AS 3549Global Crossing
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 3549 7018 6341
AS 6341
135.207.0.0/16
AT&T Research
Prefix Originated
AS 12654RIPE NCCRIS project
AS 1129Global Access
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 1239 7018 6341
135.207.0.0/16AS Path = 1755 1239 7018 6341
135.207.0.0/16AS Path = 1129 1755 1239 7018 6341
Slide courtesy Timothy Griffin
.
.
AS PATH length can be deceptive
In fairness: could you do this “right” and still scale?
Exporting internalstate would dramatically increase global instability and amount of routingstate
Shorter Doesn’t Always Mean Shorter
AS 4
AS 3
AS 2
AS 1
Mr. BGP says that path 4 1 is better than path 3 2 1
Duh!
Slide courtesy Timothy Griffin.
AS PATH for loop prevention
66
Interdomain Loop Prevention
BGP at AS YYY will never accept a route with ASPATH containing YYY.
AS 7018
12.22.0.0/16ASPATH = 1 333 7018 877
Don’t Accept!
AS 1
Slide courtesy Timothy Griffin
.
Traffic often follows AS PATH
Traffic Often Follows ASPATH
AS 4AS 3AS 2AS 1135.207.0.0/16
135.207.0.0/16ASPATH = 3 2 1
IP Packet Dest =135.207.44.66
Slide courtesy Timothy Griffin.
Sometimes traffic does not follow AS PATH
… But It Might Not
AS 4AS 3AS 2AS 1135.207.0.0/16
135.207.0.0/16ASPATH = 3 2 1
IP Packet Dest =135.207.44.66
AS 5
135.207.44.0/25ASPATH = 5
135.207.44.0/25
AS 2 filters allsubnets with maskslonger than /24
135.207.0.0/16ASPATH = 1
From AS 4, it may look like thispacket will take path 3 2 1, but it actually takespath 3 2 5
Slide courtesy Timothy Griffin
.
.
ORIGIN
▶ The ORIGIN attribute tells where the route (NLRI) originated▶ Interior to the originating AS: ORIGIN = 0▶ Via the EGP protocol (historic): ORIGIN = 1▶ Via some other means: ORIGIN = 2
▶ A lower ORIGIN wins
.
MULTI EXIT DISC (Multi-Exit Discriminator or MED)
▶ The MED (or metric, formerly INTER AS METRIC) is meantto be advertised between neighboring ASs (via EBGP)
▶ Some implementations carry MED on by IBGP▶ Hot potato versus cold potato
▶ The MED is non-transitive (is not transferred into a third AS)▶ A lower MED wins▶ The default MED is 0 (lowest possible value)
▶ Some implementations choose the highest possible value
.
Best route selection
Definition (Route selection preference)
1. (Weight; Cisco specific)2. Highest Local Preference3. Shortest AS Path4. (Lowest Origin; hardly used; historic)5. Lowest MED6. Prefer EBGP over IBGP7. Lowest IGP cost to BGP egress8. Lowest Router ID
.
BGP message header
0 15 16 23 24 31
Marker
Length Type
We use the term message and not packet, because BGP “packets”are in fact part of one single TCP-stream.
.
.
BGP header fields
BGP header fields
Marker 128 bits of 1 (compatibility)Length Total length (min 19, max 4096)
No padding1, Including headerType 1: OPEN
2: UPDATE3: NOTIFICATION
4: KEEPALIVE5: Route-REFRESH
1No superfluous bytes are allowed inside the TCP stream .
BGP OPEN message
0 7 8 15 16 31
VersionMy Autonomous System
Hold TimeBGP Identifier
Opt Parm LenOptional Parameters
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
(variable)
.
OPEN message fields
OPEN message fields
Version 4My Autonomous System Sender’s AS
Hold Time Liveness detectionBGP Identifier Sender’s identifying IP address
Opt Parm Length Length of parameter fieldOptional Parameters TLV-encoded options
One interesting parameter is the Capabilities Optional Parameter,which defines (among others) the Route Refresh Capability.
.
BGP KEEPALIVE message
This page intentionally left blank.http://www.this-page-intentionally-left-blank.org/
.
.
KEEPALIVE message fields
KEEPALIVE message fields
:)
.
BGP NOTIFICATION message
0 7 8 15 16 31
Error code Error subcodeData
hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
(variable)
.
NOTIFICATION message fields
NOTIFICATION message fields
Error code 1: Message Header Error2: OPEN Error
3: UPDATE Error4: Hold Timer Expired
. . .Error subcode Depends on error code
Data Depends on error code and subcode
.
BGP Route-REFRESH message
0 15 16 23 24 31
AFI Reserved SAFI
.
.
Route-REFRESH message fields
Route-REFRESH message fields
AFI Address Family IdentifierReserved 0
SAFI Subsequent Address Family Identifier
.
BGP UPDATE message
0 15 16 31
Unfeasible Routes LengthWithdrawn Routes(variable length)
Total Path Attribute LengthPath Attributes(variable length)
Network Layer Reachability Information(variable length)
.
UPDATE message fields
UPDATE message fields
Unfeasible Routes Length Length of Withdrawn RoutesWithdrawn Routes List of prefixes2
Total Path Attribute Length Length of Path AttributesPath Attributes TLV-encoded attributes
Network Layer Reachability Information List of NLRI prefixes
2A prefix is specified by its length and just enough bytes ofthe network IP address to cover this length .
Tweaking your policies
Tweak Tweak Tweak
• For inbound traffic– Filter outbound routes– Tweak attributes on
outbound routes in the hope of influencing your neighbor’s best route selection
• For outbound traffic– Filter inbound routes– Tweak attributes on
inbound routes to influence best route selection
outboundroutes
inboundroutes
inboundtraffic
outboundtraffic
In general, an AS has morecontrol over outbound traffic
Slide courtesy Timothy Griffin
.
.
Outbound Traffic Engineering
▶ This works by manipulating incoming routes▶ Changing local preference▶ Extending inbound AS paths▶ Manipulating the metric (MED), for instance
by using inbound communities▶ It is relatively simple
▶ Based on your own policy▶ You are in control yourself
.
Choice between provider, peer or customer
60
So Many Choices
Which route shouldFrank pick to 13.13.0.0./16?
AS 1
AS 2
AS 4
AS 3
13.13.0.0/16
Frank’s Internet Barn
peer peer
customerprovider
Slide courtesy Timothy Griffin
.
Manipulating local preferencePrefer customer over peer over provider
61
LOCAL PREFERENCE
AS 1AS 2
AS 4
AS 3
13.13.0.0/16
local pref = 80
local pref = 100
local pref = 90
Higher Localpreference valuesare more preferred
Local preference used ONLY in iBGP
Slide courtesy Timothy Griffin
.
Primary and backup links
70
Implementing Backup Links with Local Preference (Outbound
Traffic)
Forces outbound traffic to take primary link, unless link is down.
AS 1
primary link backup link
Set Local Pref = 100for all routes from AS 1 AS 65000
Set Local Pref = 50for all routes from AS 1
We’ll talk about inbound traffic soon …
Slide courtesy Timothy Griffin
.
.
Multihomed primary and backup links
71
Multihomed Backups (Outbound Traffic)
Forces outbound traffic to take primary link, unless link is down.
AS 1
primary link backup link
Set Local Pref = 100for all routes from AS 1
AS 2
Set Local Pref = 50for all routes from AS 3
AS 3provider provider
Slide courtesy Timothy Griffin.
Inbound Traffic Engineering
▶ This works by manipulating outgoing routes▶ Extending outbound AS PATHs is a traditional hack▶ Manipulating the metric (MED) is the official way▶ Setting outbound communities is a more modern approach
▶ Agreements with your neighbors are necessary (commonpolicy)
▶ Inbound is more complex than outbound▶ Inbound depends (also) on neighbor’s policy▶ You are not in control by yourself
▶ Announcing more specific routes▶ Method of last resort, but often a bad idea
.
Traffic engineering a longer AS PATH
72
Shedding Inbound Traffic with ASPATH Padding. Yes, this is a
Glorious Hack …
Padding will (usually) force inbound traffic from AS 1to take primary link
AS 1
192.0.2.0/24ASPATH = 2 2 2
customerAS 2
provider
192.0.2.0/24
backupprimary
192.0.2.0/24ASPATH = 2
Slide courtesy Timothy Griffin.
Your provider might overrule your effort
73
… But Padding Does Not Always Work
AS 1
192.0.2.0/24ASPATH = 2 2 2 2 2 2 2 2 2 2 2 2 2 2
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
AS 3 will sendtraffic on “backup”link because it prefers customer routes and localpreference is considered before ASPATH length!
Padding in this way is oftenused as a form of loadbalancing
backupprimary
Slide courtesy Timothy Griffin
.
.
But you can make an agreement by using a community
74
COMMUNITY Attribute to the Rescue!
AS 1
customerAS 2
provider
192.0.2.0/24
192.0.2.0/24ASPATH = 2
AS 3provider
backupprimary
192.0.2.0/24ASPATH = 2 COMMUNITY = 3:70
Customer import policy at AS 3:If 3:90 in COMMUNITY then set local preference to 90If 3:80 in COMMUNITY then set local preference to 80If 3:70 in COMMUNITY then set local preference to 70
AS 3: normal customer local pref is 100,peer local pref is 90
Slide courtesy Timothy Griffin.
Hot potato routing
75
Hot Potato Routing: Go for the Closest Egress Point
192.44.78.0/24
15 56 IGP distances
egress 1 egress 2
This Router has two BGP routes to 192.44.78.0/24.
Hot potato: get traffic off of your network as Soon as possible. Go for egress 1!
Slide courtesy Timothy Griffin
.
Burnt by the hot potato
76
Getting Burned by the Hot Potato
15 56
172865High bandwidth
Provider backbone
Low bandwidthcustomer backbone
Heavy Content Web Farm
Many customers want their provider to carry the bits!
tiny http requesthuge http reply
SFF NYC
San Diego
Slide courtesy Timothy Griffin.
Cold potato routing by honoring MEDs
77
Cold Potato Routing with MEDs(Multi-Exit Discriminator Attribute)
15 56
172865
Heavy Content Web Farm
192.44.78.0/24
192.44.78.0/24MED = 15
192.44.78.0/24MED = 56
This means that MEDs must be considered BEFOREIGP distance!
Prefer lower MED values
Note1 : some providers will not listen to MEDs Note2 : MEDs need not be tied to IGP distance
Slide courtesy Timothy Griffin
.
.
Communities
▶ An optional transitive attribute▶ A community can be used to communicate
preferred treatment of a route▶ Communities can be used with both inbound as well as
outbound▶ Some communities have a well-known semantics
▶ NO EXPORT: don’t export beyond current AS (orconfederation)
▶ NO ADVERTISE: don’t export at all
.
Use of communities
▶ Inbound from your upstream▶ Learn where your upstream imported this route▶ You can base policy decisions on that
▶ Outbound to your upstream▶ Request specific upstream treatment
▶ Setting of local preference▶ Announcements or not to specific ASs▶ AS PATH prepending for certain peerings
▶ Your upstream promises to implement the requested policy
.
Structure and semantics of communities
58
How Can Routes be Colored?BGP Communities!
A community value is 32 bits
By convention, first 16 bits is ASN indicating who is giving itan interpretation
communitynumber
Very powerful BECAUSE it has no (predefined) meaning
Community Attribute = a list of community values.(So one route can belong to multiple communities)
RFC 1997 (August 1996)
Used for signallywithin and betweenASes
Two reserved communities
no_advertise 0xFFFFFF02: don’t pass to BGP neighbors
no_export = 0xFFFFFF01: don’t export out of AS
Slide courtesy Timothy Griffin.
Route Reflectors
▶ Specified in RFC 4456▶ A route reflector is a kind of “super” IBGP peer▶ A route reflector has clients with which it peers via IBGP
and for which it reflects (transitively) routes▶ A route reflector is part of a full mesh of
other route reflectors and non-clients
.
.
Route reflectors illustration
Full Mesh
39
Slide courtesy Iljitsch van Beijnum.
Route reflectors illustration
Route Reflection
40
Slide courtesy Iljitsch van Beijnum
.
Confederations
▶ Specified in RFC 5065▶ Use multiple private ASs inside your main AS▶ Talk to the outside world with your main AS
▶ This hides the private ASs▶ Talk to the inside world as if using EBGP and IBGP
▶ Using the different private ASs▶ This needs special AS PATH segment types
.
Confederations illustration
Confederations
41
Slide courtesy Iljitsch van Beijnum