19
Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Embed Size (px)

Citation preview

Page 1: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

1

•BGP overview

•BGP operations

•BGP messages

•BGP decision algorithm

•BGP states

Page 2: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

2

BGP overview

• Currently in version 4.• InterAS (or Interdomain) routing protocol for

exchanging network reachability information among BGP routers.

• Uses TCP on port 179 to send routing messages.• BGP is a distance vector protocol, but unlike in RIP,

routing messages in BGP contain complete routes.• Network administrators can specify routing policies.

Page 3: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

3

BGP overview (cont.)

BGP routers are alsocalled BGP speakers

Page 4: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

4

BGP operations

• Two BGP routers exchanging information on a connection are called peers.

– Initially, BGP peers exchange the entire BGP routing table.

– A BGP router retains the current version of the entire BGP routing tables of all of its peers for the duration of the connection.

– Subsequently, only incremental updates are sent as the routing tables change.

– Keepalive messages are sent periodically to ensure that the connection between the BGP peers is alive.

– Notification messages are sent in response to errors or special conditions.

Page 5: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

5

BGP operations (cont.)

• A route is defined as a unit of information that pairs a destination with the attributes of a path to that destination.

• Routes are stored in the Routing Information Bases (RIBs).

• A RIB within a BGP router consists of three distinct parts:– Adj-RIBs-In: contains unprocessed routing information that has

been advertised to the local BGP router by its peers;

– Loc-RIB: contains the routes that have been selected by the local BGP router's Decision Process;

– Adj-RIBs-Out: organizes the routes for advertisement to specific peers by means of the local speaker’s UPDATE messages.

Page 6: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

6

eBGP and iBGP

• BGP can also be used within an AS. BGP connections inside an AS are called internal BGP (iBGP), and BGP connections between different Ass are called external BGP (eBGP).

R2 R3iBGP

AS2

R1

AS1

R4

AS4

eBGP eBGP

• If an AS has multiple connections to other AS's, multiple BGP speakers are needed.

• All BGP speakers representing the same AS must give a consistent image of the AS to the outside.

• Hence iBGP

• The purpose of iBGP is to ensure that network reachability information is consistent among multiple BGP routers in the same AS.

Page 7: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

7

• BGP header format– Marker: authenticates incoming BGP messages or detects loss of

synchronization between a pair of BGP peers– Length: indicates the total length of the message in octets, including the BGP

header– Type: indicates the type of the message

• The BGP synchronization rule states that if an AS provides transit service to another AS, BGP should not advertise a route until all of the routers within the AS have learned about the route via an IGP.

BGP messages

Marker

Length Type

0 16 24 31

Page 8: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

8

OPEN message

Marker

Length Type=OPEN Version

My autonomous system Hold time

BGP identifier

Optional parameter length

Optional parameters

0 8 16 24 31

• Purpose: first message sent after TCP connection is opened • Version: the protocol version number of the message• My autonomous system: The AS number of the sending router• Hold time: the number of seconds between the transmission of successive

KEEPALIVE messages• BPG identifier: identifier of the sending BGP router (one interface IP addr.)• Optional parameter: a list of optional parameters

Page 9: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

9

KEEPALIVE message

• If the hold time is zero, then KEEPALIVE messages will not be sent.

Marker

Length Type=KEEPALIVE

0 8 16 24 31

Page 10: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

10

NOTIFICATION message

• When a BGP speaker detects an error, it sends a Notification and then closes the TCP conncetion.

• Error code: the type of error condition• Error subcode: specific information about the nature of the error• Data: the reason for the notification.• Examples: Open message error, Update message error (bad attribute), hold

timer expired, etc.

Marker

Length Type=NOTIFICATION Error code

DataError subcode

0 8 16 24 31

Page 11: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

11

UPDATE message

• Unfeasible routes length: the total length of the withdrawn routes field in octets.• Withdrawn routes: a list of IP address prefixes for the routes that need to be

withdrawn from BGP routing tables.• Total path attribute length: the total length of the Path Attributes field in octets.• Path attributes: a variable length sequence of path attributes. • NLRI (Network Layer Reachability Information): a list of IP prefixes.

Unfeasible routes length (2 octets)

Withdrawn routes (variable)

Total path attribute length (2 octets)

Path attributes (variable)

Network layer reachability information (variable)

Length (1 octet) Prefix (variable)

Length (1 octet) Prefix (variable)

……

Attribute type Attribute length Attribute value

Attribute type Attribute length Attribute value

……

BGP header

Length (1 octet) Prefix (variable)

Length (1 octet) Prefix (variable)

……

Page 12: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

12

Update message (cont.)

• Attribute flag (1 octet):– O bit: attribute is optional (O=1), or well-known (required) (O=0).– T bit: an optional attribute is transitive (T=1), or non-transitive (T=0). Well-known

attributes are always transitive.– P bit: the information in the optional transitive attribute is partial (P=1), or complete

(P=0).– E bit: the attribute length is two octets (E=1), or one octet (E=0).

• Four types of attributes– Well-known mandatory – recognized by all BGP speakers– Well-known discretionary, optional transitive, optional non-transitive– Paths with unrecongnized optional transitive attributes are passed on when a BGP speaker

does not recognize the attribute. But unrecognized optional non-transitive attributes should be silently dropped.

Attribute type Attribute length Attribute value

OT P E 0 Attribute type code

Page 13: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

13

Types of attributes

• Attribute type code:– ORIGIN (type code 1): well-known mandatory

• defines the origin of the NLRI - well-known mandatory– 0: IGP – indicates that the NLRI is interior to the originating AS– 1: EGP – inidicates that the NLRI is learned through BGP– 2: incomplete – NLRI learned through some other means

– AS_PATH (type code 2): well-known mandatory • lists the sequence of ASs that the route have traversed to reach the destination

– A BGP speaker propagating a route prepends its own AS to the AS_PATH list– Used to detect loops

– NEXT_HOP (type code 3): well-known mandatory • defines the IP address of the border router that should be used as the next hop to

reach the destinations listed in the NLRI

– MULTI_EXIT_DISC (MED) (type code 4): Multi-Exit Discriminator - optional nontransitive – inter-AS-metric (hop count

• discriminates among multiple entry/exit points to a neighboring AS and gives a hint to the neighboring AS about the preferred path.

– makes no sense to compare a MED value by one AS with a MED used by another AS because metrics vary from AS to AS.

Page 14: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

14

Types of attributes (cont.)

• Attribute type code:– LOCAL_PREF ( type code 5): well-known discretionary

• informs other BGP routers within the same AS of its degree of preference for an advertised route

– only part of iBGP; not included in eBGP exchanges

– ATOMIC_AGGREGATE (type code 6): well-known discretionary • a BGP speaker, when presented with a set of overlapping routes from one of its

peers to reach a given NLRI, informs other BGP routers that it selected a less specific route without selecting a more specific one that is included in it. Ensures that certain aggregates are not deeaggregated.

• a route describing a smaller set of destinations (a longer prefix) is said to be more specific than a route describing a larger set of destinations (a shorted prefix)

– AGGREGATOR (type code 7): optional transitive• specifies the last AS number that formed the aggregate route followed by the IP

address of the BGP router that formed the aggregate route.• advertises which AS and which BGP speaker within that AS performed the

aggregation

Page 15: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

15

Example

• 10.10.3.0/24: CIDR (Classless Interdomain Routing) notation; 24 is the number of network mask bits – so the network prefix here is 10.10.3 and mask is 255.255.255.0.

• 205.100.0.0/22 means the mask is 255.255.252.0 – so prefix range runs from 205.100.0.0 to 205.100.3.0.

R1

R2 R5

10.10.3.0/24

iBGP

eBGP

R410.10.4.210.10.4.1

10.10.1.3

R3

AS1

AS2

10.10.1.2

10.1.2.0/24

10.1.1.0/24(with MED 200)

10.1.1.0/24(with MED 100)

• NEXT_HOP: R4 advertises 10.1.2.0/24 to R3 (eBGP) with a next hop of 10.10.1.2 (IP address of BGP peer)

• R3 should advertise 10.1.2.0/24 using iBGP with a next hop of 10.10.1.2 (as in eBGP) – reason is R3 is not an immediate neighbor of R1 or R2; R1 and R2 should update their routing table information for 10.1.2.0/24 with the next-hop to reach 10.10.1.2 based on their IGP information.

• Routing table at R2 • Reach 10.1.2.0/24 through 10.10.1.2• Reach 10.10.3.0/24 through 10.10.4.1

• Routing table at R3• Reach 10.1.2.0/24 through 10.10.1.2• Reach 10.10.3.0/24 through 10.10.4.2

10.10.1.1

• R3 will assume AS2 wants it to use R4 to reach 10.1.1.0/24 because its MED is lower

• Reach 10.1.1.0/24 via 10.10.1.2

Page 16: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

16

The BGP decision algorithm

• After BGP router receives updates about different destinations from peers, the protocol will have to decide which paths to choose in order to reach a specific destination.

• BGP will choose only a single path to reach a specific destination.

• The decision process is based on different attributes, such as next hop, local preference, the route origin, and so on.

• BGP will always propagate the best path to its neighbors.

Page 17: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

17

How BGP selects a path to a destination

BGP selects one path as the best path to a destination; places it in its routing table and propagates the path information to its neighbors (Cisco web page)

1. If path specifies a NextHop that is inaccessible drop the update.2. Prefer the largest Weight (Weight is a Cisco-specific concept – not in

BGP: locally assigned number; prefer routes with higher weights)3. If same weight prefer largest Prefer path with largest Local Preference. 4. If same Local Preference, prefer the route that was originated by BGP

running on this router.5. If no route was originated, prefer the shorter AS_path.6. If all paths have the same AS_path length, prefer the lowest origin code

(IGP<EGP<INCOMPLETE).7. If origin codes are the same, prefer the path with the lowest MED.8. If all paths have the same MED, prefer the External path over Internal.9. If all paths are still the same, prefer the path through the closest IGP

neighbor.10. Prefer the route with the lowest IP address value as specified by the

BGP router ID.

Page 18: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

18

BGP finite state machine

• Idle state: In this state BGP refuses all incoming BGP connections. No resources are allocated to the peer.

• Connect state: In this state BGP is waiting for the transport protocol (TCP) connection to be completed.

• Active state: In this state BGP is trying to acquire a peer by initiating a transport protocol connection. When done, it sends an OPEN message.

• OpenSent state: In this state BGP waits for an OPEN message from its peer.

• OpenConfirm state: In this state BGP waits for a KEEPALIVE or NOTIFICATION message.

• Established state: In the Established state BGP can exchange UPDATE, NOTIFICATION, and KEEPALIVE messages with its peer.

Page 19: Xuan Zheng (modified by M. Veeraraghavan) 1 BGP overview BGP operations BGP messages BGP decision algorithm BGP states

Xuan Zheng(modified by M. Veeraraghavan)

19

References

• Section 8.7.3 of Communication Networks by A. Leon Garcia and I. Widjaja

• RFC 1771 (can be obtained from www.ietf.org)• “Using BGP for inter-domain routing”

http://www.cisco.com/univercd/cc/td/doc/cisintwk/ics/icsbgp4.htm

• “BGP case studies” http://www.cisco.com/warp/public/459/bgp-toc.html