Upload
leah-morrison
View
16
Download
0
Tags:
Embed Size (px)
DESCRIPTION
computing the relationships between autonomous systems. giuseppe di battista maurizio patrignani maurizio pizzonia. univ. of rome III http://www.dia.uniroma3.it/ ~compunet/. router 1. router 2. announcements and traffic flows. bgp allows a router to offer connectivity to another router - PowerPoint PPT Presentation
Citation preview
computing the relationships between autonomous systems
giuseppe di battistamaurizio patrignanimaurizio pizzonia
univ. of rome IIIhttp://www.dia.uniroma3.it/~compunet/
announcements and traffic flows• bgp allows a router to offer connectivity to
another router
• “offering connectivity” means “promising the delivery to a specific destination”
195.10.14.0/24
bgp announcement
ip traffic (to be delivered to 195.10.14.0/24)
AS100AS100 AS200AS200
using AS-path to avoid cycles
193.204.161.0/24
193.204.161.0/24
40 10
193.204.161.0/24
10
prefix
as-path
legenda:
AS10AS10
AS40AS40 AS31AS31
AS60AS60
AS212AS212
193.204.161.0/24
10193.204.161.0/24
31 40 10
193.204.161.0/24
60 31 40 10
193.204.161.0/24
212 60 31 40 10
using prepending to implement preferences
193.204.161.0/24
193.204.161.0/24
40 10
AS10AS10
AS40AS40 AS31AS31
AS60AS60
AS212AS212
193.204.161.0/24
10193.204.161.0/24
31 40 10
193.204.161.0/24
212 10 10 10
193.204.161.0/24
10 10 10
how to ensure cooperation?
193.204.161.0/24
193.204.161.0/24
40 10
AS10AS10
AS40AS40 AS31AS31
AS60AS60
AS212AS212
193.204.161.0/24
10193.204.161.0/24
31 40 10
193.204.161.0/24
212 10
193.204.161.0/24
10
why AS212 should propagate AS10 announcement?
a basic question
AS1AS1 provider: grants connectivity
customer: pays for connectivity
an Autonomous System may be customer of another Autonomous System (its “provider”)
how can these kinds of relationships be inferred by observing routing information?
AS2AS2
topology discovery does not provide inter-AS commercial relationships
peer-peer
sibling-sibling
peer-peer
AS1AS1 AS2AS2
AS3AS3 AS4AS4 AS5AS5
AS6AS6 AS7AS7 AS8AS8 AS9AS9
not available route
providercustomer
problem history• the problem is introduced by Lixin Gao
(“On Inferring Autonomous System Relationships in the Internet”, IEEE Trans. Networking, 2001)– relationships are classified into three categories:
customer-provider, peer-peer, and sibling-sibling
– BGP routing tables are used as input– heuristics are verified with information coming
from other sources
a simpler problem formulationdue to Subramanian, Agarwal, Rexford, and Katz (“Characterizing the Internet Hierarchy from Multiple Vantage Points” INFOCOM 2002) [SARK 2002]
peer-peer relationshipcustomer-provider relationship
from a provider from a provider
from a provider
from a provider
from a customer from a customer
from a customer
from a customer
from a peer
from a peer
from a peerfrom a peer
valid and invalid AS-paths
type-1 type-2
if all ASes respect the advertisement rules, AS-paths would be all valid
poss
ibly
mis
sing
poss
ibly
mis
sing
possibly missing
possibly missing
type-2 valid AS-path: a (possibly missing) uphill path followed by a peer-peer edge, followed by a (possibly missing) downhill path
type-1 valid AS-path: a (possibly missing) uphill path followed by a (possibly missing) downhill path
the ToR (Type of Relationships) problem
in [SARK 2002] the ToR problem is conjectured to be NP-hard
given an undirected graph G and a set of paths P, give an orientation to some of the edges of G to minimize the number of invalid paths in P
ToR problem [SARK 2002]
invalid paths: involving more than one peering or not “valley free”
alley
a real-life ToR problem instance
networknetwork next hopnext hop pathpath
200.1.225.0200.1.225.0 167.142.3.6167.142.3.6 5056 701 6461 4926 4270 4387 i5056 701 6461 4926 4270 4387 i
200.10.112.0/2200.10.112.0/233
167.142.3.6167.142.3.6 5056 701 4926 4926 4926 6461 2914 174 174 174 174 5056 701 4926 4926 4926 6461 2914 174 174 174 174 14318 i14318 i
204.71.2.0204.71.2.0 203.181.248.23203.181.248.2333
7660 1 5056 701 11334 i 7660 1 5056 701 11334 i
231.172.64.0/1231.172.64.0/199
167.142.3.6167.142.3.6 5056 1239 1 1755 1755 1755 1755 3216 13099 i 5056 1239 1 1755 1755 1755 1755 3216 13099 i
200.33.121.0200.33.121.0 167.142.3.6167.142.3.6 5056 1 1239 8151 i5056 1 1239 8151 i
204.71.2.0204.71.2.0 144.228.241.81144.228.241.81 1239 5056 701 11334 i 1239 5056 701 11334 i
six rows extracted from the BGP routing table of Oregon RouteViews (Apr 18, 2001)
5056 701 6461 4926 4270 43875056 701 6461 4926 4270 4387
5056 701 4926 4926 4926 6461 2914 174 174 174 174 143185056 701 4926 4926 4926 6461 2914 174 174 174 174 14318
7660 1 5056 701 113347660 1 5056 701 11334
5056 1239 1 1755 1755 1755 1755 3216 130995056 1239 1 1755 1755 1755 1755 3216 13099
5056 1 1239 81515056 1 1239 8151
1239 5056 701 113341239 5056 701 11334
extract AS paths and eliminate prepending
a real-life ToR problem instance
network network
next hopnext hop pathpath
200.1.225.0200.1.225.0 167.142.3.6167.142.3.6 5056 701 6461 4926 4270 4387 i5056 701 6461 4926 4270 4387 i
200.10.112.0/2200.10.112.0/233
167.142.3.6167.142.3.6 5056 701 4926 4926 4926 6461 2914 174 174 174 174 5056 701 4926 4926 4926 6461 2914 174 174 174 174 14318 i14318 i
204.71.2.0204.71.2.0 203.181.248.23203.181.248.2333
7660 1 5056 701 11334 i 7660 1 5056 701 11334 i
231.172.64.0/1231.172.64.0/199
167.142.3.6167.142.3.6 5056 1239 1 1755 1755 1755 1755 3216 13099 i 5056 1239 1 1755 1755 1755 1755 3216 13099 i
200.33.121.0200.33.121.0 167.142.3.6167.142.3.6 5056 1 1239 8151 i5056 1 1239 8151 i
204.71.2.0204.71.2.0 144.228.241.81144.228.241.81 1239 5056 701 11334 i 1239 5056 701 11334 i
six rows extracted from the BGP routing table of Oregon RouteViews (Apr 18, 2001)
5056 701 6461 4926 4270 43875056 701 6461 4926 4270 43875056 701 4926 4926 4926 6461 2914 174 174 174 174 143185056 701 4926 4926 4926 6461 2914 174 174 174 174 14318
7660 1 5056 701 113347660 1 5056 701 113345056 1239 1 1755 1755 1755 1755 3216 130995056 1239 1 1755 1755 1755 1755 3216 13099
5056 1 1239 81515056 1 1239 81511239 5056 701 113341239 5056 701 11334
extract AS paths and eliminate prepending
6461 2914 174 174 174 174 14318 6461 2914 174 174 174 174 14318
14318 14318
3216 13099 3216 13099
5056 1 1239 81515056 1 1239 8151
building the corresponding AS graph
AS2914AS2914AS174AS174
AS14318AS14318
AS4387AS4387 AS4270AS4270
AS6461AS6461
AS701AS701 AS5056AS5056
AS4926AS4926
AS11334AS11334
AS1AS1
AS1239AS1239 AS8151AS8151
AS7660AS7660
AS1755AS1755
AS13099AS13099
AS3216AS3216
5056 701 6461 4926 4270 43875056 701 6461 4926 4270 43875056 701 4926 6461 2914 174 143185056 701 4926 6461 2914 174 14318
7660 1 5056 701 113347660 1 5056 701 113345056 1239 1 1755 3216 130995056 1239 1 1755 3216 13099
1239 5056 701 113341239 5056 701 11334
an orientation for the AS graph
AS2914AS2914AS174AS174
AS14318AS14318
AS4387AS4387 AS4270AS4270
AS6461AS6461
AS701AS701 AS5056AS5056
AS4926AS4926
AS11334AS11334
AS1AS1
AS1239AS1239 AS8151AS8151
AS1755AS1755
AS13099AS13099
AS3216AS3216
5056 1 1239 81515056 1 1239 8151
5056 701 6461 4926 4270 43875056 701 6461 4926 4270 43875056 701 4926 6461 2914 174 143185056 701 4926 6461 2914 174 14318
7660 1 5056 701 113347660 1 5056 701 113345056 1239 1 1755 3216 130995056 1239 1 1755 3216 13099
1239 5056 701 113341239 5056 701 11334
AS7660AS7660
alley alley
our contributions• we show that, although the ToR-problem is NP-
hard, a solution without invalid paths (if it exists) can be found in linear time
• we propose heuristics for the general problem based on a novel paradigm and show their effectiveness against publicly available data sets
• the experiments put in evidence that our heuristics performs significantly better than state of the art heuristics
formulation as a decision problem• the ToR minimization problem corresponds
to the following decision problem
given an undirected graph G, a set of paths P, and an integer k, test if it is possible to give an orientation to some of the edges of G so that number of invalid paths in P is at most k
ToR-D problem
what if we give an orientation to all the edges?
simplifying the decision problem for k=0suppose to have a solution to the ToR problem with all valid paths and consider an edge labeled as a peer-peer edge
peer-peer
example of orientable AS graph
AS3582AS3582
AS8709AS8709 AS8938AS8938
AS4197AS4197
AS1740AS1740
AS7018AS7018
AS3967AS3967
AS1031AS103111
AS3561AS3561
AS8843AS8843
AS15493AS15493
AS6893AS6893
an orientation leaving all valid paths
AS3582AS3582
AS8709AS8709 AS8938AS8938
AS4197AS4197
AS1740AS1740
AS7018AS7018
AS3967AS3967
AS1031AS103111
AS3561AS3561
AS8843AS8843
AS15493AS15493
AS6893AS6893
a different representation
AS8709AS8709
AS8938AS8938
AS1031AS103111
AS3561AS3561
AS8843AS8843
AS15493AS15493
AS6893AS6893
AS3582AS3582 AS4197AS4197
AS1740AS1740
AS7018AS7018
AS3967AS3967
example of not orientable AS graph
AS2914AS2914AS174AS174
AS14318AS14318
AS4387AS4387 AS4270AS4270
AS6461AS6461
AS701AS701 AS5056AS5056
AS4926AS4926
AS11334AS11334
AS1AS1
AS1239AS1239 AS8151AS8151
AS1755AS1755
AS13099AS13099
AS3216AS3216
AS7660AS7660
5056 1 1239 81515056 1 1239 8151
5056 701 6461 4926 4270 43875056 701 6461 4926 4270 43875056 701 4926 6461 2914 174 143185056 701 4926 6461 2914 174 14318
7660 1 5056 701 113347660 1 5056 701 113345056 1239 1 1755 3216 130995056 1239 1 1755 3216 13099
1239 5056 701 113341239 5056 701 11334
“chestnuts” and contradictions
chestnut
chestnut
contradiction!
testing if a solution exists with all valid paths
based on this observation the problem can be mapped to 2SAT:
a path p= v1, …, vn is valid if and only if it does not have a vertex vi such that the two edges of p incident on vi are directed away form vi (it does not have a “valley”)
observation
given a set X of Boolean variables and a formula in conjunctive normal form composed by clauses of two literals, where a literal is a variable or a negated variable, find a truth assignment for the Boolean variables in X such that the formula is satisfied
vi
example: (x1 x2) (x2 x3) (x3 x1)
alley
mapping the problem to 2SAT
v3v1 v2 v4 v5
x1,2 x2,3 x3,4 x4,5
• associate each edge with a Boolean variable• provide each edge with an arbitrary (say random) orientation
• a truth assignment for the Boolean variables corresponds to an orientation for all the edges of the AS-graph (and vice versa)
• if the variable is true, the associated edge preserves its original direction
• if the variable is false, the associated edge is reversed
construction of the 2SAT instance
v3v1 v2 v4 v5
x1,2 x2,3 x3,4 x4,5
(x1,2 x2,3) (x2,3 x3,4) (x3,4 x4,5)
v3v1 v2 v4 v5
x1,2 x2,3 x3,4 x4,5
v3v1 v2 v4 v5
x1,2 x2,3 x3,4 x4,5
solving a huge 2SAT
there is a direct path between a literal and its opposite and vice versa iff a solution without invalid paths does not exist (Aspvall, Plass, Tarjan, IPL ‘79)
(x1,2 x2,3) (x2,3 x3,4) (x3,4 x4,5) (x4,5 x2,3)
x1,2
x2,3
x2,3x1,2
x3,4
x3,4
x4,5
x4,5
compute the graph of the literals
given a 2SAT formula
strongly connected componentscompute the strongly connected components of the graph of the literals
strongly connected component = maximal set of vertices such that for each pair u, v of vertices of the set there exists a directed path from u to v and vice versa
testing if a solution exists• a solution for the 2SAT problem does not exists if and only if
the two literals of the same Boolean variable (edge) fall into the same strongly connected component
• the test takes O(n+m+q) time – where n is the number of ASes, m is the number of edges, and q is the
sum of the lengths of the AS-paths
x
x
AS graphs from the various sourceshttp://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
AS # AS NameApr 18, 2001 Apr 6, 2002
Paths Vertices Edges Paths Vertices Edges
1 Genuity 58,156 10,203 13,001 63,744 12,700 15,946
1740 CERFnet 70,830 10,007 13,416 not available
3549 Globalcrossing 60,409 10,288 13,039 76,572 12,533 16,025
3582 Univ.of Oregon 2,584,230 10,826 22,440 4,600,981 13,055 27,277
3967 Exodus Comm. 254,123 10,387 18,401 399,023 12,616 21,527
4197 Global Online Japan 55,060 10,288 13,004 59,745 12,518 15,628
5388 Energis Squared 58,832 10,411 13,259 117,003 12,659 16,822
7018 AT&T 120,283 9,252 12,117 170,325 11,706 15,429
8220 COLT Internet 46,606 8,376 10,932 154,855 12,660 18,421
8709 Exodus, Europe 114,931 10,333 15,006 126,370 12,555 18,175
1755 Ebone 23,469 10,540 13,898 not available
2516 KDDI 126,414 10,790 17,735 not available
2548 MaeWest 80,549 10,583 15,249 not available
6893 CW 70,265 10,523 14,359 not available
telnet sourcesw
eb sources
solvable without invalid paths?
AS # AS nameApr 18, 2001 Apr 6, 2002
orientable w/o invalid paths orientable w/o invalid paths
1 Genuity yes yes
1740 CERFnet yes not available
3549 Globalcrossing yes yes
3582 Univ.of Oregon no no
3967 Exodus Comm. yes yes
4197 Global Online Japan yes yes
5388 Energis Squared yes yes
7018 AT&T yes yes
8220 COLT Internet yes yes
8709 Exodus, Europe yes yes
1755 Ebone yes not available
2516 KDDI no not available
2548 MaeWest yes not available
6893 CW yes not available
telnet sourcesw
eb sources
the general problem is NP-hard
MAX2SAT
instance
MAX2SAT
assignment
example(x1 x2) (x2 x3) (x2 x4) (x1 x4) (x3 x4)
x1
x2
x3
x4
a hot topic• independently of our work, Erlebach, Hall, and
Schank of the Theory of Communication Networks Group of Zurich discovered analogous results concerning the time complexity of the general problem and the linearity in the case of all valid paths
• however, while they put more emphasis on the approximability of the problem, we focus more on the engineering and the experimentation of an effective heuristic approach
our heuristic approach
size of the problem:3,423,460 AS-paths10,916 vertices23,761 edges
very simple approach:
1. find a very large set of AS-paths admitting an orientation without invalid paths
2. try to reinsert the kept out AS-paths
1) find a very large set of paths admitting an orientation
x
x
yy
1.a) rank the edges with respect to the number of paths using them1.b) construct the graph of the literals1.c) compute the strongly connected components of it1.d) find out all the variables whose two literals fall into the same strongly
connected component1.e) consider the corresponding edges and eliminate the one with the highest
rank (and all the paths using it, too)
2) try to reinsert the left out AS-paths
2.a) set x = 12.b) add x paths2.c) if it is still solvable
commit the path additionx = x*2
elseif x is 1
discard the path else x = 12.d) repeat 2.b and 2.c until no path is left
some figures: out of 3,423,460 paths, we removed 246,835 paths, ending with 3,176,625 paths after the first step; we reinserted 222,764 with the second step, ending with 3,399,389 valid paths
comparison with the SARK paperAS # AS Name SARK Err. % SAT Err. %
1 Genuity 0.65 0.45
1740 CERFnet n.a. 0.36
3549 Globalcrossing n.a. 0.13
3582 Univ.of Oregon n.a. 0.57
3967 Exodus Comm. n.a. 0.42
4197 Global Online Japan n.a. 0.46
5388 Energis Squared n.a. 0.46
7018 AT&T 0.63 0.21
8220 COLT Internet n.a. 0.22
8709 Exodus, Europe n.a. 0.21
1755 Ebone 2.89 1.52
2516 KDDI 8.97 4.95
2548 MaeWest 1.49 0.19
6893 CW 2.92 0.64
SARK Err.%
16.85
6.29
17.28
30.52
29.65
13.67
55.37
23.30
12.62
11.68
6.32
76.36
2.71
23.63
telnet sourcesw
eb sources
recomputed
issues• how to determine peer-peer relationships once the
graph is oriented?• each AS-path is weighted one, irrespectively of the
size of its prefix. To what extent is this correct?• could snapshots of the same BGP table taken at
different dates help in better understanding the relationships between ASes?
• what if we knew in advance the orientation of some of the edges?
questions?
computing a solution without invalid paths• consider the directed acyclic graph of the connected components
of the graph of the literals• compute a topological sorting of the connected components and
assign an integer to each component• call f(x) the index of the component to which the literal x belongs • a true value is assigned to variable x if f(x) > f(x), false
otherwise
1
2
3
4
degrees of freedom in determining the peering relationships
need for more semantic information for determining peer-peer relationships
AS # AS Name edges candidates peer-peer % peer-peer
1 Genuity 13,001 11,395 8,573 65.94
1740 CERFnet 13,416 10,741 7,931 59.12
3549 Globalcrossing 13,039 10,851 8,250 63.28
3582 Univ.of Oregon
3967 Exodus Comm. 18,401 14,562 11,626 63.18
4197 Global Online Japan 13,004 9,794 7,573 58.24
5388 Energis Squared 13,259 8,969 7,383 55.68
7018 AT&T 12,117 9,664 7,672 63.32
8220 COLT Internet 10,932 7,970 6,616 60.52
8709 Exodus, Europe 15,006 10,079 8,477 56.49
all the ten together 23,761 4,148 3,936 16.56
conclusions• we show that the ToR-problem is NP-hard but…• if a solution without invalid paths exists, it can be
found in linear time by mapping the problem to 2SAT
• we propose heuristics for the general problem based on the 2SAT mapping
• we show the effectiveness of the heuristics against publicly available data sets
• the experiments put in evidence that our heuristics performs significantly better than state of the art heuristics
• we show that discovering peer-peer relationships is a hard problem if one wants to maximize them
acknowledgements
• Subramanian, Agarwal, Rexford, and Katz for sharing their data and explanations
• Thomas Erlebach, Alexander Hall, and Thomas Schank for their insight
• Debora Donato, Andrea Vitaletti for useful discussions
• Massimo Rimondini for computing statistics on the graphs
discovering the peer-peer relationships
1v1 v2
v3
v4v5
v6
2
3
4
5
6
given an oriented graph, discovering peer-peer relationships is a hard problem if you want to maximize them (MAX-INDEPENDENT-SET can be mapped to it)
the experimental setting• same setting used by [SARK 2002]
• ten telnet looking glasses are selected as sources of data
• all the AS-paths put together are used to test the algorithms
• four web sources are used to validate the output
a deeper comparisonSAT algorithm
number of nodes 10897
number of edges 23690
peering edges 0
percentage of total 0%
path covering distribution
minimum: 1
average: 80
maximum: 19918
strongly connected components
10312
scc's to edges ratio 0.4352892
size distribution: 10311 scc's of size 1
1 scc of size 586
SARK algorithm
number of nodes 10923
number of edges 23757
peering edges 1136
percentage of total 5%
path covering distribution
minimum: 0
average: 50
maximum: 22327
strongly connected components
10302
scc's to edges ratio 0.4336406
size distribution: 10096 scc's of size 1
176 scc's of size 2
21 scc's of size 3
4 scc's of size 4
1 scc of size 6
1 scc of size 7
1 scc of size 19
1 scc of size 86
1 scc of size 278
“graph of the differences”
graph of the differencesnumber of nodes 489
number of edges 1148
path covering distribution
minimum: 1
average: 37
maximum: 3420
connected components
19
cc's to edges ratio 0.0165505
size distribution: 18 cc's of size 2
1 cc of size 453
we inserted one undirected edge (and the two end vertices if needed) into the graph of the differences for every oppositely directed edge
links• Lixin Gao, “On Inferring Autonomous System Relationships in
the Internet”http://www-unix.ecs.umass.edu/~lgao/
• Subramanian, Agarwal, Rexford, and Katz “Characterizing the Internet Hierarchy from Multiple Vantage Points”http://www.cs.berkeley.edu/~sagarwal/research/BGP-hierarchy/
• Erlebach, Hall, and Schank, “Classifying Customer-Provider Relationships in the Internet” http://www.tik.ee.ethz.ch/~the/
• website dedicated to the algorithms described in this presentation http://www.dia.uniroma3.it/~compunet/relationships/
computing the relationships between autonomous systems
giuseppe di battistamaurizio patrignanimaurizio pizzonia
univ. of rome IIIhttp://www.dia.uniroma3.it/~compunet/
the routing process
lan send to
directlyconnected
router 1everythingelse
lan send to
directlyconnected
router 4
router 2
lan send to
router 5
router 3
router 1 lan send to
router 6
router 2
directlyconnected
routing protocols
• routing protocols are used to automatically update the routing tables
• they fall into two main cathegories:– link-state routing protocols
• approach: send information about your neighbors to everyone
• each router reconstructs the whole network and computes a shortest path tree to all destinations
• examples: is-is, ospf
– distance-vector routing protocols• approach: send all your information to your neighbors
• each router updates its routing table based on the neighbors’ ones
• examples: rip
flat routing a single routing algorithm involving all the organizations
• it was this way before 1984• problems: slow to converge, difficult to deploy a new
routing algorithm, difficult to debug and configure, does not take into account the ownership of the links
lan send to
router 5
router 3
router 1
lan send to
router 9
router 2
router 6
hierarchical routingmore than one routing algorithm involved
• approach:– each organization runs a local routing algorithm by using an
interior gateway protocol (igp)
– external targets are injected into the local routing algorithm
route injection
• the interior routing algorithm spreads into the network the local targets as well as the remote ones injected by border routers
• border routers are aware of remote targets since they talk a higher level routing protocol with other border routers
lan send to
router 2
router 1
router 1
exterior gateway protocol
• approach:
exterior gateway protocol
• approach:– hide the interior part of all organizations
represent the internal targets
• each border router represents its internal targets as if they were local
simplify the graph
• simplify the graph– consider the (external and internal) reachability of the
routers– the graph is actually managed through tcp connections
called peerings
solve the routing problem
• solve the routing problem on the simplified graph– based on political considerations
border gateway protocol
• bgp is an exterior gateway protocol: it keeps the routing tables updated and propagates routing information
• takes into account the willingness of the organizations to cooperate in the routing process (commercial agreements, local preferences, priorities, legal issues, …)