37
Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring Geography from BGP raw data 1 / 18

Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Inferring Geography from BGP raw data

Luca Sani

Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi

Luca Sani Inferring Geography from BGP raw data 1 / 18

Page 2: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

The Internet

The Internet: a huge set interconnected Autonomous Systems (ASes)

ASes are owned by organizations with different geographicdistribution and economic purposes

Luca Sani Inferring Geography from BGP raw data 2 / 18

Page 3: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Motivation

Internet AS-level topology inferred from BGP data:

1 node = 1 AS

1 edge = 1 or more BGP connections between two ASes

This global view hides the Internet heterogeneity

Luca Sani Inferring Geography from BGP raw data 3 / 18

Page 4: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Goals

1 Infer regional AS-level topologies from BGP data

2 Analyze graph and economic properties . . .

. . . at continental granularity:

Africa

Asia Pacific (Asia and Oceania)

Europe

Latin America (the Caribbean, Central America, Mexico andSouth America)

North America (Bermuda, Canada, Greenland, Saint Pierreand Miquelon, USA)

Luca Sani Inferring Geography from BGP raw data 4 / 18

Page 5: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Goals

1 Infer regional AS-level topologies from BGP data

2 Analyze graph and economic properties . . .

. . . at continental granularity:

Africa

Asia Pacific (Asia and Oceania)

Europe

Latin America (the Caribbean, Central America, Mexico andSouth America)

North America (Bermuda, Canada, Greenland, Saint Pierreand Miquelon, USA)

Luca Sani Inferring Geography from BGP raw data 4 / 18

Page 6: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

What “BGP data” is?

We use BGP data provided by the Oregon UniversityRouteViews and the RIPE RIS projects (October 2011)

They deployed route collectors around the world

Route collectors gather routes from cooperating ASes(feeders)

Luca Sani Inferring Geography from BGP raw data 5 / 18

Page 7: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

What “BGP data” is? (cont.)

There are three relevant information (for our work) in each route:

Set of AS paths ⇒ Global Topology

39,974 ASes139,944 Connections

Luca Sani Inferring Geography from BGP raw data 6 / 18

Page 8: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

First Step - AS geolocation

“An AS is a connected group of one or more IP prefixes runby one or more IP network operators which has a single and clearlydefined routing policy” (RFC 1930)

For each AS

1 We collect its IP prefixes from BGP data

2 We geolocate it by geolocating its prefixes (Maxmind GeoLiteDatabase)

96% of 39,974 ASes result located only in one region

88% of 139,944 connections involve at least an AS locatedonly in one region

Luca Sani Inferring Geography from BGP raw data 7 / 18

Page 9: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Second Step - Single Region ASes

The connection A-B is geolocated in North America

BGP requires the interfaces of the connection to share thesame IP subnet (exception: BGP multihop)

The subnet S belongs either to AS A or to AS B

Luca Sani Inferring Geography from BGP raw data 8 / 18

Page 10: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Second Step - Single Region ASes (cont.)

A does not own any IP address outside the North America!

Luca Sani Inferring Geography from BGP raw data 9 / 18

Page 11: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Second Step - Single Region ASes (cont.)

In any case IP A must be in North America

⇒ The connection is geolocated in the single common region

Luca Sani Inferring Geography from BGP raw data 10 / 18

Page 12: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Second Step - Other cases

We exploit single region ASes or SOURCE and DESTINATION regions

We geolocate the connection in North America (regionalprinciple)

Luca Sani Inferring Geography from BGP raw data 11 / 18

Page 13: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Regional Topologies: EU and NA case

EuropeNorth

AmericaWorld

ASes 17,101 15,894 39,974Connections 72,581 42,610 139,944

Avg.Degree

8.49 5.36 6.97

Max Degree1818

(RETN)2542

(Level3)3418

(Cogent)

10-5

10-4

10-3

10-2

10-1

100

100 101 102 103 104

P(X

>x)

x = k

Europe North America World

Luca Sani Inferring Geography from BGP raw data 12 / 18

Page 14: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Regional Topologies: EU and NA case

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

10-4 10-3 10-2 10-1 100

P(X

>x)

x = kNN/max(k)

Europe North America World

Luca Sani Inferring Geography from BGP raw data 13 / 18

Page 15: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Analysis

In order to get better insights we investigate the economic natureof the connections

Classic Economic Tags

Provider-to-Customer (P2C), Peer-to-Peer (P2P),Sibling-to-Sibling (S2S)

We adapted an economic tagging algorithm* to deal withgeographic information

*Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi, Luca Sani: BGP

and inter-AS Economic Relationships, IFIP Networking ’11

Luca Sani Inferring Geography from BGP raw data 14 / 18

Page 16: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Analysis - Results

EuropeNorth

AmericaWorld

P2C 32,471 31,820 80,095P2P 39,813 10,230 58,040S2S 297 560 1,743

P2C = Provider-to-Customer, P2P = Peer-to-Peer, S2S = Sibling-to-Sibling

Europe vs North-America case

They have a similar number of P2C connections

Europe has much more P2P connections

IXPs play a fundamental role in this difference

Luca Sani Inferring Geography from BGP raw data 15 / 18

Page 17: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Conclusion

We developed a methodology to infer continental AS-leveltopologies

We analyzed their graph and economic properties

We evidenced structural differences otherwise hidden in theglobal topology

Luca Sani Inferring Geography from BGP raw data 16 / 18

Page 18: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Future

Fine-grained analysis (requires a high-precision geolocationtool)

Sensitivity of the results with respect to geolocation databases

Influence of current BGP feeders distribution on the results

Luca Sani Inferring Geography from BGP raw data 17 / 18

Page 19: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

The End

Thank you for your attention!

Questions

[email protected]

Luca Sani Inferring Geography from BGP raw data 18 / 18

Page 20: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Backup

Backup Slides

Luca Sani Inferring Geography from BGP raw data 19 / 18

Page 21: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Active Measurement

Active measurement to solve geolocation of particular connections

Luca Sani Inferring Geography from BGP raw data 20 / 18

Page 22: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 2 - FROM & NEXT HOP

the FROM field identifies the neighbor BGP

the NEXT HOP field identifies the neighbor IP

Luca Sani Inferring Geography from BGP raw data 21 / 18

Page 23: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Analysis - Tag Changes

Tag changes from the worldwide to the regional scenarios

AfricaAsia

PacificEurope

LatinAmerica

NorthAmerica

Peering to transit 12 86 325 36 219Transit to peering 165 824 2,304 361 1,136

Luca Sani Inferring Geography from BGP raw data 22 / 18

Page 24: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Geolocation issues

a) 145 ASes are not geolocated at all

b) pair of ASes that do not share any region (partial geolocationor multihop)

Do not appear in any regional topology

6,141 over 139,944 connections

199 over 39,974 ASes ⇐ 145 because a), 44 because b)

Luca Sani Inferring Geography from BGP raw data 23 / 18

Page 25: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Tagging algorithm

step A: Inference of all the possible economic relationships foreach direct AS connection

direct means that (A,B) 6= (B,A)

It is based on the approach proposed by Oliveira et al. in [2]

The list of Tier-1 provided by Wikipedia has been exploited

For each tag is mantained the lifespan of the AS path used

At the end of this step we have multiple (tag, lifespan) pairsfor each connection

Luca Sani Inferring Geography from BGP raw data 24 / 18

Page 26: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Tagging algorithm

step B: Inference of a single economic relationship for eachdirect AS connectionAll (tag, lifespan) pairs related to the same direct connectionhave to be merged

Find the max lifespan among each pairMerge only those pairs that have a comparable lifespan withthe max, i.e. those do not differ more than N order ofmagnitude from the maxRecord the largest lifespan as the lifespan of the resulting tag

[A, B][A, B] p2c p2p c2p s2s

p2c p2c p2c s2s s2sp2p p2c p2p c2p s2sc2p s2s c2p c2p s2ss2s s2s s2s s2s s2s

Luca Sani Inferring Geography from BGP raw data 25 / 18

Page 27: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Tagging algorithm

step C: Final tagging and two-way validationIn order to have the economic relationship existing betweenAS A and AS B, the tags inferred for (A,B) and (B,A)connections have to be mergedThe approach used is the same as Step B, considering thedifferent direction of connections, e.g. (A,B) = p2c and (B,A)= c2p have the same meaning

The merge is still based on lifespan, thus if the lifespans arenot comparable, only the long-lasting tag affect the final tag

If there is a tag for both (A,B) and (B,A) and their lifespan iscomparable, then the tag is said to be two-way validated

Luca Sani Inferring Geography from BGP raw data 26 / 18

Page 28: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 1 - Inferring Enhanced Routes from BGP data

IP Geolocation Database: Maxmind GeoLiteCity

The geolocation of the IP feeder is trivial

The geolocation of a /X prefix requires to geolocate 2(32−X )

IP addresses

What about AS geolocation?

Luca Sani Inferring Geography from BGP raw data 27 / 18

Page 29: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 1 - Inferring Enhanced Routes from BGP data

IP Geolocation Database: Maxmind GeoLiteCity

The geolocation of the IP feeder is trivial

The geolocation of a /X prefix requires to geolocate 2(32−X )

IP addresses

What about AS geolocation?

Luca Sani Inferring Geography from BGP raw data 27 / 18

Page 30: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 2 - Detection of SRLTPs inside enhanced Routes

SRLTP = Single Region Located Transit Point

In each enhanced route we find regions from which the traffic hasto transit:

1 SOURCE REGION, DEST REGION and one-region located ASes

2 ASes with only one region in common with neighbors

Luca Sani Inferring Geography from BGP raw data 28 / 18

Page 31: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 2 - Detection of SRLTP - Examples

Luca Sani Inferring Geography from BGP raw data 29 / 18

Page 32: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 3 - Inferring Geographic AS paths

For each enhanced route we analyze each SRTLP

Given the region of a SRLTP we try to expand the set ofconnections in that region

Luca Sani Inferring Geography from BGP raw data 30 / 18

Page 33: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Step 3 - Inferring Geographic AS paths

Luca Sani Inferring Geography from BGP raw data 31 / 18

Page 34: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Regional Topologies - Results

AfricaAsia

PacificEurope

LatinAmerica

NorthAmerica

World

ASes 815 6,427 17,101 2,453 15,894 39,974Connections 2,002 18,040 72,581 8,329 42,610 139,944

Avg. Overlap(Conns)

0.03±0.01 0.05±0.02 0.03±0.02 0.03±0.01 0.05±0.02 -

Avg. Degree 4.90 5.61 8.49 6.79 5.36 6.68

Luca Sani Inferring Geography from BGP raw data 32 / 18

Page 35: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Analysis - Results

AfricaAsia

PacificEurope

LatinAmerica

NorthAmerica

World

P2C 1,456 12,808 32,471 4,514 31,820 80,095P2P 492 5,012 39,747 3,719 10,164 58,040S2S 21 102 297 37 350 1,743

P2C = Provider-to-Customer, P2P = Peer-to-Peer, S2S = Sibling-to-Sibling

Luca Sani Inferring Geography from BGP raw data 33 / 18

Page 36: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Analysis

In order to get better insights we investigate the economicnature of the connections

Original* Adapted

Input: AS paths + LifespansGeographic AS paths +

Lifespans

Output:Global Economic Tagged

TopologyRegional EconomicTagged Topologies

Classic Economic Tags

Provider-to-Customer (P2C) , Peer-to-Peer (P2P) ,Sibling-to-Sibling (S2S)

*Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi, Luca Sani: BGP

and inter-AS Economic Relationships, IFIP Networking ’11

Luca Sani Inferring Geography from BGP raw data 34 / 18

Page 37: Inferring Geography from BGP raw data - Isolario · Inferring Geography from BGP raw data Luca Sani Enrico Gregori, Alessandro Improta, Luciano Lenzini, Lorenzo Rossi Luca Sani Inferring

Economic Relationships

provider-to-customer: the customer pays the provider toreach all ASes that it cannot reach in other ways

peer-to-peer: the two ASes exploits each other to reach theircustomer-cones (typically free-of-charge)

sibling-to-sibling: each AS acts as a provider for the other

Luca Sani Inferring Geography from BGP raw data 35 / 18