21
1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15-17 ‘04 www.slac.stanford.edu/grp/scs/net/talk03/scidac-p inger-sep04.ppt

1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

Embed Size (px)

Citation preview

Page 1: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

1

IEPM/PingER Project

Les Cottrell, SLACDoE 2004 PI Network Research Meeting, FNAL Sep 15-

17 ‘04www.slac.stanford.edu/grp/scs/net/talk03/scidac-pinger-sep04.ppt

Page 2: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

2

Outline• PingER

– Purpose etc.– Methodology– Results

• PingER-NG ≡ IEPM-BW– Low network impact bandwidth tool (INCITE)– Traceroute viz– Topology (INCITE)

Page 3: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

3

PingER• Uses ping to provides lightweight performance

monitoring:– < 100bits/s per pair measured– No software to install at remote sites– Measures loss, RTT, reachability, jitter

• For planning, trouble shooting

• Originally (1990s) for HENP sites

• More recently also to characterize the Digital Divide– ICFA/SCIC, Internet2 Hard to Reach Places, WSIS,

ICTP/eJDS

Page 4: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

4

Methodology

• Use ubiquitous ping

• Each 30 minutes from monitoring site to target : – 1 ping to prime caches– by default send11x100Byte pkts followed by

10x1000Byte pkts• Low network impact + no software to install / configure /

maintain at remote sites + no passwords / accounts needed = good for developing sites / regions

• Record loss & RTT, (+ reorders, duplicates)

• Derive throughput, jitter, unreachability …

Page 5: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

5

Architecture

• Hierarchical vs. full mesh

WWWWWW

ArchiveArchive

MonitoringMonitoringMonitoringMonitoring MonitoringMonitoring

RemoteRemote

RemoteRemoteRemoteRemote

RemoteRemote

FNAL

Reports & Data

CacheMonitoringMonitoring

SLAC Ping

HTTP

ArchiveArchive

1 monitor hostremote host pair

~35

~550

Page 6: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

6

Coverage• In last 9 months added:

– Several sites in Russia (thanks GLORIAD)– Many hosts in Africa (5=>36 now in 27 out of 54 countries)– Monitoring sites in Pakistan and Brazil (Sao Paolo and Rio)

• Now monitoring 650 sites in 115 countries• Working to install monitoring host in Bangalore, India

Monitoring siteRemote site

Page 7: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

7

World ViewS.E. Europe, Russia: catching upLatin Am., Mid East, China: keeping upIndia, Africa: falling behind

C. Asia, Russia, S.E. Europe, L. America, M. East, China: 4-5 yrs behind

India, Africa: 7 yrs behind

Important for policy makers

TCP throughput measured from N. America to World Regions

1

10

100

1000

10000

Jan-

95

Jan-

96

Dec

-96

Dec

-97

Dec

-98

Dec

-99

Dec

-00

Dec

-01

Dec

-02

Dec

-03

Dec

-04

Der

ived

TC

P t

hro

ug

hp

ut

in

KB

ytes

/sec

1

10

100

1000

10000

China (13)

S.E. Europe (21)

Europe(150) Canada (27)

Russia(17)

Edu (141)

Latin America (37)

India(7) Africa (30)

Mid East (16)

50% Improvement/year~ factor of 10 in < 6 years

C. Asia (8)

From the PingER project, Aug 2004

Caucasus (8)

Page 8: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

8

View from CERN• Confirms view from N. America

TCP throughput from CERN to World Regions

1

10

100

1000

10000

100000

Feb-98 Jun-99 Oct-00 Mar-02 Jul-03 Dec-04

De

riv

ed

TC

P t

hro

ug

hp

ut

Kb

its

/s

Europe

N America

SE Europe

M East

Russia

L America

AfricaChina

India

From the PingER project August 2004.

Page 9: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

9

From Developing Regions

TCP throughput measured from Brazil to World Regions

10

100

1000

10000

Jan-04 Feb-04 Mar-04 Apr-04 May-04 Jun-04 Jul-04 Aug-04

De

riv

ed

TC

P t

hro

ug

hp

ut

KB

yte

s/s

Africa E. Asia Europe N. AmericaRussia S. America S. Asia

Latin America

Europe N. America

As expected Brazil to L. America is goodActually dominated by Brazil to BrazilTo Chile & Uruguay poor since goes via US

Brazil (Sao Paolo)

Novosibirsk

NSK to Moscow used to be OK but loss went up in Sep. 2003 GLORIAD may help

TCP throughput from Novosibirsk to world regions

1

10

100

1000

10000

Sep-02 Dec-02 Mar-03 Jun-03 Oct-03 Jan-04 Apr-04 Aug-04

Der

ived

th

rou

gh

pu

t in

Kn

its/

s

Africa AustralasiaBalkans E. AsiaEurope M. EastN. America RussiaS. America S. Asia

big loss increase to Moscow (from < 1% to 2-3%)Moscow

Japan/ChinaN. America

Novosibirsk

Page 10: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

10

Technology Achievement Index (TAI)

• TAI captures how well a country is creating and diffusing technology and building a human skills base.

• TAI from UNDP hdr.undp.org/reports/global/2001/en/pdf/techindex.pdf TAI top 12Finland 0.744US 0.733Sweden 0.703Japan 0.698Korea Rep. of 0.666Netherlands 0.630UK 0.606Canada 0.589Australia 0.587Singapore 0.585Germany 0.583Norway 0.579

US & Canada off-scale

Page 11: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

11

PingER-NG = IEPM-BW• Need measurement tools for high-performance

paths/applications– BER 10-8 takes > day to see 1 loss– Ping losses ≠ TCP losses

• Build infrastructure to – Measure with:

• Iperf (TCP mem-to-mem), GridFTP, bbftp• Lightweight packet pair dispersion

– Evaluate measurement tools

Page 12: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

12

Low impact bandwidth measurement• Goals:

– Make a measurement in < second rather than tens of seconds

– Injects little network traffic– Provide reasonable agreement with more intense methods

(e.g. iperf)

• Enables:– Measurements of low performance links (e.g. to developing

countries)– Helps avoid need for scheduling– More frequent measurements (minutes vs. hours)– Lower impact more friendly

Page 13: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

13

Low impact Bandwidth• Use 20 packet pairs to roughly estimate dynamic bw Capacity &

Xtraffic, then Available = Capacity – Xtraffic– Capacity min pair separation; Xtraffic packet pair dispersion

Dynamic bandwidth capacity (DBC)

Available bandwidth =DBC – X-traffic

Cross-traffic

Iperf

ABwE SLAC to Caltech Mar 19, 2004

Page 14: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

14

Achievable throughput & file transfer

• IEPM-BW– High impact (iperf, bbftp, GridFTP …) measurements 90+-15 min intervals

Select focal area

Fwd route change

Rev route change

Min RTT

Iperf

bbftpiperf1

abing

Avg RTT

Page 15: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

15

Anomalous Event Detection• Too many graphs to scan by hand, need to automate

– SLAC Caltech link performance dropped by factor 5 for ~ month before noticed, fixed within 4 hours of reporting

• Looking for long-term step down changes in bandwidth• Use modified “plateau” algorithm from NLANR

– Divide data into history & trigger buffer– If y < h – * h then trigger, else history (

• When trigger buffer fills: if t < * h, then have an event

Page 16: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

16

Route table Example• Compact so can see many routes at once

History navigation

Multiple route changes (due to GEANT), later restored to original route

Available bandwidth

Raw traceroute logs for debugging

Textual summary of traceroutes for email to ISPDescription of route numbers with date last seen

User readable (web table) routes for this host for this day

Route # at start of day, gives idea of root stability

Mouseover for hops & RTT

Page 17: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

17

Another example

TCP probe type

Host not pingable

Intermediate router does not

respondICMP checksum

error

Level change

Get AS information for routes

Page 18: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

18

Topology• Choose times and hosts and submit request

DLCLRC

CLRC

IN2P3

CESnet

ESnet

JAnetGE

AN

TNodes colored by ISPMouseover shows node namesClick on node to see subroutesClick on end node to see its path backAlso can get raw traceroutes with AS’

Alternate rt

SLAC

Alternate routeHour of day

Page 19: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

19

Putting it together

Bandwidth from SLAC to Supernet.org June 2, 2004

0

200

400

600

800

1000

6/2

/04

0:0

0

6/3

/04

0:0

0

Ba

nd

wid

th in

Mb

its

/s

Xtr

Abw

Cap

mh - 2 oh

mh

Route changes

mh=954Mbits/s, mt=753Mbits/s

(mh-mt)/(sqrt((oh**2+o t**2)/2))=2.4

sensitivity = 2; threshold 40%l history buffer length = 600ttrigger buffer length = 60

ESnetCENIC

Abilene

SLAC

SupernetSOX

Page 20: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

20

New features in works (with NIIT)

• Improve new site set-up tools• Improve management

– Discover non working links faster

• Improve access to data and meta data– Provide data base with lat/long, country etc.– Add web services access

• Improve visualization:– Provide map with drill down to node information– Automate production of long term trend plots for regions– More node selection capabilities

• Traceroute measurement and analysis

Page 21: 1 IEPM/PingER Project Les Cottrell, SLAC DoE 2004 PI Network Research Meeting, FNAL Sep 15- 17 ‘04

21

More• PingER Project

– http://www-iepm.slac.stanford.edu/pinger/– IEEE Communications Magazine on Network Traffic

Measurements and Experiments.

• ICFA/SCIC Network Monitoring report, Jan ‘04– http://www.slac.stanford.edu/xorg/icfa/icfa-net-paper

-jan04/

• IEPM-BW– http://www-iepm.slac.stanford.edu/