58
/afs/slac/u/sf/cottrell/t alk/escc/oct97 1 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRC <[email protected]>, <[email protected]> Presented at the ESCC Meeting, JLAB, Oct 1997

afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, [email protected]@hep.net

Embed Size (px)

Citation preview

Page 1: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 1

ESnet NMTF/NMFG - Status

Les Cottrell, SLAC & Dave Martin, HEPNRC

<[email protected]>, <[email protected]>

Presented at the ESCC Meeting, JLAB, Oct 1997

Page 2: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 2

Outline of Talk

• What happened to the NMTF/NMFG?

• What are we measuring?

• How are we measuring?

• Tools we are using/developing

• Coordination with others

• Next Steps

• Summary

Page 3: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 3

What happened to the NMTF/NMFG?• It evolved

– Some of original members (BNL & ORNL) were unable to continue effort

– SLAC& HEPNRC retained focus on monitoring– ICFA concerned about impact of network performance

on HENP research• Created NTF with various WG, one on Monitoring

• More focus on HENP issues and International links

• Embraced work done by NMTF/NMFG and supported continued development

• Brought in new partners, in particular INFN, CERN as well as other collection sites

Page 4: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 4

Mission etc. of the ICFA-NTF WG on Monitoring

• Mission of Group:– Obtain as uniform picture as possible of the present

performance of the connectivity used by the ICFA community

• Two meetings so far, CHEP97 (Apr-97), & Santa Fe (Sep-97)

• Produced an interim status report for Sep-97

• Will update for Dec-97, with a final report Apr-98.

Page 5: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 5

Our Main Metric is Ping

• “Universally available”, easy to understand– no software for clients to install

• Low network impact

• Provides loss, response time, reachability, unpredictability

• select hosts carefully, concerns over routers, loaded hosts etc. (provide guidelines)

• does provide useful measures

Page 6: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 6

Ping Response Time vs Bytes

Page 7: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 7

Ping Response vs Web Response

y = 1.7135x + 719.83y = 2.5726x

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 100 200 300 400 500

Minimum Ping Response in msec.

GE

T R

esp

on

se i

n m

sec.

y = 2x

y = 1.71x + 720

y = 2.57x

HT

TP

GE

T R

espo

nse

(ms)

Minimum Ping Response (ms)

Page 8: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 8

Method– Measurement

• Each Collection site keeps list of remote hosts to ping at sites it is interested in

• Every 30 mins ping each remote host with 11 * 100 byte followed by 10 * 1000 byte pings

• Min separation of pings is 1 second, timeout 20 seconds

• Throw away first ping

• Measure response, packet loss, host unreachable (no answer to any ping)

• Record data and make available

Page 9: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 9

Architecture• Three Types of Sites

– Remote Sites - need only to respond to ping packets– Collecting Sites

• Collecting Data: Perl Script Pings Nodes, Records Data in common documented format

• Serving Data: CGI/Perl Script makes Data Available to Analysis Sites

• WWW CGI tools make reports available

– Analysis Sites• Retrieving Data: Perl Script Retrieves Data from Collecting Sites

• Analysis: SAS Program Analyzes Data and Generates Graphs

• Reports: WWW Form Makes Customized Reports Available

Page 10: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 10

Architecture

WWWWWW

AnalysisAnalysis AnalysisAnalysis

CollectingCollecting

CollectingCollecting

CollectingCollectingCollectingCollecting

RemoteRemote

RemoteRemoteRemoteRemote

RemoteRemote

HTTP

Pings

E.g. HEPNRC E.g. SLAC

Archive

Reports &Data

Cache

Page 11: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 11

Available Tools - Data Collection

• Collect data (timeping) – HEPNRC rearchitected, developed & documented– Deployed at 12 sites in 6 countries

• ARM, BNL, CERN, CMU, DoE/GMTN, HEPNRC/FNAL, INFN/CNAF. KEK, Hungary, RAL, SLAC, UMD

– DESY, IN2P3, TRIUMF, MSU, Beijing also expressed interest, plus commercial sites

• Data available (pingdata) in common format– Data collected available from collection site via HTTP– Allows data for specific times to be retrieved

Page 12: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 12

Current Deployment

ESnet Site (monitored from SLAC)N. American Site ( “ “ )International Site ( “ “ )

Monitoring Site

HEPNRC/FNALRAL

INFN/CNAF

CERN

RMKI/KFKIBNL

KEK

CMU

UMD

SLAC

DESY

Page 13: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 13

Analysis / Archive Site

• Gathers & archives data– HEPNRC gathers data from collection sites a few times

daily– Archives the data (200 Mbytes/month)– Works with collection sites to resolve problems– Provide Web access to archive data via form (ping_data.

pl)

Page 14: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 14

Access to Raw Data

Page 15: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 15

Analysis / Archive Site

• Gathers & archives data– HEPNRC gathers data from collection sites a few times

daily– Archives the data (200 Mbytes/month)– Works with collection sites to resolve problems– Provide Web access to archive data via form

(ping_data.pl)

• Provide Web form to allow simple plotting (graph_pings.pl), uses SAS for speed

Page 16: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 16

Form to Select Analysis Graphs

Page 17: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 17Generated by HEPNRC @ Fermilab, 02SEP97

0

100

200

300

400

500

0

10

20

30

40

50

60

70

80

90

100

Page 18: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 18

Analysis Tools for Collection Sites

• Short-term analysis / reports– Recent data (e.g. last 30 days cached)

• Web sortable table of latest measurements, colored for quality

Page 19: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 19

Ping Loss QualityPing Loss Quality

Distributions for Host Groups

0%

10%

20%

30%

40%

50%

60%

70%

Esnet

ISP L

ocal

Inte

rnat

ional

NAmer

icaE

NAmer

icaW

Per

cen

tile

<= 1% Loss (==Good)>1% & <=5% Loss (==Acceptable)>5% & <=12% Loss (==Poor)> 12% & <=25% Loss (==Bad)>25% Loss (==Unusable)

(76, 5.46) (183, 7.18)(150, 0.79)

(199, 6.3) (188, 6.21)

(host-months, median loss)

0 -1% Good, 1-5% Acceptable, 5-12% Poor,

12-25% Poor, > 25% Unusable

Similar to Internet Weather Report (<6%, <12%, > 12%)

Page 20: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 20

Analysis Tools for Collection Sites

• Short-term analysis / reports– Recent data (e.g. last 30 days cached)

• Web sortable table of latest measurements, colored for quality, with output (TSV) for Excel (connectivity.pl)

Page 21: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 21

Latest Ping Measurements

Page 22: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 22

Raw Data from last 24 Hours

Page 23: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 23

Latest Ping Measurements

Page 24: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 24

Ping Performance for Last 180 Days

Page 25: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 25

Analysis Tools for Collection Sites

• Short-term analysis / reports– Recent data (e.g. last 30 days cached)

• Web sortable table of latest measurements, colored for quality, with output (TSV) for Excel (connectivity.pl)

• Web form to select sites and time frames to be plotted (ping_data_plot.pl)

Page 26: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 26

Request Plot of Collection Site Data

Page 27: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 27

Plot from Collection Site

Page 28: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 28

Tools in Development

• Re-engineering SLAC long term reports– exception report

Page 29: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 29

Exception Reports

Color highlightsextent of exception

Click here toburrow down tomore information

Last 10 Weeks Ping Data

Nodename

Mean 10wks % Loss

Mean 1 wk % Loss

Stdev 10wks Loss

# Std From Mean

Mean 10wks Avg (ms)

Mean 1 wk Avg (ms)

BNL.GOV 0.1 0.4 0.1 2.4 65.4 66.9CALTECH.EDU 0.3 0.1 0.6 -0.3 22.9 22.7CEBAF.GOV 0.1 0 0.2 -0.3 62.9 57.6CERN.CH 4 5.7 1.5 1.2 242.9 236.9

Click to sort by column

Page 30: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 30

Tools in Development

• Re-engineering SLAC long term reports– exception report– last 180 days

Page 31: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 31

180 Days SLAC - Stanford

Uwave &

Routing problems

Direct connect

20 ms 5.5ms

Loss < 1%

Via ESnet

Loss 3-6% 30ms

Feb-

97 Aug-

97

Page 32: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 32

Tools in Development

• Re-engineering SLAC long term reports– exception report– last 180 days– monthly points going back for years in tabular form

with quality coloring, sorting & hyperlinks• Loss (by site, and by group of sites)

• Response ( “ “ )

• Reachability ( “ “ )

• % time network “Quiescent” or “Busy”

Page 33: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 33

Ping Loss History

Page 34: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 34

TSV Output to Excel for Further Analysis

Page 35: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 35

Ping Response by GroupMedian Monthly Prime Time Ping Response Time from Jan-95

thru Jul-97, seen from SLAC

0

50

100

150

200

250

300

350

400

450

Jan-

95

Mar

-95

May

-95

Jul-9

5

Sep-9

5

Nov-9

5

Jan-

96

Mar

-96

May

-96

Jul-9

6

Sep-9

6

Nov-9

6

Jan-

97

Mar

-97

May

-97

Jul-9

7

Me

dia

n P

ing

Ro

un

d T

Rip

Re

sp

on

se

Tim

e f

or

10

0 b

yte

pin

gs

Esnet

N America W

N America E

International

Expon. (International)

Expon. (N America E)

Expon. (N America W)

Expon. (Esnet)

Main cause of apparent increase in Esnet response due to adding MIT to monitoring

Big contribution to Internationalfrom 2 sites (CN and SU).Without these 2 sites medianresponse time is closer to 300ms.

Added ihep.cn tomonitoring

Big improvement inrtesponse time toDresden

DESY got bad

Page 36: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 36

Prime-time Packet Loss by GroupMonthly Median Prime Time Ping Packet Loss by Group Jan-

95 thru Jul-97 seen from SLAC

0.00

2.00

4.00

6.00

8.00

10.00

12.00

14.00

16.00

18.00

20.00

Jan-

95

Mar-9

5

May-

95

Jul-9

5

Sep-95

Nov-95

Jan-

96

Mar-9

6

May-

96

Jul-9

6

Sep-96

Nov-96

Jan-

97

Mar-9

7

May-

97

Jul-9

7

Med

ian

Mo

nth

ly P

ing

Pac

ket

Lo

ss

Esnet

N. America W

N. America E

International

Expon. (Esnet)

Expon. (International)

Expon. (N. America E)

Expon. (N. America W)

Page 37: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 37

“Quiescent” Frequency by GroupMonthly Median Frequency for No Packet Loss by Group from

Jan-95 thru Jul-96, seen from SLAC

50

55

60

65

70

75

80

85

90

95

100

Jan-

95

Mar-9

5

May-

95

Jul-9

5

Sep-95

Nov-95

Jan-

96

Mar-9

6

May-

96

Jul-9

6

Sep-96

Nov-96

Jan-

97

Mar-9

7

May-

97

Jul-9

7

Med

ian

Mo

nth

ly P

erce

nta

ge

Fre

qu

ency

of

No

pac

ket

loss

Esnet

N America W

N America E

International

Colorado &US NW bad

Carleton, McGill& Virginia Techbad

Dresden, ETH& RAL bad

Esnet peers with vBNS

Page 38: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 38

International Site “Busy” FrequencyMonthly International Site Frequency of Non-zero loss (out of 10 pings) from Jan-9 thru Jul-97, from SLAC

0

10

20

30

40

50

60

70

80

90

% F

req

ue

nc

y o

f N

on

-ze

ro p

ing

s

CERN.CH

ETHZ.CH

IN2P3.FR

DESY.DE

PHY.TU-DRESDE

FZU.CZ

GE.INFN.IT

LNF.INFN.IT

NA.INFN.IT

PD.INFN.IT

ROMA1.INFN.IT

TS.INFN.

RL.AC.UK

KEK.JP

IHEP.AC.CN

INP.NSK.SU

Expon. (RL.AC.UK)

Expon.(ROMA1.INFN.IT)Expon. (CERN.CH)

Expon. (DESY.DE)

RL.UK

UK - US linkupgraded

Italian nodestrack & lookgood

CERN & IN2P3track

Page 39: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 39

Tools in Development

• Re-engineering SLAC long term reports– exception report– last 180 days– monthly points going back for years in tabular form with

quality coloring, sorting & hyperlinks• Loss (by site, and by group of sites)

• Response ( “ “ )

• Reachability ( “ “ )

• % time network “Quiescent” or “Busy”

• Ten Worst links in HEP

Page 40: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 40

Ten Worst HEP Links

Source Destination PingSize

% of TimeUnreachable

% ofPackets

L:ost

AvierageRT Delay

(ms)

StandardDeviation of

Avg RT Delaysgiserv.rmki.kfki.hu www.jinr.dubna.su 100 3.1 40.4 1305 576sgiserv.rmki.kfki.hu fnal.fnal.gov 100 0.5 39.8 670 173sgiserv.rmki.kfki.hu unixhub.slac.stanford.edu 100 1.0 38.4 710 173sgiserv.rmki.kfki.hu hepnrc.hep.net 100 1.6 38.1 677 178sgiserv.rmki.kfki.hu www.hep.anl.gov 100 0.5 37.9 670 170sgiserv.rmki.kfki.hu www.slac.stanford.edu 100 1.0 36.2 717 166sgiserv.rmki.kfki.hu w4.lns.cornell.edu 100 6.8 35.2 658 173dxcnaf.cnaf.infn.it www.jinr.dubna.su 100 0.6 33.5 933 512

hepnrc.hep.net www.phys.s.u-tokyo.ac.jp 100 20 30.8 242 36.7

Ranked by % Packets Lost

Page 41: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 41

What are Typical Uses

• Setting Expectations

• Service Level Contract

• Choosing ISPs

• Identifying problems, and verifying solutions

• Planning for upgrades

Page 42: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 42

Summary to Help Choose Upgrades

Page 43: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 43

Prime Time Packet Loss Jun-Aug 97Jun-Aug 1997 Prime time ping packet loss between SLAC &

60 sites

PHYSICS.UCLA.EDUSLAC.STANFORD.EDBNL.GOVCALTECH.EDUFNAL.GOV

NA.INFN.ITTS.INFN.

DESY .DE

GE.INFN.IT

NIC.ES.NET

LLNL.GOVLBL.GOV

UTEXAS.EDU

UTDALLAS.EDUPHYSICS.YALE.

NEVIS.COLUMBIAPRINCETON.EDU

PHYSICS.WISCNSCP .UMD.EDU

PHYSICS.PURDUEUCHICAGO.EDU

LNS.CORNELL.EDPHYSICS.LSA.U

PHYSICS.UPENNHEP .UIUC.EDU

HEP .UMN.EDU

PHY.OLEMISS.EPHYS.VT.EDU

PHYSICS.CARLETONPHA.J HU.EDU

PVAMU.EDUPHY.DUKE.EDU

ORNL.GOVMIT.EDU

CEBAF.GOV

KEK.J PPHY.TU-DRESDE

LNF.INFN.ITINP .NSK.SU

RL.AC.UK

ETHZ.CHROMA1.INFN.IT

CERN.CH

FZU.CZPD.INFN.IT

PHYSICS.MCGILL.C PAS.ROCHESTE

UTK.EDUHARVARD.EDU

UCDAVIS.EDU

UCSC.EDUARM.GOV

COLORADO.EDUUOREGON.EDU

PHYSICS.UCSB.EWASHINGTON.EDU

PS.UCI.EDUTRIUMF.CA

UCSD.EDUSTANFORD.EDU

PHS.UC.EDU

IN2P3.FRIHEP .AC.CN

PHYSICS.COLOS

0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80

PHYSICS.UCLA.EDU

BNL.GOV

FNAL.GOV

LBL.GOV

ORNL.GOV

CEBAF.GOV

TS.INFN.

GE.INFN.IT

PHY.TU-DRESDE

INP.NSK.SU

ROMA1.INFN.IT

CERN.CH

IHEP.AC.CN

PD.INFN.IT

PHYSICS.YALE.

NEVIS.COLUMBIA

PHYSICS.WISC

PHYSICS.PURDUE

LNS.CORNELL.ED

PHYSICS.UPENN

HEP.UMN.EDU

PHYS.VT.EDU

PHA.J HU.EDU

PHY.DUKE.EDU

PAS.ROCHESTE

HARVARD.EDU

UCDAVIS.EDU

ARM.GOV

UOREGON.EDU

WASHINGTON.EDU

TRIUMF.CA

STANFORD.EDU

Sit

e w

ith

in g

rou

p r

anke

d b

y p

ing

pac

ket

loss

100 byte % ping packet loss

Aug-97

Jul-97

Jun-97

N. America W

N. America E

International

ESnet

Page 44: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 44

Coordination etc.

• XIWT/IPWT Interest/deployment

Page 45: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 45

XIWT/IPWT interest• Austin meeting in Sep-97

– available tools presented by developers: IWR, CAIDA/NLANR, Intel, Auto Industry/Bellcore, IETF/IPPM Surveyor …

• XIWT/IPWT want to:– Measure performance of members' own networks– Get tests to validate and understand what to recommend to

other commercial customers and for what purposes. – Build a community within XIWT so can evolve it to

address harder issues.

• Selected our tools to initially deploy at 6 sites– includes Intel, SBC, HAI, BellSouth, CNRI, NIST

Page 46: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 46

Coordination etc.

• XIWT/IPWT Interest/deployment

• MICS funded joint SLAC/LBL proposal on Internet End-to-end performance monitoring for 1 year

• LBL/NIMI project

Page 47: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 47

NIMI (1)• NIMI=National Internet Measurement

Infrastructure, collaboration LBL/PSC (V. Paxson, M Mathis, J. Mahdavi).

• It is a software suite (not hardware). Deploy on “measurement hosts” around the Internet for black box infrastructure measurements.

• Ready for deployment Nov-97. Perl daemon with treno, Poisson packet generation for loss & delays.

• Hooks for other tools such as pathchar, tcpanaly.

Page 48: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 48

NIMI (2)• Challenges: accurate clock synchronization (one

way measurements), scaling to millions of nimids (nb end-to-end measurement strategies are usually not cost free, some things may be over-measured), data retrieval, new measurement strategies.

• There is no central management.

• Both HEPNRC & SLAC plan to install NIMI hosts (PCs running FreeBSD) at their sites

Page 49: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 49

Coordination etc.

• XIWT/IPWT interest/deployment

• MICS funded joint SLAC/LBL proposal on Internet End-to-end performance monitoring for 1 year

• LBL/NIMI project

• Proposed joint work with NLANR to extend Mapnet Java tools to view our data

Page 50: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 50

NLANR Mapnet Tool

• Java Applet

• Zoom & pan

• Select ISPs

• Color:– ISP– bandwidth

• Mouse over– link details– node details

Page 51: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 51

Maproute (from NLANR)

• Shapes show function– router at NAP, at

transit backbone, at ISP

• Color show variance of transit time

• Meshes of paths to destination show flaps

• Can zoom into get site information etc.

Page 52: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 52

Coordination etc.

• XIWT/IPWT interest/deployment

• MICS funded joint SLAC/LBL proposal on Internet End-to-end performance monitoring for 1 year

• LBL/NIMI project

• Proposed joint work with NLANR to extend Mapnet Java tools to view our data

• Will submit paper to IETF for this December

• Surveyor installation proposed at ESnet sites

Page 53: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 53

Surveyor

• PC Hardware with GPS located at ANS & 23 CSG partner sites

• Measure one way loss & response time using clock synchronization, metrics defined by IETF/IPPM

• 8 sites now operational, monitor 56 paths ((N-1)*N)

• Results show can have big asymmetries (asymmetric loading & routing)

• Willing to deploy (at their cost) at 5 DOE sites

• For more see http://www.advanced.org/csg-ippm/

Page 54: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 54

Asymmetric One-way Delays

0%

20%

Loss Loss

Delay Delay

Advanced to U Chicago U Chicago to Advanced

0ms

300ms

0 24

Page 55: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 55

Next Steps• Longer term reports (10 week exceptions, 180 days,

monthly going back forever)

• Provide monthly summary tables with lots of statistical measures to allow faster generation of long term reports, and more robust metrics

• Extend grouping, e.g. by AS, country, time zones crossed, more geographic regions, user selectable, by experiment, by community, by collection site

• Summaries (c.f. Weather Map, top 10s, weekly, Consumer Reports)

• NIMI/Surveyor install, NLANR tools, help XIWT

Page 56: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 56

Summary• 12 sites, 6 countries collecting data on > 400 links

• Need care selecting remote sites

• Deployment of data collection went well

• Collection sites easy to maintain after initial install

• Biggest effort at the moment (> 1 FTE) is in:– Tool definition & development– Data gathering archiving (looking after pathologies)

• Gearing up to extend SAS tools and attendant scripts

• Lot of interest & collaboration outside ESnet

Page 57: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 57

To Join

• Collection site needs:– perl5 & HTTP server– install timeping & pingdata (need only cgi-bin access,

not root)– Decide on links to monitor– Get an analysis site to retrieve & generate graphs, or at

least get connectivity.pl & ping_data_plot.pl

• Need volunteers to work on analysis scripts, some of it will require SAS, also need Java applets to visualize,

Page 58: afs/slac/u/sf/cottrell/talk/escc/oct971 ESnet NMTF/NMFG - Status Les Cottrell, SLAC & Dave Martin, HEPNRCSLAC HEPNRC, cottrell@slac.stanford.edudem@hep.net

/afs/slac/u/sf/cottrell/talk/escc/oct97 58

More Information• Monitoring WG home page (includes links to the status

report, meeting notes, how to access data, and get & install code etc.)– http://www.slac.stanford.edu/xorg/icfa/ntf/home.html

• WAN Monitoring at SLAC has lots of links– http://www.slac.stanford.edu/comp/net/wan-mon.html

• Tutorial on WAN Monitoring– http://www.slac.stanford.edu/comp/net/wan-mon

/tutorial.html