35
perfSONAR-based Network Research Brian Tierney, ESnet, bl>[email protected] Febuary 10, 2016

perfSONAR-based Network Research

  • Upload
    ngongoc

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

Page 1: perfSONAR-based Network Research

perfSONAR-basedNetworkResearch

BrianTierney,ESnet,bl>[email protected],2016

Page 2: perfSONAR-based Network Research

WhatisperfSONAR?•  perfSONARisatoolto:

–  Set(hopefullyraise)networkperformanceexpecta>ons–  Findnetworkproblems(“soOfailures”)–  Helpfixtheseproblems

•  Allinmul>-domainenvironments•  Theseproblemsareallharderwhenmul>plenetworksareinvolved

–  FocusonResearchandEduca>on(R&E)Networking,1Gbpslinksorhigher•  perfSONARisprovidesastandardwaytopublishac>veandpassive

monitoringdata–  Thisdataisinteres>ngtonetworkresearchersaswellasnetworkoperators

2

Page 3: perfSONAR-based Network Research

CurrentperfSONARcomponents•  Measurementtools

–  iperf3,bwctl,owamp,traceroute,paris-traceroute,etc.•  Measurementarchive•  Centraltestmeshmanagementtools•  Hostmanagementtools

–  Configuretests,configureNTP,etc.•  Dataanalysistools

–  Plotdatafromthearchive–  Dashboardtools

•  LookupService

February10,2016 3

Page 4: perfSONAR-based Network Research

Hardvs.SoOFailures•  “Hardfailures”arethekindofproblemseveryorganiza>onunderstands

–  Fibercut–  Powerfailuretakesdownrouters–  Hardwareceasestofunc>on

•  Classicmonitoringsystemsaregoodataler>nghardfailures–  i.e.,NOCseessomethingturnredontheirscreen–  Engineerspagedbymonitoringsystems

•  “SoOfailures”aredifferentandoOengoundetected–  Basicconnec>vity(ping,traceroute,webpages,email)works–  Performanceisjustpoor

•  HowmuchshouldwecareaboutsoOfailures?

6/2/15 4

Page 5: perfSONAR-based Network Research

6/2/15 5

Page 6: perfSONAR-based Network Research

MainperfSONARrole:Find“SoOFailures”

Gb/s

normalperformance

degradingperformance

onemonth

repair

6/2/15 6

Graduallyfailingop>cs

Under-Poweredfirewalldevice

Page 7: perfSONAR-based Network Research

perfSONARHistory•  perfSONARcantraceitsorigintotheInternet2“End2EndperformanceIni>a>ve”

fromtheyear2000.•  Whathaschangedsince2000?

–  TheGoodNews:•  TCPismuchlessfragile;CubicisthedefaultCCalg,autotuningisandlargerTCPbuffersare

everywhere•  ReliableparalleltransfersviatoolslikeGlobusOnline•  High-performanceUDP-basedcommercialtoolslikeAspera•  moregoodnewsinlatestLinuxkernel,butitwilltake3-4yearsbeforethisiswidelydeployed

–  TheBadNews:•  Thewizardgapiss>lllarge•  Under-bufferedandswitchesandroutersares>llcommon•  Under-powered/misconfiguredfirewallsarecommon•  SoOfailuress>llgoundetectedformonths•  Userperformanceexpecta>onsares>lltoolow

7

Page 8: perfSONAR-based Network Research

TheperfSONARcollabora>on•  TheperfSONARcollabora>onisaOpenSourceprojectleadbyESnet,Internet2,

IndianaUniversity,andGEANT.–  Eachorganiza>onhascommifed1.5FTEefforttotheproject–  Plusaddi>onalhelpfrommanyothersinthecommunity(OSG,RNP,SLAC,andmore)

•  TheperfSONARRoadmapisinfluenceby–  requestsontheprojectissuetracker–  annualusersurveyssenttoeveryoneontheuserlist–  regularmee>ngswithVOusingperfSONARsuchastheWLCGandOSG–  discussionsatvariousperfSONARrelatedworkshops

•  Basedontheabove,every6-12monthstheperfSONARgovernancegroupmeetstopriori>zefeaturesbasedon:–  impacttothecommunity–  levelofeffortrequiredtoimplementandsupport–  availabilityofsomeonewiththerightskillsetforthetask

8

Page 9: perfSONAR-based Network Research

publicperfSONARServers(Jan2016)•  Totalofaround1700publiclyregisteredservers

–  Equalnumberofnon-registeredservers?•  ESnet:70

–  mostly10G,includesa40GhostinBoston•  GEANT:22•  Internet2:3•  Someothertopdeployments:

–  Onenet(24),AMPATH(8),bc.net(10),RNP(8),Canarie(13),kreonet(14),NERO(12),AARnet(19),JGN(17),CENIC(5),KANREN(5)

February10,2016

©2016,hfp://www.perfsonar.net 9

Page 10: perfSONAR-based Network Research

perfSONARHardware•  Thesedaysyoucangetagood1Uhostcapableofpushing10Gbps

TCPforaround$500(+10GNICcost,$750?).–  SeeperfSONARuserlist

•  Andyoucangetahostcapableof1Gforaround$150!–  Getamul>-coreIntelCeleron-basedhost

•  ARMisnotfastenough–  e.g.:ZBOXbyZOTAC:hfps://www.zotac.com/us/product/mini_pcs/

zbox-ci323-nano

•  VMsarenotrecommended–  ToolsmoreaccurateifcanguaranteeNICisola>on

10

Page 11: perfSONAR-based Network Research

perfSONAR3.5Update•  perfSONAR3.5releasedOctober,2015– ModernizetheGUIs–  Supportforcentralhostmanagementandnodeauto-configura>on

–  SupportforDebian,VMs,andotherinstalla>onop>ons

–  Supportforlowcost($150),1Gbpsnodes

February10,2016 11

Page 12: perfSONAR-based Network Research

ExpandedperfSONARUseCases•  PreviousUseCase:perfSONARToolkit

•  IncludesCentOS6andallperfSONARcomponents

•  NewUseCases– perfSONARtoolsonly

•  SupportforbothRHEL-basedandDebian-basedhosts– perfSONARhoststhatarecentrallymanaged

•  Centralmanagerpackage•  Testpointpackage

February10,2016 12

Page 13: perfSONAR-based Network Research

perfSONARforNetworkResearchers•  Ac>vemeasurementinteres>ngfornetworkresearchers

–  Traceroutedataautoma>callycollectedalongwithbwctl/owampresults

–  TCPretransmitsasmeasuredbyiperf3•  Dataeasytodownloadforanalysis

–  esmond-ps-get-bulk•  OutputCSVorJSON•  See:hfps://pypi.python.org/pypi/esmond_client

•  Addi>onalInforma>onat:–  hfp://docs.perfsonar.net/client_apis.html

February10,2016 13

Page 14: perfSONAR-based Network Research

perfSONARonLowCostHardware•  Mo>va>on:makeperfSONARaffordableenoughtodeployonallsubnets

•  Assump>ons:–  1Gbpstestnodes–  Centralizedmeasurementarchive–  Centralizedconfigura>onmanagement– DebianLinux

February10,2016 14

Page 15: perfSONAR-based Network Research

CurrentperfSONARdevelopment•  Oneofthethemesforv3.6willbe“ControlandScalability”

–  perfSONARissuccessfulbecauseofthe‘defaultopen’model.–  BUT,asthenumberofperfSONARhostsworldwidegrows,weneedawaytocontrol

•  Whoisrunningtests•  HowoOenaretheyallowedtoruntests•  WhathostscanIrunteststo?HowtoIgetmyhostaddedtosomeoneelse’slistof

allowedhosts?

•  Workingonanewtestscheduler(psScheduler):–  Sharedbyalltestsandawareoftheresourceseachuses–  Containingfinergrainedcontrolsaboutwhocanruntestsandwhatteststhey

areallowedtorun.–  Increasedvisibilityandcontrolastowhentestswillberun

February10,201615

Page 16: perfSONAR-based Network Research

Roadmapforv3.6•  Atestscheduler(psScheduler):

–  Sharedbyalltestsandawareoftheresourceseachuses–  Containingfinergrainedcontrolsaboutwhocanruntestsandwhatteststheyareallowedtorun.–  Increasedvisibilityandcontrolastowhentestswillberun

•  Newgraphsthatallowforeasiercomparisonofmul>plemetrics–  basedonESnetToolsteamreact-basedploxngtools

•  Awebinterfaceforcrea>ngtestmeshes•  Easierselec>onofendpointsbasedontopologyloca>on,geographicloca>on,accessibility

and/orcustomsearches•  Dashboardsthatsupportaler>ngbasedonpafernsacrossanen>remesh•  Debian8support•  CentOS7versionsofthetools,testpoint,core,andcentralmanagementbundles

–  FullCentOS7Toolkitwillbeinthenextrelease

•  Pre-packagedperfSONARVMimages

February10,201616

Page 17: perfSONAR-based Network Research

ExampleperfSONARResearchProjects

Page 18: perfSONAR-based Network Research

perfSONARControlPlane(PSCP)ProjectProf.YanLuo([email protected])

Page 19: perfSONAR-based Network Research

perfSONARControlPlane(PSCP)•  Objec>ves

–  MeasurementArchiveDataAnalysis•  Whatarethemeasurementresults?Whatcanwelearn?

–  Automa>cperfSONARPeerSelec>on•  Quicklyiden>fythebestsuitablePSnode(s)ontheroutesinques>on

–  ProgrammableMeasurementandTroubleshoo>ng•  Definemeasurementtaskandcondi>onswithsoOware

•  TheDesignofperfSONARControlPlane–  PathDiscovery–  MeasurementTaskControl

hfps://github.com/ACANETS/pscpProf.YanLuo([email protected])

Page 20: perfSONAR-based Network Research

Opera>onandUseCaseofPSCP•  Obtaintracerouteinforma>onfrom95perfSONARMeasurementArchives•  Buildatraceroutegraphbasedonthe1831records

•  FindasetofperfSONARnodepairstostartbandwidthtestsandmonitorthe

results•  UseCase:Diagnos>canalysisandtrouble-shoo>ngasoOnetworklinkfailure

–  <=300LOCPythoncode,<=15minutes

hfps://github.com/ACANETS/pscpProf.YanLuo([email protected])

Page 21: perfSONAR-based Network Research

PythiaNetworkDiagnosisInfrastructure(PuNDIT)

PIs: Shawn McKee ([email protected]) and

Constantine Dovrolis ([email protected])

Page 22: perfSONAR-based Network Research

AboutPuNDIT•  PuNDITisaNSFSSIprojectwhichusesperfSONARdatatoiden>fyandlocalizenetworkproblems(2014-2016)– Goaltoautomatewatching/analyzingperfSONARmetrics•  informusers/site-adminswhentherearerealnetworkproblemstheyshouldaddress.

•  Seefurtherdetailsathfp://pundit.gatech.edu•  UserGUImock-uphfp://punditui.aglt2.org/

2/10/16 22

Page 23: perfSONAR-based Network Research

PuNDITArchitecture

2/10/16 23

•  perfSONARprovidesthebasemeasurementinfrastructure•  Collectsnetworkmetricslikelatency,lossand

reordering•  Collectstopologicalinforma>on•  AddsscampersupporttoperfSONAR:Mul>path

Detec>onAlgorithm(MDA)fromtheparis-tracerouteteamtohandleloadbalancedpaths

•  AlightweightPuNDITprocessoneachhostperformsdetec>on

•  Thecentralserverholdseventrepositoryandrunsalocaliza>onalgorithm

Page 24: perfSONAR-based Network Research

PuNDITDetails

2/10/16 24

GatherandAnalyzeNetworkTopologies

CollectNetworkMetrics

DetectProblemSignatures

LocalizeProblema>cLinks

GatherandAnalyzeNetworkTopologies

CollectNetworkMetrics

DetectProblemSignatures

LocalizeProblema>cLinks

GatherandAnalyzeNetworkTopologies

CollectNetworkMetrics

DetectProblemSignatures

LocalizeProblema>cLinks

GatherandAnalyzeNetworkTopologies

CollectNetworkMetrics

DetectProblemSignatures

LocalizeProblema>cLinks

Page 25: perfSONAR-based Network Research

EmailListsandReferenceMaterials

Page 26: perfSONAR-based Network Research

Ac>veandGrowingperfSONARCommunity

•  Ac>veemaillistsandforumsprovide:–  Instantaccesstoadviceandexper>se

fromthecommunity.–  Abilitytosharemetrics,experience

andfindingswithotherstohelpdebugissuesonaglobalscale.

•  Joiningthecommunityautoma>callyincreasesthereachandpowerofperfSONAR–  Themoreendpointsmeans

exponen>allymorewaystotestanddiscoverissues,comparemetrics

26

Page 27: perfSONAR-based Network Research

•  TheperfSONARcollabora>onisworkingtobuildastrongusercommunitytosupporttheuseanddevelopmentofthesoOware.

•  perfSONARMailingLists–  AnnouncementLists:

•  hfps://mail.internet2.edu/wws/subrequest/perfsonar-announce

–  UsersList:•  hfps://mail.internet2.edu/wws/subrequest/perfsonar-users

perfSONARCommunity

27

Page 28: perfSONAR-based Network Research

UsefulURLs•  hfp://docs.perfsonar.net/•  hfp://www.perfsonar.net/•  hfp://fasterdata.es.net/–  hfp://fasterdata.es.net/performance-tes>ng/network-troubleshoo>ng-tools/

•  hfps://github.com/perfsonar–  hfps://github.com/perfsonar/project/wiki

28

Page 29: perfSONAR-based Network Research

ExtraSlides

Page 30: perfSONAR-based Network Research

bwctlfeatures•  BWCTLletsyourunanyofthefollowingbetweenany2perfSONARnodes:–  iperf3,iperf,nufcp,ping,owping,traceroute,andtracepath

•  SampleCommands:•  bwctl -c psmsu02.aglt2.org -s elpa-pt1.es.net -T iperf3•  bwping -s atla-pt1.es.net -c ga-pt1.es.net•  bwping -E -c www.google.com•  bwtraceroute -T tracepath -c lbl-pt1.es.net -l 8192 -s atla-

pt1.es.net•  bwping -T owamp -s atla-pt1.es.net -c ga-pt1.es.net -N 1000 -i .01

6/2/15 30

Page 31: perfSONAR-based Network Research

A small amount of packet loss makes a huge difference in TCP performance

MetroArea

Local(LAN)

Regional

Con>nental

Interna>onal

Measured (TCP Reno) Measured (HTCP) Theoretical (TCP Reno) Measured (no loss)

With loss, high performance beyond metro distances is essentially impossible

6/2/15 31

Page 32: perfSONAR-based Network Research

ImprovedSupportforCentralManagement

•  Goals:– MakeiteasytoincorporateperfSONARhostsintoexis>nghostmanagementsystems(puppet,chef,SaltStack,cfengine,etc.)•  Includesamplepuppetconfigfiles

– MakeiteasytomanagemanyperfSONARhostsatasingleins>tu>on

– Newrpmanddebianbundlestosupportthis

February10,2016 32

Page 33: perfSONAR-based Network Research

NewperfSONARInstalla>onop>ons•  Inaddi>ontothetradi>onal“Toolkit”install,younowhavethetheseaddi>onalop>ons:

–  perfSONAR-Tools:•  iperf3,bwctl,owamp,nufcp,etc•  InstallthisonDTNs,etctohelpwithtroubleshoo>ng•  Doesnotsupportscheduledtes>ng•  CentOSandDebiansupport

–  perfSONAR-TestPoint:•  toolsplusLookupServiceregistra>onand‘meshagent’•  Foruseinenvironmentswithacentralmeasurementarchive•  Foruseonlowend/olderhardware(e.g.:$100nodes)•  Supportsscheduledtes>ng•  CentOSandDebiansupport

•  See:hfp://docs.perfsonar.net/install_op>ons.html

February10,2016 33

Page 34: perfSONAR-based Network Research

NewperfSONARInstalla>onop>ons(cont.)•  perfSONAR-Core:

–  Includeseverythingexceptthewebinterface–  Usethisinenvironmentswhereyoursitesysadminswanttofullymanagethe

hostconfigura>on,butdon’twanttosetupacentralmeasurementarchive–  CentOSonly

•  perfSONAR-CentralManagement:–  Includesmeasurementarchive,testmeshmanager,dashboard–  Usethistomanageacollec>onofperfSONARhostsatyoursite/campus–  CentOSonly

February10,2016 34

Page 35: perfSONAR-based Network Research

NewperfSONARInstalla>onop>ons(cont.)•  perfSONAR-Complete

–  AllperfSONARpackages–  Usethisenvironmentswhereyoursysadminswanttomanagetheinstall,buts>lluse

thetoolkitwebinterface,systemsexng,etc•  thetoolkitinstallwilloverridecertainchangeseveryupdate.

–  CentOSonly

•  Otherpackagestonote:–  Separaterpms/debsforiptablesconfig,sysctlconfig,andntppackagessoyoucanadd

themontopofperfSONAR-Coreasdesired.

February10,2016 35