49
DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Embed Size (px)

Citation preview

Page 1: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

DICE: Performance Update

Eric L. Boyd (Internet2)

Joe Metzger (ESnet)

Nicolas Simar (G2 – JRA1)

Page 2: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Vision: Performance Information is …

• Available– People can find it (Discovery)– “Community of trust” allows access across administrative domain

boundaries (AA)• Ubiquitous

– Widely deployed (Paths of interest covered)– Reliable (Consistently configured correctly)

• Valuable– Actionable (Analysis suggests course of action)– Automatable (Applications act on data)

Page 3: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Getting There: Build & Empower the Community

Decouple the Problem Space:• Analysis and Visualization• Performance Data Sharing• Performance Data GenerationGrow the Footprint:• Clean APIs and protocols between each

layer• Widespread deployment of

measurement infrastructure• Widespread deployment of common

performance measurement tools

Analysis & Visualization

Measurement Infrastructure

Performance Tools Performance

Tools

Analysis & Visualization

Measurement Infrastructure

API

API

Page 4: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR Credits

• perfSONAR is a joint effort:– ESnet– Fermilab– GÉANT2 JRA1– Internet2– RNP

• Internet2 includes:– University of Delaware– Georgia Tech– Internet2 staff

• GÉANT2 JRA1 includes:– Arnes– Belnet– Carnet– Cesnet– DANTE– DFN– FCCN– GRNet– GARR– ISTF– PSNC– Nordunet (Uninett)– Renater– RedIRIS– Surfnet– SWITCH

Page 5: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR: Project Activity Meter

• Interactions– 1-2 conf calls/week– 1 new service/month (accelerating)– 3-4 development workshops/year– 3-4 paper submissions/year

• Recruitment– RNP has joined the effort– Outreach to LHC community– GaTech beginning six month commitment

Page 6: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR: Services (1)

• Measurement Point Service– Enables the initiation of performance tests

• Measurement Archive Service– Stores performance monitoring results

• Lookup Service– Allows the client to discover the existing services and other LS services.– Dynamic: services registration themselves to the LS and mention their

capabilities, they can also leave or be removed if a service gets down.• AuthN/Z Service

– Internet2 MAT, GN2-JRA5 (eduGAIN)– Authorization functionality for the framework– Users can have several roles, the authorisation is done based on the user

role.– Trust relationships defined between users affiliated with different

administrative domains.

Page 7: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR Services (2)

• Transformation Service

– Transform the data (aggregation, concatenation, correlation, translation, etc).

• Topology Service

– Make the network topology information available to the framework.

– Find the closest MP, provide topology information for visualisation tools

• Resource protector

– Arbitrate the consumption of limited resources between multiple services.

Page 8: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Types of perfSONAR Services

• Core Services– Set released by perfSONAR Team

• e.g. LS, AA, 3 MPs, 2 MAs, RP, Tos, TS– Tested for interoperability– Serve as examples for affiliated developers– Targeted at next generation network needs (e.g. GÉANT2,

Internet2 New Network, etc.)• Affiliated Services

– Released by perfSONAR partners, lag Core– May share development infrastructure (Bugzilla, Website, Mailing

Lists)– Candidates for migration to Core Services

• Unaffiliated Services

Page 9: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR: Core Status Update

• Production release of core services package v1.0 ready (pending licensing completion)

• Core services include:– Single domain LS solution (PSNC)– RRD MA (PSNC)

• Affiliate services and client applications supporting this version will soon follow:– BWCTL MP (DFN)– perfSONAR UI (ISTF)

• Ongoing work– AA Design (Internet2, JRA1, JRA5)– Multi-LS (PSNC, RNP, UDel)– ToS (DFN, UDel)

Page 10: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR Process Status Update

• We have processes … ;-)• Release management process implemented (Internet2, RedIRIS,

UDel)• Bugzilla up and running (UDel)• Migrated from CVS to SVN (Internet2)• Functional testing under construction (GRnet)• Monitoring deployed services with Tomcat (ISTF)• Installation process eased significantly (DANTE, PSNC, UDel)• www.perfsonar.net under development (Internet2, Renater)

– Development information will stay on the Wiki– Adopter information will migrate to website

Page 11: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

perfSONAR: Affiliate Status Update

• Affiliated Services– Command Line Interface MP (Ping,

OWAMP, Traceroute) (RNP, released)

– BWCTL MP (DFN, released)– SQL MA (PSNC, released)– L2-specific MA (DANTE)– SSH MP (Looking Glass) (Belnet,

released)– ABW MP (bandwidth packet

capture cards) (Cesnet)– NMS MP (SDH status) (DANTE)– Hades MA (OWD, Jitter, OWPL)

(DFN)– Flow Replicator MA (Surfnet,

Carnet)

• User Interfaces

– CNM (DFN)

– perfSONAR UI (ISTF)

– Visual PerfSONAR (Carnet)

– Looking Glass (Belnet)

– ICE/NeTraMet (RNP)

Page 12: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

What You See Is What You Get

• perfsonarUI– Retrieval of published data

• RRD MA• Hades MA

– Visualisation of OWD, IPDV and packet loss between Hades MP

– Parsing of arbitrary IPv4 or IPv6 traceroute commands

• CNM – map based– GEANT2 + NRENS maps

• VisualperfSONAR• Looking Glass

Page 13: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

RRD MA features• Wrapper around RRD tool.• Request/reply interface.• Write into RRD. • LS registration.

• Installation scripts.• Test configuration files available.

Page 14: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Lookup Service Features

• Centralized LS (Creating a distributed LS is ongoing development) • Service Registration (including updates) functionality • Service deregistration functionality • Lookup/query functionality (XQuery/XPath) • Services keep-alives

– including database cleanup, scheduled functionality • Registration component for a service available.

• Installation scripts.

Page 15: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

RRD MA deployment Status

Page 16: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

MA deployment over time

Numbers of MAs deployed

02468

101214

Sep-05

Oct-05

Nov-05

Dec-05

Jan-06

Feb-06

Mar-06

Apr-06

May-06

Jun-06

Time (in months)

Nu

mb

er o

f M

A d

eplo

yed

New MAdeployed

Total MADeployed

Page 17: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

PerfSONAR Next steps• Formal partnership

– License, Partnership Agreement– Interim solution

• Upgrade existing user base (currently using prototype)• Data exchange policy (measurement peering agreement)• Consistent offer of services.

– What services package to suggest to networks. • L2 status monitoring.

Page 18: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

ESnet

Joe Metzger

Page 19: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

GÉANT2 JRA1

Page 20: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Last months (Jan – May) - 1

• Services– Lookup Service

• Centralized• Registration / deregistration• Lookup query• Result code

– SQL MA• Stores data in relational database• Supports

– Utilization– L2 status– Result code.

– HADES MA• Provides access to the data archive of Hades measurements

from GEANT2 network

Page 21: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Last months - 2• Tools integration

– Telnet / SSH MP• On-demand requests for device specific information• Cisco/Juniper/Quagga support• Resource protection mechanisms

– To verify the parameters send in the commands– To prevent flood of requests

– BWCTL / OWAMP MP• BWCTL

– TCP throughput measurements

• OWAMP– OWD, PL measurements

Page 22: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Last months - 3• Tools integration

– Passive• ABW

– Counts the number of captured packets and bytes and computes used bandwidth

– Short timescale intervals

• Tracefile Capture Measurement Point (TCMP)– Used for capturing packets of selected flows using either regular

Eth cards or special DAG or COMBO6 cards

• SNMP MP– Web Service access to the usage of SNMP– Get for now and OID discovery

Page 23: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Last months - 4• Alcatel NMS MP

– Web Service access to SDH and WDM monitoring parameters such as SES, ES, UAS and also the G.709 metric BBE

– Acts as a reference implementation– Can be used by other NRENs in order to build perfSONAR

compliant services which can retrieve data from NMS• AA

– Designing and developing a perfSONAR AA service making use of JRA5's eduGAIN

• Topology Service– Common schema with SA3

Page 24: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

What You See Is What You Get

• perfsonarUI– Retrieval of published data

• RRD MA• Hades MA

– Visualisation of OWD, IPDV and packet loss between Hades MP

– Parsing of arbitrary IPv4 or IPv6 traceroute commands

• CNM – map based– GEANT2 + NRENS maps

• VisualperfSONAR• Looking Glass

Page 25: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Meet the NRENs sessions

• Powerful tools and useful information

• Design (MA’s, MP’s… approach is good)

• The number of deployed services is high

• Friendly user interfaces• Tools bring a motivation for

installing services for attendees• Sharing of info between projects

is useful

• Need to integrate the tools in a single visualisation application.

• There are too few networks nodes running the services

• Not enough data available• Not enough information

available about perfSONAR• Would like to have libraties/APIs• Requirement for having its

network perfSONAR enabled.

Page 26: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Next disseminations workshops

• SEEREN2 workshop in Heraklion.• E2E service status services deployment for NRENs next

week in Muenchen.• Three more installation workshop planned over the next 12

months.

Page 27: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Data Exchange for E2E Monitoring – Archive scenario

• NREN in charge of retrieving the data from the NMS/DB to analyse them and pass the information to a java class.– About 700-1000 lines of

code for GÉANT – 15 days.• JRA1

– Provides the “mySQL MA service” code

– maintains it.– Provides the script to write

into the DB• JRA4 in charge of the E2E NOC

visualisation.

Connect. Communicate. Collaborate

Page 28: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Year 3 Objectives• Improving the visualisation and tools features (NOC,

PERT, project)• Integration of AA.• Services deployments.• Going operational (with SA3 WI15).• Mastering the amount of data.• L1-L2• Dissemination workshops for NRENs.

Page 29: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Timeline Connect. Communicate. Collaborate

Page 30: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Phases III • End of November 2006• Going operational (SA3 WI-15) :

– RRD MA– LS (plus LS registration for the other services)– SNMP MP– perfsonarUI– CNM– Hades and RIPE TTM MA– BWCTL MP.– L2 status MP.

• Novelties– Netflow Integration – Topology Service – VisualperfSONAR– Multi-LS– Push interface

Page 31: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Phase IV• End of May 2007• Going operational (SA3 WI-15) :

– Multi-LS– Topology Service– Hades MP– BWCTL MA– VisualperfSONAR

• Novelties– First set of services using JRA5 Authentication with

some Authorization.– Performance anomaly detection

Page 32: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Phase V• End of November 2007• Going operational (SA3 WI-15) :

– Authentication Service

Page 33: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Internet2

Eric Boyd

Page 34: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Vision: Performance Information is …

• Available– People can find it (Discovery)– “Community of trust” allows access across administrative domain

boundaries (AA)• Ubiquitous

– Widely deployed (Paths of interest covered)– Reliable (Consistently configured correctly)

• Valuable– Actionable (Analysis suggests course of action)– Automatable (Applications act on data)

Page 35: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Getting There: Build & Empower the Community

Decouple the Problem Space:• Analysis and Visualization• Performance Data Sharing• Performance Data GenerationGrow the Footprint:• Clean APIs and protocols between each

layer• Widespread deployment of

measurement infrastructure• Widespread deployment of common

performance measurement tools

Analysis & Visualization

Measurement Infrastructure

Performance Tools Performance

Tools

Analysis & Visualization

Measurement Infrastructure

API

API

Page 36: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Result: No more mystery …

• Increase network awareness– Set user expectations accurately

• Reduce diagnostic costs– Performance problems noticed early – Performance problems addressed efficiently– Network engineers can see & act outside their turf

• Transform application design– Incorporate network intuition into application behavior

Page 37: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Immediate Game-plan:

• Internet2 is leveraged to help provide diagnostic information for “backbone” portion of problem– Create *some* diagnostic tools– Make Abilene data as public as is reasonable

• Work on efforts to more widely make performance data available (perfSONAR)– Contribute to ‘base’ development– Integrate ‘our’ diagnostic tools as ‘good’ example MP/MA services

Page 38: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

BWCTL (Bandwidth Controller)

• What is it?A resource allocation and scheduling daemon for arbitration of iperf

tests• Typical Solution

– Run “iperf” or similar tool on two endpoints and hosts on intermediate paths

• Typical road blocks– Need permissions on all systems involved– Need to coordinate testing with others– Need to run software on both sides with specified test parameters

Page 39: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

BWCTL: 3-Party Flow Diagram

bwctlclient

bwctld(request broker)

bwctld(peer agent)

iperf(test process)

bwctldresource broker

(master daemon)

bwctld(request broker)

bwctld(peer agent)

iperf(test process)

bwctldresource broker

(master daemon)

Page 40: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

OWAMP: One-Way Active Measurement Protocol

• What is it?• Measures one-way latency: 1-way ping• Control connection used to broker test request based

upon policy restrictions and available resources. (Bandwidth/disk limits)

• Specification• http://tools.ietf.org/wg/ippm/draft-ietf-ippm-owdp/draft-

ietf-ippm-owdp-14.txt

Page 41: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

OWAMP Flow Diagram

Server

owpingclient

[Control]

owampd[Resource Broker]

owampd[Control]

OWD TestEndpoint

OWD TestEndpoint

Client

Page 42: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Thrulay Overview

• Network capacity and delay tester• Same class of tools as iperf, netperf, nettest, nuttcp, ttcp, etc.• Unique features not found in other tools:

– TCP: measures round-trip delay along with goodput– UDP: measures:

• One-way delay, with quantiles• Packet loss• Packet duplication• Reordering

– UDP: ability to send precisely positioned true Poisson streams (microsecond errors in sending times)

– Human and machine-readable (ready to be fed to gnuplot)

Page 43: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Thrulay Update• New release v0.8• Tests with multiple TCP streams• Set DSCP (a.k.a. first 6 bits of the TOS byte)• Report MTU and/or MSS (whichever the OS makes available)• More UDP statistics: duplication, reordering, quantiles of delay• SPARC/Solaris support• Mac OS X support• IPv6 support• Non-busy-waiting UDP mode (less precise, but can run more concurrent tests)• Documentation: manual pages have been added• Basic client authorization based on IP address• Integration of TSC timekeeping projects for faster and more precise timestamping

Page 44: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

NDT: Network Diagnostic Tool

• Web100 enhanced server handles testing and diagnostic services• Java based and command line clients allows testing from any client (local or

remote)• Performance and configuration faults reported back to client• Drill-down functions provide more details & error reporting capabilities• Grant from NIH/NLM to explore duplex mismatch detection

Page 45: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

NDT Flow Diagram

Client

Web

Browser

Java

Applet

NDT - Server

Web

Server

Testing

Engine

Child

Test Engine

Spawn child

Well KnownNDT Server

Web RequestRedirect msgWeb Page Request

Web page response

Test Request

Control Channel

Specific test channels

Page 46: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Bulk Transport

• Build a library / tool for bulk transport that does not require kernel level modifications yet achieves the performance of such

• VFER library– Congestion control hooks– Implements loss-based congestion control– Working on delay-based version

• File transfer utility– An initial version demoed

Page 47: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Everything we work on is available

• Tools are open source, supported, well-documented• BWCTL/Iperf, OWAMP, NDT are deployed across Abilene

backbone and at many partners• You can:

– See ongoing measurement results at the Abilene Observatory

– Test to/from the Abilene backbone

Page 48: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Network Performance Measurement Workshops

– Example Course Materials:• http://e2epi.internet2.edu/npw/presentations.html

Goals:– Grow installed base of BWCTL/Iperf, OWAMP, and NDT at

GigaPoP and regional campuses.• http://e2epi.internet2.edu/pipes/pmp/pmp-dir.html

– Begin integration into IT support processes.– Create an installed base for perfSONAR deployment.– Give each participant tool-specific cookbooks.

Page 49: DICE: Performance Update Eric L. Boyd (Internet2) Joe Metzger (ESnet) Nicolas Simar (G2 – JRA1)

Network Performance MeasurementWorkshop Locations and Dates

• Completed– SOX / GaTech (03/05)– CENIC / UCLA (06/05)– JT – Vancouver (07/05)– OARNet / OSU (09/05)– MAGPI / FMM (09/05)– MAX / College Park (12/05)– APAN (01/06)– JT - Albuquerque (02/06)– MERIT (02/06)– Columbia / NYSERNet (04/06)– University of Virginia (04/06)

• Planned– Wisconsin (07/06)

• Under Consideration– Alaska, …