35
NMC-WG Session 1 NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

Embed Size (px)

Citation preview

Page 1: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

NMC-WG Session 1NMC-WG Session 1

March 16th 2010, OGF 28Jeff Boote – Internet2Martin Swany – University of Delaware

Page 2: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• “I acknowledge that participation in this meeting is subject to the OGF Intellectual Property Policy.”

• Intellectual Property Notices Note Well: All statements related to the activities of the OGF and addressed to the OGF are subject to all provisions of Appendix B of GFD-C.1, which grants to the OGF and its participants certain licenses and rights in such statements. Such statements include verbal statements in OGF meetings, as well as written and electronic communications made at any time or place, which are addressed to:

• the OGF plenary session, • any OGF working group or portion thereof, • the OGF Board of Directors, the GFSG, or any member thereof on behalf of the OGF, • the ADCOM, or any member thereof on behalf of the ADCOM, • any OGF mailing list, including any group list, or any other list functioning under OGF auspices, • the OGF Editor or the document authoring and review process

• Statements made outside of a OGF meeting, mailing list or other function, that are clearly not intended to be input to an OGF activity, group or function, are not subject to these provisions.

• Excerpt from Appendix B of GFD-C.1: ”Where the OGF knows of rights, or claimed rights, the OGF secretariat shall attempt to obtain from the claimant of such rights, a written assurance that upon approval by the GFSG of the relevant OGF document(s), any party will be able to obtain the right to implement, use and distribute the technology or works when implementing, using or distributing technology based upon the specific specification(s) under openly specified, reasonable, non-discriminatory terms. The working group or research group proposing the use of the technology with respect to which the proprietary rights are claimed may assist the OGF secretariat in this effort. The results of this procedure shall not affect advancement of document, except that the GFSG may defer approval where a delay may facilitate the obtaining of such assurances. The results will, however, be recorded by the OGF Secretariat, and made available. The GFSG may also direct that a summary of the results be included in any GFD published containing the specification.”

• OGF Intellectual Property Policies are adapted from the IETF Intellectual Property Policies that support the Internet Standards Process.

2 – 04/21/23, © 2009 Internet2

OGF IPR

Page 3: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• OGF-NMC relationship to perfSONAR• perfSONAR Overview

– Motivation– What is perfSONAR– Who is involved– Who is adopting

• NMC working status

3 – 04/21/23, © 2009 Internet2

Overview

Page 4: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

Charter Focus/Purpose and Scope:The purpose of the Network Measurement and Control Working Group is to standardize the XML-based protocols that are currently in use in the perfSONAR project to control network measurement infrastructure and to share the results of the measurements and metrics that are generated. These protocols are already in widespread use and are described across a number of documents with various degrees of formality.

The scope of the Network Measurement and Control Working Group is to define base protocols and extension frameworks for those protocols, as well as to define extensions that are already in common use.

NMC Charter

4 – 04/21/23, © 2009 Internet2

Page 5: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Networks are not flawless– Heterogeneous equipment– Cost factors heavily into design – e.g. Get what you pay for– Design heavily favors protection and availability over

performance• Communication protocols are not advancing as fast as networks

– TCP/IP is the king of the protocol stack• Guarantees reliable transfers• Adjusts to failures in the network• Adjusts speed to be fair for all

• User Expectations– Big Science is prevalent globally– The “8 Second Rule” is present in Scientific Communities too [1]

5 – 04/21/23, © 2009 Internet2

Why Worry About Network Performance?

Page 6: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• User and resource are geographically separated• Both have access to high speed communication network

– LAN infrastructure - 1Gbps Ethernet– WAN infrastructure – 10Gbps Optical Backbone

6 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 7: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• User wants to access a file at the resource (e.g. ~600MB)• Plans to use COTS tools (e.g. SCP, but could easily be something

scientific like GridFTP or simple like a web browser)• What are the expectations?

– 1Gbps network (e.g. bottleneck speed on the LAN)– 600MB * 8 = 4,800 Mb file– User expects line rate, e.g. 4,800 Mb / 1000 Mbps = 4.8 Seconds– Audience Poll: Is this expectation too high?

• What are the realities?– Congestion and other Network performance factors– Host performance– Protocol Performance– Application performance

7 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 8: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Real Example (New York USA to Los Angeles USA):

• 10 minutes seems unreasonable given the investment in technology– Backbone network– High speed LAN– Capable hosts

• Performance realities as network speed decreases:– 100 Mbps Speed – 48 Seconds– 10 Mbps Speed – 8 Minutes– 1 Mbps Speed – 80 Minutes

• How could this happen?• More importantly, why are there not more complaints?• Audience Poll: Would you complain? If so, to whom?

8 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 9: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Expectation does not even come close to experience, time to debug. Where to start though?– Application

• Have other users reported problems? Is this the most up to date version?

– Protocol• Protocols typically can be tuned on an individual basis, consult your

operating system.

– Host• Are the hardware (network card, system internals) and software (drivers,

operating system) functioning as they should be?

– LAN Networks • Consult with the local administrators on status and potential choke points

– Backbone Network• Consult the administrators at remote locations on status and potential

choke points

9 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 10: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Following through, what normally happens …– Application

• This step is normally skipped, the application designer will blame the network

– Protocol• These settings are normally never explored

– Host• Checking and diagnostic steps normally stop after establishing

connectivity– LAN Networks

• Will assure internal performance, but LAN administrators will ignore most user complaints and shift blame to upstream sources

– Backbone Network• Will assure internal performance, but Backbone responsibilities

normally stop at the demarcation point, blame is shifted to other networks up and down stream

10 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 11: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Stumbling Blocks to solving performance problems– Lack of a clear process

• Knowledge of the proper order to approach problems is paramount• This knowledge is not just for end users – also for application developers

and network operators too– Impatience

• Everyone is impatient, from the user who wants things to work to the network staff and application developers who do not want to hear complaints

– Information Void• Lack of a clear location that describes symptoms and steps that can be

taken to mitigate risks and solve problems• Lack of available performance information, e.g the current status of a

given network in a public and easily accessible forum – Communication

• Finding whom to contact to report problems or get help in debugging is frustrating

11 – 04/21/23, © 2009 Internet2

Motivation – A Typical Scenario

Page 12: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• The purpose of this workshop is to introduce and motivate solutions in the network space– Federated debugging– Unified views of end to end network performance– Presentation and retrieval of measurement data for use by

developers, operators, and users alike. • More research and implementation is needed for other areas that will

not be mentioned here:– Applications

• Developers should be aware of TCP performance and structure their applications accordingly – perhaps considering other protocols when appropriate

– Protocols• Linux Kernel autotuning support is advancing, but vigilance is needed for

supporting large network flows on end hosts– Host Tuning

• Lots of work being done here for manual tuning, see also ESnet’s guide: http://fasterdata.es.net/

12 – 04/21/23, © 2009 Internet2

Motivation – Possible Solutions

Page 13: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Finding a solution to network performance problems can be broken into two distinct steps:– Use of Diagnostic Tools to locate problems

• Tools that actively measure performance (e.g. Latency, Available Bandwidth)

• Tools that passively observe performance (e.g. error counters)

– Regular Monitoring to establish performance baselines and alert when expectation drops.

• Using diagnostic tools in a structured manner• Visualizations and alarms to analyze the collected data

• Incorporation of either of these techniques must be:– ubiquitous, e.g. the solution works best when it is available

everywhere– seamless (e.g. federated) in presenting information from different

resources and domains13 – 04/21/23, © 2009 Internet2

Motivation – Possible Solutions

Page 14: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Desirable design features for any solution– Component Based

• Functionality should be split into logical units• Each function (e.g. visualization) should function through well

defined communication with other components (e.g. data storage)

– Modular• Monolithic designs rarely work• Components allow choice of how to operate a customized end

solution.

– Accessible• Well defined interfaces (e.g. APIs)

• Initial design should facilitate future expansion

14 – 04/21/23, © 2009 Internet2

Motivation – Possible Solutions

Page 15: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

15 – 04/21/23, © 2009 Internet2

Motivation – Possible Solutions

Analysis & Visualization

Measurement Infrastructure

Data Collection Performance

Tools

Analysis & Visualization

Measurement Infrastructure

API

API

Page 16: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Most organizations perform monitoring and diagnostics of their own network– SNMP Monitoring via common tools (e.g. MRTG, Cacti)– Enterprise monitoring (e.g. Nagios)

• Networking is increasingly a cross-domain effort– International collaborations in many spaces (e.g. science, the arts

and humanities) are common– Interest in development and use of R&E networks at an all time

high

• Monitoring and diagnostics must also become a cross-domain effort

16 – 04/21/23, © 2009 Internet2

What is perfSONAR?

Page 17: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• A collaboration– Production network operators focused on designing and building

tools that they will deploy and use on their networks to provide monitoring and diagnostic capabilities to themselves and their user communities.

• An architecture & set of communication protocols– Web Services (WS) Architecture– Protocols established in the Open Grid Forum

• Network Measurement Working Group (NM-WG)• Network Measurement Control Working Group (NMC-WG)

• Several interoperable software implementations– perfSONAR-MDM– perfSONAR-PS

• A Deployed Measurement infrastructure

17 – 04/21/23, © 2009 Internet2

What is perfSONAR?

Page 18: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• perfSONAR originated from discussions between Internet2’s End-to-End Performance Initiative (E2Epi), and the Géant2 project in September 2004.

• Members of the OGF’s (then GGF) NM-WG provided guidance on the encoding of network measurement data.

• Additional network partners, including ESnet and RNP provided development resources as well as served as early adopters.

• The first release of perfSONAR branded software was available in July 2006.

• All perfSONAR branded is open source• All products looking to be labeled as perfSONAR compliant must

establish protocol compliance based on the public standards of the OGF

18 – 04/21/23, © 2009 Internet2

perfSONAR Inception

Page 19: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Interoperable network measurement middleware designed as a Service Oriented Architecture (SOA):– Each component is modular– All are Web Services (WS) based– The global perfSONAR framework as well as individual deployments

are decentralized– All perfSONAR tools are Locally controlled

• perfSONAR Integrates:– Network measurement tools and archives (e.g. stored measurement

results)– Data manipulation– Information Services

• Discovery• Topology

– Authentication and authorization

19 – 04/21/23, © 2009 Internet2

perfSONAR Architecture Overview

Page 20: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• The key concept of perfSONAR is that each entity performs a service– Each service provides a limited set of services, e.g. collecting

measurements between arbitrary points or managing the registration and location of distributed services

– The service is a self contained entity and provides functionality on its own as well as when deployed with the remainder of the framework

• Services interact through protocol exchanges– Standardized message formats– Standardized exchange patterns

• A collection of perfSONAR services within a domain is a deployment– Deploying perfSONAR can be done À la carte, or through a complete

solution• Services federate with each other, locally and globally

– Services are designed to automatically discover the presence of other perfSONAR components

– Clients are designed with this distributed paradigm in mind

20 – 04/21/23, © 2009 Internet2

perfSONAR Architecture Overview

Page 21: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

21 – 04/21/23, © 2009 Internet2

perfSONAR Architecture Overview

Page 22: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• A perfSONAR deployment can be any combination of services– An instance of the Lookup Service is required to share information– Any combination of data services and analysis and visualization

tools is possible• perfSONAR services automatically federate globally

– The Lookup Service communicates with a confederated group of directory services (e.g. the Global Lookup Service)

– Global discovery is possible through APIs• perfSONAR is most effective when all paths are monitored

– Debugging network performance must be done end-to-end– Lack of information for specific domains can delay or hinder the

debug process

22 – 04/21/23, © 2009 Internet2

perfSONAR Architecture Overview

Page 23: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

23 – 04/21/23, © 2009 Internet2

23

FNAL (AS3152)[US]

ESnet (AS293)[US]

GEANT (AS20965)[Europe]

DFN (AS680)[Germany]

DESY (AS1754)[Germany]

measurement archive

m1m4

m3

measurement archive

m1m4

m3

measurement archive

m1m4

m3

m1m4

m3

m1m4

m3

measurement archive

measurement archive

performance GUI

user

Analysis tool

Many collaborations are inherently multi-domain, so

for an end-to-end monitoring tool to work

everyone must participate in the monitoring

infrastructure

Page 24: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• perfSONAR should be used to diagnose an end-to-end performance problem– User is attempting to download a

remote resource– Resource and user are separated by

distance– Both are assumed to be connected to

high speed networks• Operation does not go as planned,

where to start?

24 – 04/21/23, © 2009 Internet2

Example perfSONAR Use Case

Page 25: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Simple tools like traceroute can be used to determine the path traveled

• There could be a performance problem anywhere in here

• The problem may be something we could fix, but the chances are greater that it is not

25 – 04/21/23, © 2009 Internet2

Example perfSONAR Use Case

Page 26: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Each segment of the path is controlled by a different domain.

• Each domain will have network staff that could help fix the problem, but how to contact them?

• All we really want is some information regarding performance

26 – 04/21/23, © 2009 Internet2

Example perfSONAR Use Case

Page 27: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Each domain has made measurement data available via perfSONAR

• The user was able to discover this automatically

• Automated tools such as visualizations and analyzers can be powered by this network data

27 – 04/21/23, © 2009 Internet2

Example perfSONAR Use Case

Page 28: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• In the end the problem is isolated based on testing.

• The user can contact the domain in question to inquire about this performance problem

• When fixed the transfer should progress as intended

28 – 04/21/23, © 2009 Internet2

Example perfSONAR Use Case

Page 29: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• The perfSONAR Consortium is a joint collaboration between – ESnet– Géant– Internet2– Rede Nacional de Ensino e Pesquisa (RNP)

• Decisions regarding protocol development, software branding, and interoperability are handled at this organization level

• There are two independent efforts to develop software that is compatible with perfSONAR– perfSONAR-MDM– perfSONAR-PS

• Each project works on an individual development roadmap and works with the consortium to further protocol development and insure compatibility

29 – 04/21/23, © 2009 Internet2

Who is perfSONAR?

Page 30: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• perfSONAR-MDM is made up of participants in the Géant project:

30 – 04/21/23, © 2009 Internet2

Who is perfSONAR-MDM?

•Arnes•Belnet•Carnet•Cesnet•CYNet•DANTE•DFN•FCCN•GRNet

•GARR•ISTF•PSNC•Nordunet (Uninett)•Renater•RedIRIS•Surfnet•SWITCH

• perfSONAR-MDM is written in Java primarily and was designed to serve as the monitoring solution for the Large Hadron Collider (LHC) project.

• perfSONAR-MDM is available as Debian or RPM packages.

Page 31: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• perfSONAR-PS is comprised of several members:– ESnet– Fermilab– Georgia Tech– Indiana University– Internet2– SLAC– The University of Delaware

• perfSONAR-PS products are written in the perl programming language and are available for installation via source or RPM packages

• perfSONAR-PS is also a major component of the Internet2 pS Performance Toolkit – A bootable Linux CD containing measurement tools.

31 – 04/21/23, © 2009 Internet2

Who is perfSONAR-PS?

Page 32: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• perfSONAR is gaining traction as an interoperable and extensible monitoring solution

• Adoption has progressed in the following areas:– R&E networks including backbone, regional, and exchange points– Universities on an international basis– Federal labs and agencies in the United States (e.g. JET nets)– Scientific Virtual Organizations, notably the LHC project

• Recent interest has also accrued from:– International R&E network partners and exchange points– Commercial Providers in the United States– Hardware manufactures

32 – 04/21/23, © 2009 Internet2

perfSONAR Adoption

Page 33: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

33 – 04/21/23, © 2009 Internet2

perfSONAR Adoption

• Networks– APAN, CENIC, CSTNET, ESnet, Geant, Gloriad,

GPN, Internet2, JGN2, LONI, MAX, NOX, NSERNET, RNP, Starlight, Transpac2, UEN

• Labs– ANL, BNL, FNAL **, NERSC, PNNL, PSC, SLAC

• International Sites– Chinese University of Hong Kong, Chonnam

National University (Korea), KISTI (Korea), Monash University (Melbourne, Victoria, Australia), MRREE (Lima, Peru), NCHC (Taiwan), NICT (Japan), Simon Frazier (Burnaby, BC, Canada), Thaisarn Nectec (Bangkok, Thailand), UNIFACS (Salvador, Bahia, Brazil)

• Other– Cobham, Northop Gruman, Ocala Electric,

Philadelphia Orchestra, REDDnet

• Current– http://www.perfsonar.net/activeServices/IS/

• Universities• Boston University *• College of William and Mary• George Mason Univ • Georgia Tech University• Hope College• Indiana University *• Leeward Community College• Luisianna State University• Michigan State University *• Middle Tennessee State University• Northwestern **• Oregon State• Penn State University• Southern Methodist University *• Syracuse• Texas A&M University *• Tufts *• University of California Los Angles• University of California San Diego **• University of Chicago *• University of Connecticut• University of Delaware• University of Hawaii• University of Michigan *• University of Northern Iowa• University of Oklahoma *• University of Texas *• University of Utah• University of Wisconsin (Condor)• University of Wisconsin (Madison) * **• Vanderbilt **• University of Florida **

* USATLAS** USCMS

Page 34: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

• Holding regular calls• Making progress on the base document (new version will be

posted to gridforge today)• Having discussions on problems related to specifying the work

done by organizations such as perfsonar:– Effective split of the work between documents

• Base doc should only have pS messages in it• Profile document discussing how to use pS messages in HTTP/SOAP

context should come out concurrently– Result Codes

• Would like more people willing to author and edit documents• Next session will be a “working” session. Please stay, and expect

to get work assignments.

Workgroup Status

34 – 04/21/23, © 2009 Internet2

Page 35: NMC-WG Session 1 March 16 th 2010, OGF 28 Jeff Boote – Internet2 Martin Swany – University of Delaware

NMC-WG Session 1NMC-WG Session 1March 16th 2010, OGF 28Jeff Boote – Internet2Martin Swany – University of Delaware

For more information, visit https://forge.gridforum.org/projects/nmc-wg

35 – 04/21/23, © 2009 Internet2