View
224
Download
0
Category
Preview:
Citation preview
NM-WG Specification Adoption in perfSONAR
Aaron Brown, Internet2, University of Delaware
Martin Swany University of Delaware, Internet2
What is perfSONAR • A collaboration
• Production network operators focused on designing and building tools that they will deploy and use on their networks to provide monitoring and diagnostic capabilites to themselves and their user comunities.
• An architecture & a set of protocols • Web Services Architecture • Protocols based on the Open Grid Forum Network
Measurement Working Group (NM-WG) Schemas • Emerging standards in the Network Markup Language WG
(NML-WG) • Several interoperable software implementations
• Java & Perl • A Deployed Measurement infrastructure
perfSONAR Goals
• Increase network awareness • Set user expectations accurately
• Reduce diagnostic costs • Performance problems noticed early • Performance problems addressed efficiently • Network engineers can see & act outside their “turf”
• Transform application design • Incorporate network intuition into application behavior
Vision: Network Performance Information is … • Available
• People can find it (Discovery) • “Community of trust” allows access across administrative
domain boundaries • Ubiquitous
• Widely deployed (Paths of interest covered) • Reliable (Consistently configured correctly)
• Valuable • Actionable (Analysis suggests course of action) • Automatable (Applications act on data)
perfSONAR Collaborators • GRNET • HEAnet • Internet2 • ISTF • POZNAN • UNINETT • University of Delaware • Renater • RedIRIS • SLAC • SWITCH • SURFnet
• RNP • ARNES • BELNET • CARNET • CESNET • CYNET • DANTE • DFN • ESnet • FCCN • FERMI • GARR • GEANT
And anybody else we missed
perfSONAR Architecture • Interoperable network measurement middleware:
• Modular • Web services-based • Decentralized • Locally controlled
• Integrates: • Network measurement tools • Network measurement archives • Discovery • Authentication and authorization • Data manipulation • Resource protection • Topology
• Based on: • Open Grid Forum Network Measurement Working Group
schema.
perfSONAR: System Description
• Domains represented by a set of services • Each domain can deploy services important to the domain • Analysis clients interact with service across multiple domains
perfSONAR: Services (1) • Lookup Service
• Allows the client to discover the existing services and other LS services.
• Dynamic: services registration themselves to the LS and mention their capabilities, they can also leave or be removed if a service gets down.
• AuthN/Z Service • Internet2 Middleware Group, GN2-JRA5
(eduGAIN) • Authorization functionality for the framework • Users can have several roles, the authorisation is
done based on the user role. • Trust relationships defined between users affiliated
with different administrative domains.
perfSONAR Services (2) • Transformation Service
• Transform the data (aggregation, concatenation, correlation, translation, etc).
• Topology Service • Make the network topology information available
to the framework. • Find the closest MP, provide topology information
for visualisation tools • Resource protector
• Arbitrate the consumption of limited resources between multiple services.
Here is who I am, I’d like to access MA B
Where Link utilisation along - Path a,b,c,d,e,f? a,b,c: Network A – LS A, c,d,e,f : Network B, MA B, AA B
Inter-domain perfSONAR example interaction
Client
Network A Network B
LS A LS B MA A MA B
AA A AA B
a b c d
e f
Where Link utilisation along - Path a,b,c? a,b,c : Network A, MA A, AA A Token MB Here is who I am, I’d like to access MA A Get link utilisation c,d,e,f Here you go
Token MA
Get Link utilisation a,b,c Here you go
Useful graph
Schema • Key Goals: Extensibility, Normalization,
Readability • Break representation of performance
measurements down into basic elements • Data and Metadata • Measurement Data • A set of of measurement events that have
some value or values at a particular time • Measurement Metadata • The details about the set of measurement data
Schema Normalization
• Can simply the database representation for many types of measurement data • While optimizations are certainly possible,
many measurement types can be viewed as one value over time
• Assists Combination/Concatenation of metrics • Creating derived metrics
• Normalization helps with inferring relationships between types of metrics
Schema Basic Elements - Metadata
• Subject • The measured/tested entity
• EventType (Verb) • What type of measurement, value, or event
occurred • Characteristic, tool output, or generic event
• Parameters (Adjectives and Adverbs) • How, or under what conditions, did this event
occur?
Schema Basic Elements - Data
• Some sort of value - Datum • Existence of an event might point to the case
where there no additional value • As in “Link up/down” or threshold events
• Time • Must be extensible since even agreement
about the right structure is not easy • E.g. UNIX timestamp vs NTP time
A Message Message Message
Metadata
Data
An Object Store Store
Metadata
Data
A Data is Linked to a Metadata Metadata
<id>someId</id>
Data
<metadataIdRef> someId </metadataIdRef>
A Metadata may be linked to another
Metadata
<id>someId</id>
Metadata <id>someOtherId</id>
<metadataIdRef> someId </metadataIdRef>
Schema Namespaces
• All measurements have some sort of Data and Time
• All measurements can be described by the Metadata identifying who, what and how
• The specific structures of the Data and Metadata elements depend on the measurement
• Approach: Consistently use Data and Metadata elements and vary the namespaces of the specific elements
Schema Namespaces - 2
• We encode the measurement/event type in the namespace • And as a standalone element
• Some components of the system can pass Data and Metadata elements through without understanding their specific structure
• Allows and implementation to decide whether it supports a particular type of data or not
• Allows validation based on extended (namespace-specific) schemata
Schema Namespaces and Extensibility
• One key to extensibility is the use of hierarchy with delegation • Similar to OIDs in the IETF management
world • The NM-WG has a hierarchy of network
characteristics • Good starting point
• However, not all tools are cleanly mapped onto the Characteristic space • Often a matter of some debate
Schema Namespaces and Extensibility - 2
• Organization-rooted tools namespace addresses this
• Some top-level tools • ping, traceroute
• Easy to add new tools in organization-specific namespaces
• Performance Event Repository • Add a schema and get a URI • Add Java classes
perfSONAR-PS Motivation
• Create separate implementation of perfSONAR standard • Use same protocol/standards
• Proof of interoperability (strengthens the standard) • Targeted for NOC deployments
• Lightweight
• Easy to deploy/manage
• (We were unable to convince our primary users to deploy Java services due to the complexity of dependencies)
perfSONAR-PS Beta Release (0.06) (1/21/08) • Focus on development of major perfSONAR components
• LS - perfSONAR_PS::Services::LS::LS • SNMP MA - perfSONAR_PS::Services::MA::SNMP • Status MA - perfSONAR_PS::Services::MA::Status • CircuitStatus MA - perfSONAR_PS::Services::MA::CircuitStatus • Topology MA - perfSONAR_PS::Services::MA::Topology • PingER (SLAC) *
• Not yet released • OWAMP/BWCTL archive (perfSONARBUOY)
• Not released via CPAN
SNMP Measurement Archive
• Provide access to network performance data • Utilization • Errors • Discards
• Numerous tools exist to collect passive measurements (via SNMP): • MRTG • Cacti • Cricket
• Expose archives from RRD files
SNMP Measurement Archive
• Current Deployment: • Internet2 Network • ESnet • Georgia Tech/SOX • Fermilab
Pinger Based MP/MA
• Joint effort between Fermi Lab and SLAC • Present views of historic Pinger data • Expose interface to schedule live tests
• Built with perfSONAR-PS infrastructure
Link Status Measurement Archive
• Provide access to up/down status information about layer2 links • Data stored in a SQL database • Database schema allows for storing time ranges during
which a link had a certain status • Minimizes storage costs for rarely changing links
• Communication/Configuration via XML • Target audience is network operators and users
interested in obtaining the status of the links over which their data flows
Link Status Measurement Archive
• Collector • Allows for the periodic collection of the status of
one or more links • Can use SNMP, Scripts or simply Constants • Can store results directly into a database or into
a remote Measurement Archive • Future Plans: TL1 Collection
Link Status Measurement Archive
• Visualization • A perfSONAR-UI Plugin is available that can display a
network and the status of its links • Current Deployment
• Internet2 Network • HOPI (in2p3 circuit)
• Planned Deployment • SLAC
Circuit Status Measurement Archive
• An e2emon-compatible service • Integrates with the Link Status MA to provide the
information stored in MAs • Can work with local MAs directly or with remote MAs
• Can use the Topology MA to obtain necessary information about nodes
• Can use a Lookup Service to lookup the MA containing information on each link
• Target audience is administrators who want to publish circuit status information to e2emon clients
Circuit Status Measurement Archive
• Visualization • Any tool that is compatible with e2emon will
work with this service • Current Deployment • Internet2 Network • HOPI (in2p3 circuit)
• Planned Deployment • SLAC
Topology Service
• Provides a queryable repository for obtaining topology information about a domain • Can obtain the entire network • Xquery interface allows the construction of
complex queries about the network • Topology is specified according to the
schema in development in the OGF
Topology Service
• Current Deployments • Internet2 • SLAC (PingER Topology Information)
• Planned Deployments • DICE Dynamic Circuit Service Sites • ESnet
perfSONAR Lookup Service
• Directory service of perfSONAR deployments • Accept service registrations • Handles queries for service location and capabilities
and location of available data • Manage the lifetimes of data and services to keep
information up to date • Web Service interface to XML Database
• Sleepycat XML Database • Service Info/Data kept in native formats
• Draw away the complex query tasks from otherwise 'busy' services
Lookup Service
• Also XML based configuration/protocol • Native storage/query mechanisms [Xpath/XQuery] • Message format to exchange the data
• Targeted at single domain deployment • Single instance to manage multiple services
• Client components and applications use the LS to find services • perfSONAR-UI • perfAdmin
Lookup Service
• Current Deployment: • Internet2 (Ann Arbor) • University of Delaware
• Planned Deployment: • IU for Internet2 network and regionals • DICE Dynamic Circuit Network sites • International Partners
Distributed Lookup Service
• Federation of individual LS instances into a global system
• “Meta”-lookup phase allows a query to find the specific LS that has relevant information • Or perhaps the relevant LSes that have said info
• The specific query is sent directly to the LS in question
• Recent active design and development
Distributed Lookup Service • Service and measurement metadata is
“summarized” for propagation to distant domains • IP addresses in service and measurement
metadata are compressed into network/netmask pairs in the same way that routes are advertised (CIDR-style)
• These summarized metadata elements are advertised to external “scopes” • A “scope” is a set of LSes that are related by e.g.
being in the same administrative domain (although multiple scopes within a single domain are possible)
Weather Maps - Internet2
Gmaps from SLAC
CNM from DFN
CNM from DFN
perfSONARUI from acad.bg
PerfsonarUI 1
PerfsonarUI 2
PerfsonarUI 3
Oscars Circuit plugin - Internet2
Oscars circuit plugin
E2Emon - Monitoring Circuits
E2Emon: Status of E2E link CERN-LHCOPN-FNAL-001
E2Emon generated view of the data for one OPN link [E2EMON]
Traceroute Visualizer • Forward direction bandwidth utilization on application path from
LBNL to INFN-Frascati (Italy) • traffic shown as bars on those network device interfaces that have an associated
MP services (the first 4 graphs are normalized to 2000 Mb/s, the last to 500 Mb/s)
1 ir1000gw (131.243.2.1) 2 er1kgw 3 lbl2-ge-lbnl.es.net
4 slacmr1-sdn-lblmr1.es.net (GRAPH OMITTED) 5 snv2mr1-slacmr1.es.net (GRAPH OMITTED) 6 snv2sdn1-snv2mr1.es.net
7 chislsdn1-oc192-snv2sdn1.es.net (GRAPH OMITTED) 8 chiccr1-chislsdn1.es.net
9 aofacr1-chicsdn1.es.net (GRAPH OMITTED)
10 esnet.rt1.nyc.us.geant2.net (NO DATA) 11 so-7-0-0.rt1.ams.nl.geant2.net (NO DATA) 12 so-6-2-0.rt1.fra.de.geant2.net (NO DATA) 13 so-6-2-0.rt1.gen.ch.geant2.net (NO DATA) 14 so-2-0-0.rt1.mil.it.geant2.net (NO DATA) 15 garr-gw.rt1.mil.it.geant2.net (NO DATA) 16 rt1-mi1-rt-mi2.mi2.garr.net
17 rt-mi2-rt-rm2.rm2.garr.net (GRAPH OMITTED) 18 rt-rm2-rc-fra.fra.garr.net (GRAPH OMITTED) 19 rc-fra-ru-lnf.fra.garr.net (GRAPH OMITTED)
20 21 www6.lnf.infn.it (193.206.84.223) 189.908 ms 189.596 ms 189.684 ms
link capacity is also provided
Recommended