Scuola Superiore Sant’Anna Enabling Platforms for high-performance computational Grids oriented to scalable virtual organization (GRID.IT) P. Castoldi,

Scuola Superiore Sant’Anna

Enabling Platforms for high-performance computational Grids oriented to scalable virtual

organization (GRID.IT)

P. Castoldi, F. Baroncelli, F. Cugini, B. Martini,

V. Martini, F. Paolucci, L. Valcarenghi

TERENA Workshop on "Service Oriented Optical Networks“Catania, May 14th 2006

Facts about the GRID.IT project

15 Workpackages

WP1 - Grid Oriented Optical Switching ParadigmsWP2 - High Performance Photonic TestbedWP3 - Grid DeploymentWP4 - SecurityWP5 - Data Intensive Core ServicesWP6 - Knowledge Services for Intensive Data AnalysisWP7 - Grid PortalsWP8 - High-performance Component-based Programming EnvironmentsWP9 - Grid-enabled Scientific LibrariesWP10 - Grid Applications for AstrophysicsWP11- Grid Applications for Earth Observation Systems ApplicationWP12 - Grid Applications for BiologyWP13 - Grid Applications for Molecular Virtual RealityWP14 - Grid Applications for GeophysicsWP15 - Management

• National project funded by Ministry of University and Research under the FIRB (Fundamental Research Incentive Fund) line

• Duration: 3+1 year (Nov. ‘02 – Oct. ‘06)

• 4 clusters of partners:

CNIT*, (5 universities)UTDallas, CNIT subcontractor

CNR, National Research Council (3 institutes)

INFN,National Institute for Nuclear Physics (3 institutes)

ASI, Italian Space Agency

*CNIT (National Inter-university Consortium for Telecommunications) is a non-profit Consortium of 34 Italian universities operating in the telecom area, coordinating large research initiatives with own researchers and staff from affiliated universities

Global Grid Computing

Global Grid Computing expands resource horizon from LAN to WAN (not limited to optical networks ..)

Bottlenecks

• Computational, storage, etc. resources (CPU) ... same as before

• Network Resources become scarse and difficult to be reserved

A solution - main streamline

• A new functional layer, consisting of network middleware is introduced to meet the above requirements .. general concept ..

Requirements

• QoS-enabled network connectivity

• Network resource monitoring, adaptation and availability

• Application task staging should be network-aware

• Grid user should possess some ability to trigger network connectivity

L1/2/3 ResourceManagement

System

L1/2/3 ResourceManagement

System

NetworkMonitoring

NetworkMonitoring

L7 GridResources

L1/2/3 Resources

ApplicationApplication

Network Interface

OIF-UNI

TransportNetwork

L7 Grid Middleware

Scheduler GRAM GRAM = Grid Resource Allocation Manager

(*) Draft-ggf-masum-grid-network-services

NetworkConfigurator

NetworkConfigurator

“Note that, the network service interfaces for Grid will have a higher level of abstraction (hiding details) than what is provided by a traditional Service Network or Element Management System” (*)

Grid Network Interface should (*): hide network details (e.g.,

topology, configuration) to the Grid

middleware be as simple as possible allow end-to-end, on-demand,

and real-time service requests

A Grid network is an overlay L7 network on top of an independent L1/2/3 network

A network-centric view of a Grid

ServiceEntity

ApplicationEntity

Application services and network services

Customer1 Customer3 Customer4 Customer5Customer2

Storage Grid Internet Access Hosting VoIP/MoIP PSTN

SDH SONETSDH

SDH

MPLSPOS ATM

NetworkProtocols SDH

TDM

End-user

Applicationservices

Customer5

IP

• Customers run application services that exploit (stack of) network protocols for connectivity needs

• Application services are abstract description of application (logic)

• Network protocols (transport and ancillary functions such as routing, signaling, link management) are logically classified in a few categories of network services: connectionless IP, L1/L2/L3 VPN

Fast/Giga Ethernet

How do we mapthe elements of these two sets?

WDM

ser

vices

From UNI to a service interface

• Via User to Network Interface (UNI) in Control Plane-enabled ASTN/GMPLS networks client networks can request some network services but (e.g.)– it provides only point-to-point network services – it does not coordinate services provided by an arbitrary set of

edge nodes– it is not designed to be used by applications

• Consistently with existing approaches (IMS and NGN), and efforts of other EU project (MUPBED, NOBEL), applications (e.g. grid) should be enabled to set-up an “application platform”, i.e. a network service tailored to their needs

• To this purpose, a Service Plane is used that exports a new service interface towards applications, namely the user-to-service interface (USI).

The SO-ASTN

Man

agem

ent

Pla

ne

Man

agem

ent

Pla

ne

EdgeCPE

EdgeCPE

CPECPEEdgeCPE

EdgeCPE

CCI CCICCI

ControlPlane

TransportPlane

Client/Access Network

UNI

NMI-A

NMI-T

USI

UPI

Centralized Service Plane

Distributed Service Plane

DSEDSE

CSECSESLA

DSEDSE

ApplicationHost

ServicePlane

Service Plane, i.e. a network middleware, that implements a distributed signaling for CP edge nodes coordination and exports a service interface (USI) at a higher level of abstraction than the UNI.

The service interface

• UPI is an interim human-to-human or machine-to-machine interface mediated by the MP currently used as a service interface.

• The USI is an evolved machine-to-machine interface that must enable the application entity to require services:– provided by different administrative network domains– without dealing with the network technology details– without dealing with the network topology details

• The USI must support:– both executive on multiple administrative domains or informative

services on an administrative domain– the transparency of applications across multiple domains– session-based services (e.g., high-definition video-telephony) – non-session-based services (e.g., e-Business transactions)

1 - VPN Service Request (B,C, bandwidth)

2 - VPN Config (router ID, groups name, VPN id, bandwidth)3 - VPN Routing Configuration (local address, groups name, routing instance) 4 - Tunnel LSP Set-up (egress router 1, bandwidth)

5 – DSE ACK 6 - VPN ACK

On-Demand VPN via USIExperimental demonstration

ER 1Router ID = 1.1.1.1

ER 2 Router ID = 2.2.2.2

SiteC

DSE 1 DSE 3DSE 2

MPLSOSPF

RSVP-TE

ER 3Router ID = 3.3.3.3

Site B

AE-B

Site A

AE-A

3

4

3434

2 25 5

6 1

AE-C

CR 3

CR 2

CR 1

VPN

AE = Application EntityER = Edge RouterCR = Core RouterDSE = Distributed Service Element

AE = Application EntityER = Edge RouterCR = Core RouterDSE = Distributed Service Element

AB

C

Supporting functions for the SP

INTEGRATED FAULT TOLERANCE

Network Topology: why?

1. Grid Applications need network topology to optimally allocate tasks among different sites.

2. A detailed topology detector is needed in order to satisfy QoS requirements

So far ..

Existing tools provide Grids with only end-to-end network parameters, not sufficient in case of guaranteed-bandwidth connection requests (LSP, VPN)

Path Computation Element (PCE): why?

Definition: Entity capable of computing a network path or route based on a network graph and applying computational constraints.

Advantages1. Traffic Engineering (TE) route elaboration may be

highly CPU-intensive. PCE avoids router CPU utilization.

2. Optimal TE solutions, administrative policies and optimal Management solutions

3. Useful in scenarios where the node has limited visibility of the network topology to the destination (multi-area, multi-domain,multi-layer)

RESOURCE MONITORING

RESOURCE PROVISIONING

Combining network and application resilience mechanisms: why?• Grid fault tolerant schemes alone may not be as efficient as network resilience schemes• Application layer scheme may not restore previous QoS connectivity in full

Centralized TDS

1. Topology request2. USI Queries

3. XML Replies4. XML Topology file

• Based on a central resource broker.

• Broker has the routers list and administrator privileges on them.

• Broker directly queries routers with router-based requests.

• Three kinds of topologies can be discovered

• The Grid topology is discovered or updated in time ranges of a few seconds

TDS: XML Topologies and Retrieval Strategies

TDS Triggering Mechanisms

TIMEOUT BASED• Periodical polling• Delivery time <Timeout• No active monitoring

EVENT-DRIVEN BASED• Network status changes: active network monitoring• SNMP traps sent by VO nodes

TDS Update Methods

GLOBAL• Refresh entire topology at each invocation• Large number of messages exchanged

INCREMENTAL• Update of existing topology • Low network load

PHYSICAL TOPOLOGY

MPLS TOPOLOGY

OSPF-TE TOPOLOGY

NODES QUERIED

All All One single node

ADJACENCY DETECTION METHOD

IP subnet matching

LSP info matching

IP subnet matching

XML INFORMATION

Nodes graph, physical and logical interfaces, link nominal bandwidth

Active LSP, intermediate nodes (ERO), reserved bandwidth, load-balancing options

Nodes graph, RSVP link resources, OSPF areas, TE link metrics

PROTOCOL DEPENDENCIES

- MPLS-RSVP

OSPF-TE

RSVP

Topologies

Path Computation Element (PCE)

2

XSLT elaboration

<ted-database junos:style="detail">

<ted-database-id>10.10.14.1-1</ted-database-id> <ted-database-type>Net</ted-database-type> <ted-database-age>22648</ted-database-age> <ted-database-link-in>2</ted-database-link-in> <ted-database-link-out>2</ted-database-link-out> <ted-database-protocol>OSPF(0.0.0.0)</ted-database-protocol>

- <ted-link junos:style="database"> <ted-link-to>10.10.13.2</ted-link-to> <ted-link-local-address>0.0.0.0</ted-link-local-address> <ted-link-remote-address>0.0.0.0</ted-link-remote-address> <ted-link-metric>0</ted-link-metric> - <switching-capability-descriptor heading="ISCD(1):"> <switching-type>Packet</switching-type> <encoding-type>Packet</encoding-type> <maximum-lsp-bw0>[0] 0bps</maximum-lsp-bw0> <maximum-lsp-bw1>[1] 0bps</maximum-lsp-bw1> <maximum-lsp-bw2>[2] 0bps</maximum-lsp-bw2> <maximum-lsp-bw3>[3] 0bps</maximum-lsp-bw3> <maximum-lsp-bw4>[4] 0bps</maximum-lsp-bw4> <maximum-lsp-bw5>[5] 0bps</maximum-lsp-bw5> <maximum-lsp-bw6>[6] 0bps</maximum-lsp-bw6> <maximum-lsp-bw7>[7] 0bps</maximum-lsp-bw7> </switching-capability-descriptor> </ted-link> …….</ted-database>

TED

Topology

PCE

1

OSPF-TE

<topology>

<node> <node-id>10.10.1.1</node-id> <num-links>2</num-links> <link> <adj-node-id>10.10.2.1</adj-node-id> <available-bw7>1000</available-bw7> </link>

<link> <adj-node-id>10.10.3.1</adj-node-id> <available-bw7>1000</available-bw7> </link> </node> …….</topology>

C elaboration

3

<topology>



LSP Traffic Matrix

<topology>



LP formulation

<topology>



LSP strict

routes

LP elaboration

4

5

Router configurati

on

TED download

PCE functions for optimal TE solution:1, 2, 3 – Download from TE Database of relevant information, XSLT elaboration, C elaboration to produce LP formulation4 - PCE runs LP formulation to identify Label Switch Path (LSP) traffic allocation that minimizes the maximum link bandwidth (Least-fill policy)5 - PCE configures LSPs on every Ingress Router (strict routes)

Results show that PCE performs fast and achieves optimal bandwidth utilization if compared with CSPF algorithm performed by nodes

Cooperative application-networkQoS-Aware Fault Tolerance

• Assumption– Qualified applications (e.g. visualization) requires

communication QoS guarantees– QoS parameter

• minimum bandwidth

• Objective– Maximize recovered connections and minimize required

network resources upon network link failure

• Possible approach– Integrating QoS unaware layer (application) and QoS

capable layer (network) fault tolerance QoS aware integrated fault tolerance

• QoS capable layer fault tolerance– (G)MPLS path restoration

• Software layer fault tolerance– Service replication (server migration)

Integrated Fault Tolerance Advantages: Path Restoration + Service Replication

Primary VideoServer

Backup VideoServer

Primary LSP

Backup LSP

LSP to Backup Video Server

Client

anotherprimary

LSP

`

Conclusions

• The problem of providing a connection oriented service in a WAN environment to individual qualified applications (e.g. grid) have been faced from an architectural point of view with regard to

– The Service Plane and service interface

– A Centralized Topology Discovery Service (TDS)

– Path Computation Element (PCE)

– Integrated resilience scheme

• But ☺– People working on grid computing are mainly computer scientists

– People working on networks are telecommunication engineers

– Not easy to create a common view on the topic.

References

• P. Castoldi, L. Valcarenghi, "On the Advantages of Integrating Service Migration and GMPLS Path Restoration for Grid Network Failure Recovery", 1st International Workshop on Networks for Grid Applications (Gridnets 2004) co-located with Broadnets 2004, San Jose, USA, Oct. 2004.

• Barbara Martini, Fabio Baroncelli, Piero Castoldi, "A Novel Service Oriented Framework for Automatic Switched Transport Network", 9th IFIP/IEEE International Symposium on Integrated Network Management, Niece (France) 15-19 May, 2005

• F. Baroncelli, B. Martini, L. Valcarenghi, P. Castoldi, "A Service Oriented Network Architecture suitable for Global Grid Computing", Optical Networks Design and Modeling (ONDM 2005), Milan, Italy, February 2005.

• L. Valcarenghi, L. Rossi, F. Paolucci, F. Cugini, P. Castoldi, "Multi-Layer Bandwidth Recovery for Multimedia Communications: an Experimental Evaluation", 1st Conference on Next Generation Internet Networks Traffic Engineering, 18-20 April 2005, Rome, Italy

• Barbara Martini, Fabio Baroncelli, Piero Castoldi, Angelica Aprigliano, "Experimental validation of a service oriented network architecture applied to global Grid computing", 1st International Conference on AUtomated Production of Cross Media Content for Multi-channel Distribution (AXMEDIS '05), Firenze (Italy), 30 Nov - 2 Dec. 2005.

• Barbara Martini, Fabio Baroncelli, Piero Castoldi, Americo Muchanga, Lena Wosinska, "The Service Oriented Optical Network (SOON) Project", Proc. of Reliability issues in Next Generation Optical Networks (RONEXT), COST270 WG1 workshop, colocated with ICTON 2005, July 3 - 7, 2005, Barcelona, Spain.

• Luca Valcarenghi, Piero Castoldi, "QoS-Aware Connection Resilience for Network-Aware Grid Computing Fault Tolerance", Proc. of Reliability issues in Next Generation Optical Networks (RONEXT), COST270 WG1 workshop, colocated with ICTON 2005, July 3 - 7, 2005, Barcelona, Spain

• Luca Valcarenghi, Francesco Paolucci, Luca Foschini, Filippo Cugini, and Piero Castoldi, "Centralized and Distributed Topology Discovery Service Implementations", 13th Annual IEEE Symposium on High Performance Interconnects, Stanford University, August 17-19, 2005.

• L. Valcarenghi, L. Foschini, F. Paolucci, F. Cugini, P. Castoldi, "Topology Discovery Services for Monitoring the Global Grid", IEEE Communication magazine special issue on "Optical Control Plane for Grid Networks: Opportunities, Challenges and the Vision", March 2006, pp. 110-117.

• F. Baroncelli, B. Martini, L. Valcarenghi and P. Castoldi "Service Composition in Automatically Switched Transport Networks", IEEE International Conference on Networking and Services (ICNS'06) July 16-18, 2006, Silicon Valley, USA

• L.Valcarenghi and P. Castoldi, "Topology-Aware Replica Placement Hauristics in the Global Grid“ Proc. of 2° Reliability issues in Next Generation Optical Networks (RONEXT) Workshop, colocated with ICTON '06, Nottingham, U.K., 18-22 June 2006

E-mail: [email protected]

Sant’Anna School & CNIT, CNR research area, Via Moruzzi 1, 56124 Pisa, Italy

Documents

Scuola Superiore Sant’Anna Enabling Platforms for high-performance computational Grids oriented to scalable virtual organization (GRID.IT) P. Castoldi,