Marco Verlato, INFN 23 March, 2011 ISGC2011/OGF31, Taipei,Taiwan Interoperability solutions in India...

Preview:

Citation preview

Marco Verlato, INFN23 March, 2011

ISGC2011/OGF31, Taipei,Taiwan

Interoperability solutions in India

1

EU-IndiaGrid2 Consortium

GARUDA National Grid ComputingInitiative in Operational Phase To collaborate on Research and Engineering of Technologies, Architectures, Standards and Applications in Grid Computing To contribute to aggregation of resources in the Grid

Spanned across 17 cities with more than 45 participating research & academic institutions connected through NKN

3

EUIG2 gLite production infrastructure and resources

• Centralized Services for euindia VO

– Maintained by INFN @ Padova and Bologna

• VOMS server

• LFC server

• WMSLB

• Top Level BDII

• Monitoring & accounting

• 13 production sites (>10k CPU-cores) involved as of now (see next slide) where the users can run jobs. Accepted x509 IGCA certificates

EUIG2 gLite production infrastructure and resources

EUIG2 gLite production infrastructure and resources

• Computing & Storage resources last month

EUIG2 gLite production infrastructure and resources

> 30 CPU.years> 8500 jobs

EUIG2 gLite production infrastructure and resources

GARUDA MW status

• One year stop between EUIG and EUIG2

– In the meantime big changes in GARUDA:

• From GT2 to GT4 WS

• From Moab to GridWAY metascheduler

• Abandoning SRB, going to SRM

– GARUDA production now:Based on Globus 4.xCustomized Gridway Meta-schedulerMDS4, Ganglia, Nagios for Information managementGrid FTP + SRM for Data managementGSI based securityVOMS for VO support

9

GridWay

GLOBUS

lcg-CE(GT2 based)

GARUDA-CE(GT4 based)

BDII

GSI+VOMS

PORTAL UI

WMS

JC+LM

LFC

SRMSRM

MDS

GSI+VOMS

PBS-LikeLRMS

WN-1

WN-2

WN-n

GARUDA-EGI MW components analysis

ICE

CREAM-CE

EUIG2 Interoperability steps (I)

• Work started on both sides– CDAC and ICTP/INFN contributions

– First example of interoperabilty made at CDAC-Bangalore in July 2010

• Key element: GridWAY, the GARUDA adopted metascheduler• Plugins structure as to facilitate interoperability among different MW layers

• It provides the unified pool of information about compute resources through its MADs (Middleware Access Drivers)

• According to the requirements and VO privileges the uses can choose the candidate resource for the application

11

Gridway metascheduler

12

GridWay

lcg-CE(GT2 based)

GARUDA-CE(GT4 based)

BDII

GARUDA user

MDS

GARUDA-EGI interoperation architecture

CREAM-CE

EGI user

MADs MADs

PBS cluster PBS cluster PBS cluster

GSI+VOMS

EUIG2 Interoperability steps (II)

• Key element: GridSEED VM environment• A complete GRID environment based on Virtualbox developed by ICTP

• Consists of virtual instances of gLite CE, SE, WN, VOMS, CA, BDII, WMS, LB, LFC and MILU (Miramare Lightweight UI)

• The experience of CDAC-Bangalore lead to a new version of GridSEED which integrates now GridWAY into MILU, plus a GT4.0.x gateway (globus CE/GARUDA CE)

• Successfully Demo made at OGF30/Bruxelles in October 2010: submission, from MILU, of the same job on both worlds (gLite & GARUDA)

14

OGF30 Demo Scenario

GT4 CEglobus.grid.seed

lcg CEce-3.grid.seed

MiLU UImilu-i386.grid.seed

CREAM CEce-2.grid.seed

PBS SERVERPBS Server

CREAM WNLCG WN

VOMSmaster.grid.seed

TOP BDIIcentral-1.grid.seed

GRIDWAYInformation System

Proxy

Job Submission

Work in progress

15

EUIG2 Interoperability steps (III)

• At OGF30 was also established a collaboration with IGE:» VOMS integration into Globus;

» GridWay (that is an IGE supported component);

» IGE developed LoadLeveler batch adaptor for GT5;

» Globusonline.eu to orchestrate file transfer;

» LTA can tunnel ports through firewalls via (gsi)ssh;

» Replacing RFT;

» OGSA-DAI is a supported component in IGE

• GT4 phasing out issue: GARUDA is not planning to move its production infrastructure to GT5 in the short term

16

EUIG2 Interoperability steps (IV)

• EUINDIA workshop in Delhi, December 2010:

- Successfully demonstrated RegCM4 application can run on the two infrastructures, with Gridway configured to talk to both GARUDA and LCG-CE ( GridSEED based)

- Submitted RegCM4 with small data set to GARUDA resource(parallel regcm4) and LCG-CE (serial RegCM4)

- Defined plans for moving from Demo to production infrastructure

- Network interoperabilty issues identified and solved at the workshop

17

Network activities

Commissioned NKN Full Phase (1 billion Euro/10 years)TEIN3 POP, Mumbai @ ERNET, GARUDA on ERNET IPsTEIN3 POP moved to NKN

issue with performance connectivityEUIG2 had a prime role on solving the problem

GARUDA moved to NKN but ERNET public IPsissues contacting services from EUROPE, essential for the interoperability with GARUDAproblem: rotation ERNET IPs through NKN infrastructure -> major achievement of EUIG2 @ Delhi Workshop

18

Planned ahead Supporting CREAM-CE through Gridway

- MAD for Gridway has been developed

- Integration & Testing progressing

MPI support for interoperable infrastructures

- Garuda has support for MPI application with Gridway

- gLite has a planned support for a whole node reservation

Integrating Production infrastructures of Garuda and EGI

Recommended