1
CERNEU DataGrid Project TestBed StatusEU DataGrid Project TestBed Statusand Plansand Plans
Bob Jones
EU DataGrid Technical Coordinator
CERN
EU DataGrid 2
CERN
Jan. 2002
Project ScheduleProject Schedule
EU contract signed on December 29th, 2000 TestBed 0 (early 2001)
International test bed 0 infrastructure deployedGlobus 1 only - no EDG middleware
TestBed 1 ( now )First release of EU DataGrid software based on Globus 2 to defined users
within the project:HEP experiments (WP 8)Biology applications (WP 9)Earth Observation (WP 10)
Project Review by EU (1st March 2002)External reviewers inspect deliverables & demo, partner’s contributions etc.
TestBed 2 (Sept-Oct. 2002)Builds on TestBed 1 to extend facilities of DataGrid
TestBed 3 (March 2003) & 4 (Sept 2003)
EU DataGrid 3
CERN
Jan. 2002
EU DataGrid 4
CERN
Jan. 2002
Grid aspects covered by EDG testbed 1Grid aspects covered by EDG testbed 1
VO servers LDAQ directory for mapping users (with certificates) to correct VO
Storage Element Grid-aware storage area, situated close to a CE
User Interface Submit & monitor jobs, retrieve output
Replica Manager Replicates data to one or more CEs
Job Submission Service Manages submission of jobs to Res. Broker
Replica Catalog Keeps track of multiple data files “replicated” on different CEs
Information index Provides info about grid resources via GIIS/GRIS hierarchy
Information & Monitoring
Provides info on resource utilization & performance
Resource Broker Uses Info Index to discover & select resources based on job requirements
Grid Fabric Mgmt Configure, installs & maintains grid sw packages and environ.
Logging and Bookkeeping
Collects resource usage & job status
Network performance, security and monitoring
Provides efficient network transport, security & bandwidth monitoring
Computing Element Gatekeeper to a grid computing resource
Testbed admin. Certificate auth.,user reg., usage policy etc.
EU DataGrid 5
CERN
Jan. 2002
TestBed 1 layout at CERNTestBed 1 layout at CERN
GateKeeperlxshare0220
Worker Nodelxshare0221
Storage Elementlxshare0219
GateKeeperlxshare0223
Worker Nodelxshare0224
Storage Elementlxshare0222
Info ServiceMDS/ftreelxshare0225
User Interfacetestbed006
Replica Catalogtestbed0226
Res. BrokerJob Sub Srvtestbed011
Logging &Bookkeepingtestbed012 LSF (initially fork)
PBS
LDAP serversFor VOsNIKHEF
EU DataGrid 7
CERN
Jan. 2002
Physicists from LHC experiments submitting jobs with their application software that uses: User interface (job submission language etc.) Resource Broker & Job submission service Information Service & Monitoring Data Replication
Testbed usage to dateTestbed usage to date
Add lfn/pfnto
Rep Catalog
GenerateRaw events
on local diskRaw/dst ?
Job argumentsData Type : raw/dstRun Number :xxxxxxNumber of evts :yyyyyyNumber of wds/evt:zzzzzzRep Catalog flag : 0/1Mass Storage flag : 0/1
Write logbookOn client node
raw_xxxxxx_dat.log
dst_xxxxxx_dat.log
Read raw eventsWrite dst events
Get pfnfrom
Rep Catalog
Add lfn/pfnto
Rep Catalog
MS
MS
Move toSE, MS ?
Write logbookOn client node
pfn local ? ny
raw_xxxxxx_dat.log
Copy raw data From SE toLocal disk
Generic HEP application flowchart
SEMove to SE, MS?
SE
[reale@testbed006 JDL]$ dg-job-submit gridpawCNAF.jdlConnecting to host testbed011.cern.ch, port 7771Transferring InputSandbox files...doneLogging to host testbed011.cern.ch, port 15830=========dg-job-submit Success ============ The job has been successfully submitted to the Resource Broker. Use dg-job-status command to check job current status. Your job identifier (dg_jobId) is:
https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771========================================
[reale@testbed006 JDL]$ dg-job-get-output https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771
Retrieving OutputSandbox files...done
============ dg-get-job-output Success ============
Output sandbox files for the job:
- https://testbed011.cern.ch:7846/137.138.181.253/185337169921026?testbed011.cern.ch:7771
have been successfully retrieved and stored in the directory:
/sandbox/185337169921026
First simulated ALICE event generated by using the DataGrid Job Submission Service
EU DataGrid 8
CERN
Jan. 2002
Groups Involved in Testbed 1Groups Involved in Testbed 1
WP8
WP8
WP9
WP9
WP10
PierGiorgio Cerello
Eric Van Herwijnen
Julian Lindford
Andrea Parrini
Yannick Legre
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
WP6
Brian Coghlan
Flavia Donno
Eric Fede
Fabio Hernandez
Nadia Lajili
Charles Loomis
Pietro Paolo Martucci
Andrew McNab
Sophie Nicoud
Yannik Patois
Anders Waananen
WP1
WP2
WP3
WP4
WP5
WP7
Elisabetta Ronchieri
Shahzad Muzaffar
Alex Martin
Maite Barroso Lopez
Jean Philippe Baud
Frank Bonnassieux
Integration Team VO Admin ALICE:Daniele MuraATLAS:Alessandro de SalvoCMS: Andrea SciabaLHCb: Joel ClosierEO:Yannick LegreBio-Info:John van de Vegte
ALICE: Ingo Augstin & Steve BurkeATLAS: Mario Reale & Alessia TricomiCMS & WP10: JJ BlaisingLHCb: Jeff Templon
Piergiorgio Cerello
Yves Schutz
Roberto Barbera
Predrag Buncic
Federico.Carminati
Marisa Luvisetto
Daniele Mura
Fons Rademakers
Mario Sitta
Claude CharlotDave Newbold Andrea Sciaba Claudio Grandi Igor Semeniouk Paolo Capiluppi Olga Kodolova N.Kruglov A.Kryukov L.Shamardin V.Kolosov E.Tikhonenko V.Mitsyn Marco Verlato Massimo Sgaravatto A. Edunov B.Berdnikov
Eric van HerwijnenJ ClosierG PatrickD GalliV VagnoniS KlousH van BultenG D PatelN BrookA KhanF HarrisD McPherson
Silvia Resconi Craig Tull Oxana Smirnova Stan Thompson Fairouz Ohlsson-Malek
Users
Site Admin CERN:Markus SchulzLyon:Fabio Hernandez CNAF: A.Chierici et al.NIKHEF: Jeff Templon & David GroepRAL:B.Saunders
EU DataGrid 11
CERN
Jan. 2002
Security CertificatesSecurity Certificates
The project software supports 12 Certification Authorities from the various partners involved in the project
http://marianne.in2p3.fr/datagrid/ca/ca-table-ca.html
For a machine to participate as a Testbed 1 resource all the CAs must be enabled.
all CA certificates can be installed without compromising local site security
Each host running a Grid service needs to be able to authenticate users and other hosts
site manager has full control over security for local nodes
Virtual Organisation represents a community of users
6 VOs: 4 HEP (ALICE, ATLAS, CMS, LHCb), 1 EO, 1 Biology
Usage guidelines
Account Registration
EU DataGrid 12
CERN
Jan. 2002
EDG testbed 1 RPMS
LCFG ext pkg ext srvc globus
edg confg security middleware apps
ldxprofldxprof
GenericComponent
GenericComponent
rdxprofrdxprof
LCFG ComponentsDBM File
LCFG configuration files
mkxprof Web Server
XML Profile(one per client node)Server node
HTTP
Client nodes
Node configuration and installation toolsNode configuration and installation toolsNode configuration toolsNode configuration tools
For reference platform (Linux RedHat 6.2)
Initial installation tool using system image cloning
LCFG (Edinburgh University) for software updates and maintenance
Total of ~750 RPMsWith a 10Mbit/sec link need
just 10 mins to install anode
EU DataGrid 13
CERN
Jan. 2002
Lessons learnt from testbed 1Lessons learnt from testbed 1
The raw ingredients exist – we just need to be sure of the recipeSufficient expertise exists in the different institutes to cover all aspects of the project
Expertise and enthusiasm needs to be channeled using agreed framework
CERN central role underestimated and under resourced Integration and deployment is a labour intensive task
More planning & WP6 (sw integration) needs reinforcement (especially at CERN)
Better done in small steps using iterative releases
Support by and relationship with Globus developers is very important International Aspects
Already an international testbed, need to agree plans with US similar activities
Underestimated the administrative effort involved in running an international testbed Need more emphasis on testing
More unit & integration testing
Middleware WPs need to develop a test-plan (also WP6 for external packages & integration tests) and involved applications from early stage
EU DataGrid 14
CERN
Jan. 2002
Iterative ReleasesIterative Releases
Planned intermediate release schedule
TestBed 1: October 2001
Release 1.1: January 2002
Release 1.2: March 2002
Release 1.3: May 2002
Release 1.4: July 2002
TestBed 2: September 2002 Each release includes
feedback from use of previous release by application groups
agreed high-priority improvements/extensions
use of software infrastructure
feeds into architecture group Similar schedule will be organised for 2003
EU DataGrid 15
CERN
Jan. 2002
Software Release ProcedureSoftware Release Procedure
Coordination meetingGather feedback on previous releaseReview plan for next release
WP meetingTake basic plan and clarify effort/people/dependencies
Sw developmentPerformed by WPs in dispersed institutes and run unit tests
Software integrationPerformed by WP6 on frozen swIntegration tests run
Acceptance testsPerformed by Loose Cannons et al.
Roll-outPresent sw to application groupsDeploy on testbed
Coord. meeting
ReleasePlan++
Release feedback ReleasePlan
WP meetings
WP1 WP3 WP7
Component 1 Component n
Globus
EDG release
Distributed EDG release
Roll-out. meeting
testbed 1:Dec 11 2001
~100 participants
EU DataGrid 16
CERN
Jan. 2002
Development & Production testbedsDevelopment & Production testbeds
DevelopmentInitial set of 5 sites will keep small cluster of PCs for development purposes to
test new versions of the software, configurations etc. Production
More stable environment for use by application groupsmore sitesmore nodes per site (grow to meaningful size at major
centres)more users per VO
Usage already foreseen in Data Challenge schedules for LHC experimentsharmonize release schedules
Participating in InterGrid discussions on testbed organisationAntonia Ghiselli, Bob Jones, Francesco Prelz
Analysis of interface with US testbeds to be performed by end of April (GriPhyN/PPDG meeting)
EU DataGrid 17
CERN
Jan. 2002
EU DataGrid 18
CERN
Jan. 2002
Future PlansFuture Plans
Prepare for first project EU review 1st March 2002 (“managed panic”)
Expand testbed
More nodes per site, more sites (including US), more users
Evolve architecture and software on the basis of TestBed usage and feedback from users
Closer integration of the software components
Improve software infrastructure toolset and test suites
Look for convergence with PPDG/GriPhyN architecture
Enhance synergy with US via DataTAG-iVDGL and InterGrid
Address shortcomings in plan by collaborating with other EU projects (DataTAG, GridSTART, CrossGrid)
Promote early standards adoption with participation to GGF WGs
Final software release by end of 2003