Upload
meg
View
22
Download
3
Tags:
Embed Size (px)
DESCRIPTION
The European DataGrid Project Technical status. www.eu-datagrid.org Bob Jones (CERN) Deputy Project Leader. DataGrid scientific applications Developing grid middleware to enable large-scale usage by scientific applications. Bio-informatics - PowerPoint PPT Presentation
Citation preview
CERN
The European DataGrid Project
Technical status
www.eu-datagrid.orgBob Jones (CERN)
Deputy Project Leader
B.Jones– Nov 2002 - n° 2
DataGrid scientific applicationsDeveloping grid middleware to enable large-
scale usage by scientific applications
Earth Observation
•about 100 Gbytes of data per day (ERS 1/2)
•500 Gbytes, for the ENVISAT mission
Bio-informatics Data mining on genomic databases (exponential growth) Indexing of medical databases (Tb/hospital/year)
Particle Physics Simulate and reconstruct complex physics
phenomena millions of times
LHC experiments will generate 6-8 PetaBytes/year
B.Jones– Nov 2002 - n° 3
The Project
9.8 M Euros EU funding over 3 years
90% for middleware and applications (HEP, Earth Obs. and Bio Med.)
Three year phased developments & demos (2001-2003)
Total of 21 partners Research and Academic institutes as well as industrial companies
Extensions (time and funds) on the basis of first successful results: DataTAG (2002-2003) www.datatag.org
CrossGrid (2002-2004) www.crossgrid.org
GridStart (2002-2004) www.gridstart.org
B.Jones– Nov 2002 - n° 4
Research and Academic Institutes•CESNET (Czech Republic)•Commissariat à l'énergie atomique (CEA) – France•Computer and Automation Research Institute, Hungarian Academy of Sciences (MTA SZTAKI)•Consiglio Nazionale delle Ricerche (Italy)•Helsinki Institute of Physics – Finland•Institut de Fisica d'Altes Energies (IFAE) - Spain•Istituto Trentino di Cultura (IRST) – Italy•Konrad-Zuse-Zentrum für Informationstechnik Berlin - Germany•Royal Netherlands Meteorological Institute (KNMI)•Ruprecht-Karls-Universität Heidelberg - Germany•Stichting Academisch Rekencentrum Amsterdam (SARA) – Netherlands•Swedish Research Council - Sweden
Assistant Partners
Industrial Partners•Datamat (Italy)•IBM-UK (UK)•CS-SI (France)
B.Jones– Nov 2002 - n° 5
EDG structure : work packages The EDG collaboration is structured in 12 Work Packages:
WP1: Work Load Management System
WP2: Data Management
WP3: Grid Monitoring / Grid Information Systems
WP4: Fabric Management
WP5: Storage Element
WP6: Testbed and demonstrators
WP7: Network Monitoring
WP8: High Energy Physics Applications
WP9: Earth Observation
WP10: Biology
WP11: Dissemination
WP12: Management
}} ApplicationsApplications
B.Jones– Nov 2002 - n° 6
Project Schedule Project started on 1/1/2001
TestBed 0 (early 2001) International test bed 0 infrastructure deployed
Globus 1 only - no EDG middleware
TestBed 1 ( now ) First release of EU DataGrid software to defined users within the project:
HEP experiments, Earth Observation, Biomedical applications
Project successfully reviewed by EU on March 1st 2002
TestBed 2 (end 2002) Builds on TestBed 1 to extend facilities of DataGrid
TestBed 3 (2nd half 2003)
Project completion expected by end 2003
B.Jones– Nov 2002 - n° 7
EDG middleware GRID architecture
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica
ManagerGrid
SchedulerGrid
Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element Services
SQL Database Services
SQL Database Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault Tolerance
Resource Management
Resource Management
Fabric StorageManagement
Fabric StorageManagement
Grid
Fabric
Local Computing
Grid Grid Application LayerGrid Application Layer
Data Management
Data Management
Job Management
Job Management
Metadata Management
Metadata Management
Service IndexService Index
APPLICATIONS
GLOBUS
M / W
B.Jones– Nov 2002 - n° 8
EDG interfaces
Collective ServicesCollective Services
Information & Monitoring
Information & Monitoring
Replica ManagerReplica Manager Grid SchedulerGrid Scheduler
Local ApplicationLocal Application Local DatabaseLocal Database
Underlying Grid ServicesUnderlying Grid Services
Computing Element Services
Computing Element Services
Authorization Authentication and Accounting
Authorization Authentication and Accounting
Replica CatalogReplica Catalog
Storage Element Services
Storage Element ServicesSQL Database
ServicesSQL Database
Services
Fabric servicesFabric services
ConfigurationManagement
ConfigurationManagement
Node Installation &Management
Node Installation &Management
Monitoringand
Fault Tolerance
Monitoringand
Fault ToleranceResource
ManagementResource
ManagementFabric StorageManagement
Fabric StorageManagement
Grid Application LayerGrid Application Layer
Data Management
Data ManagementJob ManagementJob Management Metadata
ManagementMetadata
ManagementObject to File
MappingObject to File
Mapping
Service IndexService Index
Computing Computing ElementsElements
SystemSystemManagersManagers
ScientistScientistss
OperatingOperatingSystemsSystems
FileFile SystemsSystems
StorageStorageElementsElementsMassMass Storage SystemsStorage Systems
HPSS, CastorHPSS, Castor
UserUser AccountsAccounts
CertificateCertificate AuthoritiesAuthorities
ApplicationApplicationDevelopersDevelopers
BatchBatch SystemsSystemsPBS, LSF, etc.PBS, LSF, etc.
B.Jones– Nov 2002 - n° 9
EDG overview : current project status
EDG currently provides a set of middleware services Job & Data Management
GRID & Network monitoring
Security, Authentication & Authorization tools
Fabric Management
Runs on Linux Red Hat 6.1 platform Site install & config tools and set of common services available
5 core EDG 1.2 sites currently belonging to the EDG-Testbed CERN(CH), RAL(UK), NIKHEF(NL), CNAF(I), CC-Lyon(F),
also deployed on other testbed sites (~15) via CrossGrid, DataTAG and national grid projects
actively used by application groups
Intense middleware development continuously going on, concerning: New features for job partitioning and check-pointing, billing and accounting New tools for Data Management and Information Systems. Integration of network monitoring information inside the brokering polices
B.Jones– Nov 2002 - n° 10
Dubna
Moscow
RAL
Lund
Lisboa
Santander
Madrid
Valencia
Barcelona
Paris
Berlin
LyonGrenoble
Marseille
BrnoPrague
Torino
Milano
BO-CNAFPD-LNL
Pisa
Roma
Catania
ESRIN
CERN
IPSL
Estec KNMI
Testbed Sites
B.Jones– Nov 2002 - n° 11
Tutorials
DAY1
Introduction to Grid computing and overview of the DataGrid project
Security
Testbed overview
Job Submission
lunch
hands-on exercises: job submission
DAY2 Data Management
LCFG, fabric mgmt & sw distribution & installation
Applications and Use cases
Future Directions
lunch
hands-on exercises: data mgmt
The tutorials are aimed at users wishing to "gridify" their applications using EDG software and are organized over 2 full consecutive days
December2 & 3 – Edinburgh5 & 6 - Turin 9 & 10 - NIKHEF
To date about 120 people trained
http://hep-proj-grid-tutorials.web.cern.ch/hep-proj-grid-tutorials/
B.Jones– Nov 2002 - n° 12
Through links with sister projects, there is thepotential for a truely global scientific applications grid
Demonstrated at IST2002 and SC2002 in November
Related Grid projects
B.Jones– Nov 2002 - n° 13
Details of release 1.3
Based on Globus 2.0beta but with binary modifications taken from Globus 2.2
large file transfers (gridFTP)
“lost” jobs (GASS cache)
unstable information system (MDS 2.2)
new Replica Catalog schema
More reliable job submission Res Broker returns errors if
overloaded
Stability tests successfully passed
Minor extensions to JDL
Improved data management tools GDMP v3.2
automatic &explicit triggering of staging to MSS
support for parallel streams (configurable)
Edg-replica-manager v2.0 uses GDMP for MSS staging shorter alias’ for commands (e.g. edg-
rm-l edg-replica-manager-listReplicas) new file mgmt commands: getbestFile,
cd, ls, cat etc. support for parallel streams
(configurable)
Better fabric mgmt Bad RPMs no longer block installation Available on Linux RH 6. Not backward compatible with EDG 1.2
Addresses bugs found by applications in EDG 1.2 - being deployed in November
B.Jones– Nov 2002 - n° 14
Incremental Steps for testbed 2
1. Fix “show-stoppers” for application groups – mware WPs (continuous)
2. Build EDG1.2.x with autobuild tools
3. Improved (automatic) release testing
4. Automatic installation & configuration procedure for pre-defined site
5. Start autobuild server for RH 7.2 and attempt build of release 1.2.x
6. Updated fabric mgmt tools
7. Introduce prototypes in parallel to existing modules RLS R-GMA
8. GLUE modified info providers/ consumers
9. Storage Element v1.0
10. Introduce Reptor
11. Add NetworkCost Function
12. GridFTP server access to CASTOR
13. Introduce VOMS
14. Improved Res. Broker
15. LCFGng for RH 7.2
16. Storage Element v2.0
17. Integrate mapcentre and R-GMA
18. Storage Element V3.0
Expect this list to be updated regularly
Prioritized list of improvements to be made to the current release as established with usersfrom September through to end of 2002
B.Jones– Nov 2002 - n° 15
Plans for the future
Further developments in 2003 Further iterative improvements to middleware driven by users needs More extensive testbeds providing more computing resources Prepare EDG software for future migration to Open Grid Services Architecture
Interaction with LHC Computing grid (LCG) LCG intends to make use of the DataGRID middleware LCG is contributing to DataGRID
Testbed support and infrastructure Get access to more computing resources in HEP computing centres
Testing and verification Reinforce the testing group and maintain a certification testbed
Fabric management and middleware development
New EU project Make plans to preserve current major asset of the project: probably the largest Grid
development team in the world EoI for FP6 ( www.cern.ch/egee-ei ), possible extension of the project, etc.