19.07.2012, GRID2012, Dubna 1
GRID AND HPC SUPPORT
FOR NATIONAL PARTICIPATION
IN LARGE-SCALE COLLABORATIONS
Department of Computational Physics and Information Technologies,
'Horia Hulubei' National Institute for R&D in Physics
and Nuclear Engineering (IFIN-HH),
Magurele, Romania
M. Dulea, S. Constantinescu, M. Ciubancan,
T. Ivanoaica, C. Placinta, I.T. Vasile, D. Ciobanu-Zabet
Resource centres:
• 11 active sites hosted by 7 institutions
• Total number of cores: 5830
• Total disk capacity: 1,8 PB
Network infrastructure provided by
NREN: RoEduNet (min. 10 Gbps)
19.07.2012, GRID2012, Dubna 2
NATIONAL GRID INFRASTRUCTURE
19.07.2012, GRID2012, Dubna 3
WLCG VOs:
cover 98% of the national Grid production
alice - 54% (supported by 3 centres)
atlas - 41% (supported by 4 centres)
lhcb - 3% (supported by 3 centres)
(reported period 07.2011-06.2012)
2%: envirogrids.vo.eu-egee.org - models & scenarios for the environment of the Black Sea Catchment gridifin - provides framework for running non-HEP applications supported by the National Grid for Physics and Related Areas (GriNFiC) see, seegrid - regional (South East Europe) hone
ilc
SUPPORTED VOs AND GRID USAGE
19.07.2012, GRID2012, Dubna 4
Tier-2 Normalised CPU time
per country (HEPSPEC06-hours)
1,58% 10th position / 33;
Total: 114.842.196
ALICE: 64.488.248 (6th)
ATLAS: 46.523.872
LHCb: 3.830.076
Supported VOs and Grid usage
(07.2011-06.2012)
19.07.2012, GRID2012, Dubna 5
Tier-2 number of jobs
3,63% 9th position / 33
Total: 15.009.823
alice: 6.521.359 (first)
atlas: 8.335.369 (9th) 90.6% CPU efficiency (4th)
lhcb: 153.095
Supported VOs and Grid usage
(07.2011-06.2012)
Years Tier-2 RO-LCG 2010-2011 / 2009-2010 1,74 2,16 2011-2012 / 2010-2011 1,44 2,65
6
High-Performance Computing Infrastructure for South East Europe’s Research Communities
Worldwide LHC Computing Grid
Processing of PetaByte-Scale Data @ FR Cloud
GriCeFCo
Coordination of the Romanian HPC consortium that participates in the HP-SEE project (FP7-RI-261499 , 2010-2013)
Coordination of the Romanian Tier-2 Federation RO-LCG, - composed of 5 institutions that participate in the WLCG collaboration with CERN (since 2006)
Collaboration IRFU/CEA - IFIN-HH, on Efficient Handling and Processing Petabite Scale Data for the Computing Centres within the French Cloud (HaPPSDag)
Coordination of the National Grid for Research in Physics and Related Areas, built through EU structural funds (SOP IEC 2.2.3 - Grid)
RO - LIT-JINR / Dubna collaboration Hulubei-Meshcheriakov programme (2005-2013), project Optimization Investigations of the GRID and Parallel Computing Facilities at LIT-JINR and Magurele Campus
COLLABORATIONS AND PROJECTS
(DFCTI)
19.07.2012, GRID2012, Dubna
7
IFIN_Bio > Myrinet 2000 256 cores
IFIN_BC > Infiniband 1040 cores
APC InfraStruXureTM
Data Center ^
COMPUTING CENTER
GRID AND HPC INFRASTRUCTURE @ DFCTI
19.07.2012, GRID2012, Dubna
19.07.2012, GRID2012, Dubna 8
DFCTI hosts 3 grid sites:
RO-07-NIPNE ( alice, atlas, lhcb)
RO-11-NIPNE (lhcb)
IFIN GRID (GriNFiC)
DD: dpm disk storage servers
2 CREAM CE, VO-Box, etc.
Running atlas analysis jobs required a scalable solution for handling the increasing data transfers between the SE and WNs.
A convenient solution was the stacking of switches, wich can provide the required minimum bandwidth when the number of simultaneous transfers grows.
IFIN GRID
RO-07
LOCAL NETWORK TOPOLOGY
19.07.2012, GRID2012, Dubna 9
MONITORING
Provided at 4 levels: 1) utility and support equipment (electric power, UPS, cooling, etc)
2) data traffic on main switches 3) grid activity: number of jobs of different types, Grid production ( normalised CPU time ), the traffic on the main servers (SE, GW, etc.), on a common interface: http://www.nipne.ro/RO-07-NIPNE.html
19.07.2012, GRID2012, Dubna 10
SERVICE AVAILABILITY MONITORING
SAM provided by IFIN GRID, through ifops VO; results transferred to WMS and published by Nagios
ifops is dedicated to the monitoring of the sites of GriNFiC, including those of RO-LCG
The GUI allows to compare the results of the ifops tests with those of the ops tests performed by NGI and published on the EGI monitoring portal.
19.07.2012, GRID2012, Dubna 11
from the sublime to the ridiculous ...
NETWORK SUPPORT
Extensive investigation of the connectivity in HaPPSDaG
Most of the RO-LCG resources are connected to the NREN's NOC through a 12 km FO aerial cable (10 Gbps).
perfSONAR tests make no sense if this connection is cut
This begins to happen rather frequently
Below: reaching 8,5 Gbps throughput, during alice files transfer, just before the link being cut.
19.07.2012, GRID2012, Dubna 12
The development of the HPC infrastructure started in 2006 and benefited of the cooperation with LIT-JINR.
There are now 4 parallel clusters in technology Infiniband, Myrinet 2G and GbE.
The large-scale computations are performed on the IFIN_BC cluster (IBM Blade Center)
Server type: IBM QS22 IBM LS22 IBM HS22 Total CPU IBM PowerXCell 8i AMD Opteron 2376 Intel Xeon X5650 Clock frequency 3,2 GHz 2,3 GHz 2,67 GHz Core no. / cpu 1x PPE + 8x SPE 4 6 Cache level2/cpu 512 KB 512 KB 6x256 KB FSB frequency 1066 1000 3200 MHz HDD / node 8 GB SSD 76 / 146 GB SAS 500 GB SAS RAM / node 32 GB 8 GB 24 / 36 / 48 Total RAM 512 GB 80 GB 21x24+23x36+12x48 GB 1908 GB Nodes total 16 10 56 82 CPUs total 32 20 112 164 Cores total 32x PPE + 256x SPE 80 672 1040 Interconnects Infiniband 4x QDR 40 Gbps
HPC INFRASTRUCTURE
13
• Simulation and modeling of large biomolecular systems by means of molecular dynamics codes (e.g. NAMD) (collab. with Fac. of Biology / Univ. of Bucharest)
• Modeling drug - efflux pump inhibitors interaction dynamics: Activity modeling and simulation of efflux pump inhibitors based on advanced laser methods (collaboration with INFLPR)
• Dynamics of Bose-Einstein condensates (collaboration with IPB, Belgrade)
• Modeling of seismic events (INCDFP collaboration)
• Ab-initio investigation of charge transport in nanostructures (collab. with Physics
Faculty / Univ. of Bucharest)
19.07.2012, GRID2012, Dubna
HPC SUPPORT FOR RESEARCH PROJECTS
High-Performance Computing Infrastructure for South East Europe’s Research Communities (HP-SEE) (2010-2013)
14
Coordinator: GRNET, Greece 14 partners DFCTI leads the Romanian consortium:
• IFIN-HH • UVT • UPB (NCIT) • ISS
HPC SUPPORT FOR REGIONAL RESEARCH COMMUNITIES 1/2
19.07.2012, GRID2012, Dubna
Contribution to the realization of an common integrated HPC infrastructure in SEE (11 HPC resource centres).
Grid middleware is used in in order to hide the various ways of accessing HPC sites provided by the local resource management systems (LRMS).
Several middleware solutions can be used, but even this access layer can also be generalized by application specific, graphical portal solutions.
Example: WS-PGRADE - can communicate with all the EMI legacy middleware solutions: ARC, dCache, gLite, and UNICORE (https://www.shiwa-workflow.eu/wiki/-/wiki/Main/WS-PGRADE)
15
HPC SUPPORT FOR REGIONAL RESEARCH COMMUNITIES 2/2
19.07.2012, GRID2012, Dubna
In figure: the ISyMAB user can choose to work on IFIN_BC or on PARADOX @ IPB, Belgrade.
The working directory of the user on any cluster can be synchronized with the ISyMAB directory using the Sync buttons
ISyMAB - Integrated System for Modeling and data Analysis of complex Biomolecules
The application provides a remote access framework on NAMD clusters which offers the users an integrated interface with analysis tools
The user with access rights can launch jobs through pbs and execute shell scripts on various HP-SEE clusters
19.07.2012, GRID2012, Dubna 16
DFCTI will use a new building (in construction): 316 sqm datacenter, electric power: 750 kW
PROSPECTS 1/2
SHORT- AND MID-TERM TASKS
Realization of a redundant network connection (in collaboration with NREN), improving
availability, and the integration in LHCONE
Federated storage for the atlas sites RO-07-NIPNE, RO-02-NIPNE
Simplifying the structure, operation, and management of the RO-LCG data centres
LONG-TERM TASKS:
Providing computing support for large-scale
collaborations, on long term, based on
international agreements:
• WLCG: implementing the post-LS1 strategy
• ELI-NP project (Extreme Light Infrastructure
- Nuclear Physics, http://www.eli-np.ro);
• FAIR-GSI experiments (CBM, PANDA);
• ITER
• EURATOM programme, Integrated Tokamak
Modelling Task Force (ITM-TF); etc.
PROSPECTS 2/2
19.07.2012, GRID2012, Dubna 17
19.07.2012, GRID2012, Dubna 18
During the first years of LHC data taking, RO-LCG provided better production results than the Tier-2 average
Scalable solutions were designed and implemented for improving the transfer, processing and storage of large datasets at the DFCTI center in the framework of the computing support of the ATLAS experiment at LHC-CERN.
A significant contribution to the performance management of the Grid system came from the NGI-independent set of tools implemented for monitoring the data transfer, storage efficacy, and service availability in the resource centres; this avoids possible inconvenients related to the end of the EGI support.
An integrated system for modeling, production runs, and data analysis of complex biomolecules was programmed for the molecular dynamics simulations performed in the framework of the HP-SEE infrastructure
Urgent measures are required for improving the network connectivity, in order to participate to LHCONE
CONCLUSIONS
19.07.2012, GRID2012, Dubna 19
INVITATION: RO-LCG CONFERENCE
Grid, Cloud & High Performance Computing in Science 25-27.10.2012
www.itim-cj.ro/rolcg2012 Deadlines:
15.08.2012: full paper submission
National contribution to the development of the LCG computing grid for elementary particle physics, project funded by the National Authority for Scientific Research (ANCS) under contract 8EU/2012 CEA - IFA Partnership, R&D project: Efficient Handling and Processing Petabite Scale Data for the Computing Centres within the French Cloud, contract C1-06/2010, co-funded by ANCS Collaboration with LIT-JINR / Dubna in the framework of the Hulubei- Meshcheriakov programme, project: Optimization Investigations of the Grid and Parallel Computing Facilities at LIT-JINR and Magurele Campus FP7 HP-SEE project: High-Performance Computing Infrastructure for South East Europe’s Research Communities, contract FP7-RI-261499, www.hp-see.eu
19.07.2012, GRID2012, Dubna 20
ACKNOWLEDGEMENTS