Upload
trevor-norris
View
216
Download
1
Embed Size (px)
Citation preview
23-06-2003 L.Perini-CSN1 1
ATLAS Italia Calcolo
Stato e piani: Ruolo di LCG
Nessun finanziamento chiesto ora (a Settembre si)
2
Layout
• Stato e piani del s/w– Slides scelte da presentazione di D.Quarrie al GDB del
10 giugno scorso
• Passato e futuro dei Data Challenges– Slides scelte da presentazione di G.Poulard al GDB del
10 giugno scorso
• Ruolo di LCG e fallbacks– Slides prodotte dal Computing coordinator per questa
riunione, agreed nel gruppo di rappresentanti ATLAS in LCG
GDB Meeting - 10 June 2003
ATLAS Offline Software
David R. Quarrie
Lawrence Berkeley National Laboratory
David R. Quarrie: ATLAS Offline Software
4GDB Meeting - 10 June 2003
ATLAS is in closing stages of transition from FORTRAN-based software to C++ based For DC-0 & DC-1 simulation was based on Geant3
For DC-2 it will be based on Geant4
ATLAS has been very active in validating Geant4 Common Framework (Athena) based on collaboration with
LHCb
First version of C++ reconstruction in place
Used in Level 2 and Event Filter as well as offline
First major design iteration underway
Functionality and robustness are already good (>106 events in DC-2)
Performance in some areas needs work
Software Overview
David R. Quarrie: ATLAS Offline Software
5GDB Meeting - 10 June 2003
Based on concepts of Components Services, Algorithms and Tools
Highly modular and flexible Good mapping to GRID services
Based on abstract interfaces - no direct coupling with algorithms
Compatible with non-GRID environment (e.g. laptop)
Integration with interactive scripting language (Python)
Athena
David R. Quarrie: ATLAS Offline Software
6GDB Meeting - 10 June 2003
A recurring problem is compatibility with external software e.g. POOL/SEAL need their own set of external software
(e.g. Boost)
Still grappling with obvious (e.g. incompatible versions) and not so obvious (e.g. compilation/configuration flags) problems
This is still an area requiring more work to minimize ATLAS-specific external packages and take advantage of e.g. LCG common installations
Software Distribution (2/2)
David R. Quarrie: ATLAS Offline Software
7GDB Meeting - 10 June 2003
LCG Component Integration POOL/SEAL
Geant4 Integration Pile-up Infrastructure in place
All detectors supported
Detector Description Integration Reconstruction and G4 Simulation from common geometry
Calibration/alignment infrastructure in place Begin to incorporate feedback from Reco Task Force
New Reco EDM
Release 7.0.0 - 31 July 2003
David R. Quarrie: ATLAS Offline Software
8GDB Meeting - 10 June 2003
Dual targets DC-2 (Q2-Q3 2004)
Combined Testbeams (Q2-Q3 2004)
A major focus is consolidation from 7.0.0 Robustness, house-cleaning
Performance
G4 validated for production
Full integration from RecoTaskForce designs/recommendations
Interactive as well as batch
Replacement of jobOptions files by Python scripts GRID integration
Release 8.0.0 (Feb 2004)
David R. Quarrie: ATLAS Offline Software
9GDB Meeting - 10 June 2003
Multiple prototypes developed in conjunction with data challenges Both European and USA
Magda, AMI, Grappa, etc.
Some overlapping functionality, but necessary to explore
Distributed Physics Analysis Projects developing GANGA, DIAL, Chimera
Goal is to bring these under a coherent umbrella by end of Q3 2003 ready for DC-2
GRID Projects
10
Towards ATLAS Data Challenges 2
LCG-GDB10th June 2003
Gilbert PoulardATLAS Data Challenges Co-ordinatorCERN EP-ATC
12
ATLAS DC1 (July 2002-April 2003)
Primary concern was delivery of events to High Level Trigger (HLT) and to Physics communities HLT-TDR due by June 2003 Athens Physics workshop in May 2003
Put in place the full software chain from event generation to reconstruction Switch to AthenaRoot I/O (for Event generation) Updated geometry New Event Data Model and Detector Description Reconstruction (mostly OO) moved to Athena
Put in place the distributed production “ATLAS kit” (rpm) for software distribution Scripts and tools (monitoring, bookkeeping)
AMI database; Magda replica catalogue; VDC Job production (AtCom)
Quality Control and Validation of the full chain Use as much as possible Grid tools
13
Tools in DC1
AMI Magda MagdaVDC
AtCom GRAT
replica catalog
physics metadatarecipe catalogPerm. production logTrans. production log
physics metadataperm production logtrans production logreplica catalog
recipe catalog
interactiveproduction framework
automaticproduction framework
AMI
physics metadata
14
DC1 in numbersProcess No. of
eventsCPU Time CPU-days
(400 SI2k)Volume of data
kSI2k.months
TB
SimulationPhysics evt.
107 415 30000 23
SimulationSingle part.
3x107 125 9600 2
Lumi02 Pile-up
4x106 22 1650 14
Lumi10 Pile-up
2.8x106 78 6000 21
Reconstruction
4x106 50 3750
Reconstruction+ Lvl1/2
2.5x106 (84) (6300)
Total 690 (+84) 51000 (+6300)
60
15
Contribution to the overall CPU-time (%) per country
1,41%
10,92%
0,01%
1,46%9,59%2,36%
4,94%
10,72%
2,22%
3,15%
4,33%
1,89%
3,99%
14,33%
0,02%
28,66%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
ATLAS DC1 Phase 1 : July-August 20023200 CPU‘s110 kSI9571000 CPU days
5*10*7 events generated1*10*7 events simulated3*10*7 single particles30 Tbytes35 000 files
39 Institutes in 18 Countries1. Australia
2. Austria3. Canada4. CERN5. Czech Republic6. France7. Germany8. Israel9. Italy10. Japan11. Nordic12. Russia13. Spain14. Taiwan15. UK16. USA
grid tools
used at 11 sites
16
Primary data (in 8 sites)
6%
20%
6%
31%
4%
4%
4%
25%
1
2
3
4
5
6
7
8
Total amount of primary data: 59.1 TBytes
Alberta ( 3.6)
BNL (12.1)
CNAF (3.6)
Lyon (17.9)
FZK (2.2)
Oslo (2.6)
RAL ( 2.3)
CERN (14.7)
Data (TB)Simulation: 23.7 (40%)Pile-up: 35.4 (60%)Lumi02: (14.5)Lumi10: (20.9)
Pile-up:
Low luminosity ~ 4 x 106 events (~ 4 x 103 NCU days)High luminosity ~ 3 x 106 events ( ~ 12 x 103 NCU days)
Data replication usingGrid tools(Magda)
17
Grid in ATLAS DC1
US-ATLAS EDG Testbed Prod NorduGrid
part of simulation reproduce part of full phase 1 & 2Pile-up phase 1 data productionreconstruction several tests reconstruction
GRAT & Chimera
18
ATLAS Data Challenges: DC2July 2003 – July 2004
At this stage the goal includes: Full detector simulation with Geant4 Pile-up and digitization in Athena Deployment of the complete Event Data Model
and the Detector Description Use as much as possible the LCG Applications
software (e.g. POOL) Test the calibration and alignment procedures Perform large-scale physics analysis Use widely the GRID middleware
Use more and more GRID tools Run as much as possible the production on
LCG-1
19
DC2 and LCG-1 (LP summary)
LCG-1 We intend to use and contribute to validate LCG-1
components when they become available (R-GMA; RLS; …)
ATLAS-EDG becoming ATLAS-LCG task force Scale of DC2
About 107 events simulated as in DC1 (but GEANT4) All of them pileupped All of them reconsructed
Analysis….
20
DC2:Time scale End-July: Release 7
Mid-November: pre-production release
February 1st: ”production” release
April 1st
June 1st: “DC2”
July 15th
Put in place, understand & validate:
Geant4 POOL persistency & LCG App. Event Data Model Digitization; pile-up; byte-
stream Conversion of DC1 data to POOL
and run reconstruction Testing and validation
Run test-production
Start final validation
Start simulation Pile-up & digitization Transfer data to CERN
Start Reconstruction on “Tier0” Distribution of ESD & AOD Calibration; alignment Start Physics analysis Reprocessing
21
ATLAS Data Challenges: DC2
We are building an ATLAS Grid production & Analysis system
We intend to put in place a “permanent” Monte Carlo production system If we continue to produce simulated data
during summer 2004 we want to keep open the possibility to run another “DC” later (November 2004?) with more statistics
INFN, 23 June 2003 22
ATLAS Software & Computing and LCG products in 2003-2004
Dario Barberis
CERN & Genoa University/INFN
INFN, 23 June 2003 23
ATLAS Computing Timeline• POOL/SEAL release
• ATLAS release 7 (with POOL persistency)
• LCG-1 deployment
• ATLAS complete Geant4 validation
• ATLAS release 8
• DC2 Phase 1: simulation production
• DC2 Phase 2: intensive reconstruction (the real challenge!)
• Combined test beams (barrel wedge)
• Computing Model paper
• ATLAS Computing TDR and LCG TDR
• DC3: produce data for PRR and test LCG-n
• Computing Memorandum of Understanding
• Physics Readiness Report
• Start commissioning run
• GO!
2003
2004
2005
2006
2007
NOW
INFN, 23 June 2003 24
How to get there:1) Software
• Software developments in progress:
– Geant4 simulation validation for production
– GeoModel (Detector Description) integration in simulation and reconstruction
– Full implementation of new Event Data Model
– Restructuring of trigger selection, reconstruction and analysis environment
– POOL persistency
– Interval of Validity service and Conditions DataBase
– Detector response simulation in Athena
– Pile-up in Athena (was in atlsim/G3)
INFN, 23 June 2003 25
• SEAL– Plug-in manager
• Internal use by POOL now• Full integration into Athena Q3 2003
– Data Dictionary• Integrated into Athena now• Includes Python support
• POOL– Integration underway– Goal is to have demonstrated support for POOL by 31 July
• Ability to read and write components of the ATLAS EDM
– Complete support by Oct 2003• SEAL Maths Library
– Integrate in time for DC-2• PI
– Integrate ROOT implementation of AIDA API Q3 2003
LCG Applications Components
INFN, 23 June 2003 26
• Main product to we need urgently is POOL persistency– Right now many integration problems– Several ATLAS and LCG people actively working on them– We assume major problems will be sorted out by end July
(ATLAS release 7), and full deployment in October• What if...?
– If there are problems of principle that cannot be overcome (discovered during the Summer):• go back to AthenaROOT (home-made direct coupling of
Athena/StoreGate to ROOT I/O already prototyped)
• write converters by hand
• introduce delays as more work is needed
• not nice.
– Decision in October 2003 to be ready anyway for DC2
LCG Applications:fall-back solutions
INFN, 23 June 2003 27
How to get there:2) Data Challenges
• DC1 (2002-2003) completed in April 2003:– 2nd pass of reconstruction with Trigger L1 and L2 algorithms for HLT
TDR in progress– Zebra/Geant3 files will be converted to POOL format and used for
large-scale persistency tests– they will be used as input for validation of new reconstruction
environment• DC2 (1st half 2004):
– provide data for Computing Model document (end 2004)– full use of Geant4, POOL and Conditions DB– simulation of full ATLAS and of 2004 combined test beam– prompt reconstruction of 2004 combined test beam
• DC3 (2nd half 2005):– scale up computing infrastructure and complexity– provide data for Physics Readiness Report
• Commissioning Run (from 2nd half 2006):– real operation!
INFN, 23 June 2003 28
• We plan to test (and use) the LCG-1 infrastructure as soon as deployed and functional– First tests will start in 2nd half of July as soon as CERN
installation is open to the experiments• ATLAS-EDG test group will become ATLAS-LCG test group
• can run jobs of varying complexity (CPU and I/O), simulation, pile-up, reconstruction, (analysis later)
– In parallel, we continue developing our production tools• we have to live with several Grid flavours for a long time to come
• and we have for the time being to continue productions in non-Grid environments
– Effort on distributed analysis tool underway within several national Grid projects• new RTAG-11 should help here to get some coherence in
developments
• internal ATLAS coordination also started in this area
LCG-1 Deployment
INFN, 23 June 2003 29
• What if...?– We assume there will always be several flavours of Grids and
other production sites we have to cope with• typical examples are electrical grids: we can move electrical
appliances all over the world but we need different connectors and transformers
– For large-scale productions we know how to cope in the “old” way• in DC1 we have produced >107 fully-simulated events using >50
different sites, some linked in Grid systems– The real need is for the “added values” of Grids, mainly useful
for end-user data analysis:• user certification• data and CPU management (submit jobs to a single interface)
• Conclusion: we can cope with delays in the availability of a fully performant system till Q2 2004 (DC2): if still problematic at that point we have to re-think our computing model.
LCG-1 Deployment:fall-back solutions