33
18-9-2002 CSN1-Catania L.Perin i 1 Calcolo e software di Atlas ultime rilevanti novità Eseguito felicemente DC1-fase1 distribuito Deciso impegno in Grid per fase 2 EDG ora concentrato su Atlas, primi buoni risultati

Calcolo e software di Atlas ultime rilevanti novità

Embed Size (px)

DESCRIPTION

Calcolo e software di Atlas ultime rilevanti novità. Eseguito felicemente DC1-fase1 distribuito Deciso impegno in Grid per fase 2 EDG ora concentrato su Atlas, primi buoni risultati. Disclaimer. Non faccio ora overview complessiva per Atlas - PowerPoint PPT Presentation

Citation preview

Page 1: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 1

Calcolo e software di Atlasultime rilevanti novità

Eseguito felicemente DC1-fase1 distribuitoDeciso impegno in Grid per fase 2

EDG ora concentrato su Atlas, primi buoni risultati

Page 2: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 2

Disclaimer

• Non faccio ora overview complessiva per Atlas• Recenti sviluppi molto positivi, racconto quelli, e

ruolo INFN in essi• Recente Atlas EB (6-sett) con discussione calcolo,

recente conf. EDG a Budapest (1-5 sett), recente meeting Atlas-EDG task force– Le slides che presento largamente riciclate da queste

occasioni, con opportune “cuciture” e aggiunte INFN

Page 3: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 3

DC1 preparation: datasets• Datasets

– Few set of data samples: full list• “validation” : Run Atrecon and look at standard plots

– “old” known data sets; new ones; single particles• High statistics• Medium statistics

– Priorities defined– Data sets allocated to the participating centres

• Documentation– Web pages

Page 4: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 4

DC1-1 e parte INFN• DC1-1 praticamente fatto:

– CPUs Roma1 46, CNAF 40, Milano 20, Napoli 16, LNF 10• 2 106 dei 1.5 107 dijets > 17 GeV suddivisi fra tutti ( fattore

filtro=9): Fatto– 8 105 Roma1, 2 Napoli, 10 fra CNAF e Milano (~7 e 3)

• 2.7 107 single muons Roma1 Fatto• 5 105 dijets > 130 GeV (filtro= 42%) ~Fatto

– Approx. 40% Milano e 60% CNAF• Inoltre LNF 25K ev. Muon Phys e Napoli 5K• Bulk del lavoro fra ultima set. Luglio e 10 Settembre!• INFN circa 1.5 TB output su 22.9 totali (in linea con

CPU: 5700/88000 SI95) 6.5%: ma INFN=10% ATLAS.– Tot.Atlas ~2M ore (CPU 500MH), INFN 130K con 130 CPU – Bisogna equilibrare per il 2003!

Page 5: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 5 Armin NAIRZ ATLAS Software Workshop, RHUL, September 16-20, 2002 3

Validation samples (740k events)

single particles (e, , , ), jet scans, Higgs events see Jean-Francois Laporte’s talk

Single-particle production (30 million events) single (low pT; pT=1000 GeV with 2.8<<3.2) single (pT=3, …, 100 GeV) single e and

different energies (E=5, 10, …, 200, 1000 GeV) fixed points; scans (||<2.5); crack scans

(1.3<<1.8) standard beam spread (z=5.6 cm);

fixed vertex z-components (z=0, 4, 10 cm) Minimum-bias production (1 million events)

different regions (||<3, 5, 5.5, 7)

Data Samples I

Page 6: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 6 Armin NAIRZ ATLAS Software Workshop, RHUL, September 16-20, 2002 4

QCD di-jet production (5.2 million events)

different cuts on ET(hard scattering) during generation

large production of ET>11, 17, 25, 55 GeV samples, applying particle-level filters

large production of ET>17, 35 GeV samples, without filtering, full simulation within ||<5

smaller production of ET>70, 140, 280, 560 GeV samples

Physics events requested by various HLT groups (e/, Level-1, jet/ETmiss, B-physics, b-jet, ; 4.4 million events)

large samples for the b-jet trigger simulated with default (3 pixel layers) and staged (2 pixel layers)

layouts B-physics (PL) events taken from old TDR tapes

Data Samples II

Page 7: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 7

Data producedSample Type Events CPU

NCU daysStorage

Validation Single 0.7 M 758 53 GB

Physics 44 K 163 57 GB

High Statistics Single 27 M 530 540 GB

Physics 6.2 M 40635 13.2 TB

Medium Statistics Single 2.7 M 9502 119 GB

Physics 4.4 M 25095 9.0 TB

Total Single 30.4 M 10790 0.7 GB

Physics 10.5 M 65893 22.2 TB

Grand Total 76683 22.9As of ~30 August1ncu ~ 1 PIII 500 MHz

Page 8: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 8

Participation to DC1/Phase 1Country City Maximum number

of machinesSI95

Australia Melbourne 24 1008

Austria Innsbruck 2 185

Canada Alberta, CERN 185 8825

CERN 500 18900

Czech Republic Prague

Denmark Copenhagen

France Lyon 40 1470

Germany Karlsruhe 140 6972

Israel Weizmann 74 3231

Italy BolognaMilano, Napoli, Roma

4080

22013058

Japan Tokyo 78 4586

Normay Bergen

Russia Dubna, Moscow, Protvino 115 4329

Spain Valencia 100 5376

Taiwan Taipei 48 1984

UK Birmingham, Cambridge, Glasgow, Lancaster, LiverpoolRAL

300

100

44105880

USA Arlington, OklahomaBNLLBNL

37100800

9913780

11172

Total (potential) 2763 88358

Page 9: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 9

Uso della Farm di Atlas a Roma1(A. De Salvo): segue CNAF e Mi in % delle 40 e 20

CPU’s rispettive (G.Negri)

Numero di hosts e di processori in linea nella farm in funzione del tempo. In

totale le CPU in linea nella farm sono state 52 a partire dall'inizio di luglio. I due server principali di disco e il server di monitoring, per un totale di 6 CPUs, non vengono usati dal sistema di code.

Numero di processori disponibili (slots, in verde) e di processori utilizzati (in blu) attraverso il sistema di code batch della farm in funzione del tempo.

Page 10: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 10

Page 11: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 11

Page 12: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 12

ATLAS DC1 Phase 1 : July-August 2002

• Samples done (< 3% job failures)–50*106 events generated–10*106 events, after filtering, simulated with GEANT3–31*106 single particles simulated

Ora a metà settembre DC1-1 concluso, avendo prodotto tutti gli high priority, e la gran parte die medium priority samples. Alcuni medium e parte die low continuano in modo asincrono in specifici siti

Gran parte del lavoro fatto in ultima settimana luglio, agosto, primi giorni di settembre: circa 40 giorni per un totale di 110kSpI95*mese

Nordugrid usata fra Berger, Copenhagen, Stoccolma; US Grid tools fra LBNL, Oklaoma, Arlington, per 200k Ev e 30k ore CPU su 3 settimane con storage locale e a BNL HPSS

Goals : Produce the data needed for the HLT TDR Get as many ATLAS institutes involved as possible

Page 13: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 13

ATLAS DC1 Phase 1 : July-August 2002ATLAS DC1 Phase 1 : July-August 2002

• CPU Resources used :– Up to 3200 processors (5000 PIII/500 equivalent)– 110 kSI95 (~ 50% of one Regional Centre at LHC startup)– 71000 CPU*days– To simulate one di-jet event : 13 000 SI95sec

• Data Volume :– 30 Tbytes– 35 000 files – Output size for one di-jet event (2.4 Mbytes)– Data kept at production site for further processing

• Pile-up• Reconstruction• Analysis

Page 14: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 14

ATLAS DC1 Phase I• Phase I

– Primary concern is delivery of events to HLT community• Goal ~107 events (several samples!)

– Put in place the MC event generation & detector simulation chain• Switch to AthenaRoot I/O (for Event generation)• Updated geometry (“P” version for muon spectrometer)

– Late modifications in digitization (few detectors)• Filtering

– To be done in Atlsim– To be checked with Atlfast

• Validate the chain:

Athena/Event Generator -> (Root I/O)->Atlsim/Dice/Geant3->(Zebra)

– Put in place the distributed MonteCarlo production• “ATLAS kit” (rpm)• Scripts and tools (monitoring, bookkeeping)• Quality Control and Validation of the full chain

Page 15: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 15

DC1 preparation: software• One major issue was to get the software ready

– New geometry (compared to December-DC0 geometry)• Inner Detector

– Pixels: More information in hits; better digitization– TRT: bug fix in digitization– Better services

• Calorimeter– ACBB readout– ENDE readout updated (last minute update to be avoided if possible)– End-caps shifted by 4 cm.

• Muon– AMDB p.03 (more detailed chambers cutouts)

– New persistency mechanism• AthenaROOT/IO

– Used for generated events– Readable by Atlfast and Atlsim

• And Validated

Page 16: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 16

DC1 preparation: kit; scripts & tools

• Kit– “ATLAS kit” (rpm) to distribute the s/w Alessandro de Salvo

• It installs release 3.2.1 (all binaries) without any need of AFS– Last update July 2002

• It requires : – Linux OS (Redhat 6.2 or Redhat 7.2) – CERNLIB 2001 (from DataGrid repository) cern-0.0-2.i386.rpm (~289

MB) • It can be downloaded :

– from a multi-release page (22 rpm's; global size ~ 250 MB )– “tar” file also available– Installation notes are available:

» http://pcatl0a.mi.infn.it/~resconi/kit/RPM-INSTALL

• Scripts and tools (monitoring, bookkeeping)– Standard scripts to run the production– AMI bookkeeping database (developed by Grenoble group)

Page 17: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 17

DC1 preparation: validation & quality control• We processed the same data in the various centres and made the

comparison– To insure that the same software was running in all production centres– We also checked the random number sequences

• We defined “validation” datasets– “old” generated data which were already simulated with previous version of

the simulation• Differences with the previous simulation understood

– “new” data • Physics samples• Single particles

– Part of the simulated data was reconstructed (Atrecon) and checked • This was a very “intensive” activity

– We should increase the number of people involved• It is a “key issue” for the success!

Page 18: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 18

Expression of Interest (some…)• CERN• INFN

– CNAF, Milan, Roma1, Naples

• CCIN2P3 Lyon• IFIC Valencia• FZK Karlsruhe• UK

– RAL, Birmingham, Cambridge, Glasgow, Lancaster, Liverpool

• “Nordic” cluster– Copenhagen, Oslo, Bergen, Stockholm,

Uppsala, Lund

• FCUL Lisboa• Prague• Manno• Thessaloniki

• USA– BNL; LBNL– Arlington; Oklahoma

• Russia– JINR Dubna– ITEP Moscow– SINP MSU Moscow– IHEP Protvino

• Canada – Alberta, TRIUMF

• Tokyo• Taiwan• Melbourne• Weizmann• Innsbruck• ……

Page 19: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 19

DC1 Phase II and pileup• For the HLT studies we need to produce events with

“pile-up” :– It will be produced in Atlsim– Today “output” is still “Zebra”

• We don’t know when another solution will be available• Will we need to “convert” the data?

– Fraction of the data to be “piled-up” is under discussion• Both CPU and storage resources are an issue

– If “full” pile-up is required CPU needed would be higher to what has been used for Phase I

• Priorities have to be defined

Page 20: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 20

Pile-up production in DC1 (1 data sample) still under discussion

L Number of events

Event size Total size TB

Eta cut

2 x 10**33 1.2 x 10**6 3.6 4.3 < 3

10**34 1.2 x 10**6 5.6 6.7

2 x 10**33 1.2 x 10**6 4.9 6.0 < 5

10**34 1.2 x 10**6 14.3 17

Page 21: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 21

L Number of events

Time per eventNCU sec

Total TimeNCU days

Eta cut

2 x 10**33 1.2 x 10**6 3700 5 x 10**4 < 3

10**34 1.2 x 10**6 3700 5 x 10**4

2 x 10**33 1.2 x 10**6 5700 8 x 10**4 < 5

10**34 1.2 x 10**6 13400 19 x 10**4

Pile-up production in DC1 (1 data sample) still under discussion

Page 22: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 22

DC1-2 Goals: Norman at EB• Athena-based reconstruction of data simulated in Phase 1:

– for HLT;– for other physics studies;– to discover shortcomings in the reconstruction codes;– to use new Event Data Model and Det. Descr., as required for recon.

• To carry out studies of ‘data-flow’ and algorithms in HLT (‘bytestream’ etc.);

• To make a significant ‘next step’ in our use of Geant4; (specification of this is an ongoing discussion with Simulation Co-ordinator.)

Page 23: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 23

DC1-2 Goals: Norman at EB• To make limited (but we hope finite) use of Grid tools for

the ‘batch-type’ reconstruction. (Document sent to potential Grid ‘suppliers’: special session during next s/w week at RHUL, Sep.19)

• To build on and develop world-wide momentum gained from Phase 1;

• To meet LHCC DC1 milestone for end of this year. • NOTE: as stated previously, it has always been clear that

we would not complete ALL HLT-related studies by end of this year.

Page 24: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 24

Importante passo per Atlas e Grid: Atlas-EDG task force

• Lo scopo è provare che si può fare DC1-2 con i DataGrid, almeno per una parte significativa

• Lo strumento è riprodurre qualche % di quanto gia’ fatto in DC1-1

• Lavoro cominciato a fine luglio e ci stiamo riuscendo! Programma:– 100 jobs da 24 ore riprodotti da esperti– Aggiunta: 250 jobs “nuovi” saranno sottomessi da Luc

(esperto DC1, ma mai viste Grids…)

Page 25: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 25

Task Force (EDG Coo Budapest) With mymods

• Task for with ATLAS & EDG people (lead by Oxana Smimova)• ATLAS is eager to use Grid tools for the Data Challenges

– ATLAS Data Challenges are already on the Grid (NorduGrid, Ivdgl)– The DC1/phase2 (to start in late November) is expected to be done mostly using the

Grid tools• By September 19 (ATLAS SW week Grid meeting) evaluate the usability of

EDG for the DC tasks– The task: to process 5 input partitions of the Dataset 2000 at the EDG Testbed + one

non-EDG site (Karlsruhe)• Intensive activity has meant they could process some partitions but problems

with long running jobs still need final Globus fix • Data Management chain is proving difficult to use and sometime unreliable• Need to clarify policy for distribution/installation of appl. s/w • On-going activity with very short-timescale: highest priority task

Page 26: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 26

ATLAS EDG

Jean-Jacques Blaising Laura Perini Ingo Augustin

Frederic Brochu Gilbert Poulard Stephen Burke

Alessandro De Salvo Alois Putzer Frank Harris

Michael Gardner Di Qing Bob Jones

Luc Goossens David Rebatto Peter Kunszt

Marcus Hardt Zhongliang Ren Emanuele Leonardi

Roger Jones Silvia Resconi Charles Loomis

Christos Kanellopoulos Oxana Smirnova Mario Reale

Guido Negri Stan Thompson Markus Schulz

Fairouz Ohlsson-Malek Luca Vaccarossa Jeffrey Templon

Steve O'Neale and counting…

Members and sympathizers

Page 27: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 27

Task description: Oxana• Input: set of generated events as ROOT files (each input

partition ca 1.8 GB, 100.000 events); master copies are stored in CERN CASTOR

• Processing: ATLAS detector simulation using a pre-installed software release 3.2.1– Each input partition is processed by 20 jobs (5000 events each)– Full simulation is applied only to filtered events, ca 450 per job– A full event simulation takes ca 150 seconds per event on a 1GHz

processor)• Output: simulated events are stored in ZEBRA files (ca 1 GB

each output partition); an HBOOK histogram file and a log-file (stdout+stderr) are also produced. 20% of output to be stored in CASTOR.

• Total: 9 GB of input, 2000 CPU-hours of processing, 100 GB of output.

Page 28: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 28

Execution of jobs: Oxana• It is expected that we make full use of the Resource Broker

functionality– Data-driven job steering– Best available resources otherwise

• A job consists of the standard DC1 shell-script, very much the way it is done in a non-Grid world

• A Job Definition Language is used to wrap up the job, specifying:– The executable file (script)– Input data– Files to be retrieved manually by the user– Optionally, other attributes (maxCPU, Rank etc)

• Storage and registration of output files is a part of the job

Page 29: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 29

Preliminary work done: Oxana• ATLAS 3.2.1 RPMs are distributed with the EDG tools to provide the

ATLAS runtime environment• Validation of the ATLAS runtime environment by submitting a short (100

input events) DC1 job was done at several sites:– CERN– NIKHEF– RAL– CNAF– Lyon – Karlsruhe – in progress

• A fruitful cooperation between ATLAS users and EDG experts• The task force attributes:

– Mailing list [email protected]– Web-page http://cern.ch/smirnova/atlas-edg

Page 30: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 30

Status sept.11 (Guido)+mymodswhat has been tested with ATLAS production jobs• only one site (CERN Production Testbed, 22 dual-processors WNs) has been

used, now the other 5 in use for the last 50 jobs• 5 users have submitted 10 jobs each (5000 evts/job)

output partitions 00001 to 00010 Guido Negri 00011 to 00020 Fairouz Ohlsson-Malek 00021 to 00030 Silvia Resconi 00031 to 00040 Oxana Smirnova 00041 to 00050 Frederic Brochu

• matchmaking successfully done (see next slide)• registration of output zebra files successfully done• 3 jobs failed:

– “Failure while executing job wrapper” (Frederic Brochu and Fairouz Ohlsson-Malek)– “Globus Failure: cannot access cache files… check permission, quota and disk

space” (Frederic Brochu)

Page 31: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 31

WP8 Summary in Budapest• Current WP8 top priority activity is Atlas/EDG Task Force work

– This has been very positive. Focuses attention on the real user problems, and as a result we review our requirements, design etc. Remember the eternal cycle! We should not be surprised if we change our ideas. We must maintain flexibility with continuing dialogue between users and developers.

• Will continue Task Force flavoured activities with the other experiments

• Current use of Testbed is focused on main sites (CERN,Lyon,Nikhef,CNAF,RAL) – this is mainly for reasons of support

• Once stability is achieved (see Atlas/EDG work) we will expand to other sites. But we should be careful in selection of these sites in the first instance. Local support would seem essential.

• WP8 will maintain a role in architecture discussions, and maybe be involved in some common application layer developments

• THANKS To members of IT and the middleware WPs for heroic efforts in past months, and to Federico for laying WP8 foundations

Page 32: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 32

Il futuro• DC1-2 ricostruzione dei dati, < 1/10 CPU DC1-1

– Inizio a fine novembre, dati nei siti protoTier1, uso di Grid, da decidere come e quando nel meeting del 19 settembre e subito dopo

– Ipotesi uso di GRID tool compatibili ma diversi, con interoperabilità tramite lavoro EDT-IvDGL (demo EU-US con Atlas applications previste in Novembre) : sarebbe anche in primo test della linea GLUE/LCG

• Quanto pileup e Geant4 sarà compreso in DC1-2 (da concludersi a Gen02) ancora da decidere, il resto per DC2 dopo la metà 2003

• Risorse INFN-ATLAS Tier1+Tier2(Mi+Roma1) a 120 CPU’s a 300 per assicurare share 10% in DC2:– 140 Tier1, 80 Mi, 50 Roma1, 30 altre sedi (Napoli e LNF hanno

partecipato a DC1-1 con 16 e 10 CPU rispettivamente) sarebbero raggiunti con richieste 2003 (Tier1+LCG): ma Roma1 dovra’ almeno raggiungere Milano: il numero di Tier2 potrebbe crescere (già in 2003?). Contiamo su tasca LCG di CSN1

Page 33: Calcolo e software di Atlas ultime rilevanti novità

18-9-2002 CSN1-Catania L.Perini 33

CPU and Disk for ATLAS DC2 DC1-1 ~110 kSi95 Months

Year-Quarter 03Q3 03Q4 04Q1 04Q2 Computing power (kSI95 Months)

Total requirement for Simulation 55

Total requirement for pile-up 100 250

Total requirement for Reconstruction 5

Total requirement for Analysis 20 20 Processing power required CERN T0 18 83 CERN T1 (DC related only) 10 10 Offsite T1+T2( DC only) 37 167 10 10

SSttoorraaggee ((TTeerraaBByytteess))

Data Simulated at CERN 7 20 5

Data Simulated Offsite 15 40

Data Transferred to CERN

Data stored at CERN Active Data at CERN Assume numbed of active Offsite Sum Data Stored Offsite For ATLAS Data Challenge 3, end of 2004 beginning of 2005 ? We intend to generate and simulate 5 times more data than for DC2. For ATLAS Data Challenge 4, end of 2005 beginning of 2006 ? We intend to generate and simulate 2 times more data than for DC3.