39
CERN IT Department CH-1211 Genève 23 Switzerland www.cern.ch/ L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS Simone Campana CERN IT/GS

L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

  • Upload
    taini

  • View
    19

  • Download
    0

Embed Size (px)

DESCRIPTION

L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS. Simone Campana CERN IT/GS. ATLAS Event Data Model. RAW (1.6MB/ev). Raw Data : output of the Event Filter Farm (HLT) in byte-stream format. - PowerPoint PPT Presentation

Citation preview

Page 1: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

Simone Campana

CERN IT/GS

Page 2: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

2

RAWRAW(1.6MB/ev)(1.6MB/ev)

ESDESD(1MB/ev)(1MB/ev)

AODAOD(150 KB/ev)(150 KB/ev)

DPDDPD(20KB/ev)(20KB/ev)

Raw DataRaw Data: output of the Event Filter Farm (HLT) in output of the Event Filter Farm (HLT) in byte-stream formatbyte-stream format

Event Summary DataEvent Summary Data: output of the event output of the event reconstruction (tracks, hits, calorimeter cell and reconstruction (tracks, hits, calorimeter cell and clustersclusters, combined reconstruction objects etc...)., combined reconstruction objects etc...).For calibrazion, allineamento, refitting …For calibrazion, allineamento, refitting …

Analysis Object DataAnalysis Object Data: reduced representation of the events, suitable for analysis. Reconstructed “physics objects” (elettrons, muons, jets (elettrons, muons, jets …)…)

Derived Physics DataDerived Physics Data: reduced information for ROOT reduced information for ROOT specific analysis. specific analysis.

ATLAS Event Data Model

Page 3: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Page 4: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

LYON

BNL

LPCTokyo

NW

GRIF

T3

NET2

All Tier-1s have predefined (software) channel with CERN and with each other Tier-1.Tier-2s are associated with one Tier-1 and form the cloudTier-2s have predefined channel with the parent Tier-1 only.

FR CloudFR Cloud

BNL CloudBNL Cloud

Pékin

“Tier Cloud Model”Unit : 1 T1 + n T2/T3

ATLAS tiers Organization

NG

LYON

BNL

FZKTRIUMF

ASGC

PIC

SARA

RAL

CNAF

CERN

Clermont

LAPP

CCPM

Roumanie

SW

GL

SLAC

TWT2

Melbourne

Page 5: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

NG

LYON

BNL

FZKTRIUMF

ASGC

PIC

SARA

RAL

CNAF• Raw data Mass Storage at CERN

• Raw data Tier 1 centers

• complete dataset distributed

among T1s

• ESD Tier 1 centers

• 2 copies of ESD distributed

worldwide

• AOD Each Tier 1 center

• 1 full set per T1

Original ProcessingT0 T1

Original ProcessingT0 T1

Detector Data Distribution

CERN

T2: 100 % AOD, small fraction ESD,RAW

Page 6: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

ReprocessingT1 T1

ReprocessingT1 T1

• Each T1 reconstructs its own RAW

• Produces new ESD, AOD

• Ships :

Reprocessed Data Distribution

• ESD to associated T1 • AOD to all other T1s

NG

LYON

BNL

FZKTRIUMF

ASGC

PIC

SARA

RAL

CNAF

Page 7: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Tokyo

GRIF

Pékin

Clermont

Roumanie

• Monte Carlo production (ESD,AOD)Monte Carlo production (ESD,AOD)

• Ships RAW,ESD,AOD to

associated T1

• Physics AnalysisPhysics Analysis

• Gets (ESD) AOD from

associated T1

ATLAS Tier-2 activities

NG

LYON

BNL

FZKTRIUMF

ASGC

PIC

SARA

RAL

CNAF

Page 8: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

ATLAS and Grid Middleware

• ATLAS resources are distributed across different Grid Infrastructures– EGEE, OSG, Nordugrid

• Most of the Grid Services are shared across different Grids– SRM interface for Storage Elements

• With different backend storage implementation– LCG File Catalog

• At all ATLAS T1s, contains infos for file replicas in the cloud– File Transfer Service at every T1

• Baseline transfer service to import data at any site of the cloud.– VOMS

• To administrate VO membership– CondorG

• For job dispatching

• The ATLAS computing framework guarantees Grid interoperability

Page 9: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

9

The DDM in a nutshell

The Distributed Data Management …

• … enforces the concept of dataset– Logical collection of files– Dataset contents and location stored in central

catalogs – File information stored on local File Catalogs (LFC) at

T1s

• … based on a subscription model– Datasets are subscribed to sites – A series of services enforce the subscription

• Lookup data location in LFC• Trigger data movement via FTS• Validate data transfer

Page 10: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Testing Data Distribution: CCRC08

• Week 1: Data Distribution Functional Test– to make sure all files get where we want them to go– between Tier-0 and Tier-1’s, for disk and tape

• Week 2: Tier-1 to Tier-1 tests– similar rates as between Tier-0 and Tier-1– more difficult to control and monitor centrally

• Week 3: Throughput test– try to maximize throughput but still following the model– Tier-0 to Tier-1 and Tier-1 to Tier-2

• Week 4: Final, all tests together– also artificial extra load from simulation production

Page 11: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: Full Exercise

Page 12: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Transfer ramp-up

Test of backlog recoveryFirst data generated over 12 hours and subscribed in bulk

12h backlog recovered 12h backlog recovered in 90 minutes! in 90 minutes!

MB

/s T0->T1s throughput

Page 13: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: T0->T1s data distribution

Suspect DatasetsSuspect DatasetsDatasets is complete complete

(OK) but double registration

Suspect DatasetsSuspect DatasetsDatasets is complete complete

(OK) but double registration

Incomplete DatasetsIncomplete DatasetsEffect of the power-

cut at CERN on Friday morning

Incomplete DatasetsIncomplete DatasetsEffect of the power-

cut at CERN on Friday morning

Page 14: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: T1-T1 transfer matrix

YELLOW boxesEffect of the power-cut

YELLOW boxesEffect of the power-cut

DARK GREEN boxesDouble Registration problem

DARK GREEN boxesDouble Registration problem

Compared with week-2 (3 problematic sites)Very good improvement

Page 15: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: T1->T2s transfers

SIGNET: ATLAS DDM configuration issue (LFC vs RLS)SIGNET: ATLAS DDM configuration issue (LFC vs RLS)

CSTCDIE: joined very late. Prototype.CSTCDIE: joined very late. Prototype.

Many T2s oversubscribed(should get 1/3 of AOD)

Page 16: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Throughputs

T1->T2 transfers T1->T2 transfers

show a time structure

Datasets subscribed:-upon completion at T1 -every 4 hours

T0->T1 transfersT0->T1 transfers

Problem at load generator on 27th

Power-cut on 30th

MB/s

MB/sExpected Rate

Page 17: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: Concurrent Production

# running jobs

# jobs/day

Page 18: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Week-4: metrics

• We said: • T0->T1: sites should demonstrate to be capable to import 90% of

the subscribed datasets (complete datasets) within 6 hours from the end of the exercise

• T1->T2: a complete copy of the AODs at T1 should be replicated at among the T2s, withing 6 hours from the end of the exercise

• T1-T1 functional challenge, sites should demonstrate to be capable to import 90% of the subscribed datasets (complete datasets) for within 6 hours from the end of the exercise

• T1-T1 throughput challenge, sites should demonstrate to be capable to sustain the rate during nominal rate reprocessing i.e. F*200Hz, where F is the MoU share of the T1.

• Every site (cloud) meet the metric– Despite power-cut– Despite “double registration problem”– Despite competition of production activities

Page 19: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Disk Space (month)

ATLAS “moved” 1.4PB of data in May 2008

1PB deleted in EGEE+NDGF in << 1day1PB deleted in EGEE+NDGF in << 1dayPossibly another 250TB deleted in OSG250TB deleted in OSG

Deletion agent at work. Uses SRM+LFC bulk methods.Deletion rate is more than good (but those were big files)

Page 20: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Lessons learned from CCRC08

• The Data Distribution framework seems in good shape and ready for data taking

• Few things need attention:– FTS servers at T1s need global tuning of

parameters– Some bugs found in ATLAS DDM services

• Now fixed

– In at least 3 cases, a network problem or inefficiency has been discovered

• Monitoring …

Page 21: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Few words about the FDR

• FDR = Full Dress Rehearsal– Test the full chain, from the HLT to the analysis at

T2s. – Same set of Monte Carlo data (approx 8TB) in byte-

stream format, injected every day in the T0 machinery

– Data (RAW and reprocessed) distributed and handled as real data

• FDR2 data exports (June 2008)– Much less challenging than CCRC08 in terms of

distributed computing• 6 hours of data per day to be distributed in 24h• Three days of RAW data have been distributed in less than 4

hours • All datasets (RAW and derived) complete at every T1 and T2

(one exception for T2)

Page 22: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Data Export after CCRC08 and FRD

• Data Distribution functional test:– To test data transfers:

• Tier-0 to all Tier-1’s tape and disk (RAW, ESD, AOD)• all Tier-1’s to all other Tier-1’s (AOD, DPD)• each Tier-1 to all Tier-2’s in the same cloud (AOD,DPD)• muon calibration streams Tier-0 to some special Tier-2’s

– Completely automated:• at 5% of nominal rate, fake generated data from T0• starts every Monday at midday stops next Sunday at midnight• central data deletion of test data everywhere• reports weekly statistics

• Data taking – Mostly Cosmics … – RAW data exported to T1s (for custodial)– ESD exported to 2 T1s following Computing Model– Some data kept permanently on disk at CERN

Page 23: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Activity after CCRC08

Most inefficiencies due Most inefficiencies due to Scheduled to Scheduled DowntimesDowntimes

Most inefficiencies due Most inefficiencies due to Scheduled to Scheduled DowntimesDowntimes

Page 24: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Detector Data Replication

Page 25: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

• Bursty activity, mainly depending on software readiness

• Main samples: fdr2, 10TeV, 900 GeV and validations

• Runs in Tier-2’s but also in Tier-1’s– no competition yet with analysis (T2) and re-processing

(T1)

• Average of 10k simultaneous jobs, peaks of 25k jobs

• All production now submitted through Panda system

Simulation Production

Page 26: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

26

site A

server

site B

pilotpilot

Worker Nodes

condor-gSchedulerScheduler

glite

https

pull

run

run

job

pilotpilot

ProdDBjob

BambooBamboo

Monte Carlo Production

job

pull

Page 27: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Panda in a nutshell

• Job definitions are hosted in the Production Database

• The agent “Bamboo” polls jobs from ProdDB and feeds the Panda server

• The Panda Server manages all job information centrally– Priority Control – Resource Allocation– Job Scheduling

• A job scheduler dispatches pilot jobs to sites – Using various mechanisms: local batch system commands,

gLite WMS, CondorG• Pilots jobs are prescheduled to Grid sites

– Pilots pull “real jobs” from Panda server as soon as suitable CPUs become available.

• Output data are aggregated at T1s using DDM

Page 28: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Simulation Production

Running Jobs: Monthly StatisticsRunning Jobs: Monthly Statistics

Number of jobs per dayNumber of jobs per day

ErrorsErrors

Page 29: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Simulation Production Functional Test

• submits one real MC task as a test to each cloud every Monday– 5000 events, 25 events/job 200 jobs of ~6 hours each– jobs should run in each of the Tier-2’s (and Tier-1) in the cloud– low priority to not interfere with real production

• task aborted on Thursday– kills remaining jobs and removes all output– statistics generated: efficiency, brokering, problem sites

Page 30: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Reprocessing

• Reprocessing “just” is a special case of production system job– Handled by Panda– Runs at T1s only (first order approximation)

• However…– Needs to prestage files (RAW data) from tape at T1s– Needs to access the detector condition data on Oracle

racks at T1s

• Current issues:– pre-staging still not quite working yet

• Software exists, being tested• Every T1 has a different storage setup, performances etc …

– conditions database access not quite working yet• each job opens several connections to the database at the

beginning of the job• Too many concurrent and simultaneous jobs overload the

database. Being investigated.

Page 31: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Analysis

• The ATLAS analysis model is “jobs go to data”– Analysis mostly run on DPD and AODs– Initially, large access ESD and possibly RAW

• Currently, 2 frameworks for analysis: Ganga and pAthena– Both fully integrated with ATLAS DDM for data

co-location– Will possibly be merged in a unique tool– Now a unique support team

Page 32: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Ganga

• Client based analysis framework– Central Core component– Multiple plug-ins to benefit of various job

submission system• gLite WMS• CondorG• Local Batch System (LFS,PBS)

Multi VO projectMulti VO projectAnalysis Functional testsAnalysis Functional tests

Page 33: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

pAthena

• Server based analysis framework– Full usage of the Panda infrastructure– Very advanced monitoring– Offers job prioritization and user shares

Monitoring per userMonitoring per user

Worldwide pAthenaWorldwide pAthenaActivityActivity

(last month) (last month)

Page 34: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

User Storage Space

• ATLAS uses the srmv2 interface everywhere now– Offers the possibility to partition the space (space

tokens) depending on the use case

• For central activities – DATADISK and DATATAPE for real data – MCDISK, MCTAPE and PRODDISK for Simulation

Production

• For Group analysis (GROUPDISK) – Ideally, quota management per group– In reality, only global quota, little possibility to configure

group based ACLs. Need policing.

• User analysis– USERDISK

• scratch space for job output, cannot guarantee lifetime– LOCALGROUPDISK

• not ATLAS pledged resources, “home” space for users• Same limitation as for GROUPDISK

Page 35: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Experience from one week of beam data

Page 36: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Day 1: we were ready

Page 37: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Data arrived …

Page 38: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

We started exporting … and we saw issues.

Data Exports Throughput in MB/s

Effect of concurrent data access

from centralized transfers and user activity

(overload of disk server)

Number of errors

Page 39: L'infrastructure de calcul pour le LHC Le point de vue d'ATLAS

CERN IT Department

CH-1211 Genève 23

Switzerlandwww.cern.ch/

it

InternetServices

Conclusions

• Computing for LHC experiment is extremely challenging– Very demanding use case– The system is is complex, relies on many external components

• Centralized Data Distribution works reliably– Tested in many challenges and in real life

• Monte Carlo Production framework also reliable– But this is not true for the data reprocessing– Database access and data prestaging need attention

• Data Analysis user activities represent the real challenge now– Do not follow a particular pattern (non-organized by definition)– Not always possible to protect production from users or users from

other users– Never “tested” at the real scale

• The EGEE Grid and offers the necessary baseline services and the infrastructure for ATLAS data taking– Improvements in the area of Storage are foreseen in the near future,

based on experiments inputs and lessons.