14
Roberto Divià, Roberto Divià, CERN/ALICE CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 CHEP 2009, Prague, 21-27 March 2009 The The ALICE Online ALICE Online Data Storage Data Storage System System Roberto Divià Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), Pierre Vande Vyvre (CERN) Pierre Vande Vyvre (CERN) Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain Chapeland (CERN), Vasco Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain Chapeland (CERN), Vasco Chibante Barroso (CERN), Chibante Barroso (CERN), Filippo Costa (CERN), Filimon Roukoutakis (CERN), Klaus Schossmaier (CERN), Csaba Soòs (CERN), Filippo Costa (CERN), Filimon Roukoutakis (CERN), Klaus Schossmaier (CERN), Csaba Soòs (CERN), Barthelemy Von Haller (CERN) Barthelemy Von Haller (CERN) For the ALICE collaboration For the ALICE collaboration

Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Embed Size (px)

Citation preview

Page 1: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 1CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

TheTheALICE OnlineALICE OnlineData Storage Data Storage

SystemSystemRoberto DiviàRoberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), Pierre Vande Vyvre (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva (CERN), Pierre Vande Vyvre

(CERN)(CERN)

Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain Chapeland (CERN), Vasco Chibante Barroso (CERN), Valerio Altini (CERN), Franco Carena (CERN), Wisla Carena (CERN), Sylvain Chapeland (CERN), Vasco Chibante Barroso (CERN),

Filippo Costa (CERN), Filimon Roukoutakis (CERN), Klaus Schossmaier (CERN), Csaba Soòs (CERN), Barthelemy Von Haller Filippo Costa (CERN), Filimon Roukoutakis (CERN), Klaus Schossmaier (CERN), Csaba Soòs (CERN), Barthelemy Von Haller

(CERN)(CERN)

For the ALICE collaborationFor the ALICE collaboration

Page 2: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 2CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

PDSPDS

CASTORCASTOR

1.25 GB/s1.25 GB/s

ALICEALICE

environmentenvironment

for the GRIDfor the GRID

AliEnAliEn

GDCGDC

ALICE trigger, DAQ & ALICE trigger, DAQ & HLTHLT

CTPCTP

LTULTU

TTCTTC

FEROFERO FEROFERO

LTULTU

TTCTTC

FEROFERO FEROFERO

LDCLDCLDCLDC

Storage NetworkStorage Network

Transient Data Storage (TDS)Transient Data Storage (TDS)

D-RORCD-RORCD-RORCD-RORC

LDCLDC

D-RORCD-RORC D-RORCD-RORC

LDCLDC

D-RORCD-RORC

HLT FarmHLT Farm

25 GB/s25 GB/s

MoverMover

DAQ NetworkDAQ Network

Page 3: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 3CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

ALICE trigger, DAQ & ALICE trigger, DAQ & HLTHLT

PDSPDS

CASTORCASTOR

1.25 GB/s1.25 GB/s

GDCGDC

Storage NetworkStorage Network

Transient Data Storage (TDS)Transient Data Storage (TDS)

MoverMover

DAQ NetworkDAQ Network

Page 4: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 4CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

Our objectivesOur objectives• Ensure steady and reliable data flow up to the design specsEnsure steady and reliable data flow up to the design specs

• Avoid stalling the detectors with data flow slowdownsAvoid stalling the detectors with data flow slowdowns

• Give sufficient resources for online objectification in ROOT format via AliROOTGive sufficient resources for online objectification in ROOT format via AliROOT very CPU-intensive procedurevery CPU-intensive procedure

• Satisfy needs from ALICE parallel runs and from multiple detectors Satisfy needs from ALICE parallel runs and from multiple detectors commissioningcommissioning

• Allow a staged deployment of the DAQ/TDS hardwareAllow a staged deployment of the DAQ/TDS hardware

• Provide sufficient storage for a complete LHC spill in case the transfer between the Provide sufficient storage for a complete LHC spill in case the transfer between the experiment and the CERN Computer Center does not progressexperiment and the CERN Computer Center does not progress

PDSPDS

CASTORCASTOR

1.25 GB/s1.25 GB/s

GDCGDC

Storage NetworkStorage Network

Transient Data Storage (TDS)Transient Data Storage (TDS)

MoverMover

DAQ NetworkDAQ Network

Page 5: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 5CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

Current TDS Current TDS architecturearchitecture

CVFS over IPCVFS over IP

5 switches Qlogic SANBox 5602:5 switches Qlogic SANBox 5602:• FC 4 Gb: equipment, PCs, storageFC 4 Gb: equipment, PCs, storage• FC 10 Gb: inter-switches connectionsFC 10 Gb: inter-switches connections

• 5 * 5 Disk Arrays Infotrend A16F models G2422 & G24305 * 5 Disk Arrays Infotrend A16F models G2422 & G2430• 5 * 15 disk volumes5 * 15 disk volumes• Total maximum space: 59 TBTotal maximum space: 59 TB• CVFS: StorNext 3.1.2CVFS: StorNext 3.1.2• Handled by the Transient Data Storage Manager (TDSM)Handled by the Transient Data Storage Manager (TDSM)

5 * ( 6 * GDCs + 2 * Movers )5 * ( 6 * GDCs + 2 * Movers )

……

……

++

Page 6: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 6CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

Future upgradesFuture upgrades

Two switches SANBox 9000, 8 blades maximum each:Two switches SANBox 9000, 8 blades maximum each:

• 9 * Blades with 16 ports FC 4 Gb: equipment, hosts, storage9 * Blades with 16 ports FC 4 Gb: equipment, hosts, storage

• 2 * Blades with 4 ports FC 10 Gb: inter-switches connections2 * Blades with 4 ports FC 10 Gb: inter-switches connections

Page 7: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 7CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

emptyingemptying

fullfull

fillingfilling

CASTORCASTOR

TDSM architectureTDSM architecture

TDSTDS

GDCGDC GDCGDC GDCGDCFeedbackFeedback

to the CTPto the CTP

TDSMTDSM

FileFile

MoverMover

TDSMTDSM

FileFile

MoverMover

TDSMTDSM

ManagerManager

DAQ networkDAQ network

CERNCERN

GPNGPNMSSMSS

networknetwork

AliEnAliEn

spoolerspooler

TDSMTDSMconfigurationconfiguration

& control DB& control DB

GDCGDC

TDSMTDSM

FileFile

MoverMover

TDSMTDSM

FileFile

MoverMover

freefree

disableddisabled

AliEnAliEn

DAQDAQ

logbooklogbook

Page 8: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 8CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

MonitoringMonitoringTDSM DB to get the status and history of the TDS/TDSMTDSM DB to get the status and history of the TDS/TDSM

Page 9: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 9CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

MonitoringMonitoringLogbook to monitor the status of the migrationLogbook to monitor the status of the migration

Page 10: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 10CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

MonitoringMonitoringLemon to monitor the system setup and metricsLemon to monitor the system setup and metrics

Page 11: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 11CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

Validation & testingValidation & testing Small “lab style” test setupsSmall “lab style” test setups

Special running mode where:Special running mode where: a single LDC can inject several real eventsa single LDC can inject several real events the Event builder unpacks it as for the original eventthe Event builder unpacks it as for the original event

Dedicated “write and forget” CASTOR poolDedicated “write and forget” CASTOR pool Ad-hoc “black hole” AliEn registration serviceAd-hoc “black hole” AliEn registration service

Profiling during detectors commissioning and cosmic runsProfiling during detectors commissioning and cosmic runs

ALICE Data ChallengesALICE Data Challenges Run between 1999 and 2006Run between 1999 and 2006 Periodic full-chain tests (ALICE DAQ/Offline + IT department)Periodic full-chain tests (ALICE DAQ/Offline + IT department)

Page 12: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 12CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

ALICE Data ALICE Data ChallengesChallenges

Define an architecture (HW and SW)Define an architecture (HW and SW) Re-use existing idling components (IT and ALICE)Re-use existing idling components (IT and ALICE) If needed, add some glue here and thereIf needed, add some glue here and there

Earlier ADCs: lots of glue!Earlier ADCs: lots of glue! Evaluate and profile the individual componentsEvaluate and profile the individual components Put them together and check the resultPut them together and check the result Do short sustained tests (hours)Do short sustained tests (hours) Run the final Challenge (7 days) with two targets:Run the final Challenge (7 days) with two targets:

Sustained overall data rateSustained overall data rate Amount of data to PDSAmount of data to PDS

Repeat the exercise year after year with more challenging objectivesRepeat the exercise year after year with more challenging objectives Achieve quasi-ALICE results with minimum glue right before Achieve quasi-ALICE results with minimum glue right before

ALICE commissioningALICE commissioning

Page 13: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 13CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

The TDS in 2008The TDS in 2008 25 February to 9 March 2008: ALICE Cosmic runs25 February to 9 March 2008: ALICE Cosmic runs

1500 runs1500 runs 340 hours340 hours 70 TB70 TB

3+4Q08:3+4Q08: 6800 runs6800 runs 3300 hours3300 hours 108 TB108 TB

Page 14: Roberto Divià, CERN/ALICE 1 CHEP 2009, Prague, 21-27 March 2009 The ALICE Online Data Storage System Roberto Divià (CERN), Ulrich Fuchs (CERN), Irina Makhlyueva

Roberto Divià, CERN/ALICERoberto Divià, CERN/ALICE 14CHEP 2009, Prague, 21-27 March 2009CHEP 2009, Prague, 21-27 March 2009

In conclusion…In conclusion… Continuous evaluation of HW & SW components Continuous evaluation of HW & SW components

proved the feasibility of the TDS/TDSM architecture proved the feasibility of the TDS/TDSM architecture All components validated and profiledAll components validated and profiled ADCs gave highly valuable information for the R&D processADCs gave highly valuable information for the R&D process

Additional ADCs added to the ALICE DAQ planning for 2009Additional ADCs added to the ALICE DAQ planning for 2009 Detector commissioning went smoothly & all objectives were metDetector commissioning went smoothly & all objectives were met No problems during cosmic and preparation runsNo problems during cosmic and preparation runs Staged commissioning on its wayStaged commissioning on its way Global tuning in progressGlobal tuning in progress

We are ready for LHC startupWe are ready for LHC startup