15
TDAQ commissioning R. Fantechi

R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

Embed Size (px)

DESCRIPTION

PC farm Basic constraints Data throughput in input Good estimate: full 10 Gb links (52-56 from detectors) Number of 10 Gb ports on the router available for PCs O(30) CPU needed for L1/L2 processing Unknown Estimation of an extreme case: run the full reconstruction with (guessed) 1 sec/event, the power of 30 2*12 core PC will process 40 KHz of data

Citation preview

Page 1: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

TDAQ commissioning

R. Fantechi

Page 2: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

TDAQ commissioningCoordination activity started on JanuarySeveral items to be attacked already in

JanuaryRegular meetings: 15/1, 29/1, 6/2

First topics attacked:PC FarmRun ControlL0TP tests

Page 3: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

PC farmBasic constraints

Data throughput in input Good estimate: 25-30 full 10 Gb links (52-56 from

detectors)Number of 10 Gb ports on the router available for

PCs O(30)

CPU needed for L1/L2 processing Unknown Estimation of an extreme case: run the full

reconstruction with (guessed) 1 sec/event, the power of 30 2*12 core PC will process 40 KHz of data

Page 4: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

52(56)x 10Gb portsto detector area

HP8212 with 88x 10Gb and 24x 1Gb

30x 10Gb portsto online farm

34x 1Gb portsto merger

HPxx 48x 1Gb + 4x 10Gb

HP2920 24x 1Gbor 48x 1Gb DCS IPMI ...

Cern CDR + GPN

Cern TN

30 farmcomputers

3 mergercomputers?

1Gb link10Gb link

Page 5: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

PC farm discussion Focus on O(30) PCs

2 CPUs, 8 or 12 cores, 64 GB of memoryGood PS and at least a PCI 16X GEN 3 for a GPU

GPU to be installed in a second time to boost L1/L2 capacity Keeping the network infrastructure as simple as of today

Several investigations ongoingAlberto and Paolo looking for offers in ItalyRF looking for existing CERN-wide contracts in IT or LHC

experiments, to profit from good discountsJonas searching the market in Germany and evaluating the cost

of his proposal of assembling PCs on shelves inside the racks

Some points still to be defined for the pf-ring driver

Page 6: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

PC farm discussion 2 more mergers are needed

Connect the 1 Gb/s outputs of the 30 PCs to a 48 port 2920 with 4 10 Gb interfaces

3 10 Gb interfaces to go to the 3 mergersThe fourth one back to the router do go to the

CDR

Additional network switch needed for IPMI control

Page 7: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

PC farm discussion Start to think for a strategy if more power is needed

GPUs is one way, but no work has started yet However, there is a limit to the number/power of GPUs to be put in a PC

Power supply capability, PCI slots Cost tradeoff for high performance GPUs

On the other side, increase the number of PCs Change the network configuration Either build a tree of routers (topology and configuration to be checked) or

go to a more expensive Brocade router to replace the actual HP8xxx one

Limited solution (Jonas): equip each one of the 30 PCs with 2 10 Gb Interfaces and connect other 30 PCs in cascade

On top of this, consider that according to the performance of the trigger we may have an input rate of less than 30 10 Gb links. This will create free slots in the router for more PCs. Again not a scalable solution

Page 8: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

L0TP No request for the time being for a test at CERN

Ferrara claims it is able to exercise the L0tp proto with a clever PC program

Possibility of a test here middle of MarchECN 3 facilities always ready (continuous dry

run)Clock, Talk boardService PCsEverything restarted at the beginning of 2014Regularly used for the CREAM testsTo be eventually integrated with more TEL62s

Page 9: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

NetworkAlmost all the switches available in IT

A first batch of installations in ECN3 are foreseen soon LAV4-11, CHANTI, router line cards

Followed by the LKr switchesComplete with Straws and the restAll fiber patches for clock and network ordered

Insulation of the networkA cluster of two virtual machines running Windows Terminal

server have been defined and available to us Work to be started to configure the accesses

Another VM with Linux available to start the same exercise for Linux applications

Page 10: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

Run controlMeeting on the upgrade of the run control before

ChristmasNicolas has prepared a draft note

To be finalized this weekImplementation starting in March

Main pointsMaintain configuration files in a central databaseTo be transferred to the detector clients at init/startrunUtility to handle the interaction with the database

Clients to be upgraded/writtenTdspy (main effort), LKr init, LKr calib, GTK and Straw

init, L0tp

Page 11: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

TTCex & LTUsAll the requested TTCex available since Feb

2013MK I typeNo new production in the near futureFew spare modules available in PH-ESE, but more

could come at the end of the year, due to some dismantling in CMS

Occasion to procure a reasonable number of spares

Preparation of the final LTU crate(s)See Marian talk

Page 12: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

CratesLong saga following the initial failures in 2012

Problem not understood on sysreset lineFiltering ok for LAV

Several checks done on the existing cratesA test board prepared to check voltages and

sysreset functionality Indeed, one PS found with a problem on 48 V (not going

immediately to zero after power off) Other two with sysreset problem All three were damaged in 2012 (badly repaired?) These three sent back to Wiener as the first ones to have

the filtering upgraded

Page 13: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

CratesIt is anyway time now to make the order

Sent from the pool coordinator at the end of March

6 crates on hold, 5 prepared in DecemberOrganize a set of hot spares

Two full LAV-type crates in Valeri’s labIt should be safe to have in addition one PS

LAV type + 1 PS TEL type as additional sparesAvailable at the experiment for a first line

intervention

Page 14: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

Crate summaryPrototype crates

1 LAV type (Frascati) + 1 TEL type (Roma2, now at CERN)2011 order

8 LAV type: LAV1, LAV2, LAV3, CEDAR,MUV, CHOD, Straw, spare

2012 order (on hold, to be unblocked)6 LAV type: LAV4, LAV5, LAV6, LAV7, LAV8, LAV9However one filtered LAV PS has been delivered

2013 order (prepared, wait for our ok)4 LAV type: LAV10, LAV11, 2*CHANTI4 TEL type: 3*L0/LKR, 1 RICH

CHOD was originally RICH, it will be LAV12

Page 15: R. Fantechi. TDAQ commissioning Coordination activity started on January Several items to be attacked already in January Regular meetings: 15/1, 29/1,

Items for the next meetingsDecision on the PC farm purchasesTuning of the PC farm software

Rates, merger output rate, diagnostics, documentation

Review of the requirements of the various detectorsGTK, Straw, TEL62s, Gandalf, LKrAs far as network, PC farm, etc

Scheduling of common and private dry runsSynergy with computing WG for CDR and storage

Together with IT in the newly formed coordination meetings