29
Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

Embed Size (px)

Citation preview

Page 1: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

Run II DZERO DAQ / Level 3 Trigger System

Ron Rechenmacher, FermilabThe DØ DAQ Group (Brown/FNAL-CD/U.of Washington)

CHEP03

Page 2: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 2

What to Expect

DAQ/trigger system overview

Hardware, software description

Performance

Lessons learned

Scaling to the future

Page 3: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 3

D0 at Fermilab

Page 4: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 4

D0 DAQ at FermilabCollision rate of 7.6 MHz

3 level trigger system

L1/L2 reduce rate to 1KHz into L3

250 MB/s average event data rate into L3

50 Hz and 12.5 MB/s output from L3

L3/DAQL3/DAQ(Commodity HW/SW)(Commodity HW/SW)

L1/L2L1/L2(Custom Hardware)(Custom Hardware)

Data TapeData Tape7.5 MHz7.5 MHz 1 KHz1 KHz 50 Hz50 Hz

Page 5: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 5

Commodity DAQ/L3 System

We chose a good mix of hardware and software and built a system that easily met the 250KB @ 1KHz (=250MB/sec) requirement.

• Great depth of software development tools and methodologies.

Commodity software development environment

Commodity hardware

Page 6: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 6

DAQ/trigger System Overview

Page 7: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 7

Mechanically supported in crate by custom 9U “Extender” board

933 MHz 933 MHz CPUCPU

128 MB 128 MB Flash ROMFlash ROM

128 MB 128 MB RAMRAM

““PMC” slotPMC” slot(filled with (filled with

BVM I/O BVM I/O module)module)

VME to VME to PCI PCI

(Universe II)(Universe II)

Commodity Single Board Computer “SBC”

Dual Dual 100Mb 100Mb

EthernetEthernet

J3J3

connectoconnectorr

SBCSBCFront-panelFront-panel

connectionsconnections

StatusStatus

lightslights

Page 8: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 8

Hardware Description - Switches6509 (single central switch)• 16 GB/s backplane• 9 module slots

• 8 port Gb (fiber or copper)• 48 port 100Mb/s

• 112MB/48 ports for output buffering

2948G (currently 5 of these in the system)• 48 100Mb ports and 2 Gb ports• “Concentrator” switch

• Combines data from up to 20 100mb/s inputs into 2 Gb outputs• No packet loss possible if limited to 20 inputs

Capacity well exceeds D0 requirements

Page 9: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 9

Hardware Description - Nodes

82 nodes in total, currently

Dual CPU

1 GB RAM

1 GHz PIII / AMD Athlon 2000’s

Dual ethernet

Cost effective

Page 10: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 10

Developed Software Description Multiple runs can be configured simultaneouslyConnections to monitor server (talk on Thursday)All connections TCP• Auto re-connection

Application buffer trackingComponents of the system can be restarted ‘on the fly’

Event Data

Rou

ting Cra

te-li

sts

Buffer-info

Confi

gura

ti on

SelectedEvents

Configuration

RoutingMaster

D0 RunControl

NodesSBCs

Runs

To Tape

DAQsupervisor

Monitoring

Page 11: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 11

Software InfrastructureLinux 2.4 kernel• Modifiable

• One arp patch

• Easy development• Kernel debugging – KGDB

TcpdumpFermi Linux Trace• Complete system – kernel <-> users space interaction

Rgang• Single executable• Parallel remote execution and file copy for “farms”

Page 12: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 12

Software TRACEroot@d0sbc001b:/proc/trace>cat buffer | head –20 count timeStamp PID TraceName CPU lvl message ------------------------------------------------------------------------ 1 1048198375378446 1620 KERNEL 0 30 exit do_softirq 2 1048198375378425 1620 KERNEL 0 30 enter do_softirq 3 1048198375378422 1620 KERNEL 0 30 exit handle_IRQ_event irq=5 4 1048198375378411 1620 KERNEL 0 30 enter handle_IRQ_event irq=5 5 1048198375377887 1339 KERNEL 0 31 sched: prev=1339 next=1620 6 1048198375377788 1339 KERNEL 0 30 exit do_softirq 7 1048198375377780 1339 KERNEL 0 30 enter do_softirq 8 1048198375377779 1339 KERNEL 0 30 exit handle_IRQ_event irq=5 9 1048198375377771 1339 KERNEL 0 30 enter handle_IRQ_event irq=5 10 1048198375377716 1339 l3xetg 0 8 Node idx 55: total=3 delta=1 latency=0 11 1048198375377688 1339 KERNEL 0 30 exit do_softirq 12 1048198375377686 1339 KERNEL 0 30 enter do_softirq 13 1048198375377684 1339 KERNEL 0 30 exit handle_IRQ_event irq=5 14 1048198375377680 1339 KERNEL 0 30 enter handle_IRQ_event irq=5 15 1048198375377668 1620 KERNEL 0 31 sched: prev=1620 next=1339 16 1048198375377646 1339 KERNEL 0 31 sched: prev=1339 next=1620 17 1048198375377615 1339 KERNEL 0 30 exit do_softirqroot@d0sbc001b:/proc/trace>echo KERNEL=0x0fffffff >|level

Page 13: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 13

Software Logic Analyzer

Page 14: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 14

ControlDØ Run Control sends configuration commands to Level 3• Level 3 is a black box to the rest of

DZERORun Control

Level 3/DAQSupervisor

NodeNodeNodeNode

RoutingMaster

Level 3 Supervisor configures L3/DAQ system• Allows configuration of multiple run.

All components can crash or reboot at any time• System will automatically

reconfigure without contacting run control.

Page 15: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 15

MonitoringMonitoring

Example use A status display in the Control Room (or your living room!)

All components of the DAQ are clients

The Server caches recent queries to limit the load on clients

There are many displays, each serving a specialized purpose (uMon, l3xqt, history, systray, and web pages)

Based on TCP/IP, ACE, and XML

Real-time “Trace”

Example useSee the event numbers that were in an SBC’s buffers just before some glitch occurs

Combines low-level debugging information and log-file entries in a single real-time circular buffer

The buffer can be “frozen” by either software or hardware triggers

A system-wide display has been demonstrated and is under development

Example useUnderstand why a node was not connected to an SBC yesterday

Centralized, accessible, and time-stamped

Errors go to SES

Log-files

XMLServer

Clients

Displays WebPages

Talk on this topic on Thursday

Able to control the amount to log files

Page 16: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 16

Performance

Current rate is 400Hz with 300KB events• 120 MB/sec

Subset at 2KHz with smaller events

Subset utilizing dual ethernet using large events

Page 17: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 17

Performance

`Yearly' Graph

Percent backplane utilization Percent backplane utilization

Page 18: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 18

Lessons Learned

R&D (system’s analysis) goes a long way

(VME) systems integration expertise goes along way• Transcend sub-system boundaries

TCP expertise needed• 200 ms dropped packet problem

• TCP not tuned for ‘real-time’ applications by default• TCP_RTO_MIN parameter and others need tuning

• Understanding of Linux Kernel and TCP tools

Track all software/configuration

Page 19: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 19

R&D

Significant upfront analysis/investigation

“To the metal” understanding/expertise

Basis for smooth integration

Page 20: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 20

R&D - VME SBCsUniverse II• VMETRO studies

Linux interrupt latency measurements

VMETRO VBT-325C VME Trace Sampling: STATE at MiddleVMETRO VBT-325C VME Trace Sampling: STATE at Middle Trace Search Jump Count Format Markers Window Quit Help Trace Search Jump Count Format Markers Window Quit Help +DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDVME#1DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD++DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDVME#1DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD+|DDDDDDDDDTimeDDDDDDDBgLDDAMDAddressDDDataDDDDESizeDDDCycleDStatDIRQ7:1*DIackD>||DDDDDDDDDTimeDDDDDDDBgLDDAMDAddressDDDataDDDDESizeDDDCycleDStatDIRQ7:1*DIackD>|| -17 1.35us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- ^| -17 1.35us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- ^| -16 47.41us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -16 47.41us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -15 49.97us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- #| -15 49.97us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- #| -14 46.61us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- #| -14 46.61us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- #| -13 165.04ms ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -13 165.04ms ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -12 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -12 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -11 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -11 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -10 1.35us ---- 39 ..21A004 ....0011 WORD RD OK ....... ---- #| -10 1.35us ---- 39 ..21A004 ....0011 WORD RD OK ....... ---- #| -9 46.93us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -9 46.93us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -8 49.81us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- #| -8 49.81us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- #| -7 46.61us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- #| -7 46.61us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- #| -6 166.35ms ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -6 166.35ms ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -5 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -5 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -4 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -4 1.15us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -3 1.33us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -3 1.33us ---- 39 ..21A004 ....0001 WORD RD OK ....... ---- #| -2 46.45us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -2 46.45us ---- 39 ..224012 ....0500 WORD RD OK ....... ---- #| -1 50.77us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- | -1 50.77us ---- 39 ..22C012 ....0500 WORD RD OK ....... ---- | HALT 47.09us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- V| HALT 47.09us ---- 39 ..234012 ....0500 WORD RD OK ....... ---- V+DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD++DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD+Ok. <PF2=Menu> <^W=Nxt wnd> | | | Ok. <PF2=Menu> <^W=Nxt wnd> | | |

Efficient OS access• Writev

• Limit memory copying

Linux RT scheduling

Page 21: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 21

R&D – Switches/ethernet

VME to ethernet• Rate and CPU

We analyzed the architecture of the 6509• buffering increased at the last minute and turned out not

to be an issue• Prepared to control required buffering via control of TCP

window size

Round trip messages passing timings

Tests done under Linux

Page 22: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 22

ExpandingRoom to grow• This system could easily double

Utilization indicatorUtilization indicator

Page 23: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 23

RoutingMaster

DAQNodes

SBCs

Get info from Emperor and pass to SBCs

RoutingMaster

DAQ Nodes

SBCs

SBCs

DAQ Nodes

Event Nodes

Event Nodes

Event Nodes

Routing

EmperorEvent Node Groups

Node Master

Node Master

Node Master

Tell DAQ nodes which event node to use

Advertise total free buffers to the Emperor

Emperor… for each event:

Pick an Event Node Group (ENG) with the most free buffers

Inform the NM and RMs of the choice

Scaling

Page 24: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 24

SummaryCommodity-based ethernet DAQ built for D0• 250 MB/s: 1 KHz of 250 KB events• 63 sources and >80 targets

Commodity (ethernet) systems • wow, a lot of stuff can show up!

You need a TCP/IP expert or twoPeople that can transcend boundaries“to the metal” understandingInfrastructure

Page 25: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 25

References / Additional InformationFermitools• http://fermitools.fnal.gov

Buffering• http://www-d0.fnal.gov/cgi-bin/cvsweb.cgi/~checkout~/l3xsbc/doc/buffering/index.html?rev=HEAD&content-type=text/html

DAQ Scaling, DAQ Overview, sci2002• http://d0.phys.washington.edu/~haas/d0/L3/

L3DAQ Homepage• http://www-d0online.fnal.gov/www/groups/l3daq/default.html

L3 switch backplane load• http://m-d0-mrtg.fnal.gov/s-d0-dab2cr-l3/s-d0-dab2cr-l3.backp2.html

The D0 Experiment• http://www-d0.fnal.gov/

D0 Run II Operations• http://www-d0.fnal.gov/runcoor/

Page 26: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 26

Extra Slides Follow

Page 27: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 27

Monitoring

Centralized, caching, monitor server.

Based on TCP/IP, ACE, and XML

Supports many displays and clients• 40 displays simultaneously

• 200 data sources

Talk and poster on this topic Thursday.c

Page 28: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 28

Performance

`Weekly' Graph (2 Hour Average)

Max Max  5-min. 5-min.17.0 % Average 17.0 % Average  5-min. 5-min.4.0 % Current 4.0 % Current  5-min. 5-min.0.0 %0.0 %

Max Max  5-min. 5-min.1.0 % Average 1.0 % Average  5-min. 5-min.0.0 % Current 0.0 % Current  5-min. 5-min.0.0 % 0.0 %

Page 29: Run II DZERO DAQ / Level 3 Trigger System Ron Rechenmacher, Fermilab The DØ DAQ Group (Brown/FNAL-CD/U.of Washington) CHEP03

3/24/2003 Ron Rechenmacher, Fermilab Slide 29

Software Description - RM

Event routing (Routing Master)

• Receives “run” information from supervisor• Farm node list and crate list per bit

• Gets bit fired by event# from TFW

• Receives no. of free buffers from each farm node

• Decides which nodes receive which events

• Sends routing info by event# to SBCs

• Sends crate list by event# to farm nodes

• Disables triggers when necessary