24
Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

  • View
    217

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

Some data acquisition and processing considerations at L = 1036

SuperB04 workshopJanuary 2004

Gregory Dubois-Felsmann

Caltech, BaBar

Page 2: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 2

What this talk is about and not about

• It’s an exploration of certain basic parameters of a multi-level data acquisition that can handle rates that might be expected at a e+e- B-factory with a luminosity of around 1036 cm-2s-1.

• It starts with certain assumptions about what the first-level trigger at a detector appropriate for this facility might/should be able to do.– That remains to be demonstrated, especially with regard to how one might

make an all-silicon tracking detector part of the trigger…

• It doesn’t report any DAQ R&D specifically focused on very high luminosities – that’s yet to really get started within the BaBar collaboration– We’ve been focused on nearer term luminosity and other upgrades

– These have generated some relevant insights, but there’s much more to do.

Page 3: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 3

Philosophy

• Many people’s reaction when they hear “1036” has been…

“you have to have a tight, targeted trigger.”

• At least from the point of view of the data acquisition system, I will argue that this is not true, and in fact…– There’s a physics case for a relatively open trigger, and…– advances in technology mean that it’s feasible, and in some respects even

“easy”, with the right architecture, in terms of basic capabilities of processing, networking, and storage systems.

• Nevertheless:– The first-level trigger and front-end DAQ electronics will not be as

straightforward, and…– The real challenge will be to design an analysis environment that makes it

possible to do physics with such an immense data set.

Page 4: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 4

Parameters and a definition

• Luminosity: one pb-1s-1

– “Snowmass year” of 107s implies 10 ab-1/year,

• Start date at that luminosity: ~2010-2012– Will use 2009 for Moore’s Law calculations

• Cross sections:– B-Bbar: 1 nb – or 1010 B-Bbar pairs/year

– uds: 1.6 nb, c: 1 nb

– leptons (): 0.78 each

– recognizable Bhabhas: ~50nb

• A “trigger” is considered very generally, in this talk, as a component of the system that irrevocably discards events.

Page 5: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 5

(Some of) the physics case for an open trigger

• Rare decay analyses requiring looking at the recoil to fully recon-structed tags will be a mainstay of the physics program.

– B tags look pretty much like generic B events until fully reconstructed.

• Decays that are “easy to trigger on” will in many cases also be easy for the hadronic B-factories.

• An efficient, open trigger is usually easier to understand and less biased, limiting systematic errors in high-statistics analyses.

• A 1036 super-B-factory should be seen in part as a “facility instrument” that should be able to fulfill needs not foreseen at the time of its design, at the time it turns on, or even ones arising after it has closed.

– A lot will be learned about B physics from other experiments right up through turn-on… and beyond, especially if relevant new physics turns up at the LHC or elsewhere…

A little more on each of these…

Page 6: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 6

Tag-recoil analyses

• Many studies of hard-to-identify rare decays, for instance those involving neutrinos, will benefit from examining the unbiased recoil against fully reconstructed B’s.– B tags are fairly unremarkable-looking at the level of the individual tracks

that compose them – efficient selection depends very much on full reconstruction

• Neutral and soft charged pions play an important role

– It seems very difficult to design a tight, fast, well-characterized trigger for B tags, short of one relying on full reconstruction

– Experience shows that experiments tend to develop their tag selection algorithms over time as the detector response (and the set of B decay modes that are useful) become better understood: flexibility is key

– Trying to trigger on specific characteristics of the recoil would only work for some modes and would introduce significant systematics.

Page 7: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 7

Triggering vs. hadron collider experiments

• More generally, hadron collider experiments with similar or greater production rates for B mesons will be the competition.

One inherent advantage of the e+e- B-factories is precisely the ability to go after decay modes that don’t have simple identifiable features and so present real triggering problems in the hadronic environment.

Page 8: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 8

The community facilty

• A super-B-factory should be seen as in part a “facility instrument” that should be able to fulfill needs not foreseen at the time of its design.

– Acquiring a sample of (a few) 1010 B’s in a clean environment is likely a not-to-be-repeated opportunity.

– The choice of data acquisition and trigger strategy has to be made early.

– A full-acceptance strategy, though here argued to be feasible, is undoubtedly a challenge in certain respects, particularly in the demands it places on downstream (“offline”) systems and will require advance commitment and R&D, especially to develop the correspondingly necessary analysis tools.

– Yet, a lot will be learned about B physics right up through turn-on…and even afterwards, from the experiment’s initial results, and these might point in directions not foreseen in the trigger design.

– Things learned at the LHC, LC, and beyond may well raise questions best addressed by going back and looking at this B sample

• This suggests trying very hard to find a way to keep all the B’s around and make sure that people with new ideas can apply them, even years later.

– That includes finding a way to keep the archive usable for analysis for a long time.

Page 9: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 9

So, let’s see if it’s possible

• This is only a sketch of how this might go…

• But I hope it’s enough to make it clear that this is plausible enough to consider in more detail, and that…

• … perhaps this should even be the default assumption.

Page 10: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 10

Some basic assumptions

• Level 1 (hardware trigger) rate:– BaBar projections show that the hardware trigger rate will be dominated

by luminosity-driven interactions at luminosities above 1034 .• These are based on fairly old test runs, being redone this week…

– Scaled to 1036, this gives a rate for a BaBar-like hardware trigger of about 75,000 “luminosity-ish” event per second. About 40-50,000 of these are Bhabhas. The same set of numbers projects only about 8,000 “beam background-ish” events per second.

• Background may be worse that this because of continuous injection, but we’ve recently made great progress in understanding how to suppress these cleanly.

• We should study how one might veto a substantial fraction of the Bhabhas in the hardware trigger. Perhaps in a Level 2?

• Two-photon veto possible in hardware trigger?

– In this talk, will try to see whether we can live with a 100,000 event per second hardware trigger rate.

– “Hardware trigger” = triggering applied to incompletely built events• “Hardware” will surely be in part a misnomer…

Page 11: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 11

Assumptions II

• Software trigger rate– This is the rate at which events are saved to permanent storage.

– This is envisioned as including stages equivalent to both BaBar’s Level 3 and “BGFilter” selections.

• Level 3: does basic DCH tracking and EMC clustering; selection is very open (two, or one high-p, tracks coming from the IP, basic EMC energy and multiplicity cuts)

• BGFilter: done partway through the BaBar full reconstruction executable, including tracking, clustering, but not PID, physics combinatorics, etc.; tries to identify multihadrons

Page 12: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 12

The task

• Past studies suggest that we could set up a selection with a 7 nb cross section.– Allows some “leakage” of Bhabhas, two-photon events beyond ~4.5 nb

udscb cross section.

– Very likely we can learn how to do better – we already have identified some reasonable handles on the two-photon events – but…

– Seems to be very hard to eliminate substantial fraction of charm and uds events while maintaining high unbiased efficiency for B’s – at a coarse level continuum events look very much like low-multiplicity charmless B decays

• The charm physics may also be interesting…

• Assume 100,000 events in, 7,000 events out per second. Can this be handled?

Page 13: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 13

Assumptions III

• Raw event size– Most important but least predictable right now

– Depends on detector choices, details of backgrounds

– BaBar now 26-30 kB

– Assume 50kB. If it’s wrong, at least it’s a scaling point.

– Will re-evaluate after this week’s background studies in BaBar

• Processed event size (more on this later)– 1 kB “skimmable tag”

– 10 kB DST

– Equivalent to BaBar’s “mini” format

– Assumed to be a product of the full software trigger

Page 14: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 14

Basic architectural principles

• Keep things parallel for as long as possible– Acquire data in parallel

– Run the software trigger in parallel

– Record raw data to archival storage in parallel• Must design a storage catalog that can handle this

• Keep data access sequential as much as possible– Parallelize and concatenate all jobs needing to read all the data

• Provide the hardware needed to give reasonable response time & throughput

• Notional goal: – No single-path bottlenecks until the user actually looks at a histogram

• Even large “ntuple” files should be parallelized, if possible.

– Random access limited to users’ analyses using users’ resources

Page 15: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 15

Moore’s Law considerations

(Chart courtesy of Richard Mount)

• CPU speed / processor doubling time is about 1.35 years

• Use 2008 as the design year for Moore’s Law purposes

• Assume things really do continue to scale in this way…

Page 16: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 16

Moore’s law versus luminosity

• At first this may not look so encouraging…

• But it appears to be sufficient if we can plan wisely…

0.1

1

10

100

1000

10000

1998 2000 2002 2004 2006 2008 2010 2012 2014

Year

R

ela

tiv

e s

cale

Lum (vs. 3 10 3̂3)

CPU/$

CPU speed

Seq. Storage/$

SuperB

L = 1036

BaBar

L = 3 * 1033

BaBar onlinetechnologyacquisition

Page 17: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 17

1) Filtering and storing the data

• Two questions:– Can we move the bytes?

– Can we afford to record them?

• Data volume into the software trigger:– 100,000 Hz * 50 kB = 5 GB/s

– The full rate appears only in the network switch.• Switches with 64 Gbps backplanes are available today

– Assume 100 software trigger machines• No problem to move 50 MB/s over Gigabit Ethernet into each machine

– Gigabit Ethernet has become a commodity product

– In use in BaBar for the last year

– Today’s Linux boxes can saturate 1 1/2 Gigabit links over UDP

– Improved hardware support is rapidly decreasing the CPU cost

– No comment in this talk about 100 kHz front-end DAQ…

Page 18: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 18

Storing the data II

• Data volume out of the software trigger:– 7000 Hz * 60 kB = 420 MB/s (60 = raw + 10kB DST)

= 1.5 TB/hr, 36 TB/day, 4200 TB/Snowmass year

– Distribute over 100 software trigger machines • No problem to move 4.2 MB/s off each machine.• No problem to write to O(20) “tape” servers each writing 21 MB/s

• Paying for the “tape”– BaBar silo media: currently ~2000 TB / $M– Moore’s law 2003-2009: * 26/2.1 = * 7.2 14500 TB / $M– $300K / Snowmass year– Affordable, but still a large expense– “Tape” may actually be MAID by then (Massive Array of Idle Disks)

Factor of four in the 50 kB event size would be OK for transport (everything scales) but become problematic in storage cost

Really have to try not to let the events balloon because of new detector technology or background

Also can’t afford many replications

Page 19: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 19

Filtering the data II

• On 100-node farm, each node processes – 1000 Hz input

• Moore for CPU speed, 2002-2009: * 36– “2002” machines are present BaBar online farm, 1.4GHz PIII.

• Are 50 GHz-equivalent CPUs a credible possibility? Keep reading…

• Stage I: identical to present BaBar Level 3 (very conservative!)– Would produce 20 kHz rate, consuming 5 ms of “BaBar online CPU”.

– Assume 10 ms baseline to allow for background-related problems.

– Scales to 0.28 ms / event in 2009 30% of CPU used for Stage I.

– Leaves lots of room for growth, improvements in algorithms

– Produces 200 Hz / node intermediate stream.

Page 20: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 20

Filtering the data III

• Stage II: comparable to BaBar BGFilter– Right now, BaBar BGFilter takes O(0.5s)/event.

– 2009 Moore: 14ms/event

– Need 280 CPUs to process 20kHz

– Choices:• Can redistribute events from Level 3 CPUs to 3 CPUs each

• Can expand the farm to, say, 175 2-CPU nodes and keep all processing within the box

– If the BGFilter estimate is off due to background, this is a very scalable system:

• Easy to double the number of CPUs

Page 21: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 21

Will Moore’s Law really continue to hold for CPU?

• Recall “50 GHz” number that emerged above – from relatively conservative application of baseline dates and empirical scaling data.

• Is this a sensible expectation?

a) It may well be.Consider: IBM claims that the “strained silicon” technology which they introduced into mass production in 2003 is readily scalable to 20 GHz processor clock speeds.

b) It doesn’t matter.Indications are that, even before raw speed scaling breaks down, manufacturers are going to be introducing tightly integrated multi-CPU-per-chip hardware. None of the argument of this talk is at all dependent on the availability of extremely high speed single-thread computation.

Page 22: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 22

Reconstructing the data

• BaBar reconstruction: ~1-1.5 s / hadronic event– After Moore, this is 28-56 ms / event.

– To do this on 7000 events / second requires 200-300 CPUs Seems doable.

– The rest is in Rainer Bartoldus’ talk

Page 23: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 23

What if a stage requires “live” calibration data?

• Suppose filtering requires best-quality calibrations, comparable to BaBar “rolling calibrations”…

• Can we imagine introducing an hour’s buffering, say?– One hour of the 20 kHz Stage One data is 3600 GB

– That’s 36 GB of buffering per node• Not a problem as far as attaching the necessary disk space

– Stage One data rate per node is 200 Hz or 10 MB/s• Can imagine writing this out and reading it back in (20 MB/s round trip)

already with present technology

– OK, but • … no explanation of how to do the calibration

– Perhaps one would strip out a small prescaled sample of dimuons?

– Could constants be computed independently on every node to avoid introducing a bottleneck?

• … and we’ve learned in BaBar how much one would like to avoid introducing feedback loops in the data flow, especially at this low level

Page 24: Some data acquisition and processing considerations at L = 10 36 SuperB04 workshop January 2004 Gregory Dubois-Felsmann Caltech, BaBar

21 January 2004 SuperB04 DAQ 24

Costs of parallelism

• Coordinating the components– Requires a lot of R&D to build the necessary tools– Lots of other people are trying to solve the same problem

• Reliability– Need to be able to tolerate failures of single components in a large system– Build the system to be inherently resilient

• “Only 99 nodes available today? Fine!”

– Again requires R&D

• Figuring out how to do analysis in such a world

• All these things are of interest to BaBar in the medium term; there should be lots of opportunities to do the R&D there and elsewhere in high energy physics.