20
SuperBelle Event Building System Takeo Higuchi (IPNS, KEK) S.Y.Suzuki (CC, KEK) 2008/12/12 1st Open Meeting of the SuperKEKB Collaboration

SuperBelle Event Building System

Embed Size (px)

DESCRIPTION

SuperBelle Event Building System. Takeo Higuchi (IPNS, KEK) S.Y.Suzuki (CC, KEK). Belle DAQ Diagram. Belle Event Builder. COPPER/FASTBUS readout. Construction of event record Data transmission to RFARM. Configuration of SB Event Builder. CPR crate. CPR crate. CPR crate. CPR crate. - PowerPoint PPT Presentation

Citation preview

SuperBelle Event Building System

Takeo Higuchi (IPNS, KEK)S.Y.Suzuki (CC, KEK)

2008/12/12 1st Open Meeting of the SuperKEKB Collaboration

Belle DAQ DiagramBelle Event Builder

• COPPER/FASTBUS readout • COPPER/FASTBUS readout

• Construction of event record• Data transmission to RFARM• Construction of event record• Data transmission to RFARM

Configuration of SB Event BuilderCPR crate ROPC #01

RFARM #01RFARM #01

RFARM #02RFARM #02

RFARM #03RFARM #03

EFARM #01EFARM #01 SWSW

EFARM #02EFARM #02 SWSW

EFARM #03EFARM #03 SWSW

EFARM #nEFARM #n

CPR crate ROPC #02

CPR crate ROPC #03

CPR crate ROPC #04

CPR crate ROPC #m

… …

SWSWSWSW

SWSWSWSW

SWSW RFARM #nRFARM #n

The m and n depend on the detector design.

From our experience, the performance limit here is at 1-2Gbps

Regulation on FINESSE by EB

• No event# skip / No event# shuffle– Any kind of data record must be sent out from the

FINESSE upon the L1 trigger. • At least a capsule with event# tag (and empty body)

must be generated even for a totally zero-suppressed event or an L2-aborted event.

– Event # by the FINESSE must be increased monotonically by 1.

12345678134589101112435876

………

OK

NG FATAL

Inside EFARM #xx

• Multi-layer event building– Similar to the present EFARM configuration.

EFARM #xxEFARM #xx

1st layer 2nd layer i-th layer Exit layer…

PCPCPCPC

PCPC

PCPC

PCPC PCPC

PCPC

PCPC

PCPC

Performance Study Inside EB (1)Master thesis of K.Ito (2005)

P2P connection Connection over SW

100BaseT

1000BaseT

100BaseT

GbE SW

• P2P connection• Connection over SW

SW works fine

N=8

N=1..11 @800Bytes/ev

1000BaseT

100BaseT

GbE SWScalability up to N~10…

Performance Study Inside EB (2)Master thesis of K.Ito (2005)

1000BaseT

100BaseT

GbE SWROPC

1stLayer

Pseudo2nd

Layer

Event rate Total data transfer rate

N=1..20 N < 5N < 5~

Performance Study Inside EB (3)Master thesis of K.Ito (2005)

N=5N=5

N=1..10

ROPC1st Layer

2nd Layer

800 bytes/ev 1600 bytes/ev 3200 bytes/ev 6400 bytes/ev

vs. # of 1st later PCs

Event rate Total data transfer rate

Concepts of Implementation

• Switch from house-made technologies to generally existing technologies

– E.g. (1): event buffering• Now: House made (shared memory + semaphore).• SB: Socket buffer within Linux kernel.

– E.g.(2): state machine• Now: All EB nodes are managed by NSM.• SB: rwho or finger+.plan

Only one NSM node to respond to master.This collect all EB nodes’ status via thosecommands.

Less effort for documentationLess effort for documentation

Concepts of Implementation

• Quit from complexity

– NSM: Only one NSM node to respond to master.– D2: Own error reporting / logging syslog.– No (or minimum) configuration file:

the present EB requires large effort to add new ROPCsfor its multiple and complicated configuration files...

– Common software for all EB PCs in each layer.

Less effort for trouble shootingLess effort for

trouble shooting

We have just started the conceptual design.Following slides includes many unimplemented ideas.

Network Link Propagation

…Network connections propagate at RUN START

from DOWNstream to UPstream. # or ports awaiting for connection can be unity.

COPP

ERs

COPP

ERs

E101E101

E201E201

RO01RO01

RO0nRO0n

E102E102

E202E202

En01En01 RFRF

Determination of PCs to Connect

• Idea #1: Determine EB PCsto connect from its own hostname + rule.

– EB PC connects to all available (= accepting connection) PCs at run start, which match to the rule.

– Two new schemes must be created:• Detection of PCs which are not booted but included to CDAQ.• Detection of PCs which are booted but excluded from CDAQ.

E02_01E02_01

E02_02E02_02

E03_01E03_01

E01_11E01_11 E01_12E01_12 E01_19E01_19E01_13E01_13E.g.: …

E01_21E01_21 E01_22E01_22 E01_29E01_29E01_23E01_23 …

E02_11E02_11 E02_12E02_12 E02_19E02_19E02_13E02_13 …

Recall “no (or minimum)configuration file”

Recall “no (or minimum)configuration file”

Determination of PCs to Connect

• Idea #2: “EB_CONFIG server”.– EB_CONFIG server knows which EB PCs are booted up

and which are not.– EB_CONFIG server knows which subsystems are

included and which are excluded (told from MASTER).– EB_CONFIG server knows which EB PCs belong to

which subsystem (from a configuration file). • But this is what we like to avoid.

EB PCEB PC

Inside EB PC (Process Diagram)downstream upstream

port #nnn

…port #nnn

port #nnn

port #nnn

connect toupstream

connect toupstream

connect toupstream

output todownstream

eventbuilder

xinetdxinetd

eb_process

fork(2)

An EB PC in the exit layer may have to output the built-up record to one of multiple destinations (RFARM PCs), which consequently means the EB PC accepts multiple connections from each destination PC. We have found a way to pass socket descriptor to another process. This enables us to perform event building, destination selection, and data transmission in a single process.

Input Buffer

• Use Linux kernel bufferinstead of house-made buffering scheme– Much easier implementation/maintenance.– Unlike when the present EB developed, very large buffer

(~80MB) can be allocated in the kernel now.

Insidekernel

Data

shmoprecv

Sharedmemory

Build

Insidekernel

Data

recv

Build

Now SuperBelle

Output Buffer

• Recent Linux provides information about how much unsent data remain in the kernel output buffer.– Established by SK DAQ team (Hayato , Yamada).

Enables dynamic traffic balancing.

ioctl( sd, SIOCOUTQ, &val);ioctl( sd, SIOCOUTQ, &val);

PC #00in RFARM

#01

PC #00in RFARM

#01

EB PCin exit layer

EB PCin exit layer

PC #01in RFARM

#01

PC #01in RFARM

#01

PC #02in RFARM

#01

PC #02in RFARM

#01

idle

idle

too busyto recv data

unsent dataPC #00

in RFARM #01

PC #00in RFARM

#01

EB PCin exit layer

EB PCin exit layer

PC #01in RFARM

#01

PC #01in RFARM

#01

PC #02in RFARM

#01

PC #02in RFARM

#01

Select data destinationto balance the traffic load.

PC: EBNSMPC: EBNSM

State Machine / Status Collection

• Possible statuses of EB PC

• Collection of EB PC statuses and EB Configuration

Accepting side

Not ready

Ready

Partially connected

Fully connected

From downstreamto the EB PC

From the EB PCto upstream

Connecting side

Not ready

Ready

Partially connected

Fully connectedReady to run

EB_CONFserver

Collection ofEB PC statuses

Query: which EB PC is ready? to connect? included?

NSM command:EB_CONFIG …

NSM reply:READY, NOTREADY …

nsmd

CDAQ sideEFARM side rwho is used.

Error Reporting and Logging

INFOWARNERRORFATALEB

Pro

cess

syslog(3)

/var/log/messages

tail –fgrep

•Watch with

• Throw error messages tothe NSM master.

BASF Substitution

• Separation of BASF from EFARM– Present EFARM strongly depends on BASF; BASF modules

build up event records. May cause BELLE_LEVEL problem.

– SuperBelle EFARM is not to use BASF to build up event.

• Substitution of BASF– BASF-like substitution for online data quality check and/or

L3 trigger use will be provided.

const int (*userfunc[])(unsigned int *data, size_t size);

const int (*userfunc[])(unsigned int *data, size_t size);

Summary

• We have started the design of event building farm for SuperBelle.

• Regulation on the FINESSE is given from the EFARM.• Several performance studies about network data link

are already made to input to the new EFARM design.• The new EFARM fully utilizes pre-implemented Linux

functions instead our own invention.• The new EFARM tries to be self-configured to be free

from configuration-file nightmare.